Studies on Model Distance Normalization Approach in Text-independent Speaker Verification
-
摘要: 在自动说话人验证中, 模型距离归一化是非常有用的得分归一化技术之一. 相比于其他的主流得分归一化技术, 模型距离归一化的主要优点在于它不需要额外的语音数据和说话人集合. 但是, 它也仍然有自身的缺点. 比如, 在传统的模型距离归一化中, 模型之间的KL距离用Monte-Carlo方法求得, 而此方法的时间复杂度很高. 本文从一个新的角度探讨了模型距离归一化的原理, 并且提出了简化的模型距离归一化方法, 即使用KL距离的上限来衡量两个说话人模型的距离. 在2006年的NIST说话人评测数据集上, 本文提出的简化的模型距离归一化方法取得了与传统方式相近的结果, 而时间复杂度却大大降低了.Abstract: Model distance normalization (D-Norm) is one of the useful score normalization approaches in automatic speaker verification (ASV) systems. The main advantage of D-Norm lies in that it does not need any additional speech data or external speaker population, as opposed to the other state-of-the-art score normalization approaches. But still, it has some drawbacks, e.g., the Monte-Carlo based Kullback-Leibler distance estimation approach in the conventional D-Norm approach is a time consuming and computation costly task. In this paper, D-Norm was investigated and its principles were explored from a perspective different from the original one. In addition, this paper also proposed a simplified approach to perform D-Norm, which used the upper bound of the KL divergence between two statistical speaker models as the measure of model distance. Experiments on NIST 2006 SRE corpus showed that the simplified approach of D-Norm achieves similar system performance as the conventional one while the computational complexity is greatly reduced.
计量
- 文章访问数: 2077
- HTML全文浏览量: 64
- PDF下载量: 978
- 被引次数: 0