摘要:
对特征参数概率分布的实验分析表明,在有噪声影响的情况下,特征参数通常呈现双峰分布.据此,本文提出了一种新的,基于双高斯的高斯混合模型(Gaussian mixture model,GMM)的特征参数归一化方法,以提高语音识别系统的鲁棒性.该方法采用更为细致的双高斯模型来表达特征参数的累积分布函数(CDF),并依据估计得到的CDF进行参数变换将训练和识别时的特征参数的分布都规整为标准高斯分布,从而提高识别正确率.在Aurora 2和Aurora 3数据库上的实验结果表明,本文提出的方法的性能明显好于传统的倒谱均值规整(Cepstral mean normalization,CMN)和倒谱均值方差规整(Cepstral mean and variance normalization,CMVN)方法,而与非参数化方法-直方图均衡特征规整方法的性能基本相当.
Abstract:
In this paper, a new feature normalization approach based on double Gaussian mixture model is proposed. Since speech features in noisy environments usually follow bimodal distributions, to fully utilize this characteristic we represent the cumulative density function (CDF) of the features with a more delicate Gaussian mixture model. Finally, feature normalization process is performed according to the estimated CDF to improve speech recognition performance. Experimental results on Aurora 2 and Aurora 3 tasks show that the performance of our method is much better than those of the conventional cepstral mean normalization (CMN) and cepstral mean and variance normalization (CMVN) methods, and is comparable to that of the histogram equalization method, which is a non-parametric method.