摘要:
针对在噪声环境下的说话人识别系统,做了两点改进.第一,为了提高系统的鲁棒性,通过不同尺度的小波基,把含有噪声的信号分解于不同频段中,然后在各个频段分别通过TEO(Teager能量算子)去噪.针对说话人识别的特点,在小波重构时对各小波系数进行了加权处理.再把各个频段的输出通过小波重构恢复信号.最后通过Mel滤波器组把小波系数转换成MFCC.第二,为了进一步提高识别性能和训练速度,在识别阶段采用了改进的OGMM(正交高斯混合模型),即把正交变换改到EM算法之前进行,这样就不必要在EM迭代过程中每次都进行正交运算了.从实验得出,采用本文提出的DWT-TEO参数对于说话人识别的效果较好.采用改进的OGMM进一步提高了识别性能和训练速度.
Abstract:
Two modifications for speaker recognition system in noise environment are described. First, in order to improve the robustness of the system, noisy speech is decomposed into various frequency bands and de-noising is carried out by TEO in every frequency band. The wavelet coefficient is weighted according to the characteristics of speaker recognition, and is then transformed into MFCC. Second, in order to improve recognition performance and training speed, a modified OGMM that orthogonal transform is performed before EMarithmetic is applied at the recognition stage. Thus, it is not necessary to do orthogonal operation during every EM iterative process. The experimental results show that the parameters proposed have produced good effect and that modified OGMM can further improve recognition performance and training speed.