正交拉普拉斯语种识别方法

杨绪魁; 屈丹; 张文林

doi:10.3724/SP.J.1004.2014.01812

正交拉普拉斯语种识别方法

doi: 10.3724/SP.J.1004.2014.01812 cstr: 32138.14.SP.J.1004.2014.01812

1.
解放军信息工程大学信息系统工程学院郑州 450002

基金项目:

国家高技术研究发展计划（863计划）（2012AA011603），国家自然科学基金（61175017），全军军事学研究生课题（2010JY0258-144）资助

详细信息

作者简介:
屈丹中国人民解放军信息工程大学信息系统工程学院副教授，2005 年获解放军信息工程大学博士学位. 主要研究方向为语音信号处理与模式识别.E-mail：qudanqudan@sina.com

通讯作者:
杨绪魁中国人民解放军信息工程大学信息系统工程学院硕士研究生. 主要研究方向为语种识别，连续语音识别和机器学习.E-mail：gzyangxk@163.com

计量
- 文章访问数: 2145
- HTML全文浏览量: 97
- PDF下载量: 1469
- 被引次数: 0
出版历程
- 收稿日期: 2013-03-12
- 修回日期: 2013-08-28
- 刊出日期: 2014-08-20

An Orthogonal Laplacian Language Recognition Approach

1.
Institute of Information Engineering, PLA Information Engineering University, Zhengzhou 450002

Funds:

Supported by National High Technology Research and Development Program of China (863 Program) (2012AA011603), National Natural Science Foundation of China (61175017), Research of The Military Science Graduate of PLA (2010JY0258-144)

摘要

摘要: 提出了一种正交拉普拉斯语种识别方法，即在提取语音的i-vector后，采用正交局部保持投影进行子空间映射，将信号整体空间映射到语言信息加信道信息子空间，然后对映射后的矢量进行信道补偿处理，最后用支持向量机进行识别. 尽管i-vector最大限度地保留了语音的声学信息，但是并没有发现这些信息之间的内在结构. 利用正交局部保持投影在去除声学无关信息的基础上，进一步发现声学特征的内在结构，能够有效地提高特征的区分性. 在对NIST LRE 2003测试数据库实验后，发现新方法相较于基线系统来说，平均代价降低了28.91%.
- 因子分析 /
- 辨识矢量 /
- 流形学习 /
- 正交局部保持投影 /
- 语种识别
Abstract: An orthogonal Laplacian language recognition approach is proposed. In this approach, the i-vector of an utterance, after being extracted, is mapped into a subspace by an orthogonal locality preserving projection. Then, channel compensation is done for the mapped vector. At last, recognition is done with a support vector machine. Though the i-vector preserves the acoustics information as much as possible, it cannot find the inner structure among this information. Whereas the intrinsic structure of acoustics feature can be found by the orthogonal locality preserving projection algorithm on the basis of removing the irrelevant information. Experiments on the NIST LRE 2003 evaluation corpus show that this new approach can reduce a 28.91% average detection cost compared to the baseline.
- Factor analysis /
- identifying vector /
- manifold learning /
- orthogonal locality preserving projection /
- language recognition

HTML全文

参考文献(16)

[1]	Zissman M A. Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions Speech and Audio Process, 1996, 4(3): 31-44
[2]	[2] Campbell W M, Sturim D E, Reynolds D A. Support vector machine using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
[3]	[3] Kenny P. Factor Analysis of Speaker and Session Variability: Theory and Algorithms, Technical Report CRIM-06/08-13. Montreal, CRIM, 2005
[4]	[4] Kenny P, Boulianne G, Oullet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447
[5]	[5] Martinez D, Plchot O, Burget L, Glembek O, Matejka P. Language Recognition in iVectors Space. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 861-864
[6]	[6] Dehak N, Torres P A, Reynolds D, Dehak R. Language recognition via iVectors and dimensionality reduction. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 857-860
[7]	[7] Tipping M E, Bishop C M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1999, 61(3): 611-622
[8]	[8] Turk M, Pentland A P. Face recognition using eigenfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Maui, Hawaii: IEEE, 1991. 586-591
[9]	Zeng Xian-Hua. Researches on Related Issues of Spectral Method for Manifold Learning [Ph.D. dissertation], Beijing Jiaotong University, China, 2009 (曾宪华. 流形学习的谱方法相关问题研究 [博士学位论文], 北京交通大学, 中国, 2009)
[10]	Yang J C, Liang C Y, Yang L, Suo H B, Wang J J, Yan Y H. Factor analysis of Laplacian approach for speaker recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012. 4221-4224
[11]	He X F, Niyogi P. Locality preserving projections. In: Proceedings of the Neural Information Processing Systems 16 (NIPS). Vancouver, Canada: The MIT Press, 2003. 153-160
[12]	He X F, Yan S C, Hu Y X, Niyogi P, Zhang H J. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005, 27(3): 328-340
[13]	Cai D, He X F. Locality preserving projections. In: Proceedings of the 28th Annual International ACM SIGIR Conference (SIGIR'05). Salvador, Brazil: ACM, 2005
[14]	Cai D, He X F, Han J W, Zhang H J. Orthogonal Laplacianfaces for face recognition. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614
[15]	Hatch A O, Kajarekar S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: INTERSPEECH. Pittsburgh, PA, USA, 2006. 1471-1474
[16]	Torres-Carrasquillo P A, Singer E, Kohler M A, Greene R J, Reynolds D A, John R, Deller J R Jr. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings of the International Conferences on Spoken Language Processing (ICSLP). Denver, 2002. 89-92

施引文献

资源附件(0)

访问统计

计量

文章访问数: 2145
HTML全文浏览量: 97
PDF下载量: 1469
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

正交拉普拉斯语种识别方法

doi: 10.3724/SP.J.1004.2014.01812 cstr: 32138.14.SP.J.1004.2014.01812

作者简介:
屈丹中国人民解放军信息工程大学信息系统工程学院副教授，2005 年获解放军信息工程大学博士学位. 主要研究方向为语音信号处理与模式识别.E-mail：qudanqudan@sina.com

通讯作者:
杨绪魁中国人民解放军信息工程大学信息系统工程学院硕士研究生. 主要研究方向为语种识别，连续语音识别和机器学习.E-mail：gzyangxk@163.com

计量

An Orthogonal Laplacian Language Recognition Approach

计量

目录

留言板

正交拉普拉斯语种识别方法

doi: 10.3724/SP.J.1004.2014.01812 cstr: 32138.14.SP.J.1004.2014.01812

作者简介: 屈丹 中国人民解放军信息工程大学信息系统工程学院副教授，2005 年获解放军信息工程大学博士学位. 主要研究方向为语音信号处理与模式识别.E-mail：qudanqudan@sina.com

通讯作者: 杨绪魁 中国人民解放军信息工程大学信息系统工程学院硕士研究生. 主要研究方向为语种识别，连续语音识别和机器学习.E-mail：gzyangxk@163.com

计量

出版历程

An Orthogonal Laplacian Language Recognition Approach

计量

出版历程

目录

作者简介:
屈丹中国人民解放军信息工程大学信息系统工程学院副教授，2005 年获解放军信息工程大学博士学位. 主要研究方向为语音信号处理与模式识别.E-mail：qudanqudan@sina.com

通讯作者:
杨绪魁中国人民解放军信息工程大学信息系统工程学院硕士研究生. 主要研究方向为语种识别，连续语音识别和机器学习.E-mail：gzyangxk@163.com