2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

正交拉普拉斯语种识别方法

杨绪魁 屈丹 张文林

杨绪魁, 屈丹, 张文林. 正交拉普拉斯语种识别方法. 自动化学报, 2014, 40(8): 1812-1818. doi: 10.3724/SP.J.1004.2014.01812
引用本文: 杨绪魁, 屈丹, 张文林. 正交拉普拉斯语种识别方法. 自动化学报, 2014, 40(8): 1812-1818. doi: 10.3724/SP.J.1004.2014.01812
YANG Xu-Kui, QU Dan, ZHANG Wen-Lin. An Orthogonal Laplacian Language Recognition Approach. ACTA AUTOMATICA SINICA, 2014, 40(8): 1812-1818. doi: 10.3724/SP.J.1004.2014.01812
Citation: YANG Xu-Kui, QU Dan, ZHANG Wen-Lin. An Orthogonal Laplacian Language Recognition Approach. ACTA AUTOMATICA SINICA, 2014, 40(8): 1812-1818. doi: 10.3724/SP.J.1004.2014.01812

正交拉普拉斯语种识别方法

doi: 10.3724/SP.J.1004.2014.01812
基金项目: 

国家高技术研究发展计划(863计划)(2012AA011603),国家自然科学基金(61175017),全军军事学研究生课题(2010JY0258-144)资助

详细信息
    作者简介:

    屈丹 中国人民解放军信息工程大学信息系统工程学院副教授,2005 年获解放军信息工程大学博士学位. 主要研究方向为语音信号处理与模式识别.E-mail:qudanqudan@sina.com

    通讯作者:

    杨绪魁 中国人民解放军信息工程大学信息系统工程学院硕士研究生. 主要研究方向为语种识别,连续语音识别和机器学习.E-mail:gzyangxk@163.com

An Orthogonal Laplacian Language Recognition Approach

Funds: 

Supported by National High Technology Research and Development Program of China (863 Program) (2012AA011603), National Natural Science Foundation of China (61175017), Research of The Military Science Graduate of PLA (2010JY0258-144)

  • 摘要: 提出了一种正交拉普拉斯语种识别方法,即在提取语音的i-vector后,采用正交局部保持投影进行子空间映射,将信号整体空间映射到语言信息加信道信息子空间,然后对映射后的矢量进行信道补偿处理,最后用支持向量机进行识别. 尽管i-vector最大限度地保留了语音的声学信息,但是并没有发现这些信息之间的内在结构. 利用正交局部保持投影在去除声学无关信息的基础上,进一步发现声学特征的内在结构,能够有效地提高特征的区分性. 在对NIST LRE 2003测试数据库实验后,发现新方法相较于基线系统来说,平均代价降低了28.91%.
  • [1] Zissman M A. Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions Speech and Audio Process, 1996, 4(3): 31-44
    [2] [2] Campbell W M, Sturim D E, Reynolds D A. Support vector machine using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
    [3] [3] Kenny P. Factor Analysis of Speaker and Session Variability: Theory and Algorithms, Technical Report CRIM-06/08-13. Montreal, CRIM, 2005
    [4] [4] Kenny P, Boulianne G, Oullet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447
    [5] [5] Martinez D, Plchot O, Burget L, Glembek O, Matejka P. Language Recognition in iVectors Space. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 861-864
    [6] [6] Dehak N, Torres P A, Reynolds D, Dehak R. Language recognition via iVectors and dimensionality reduction. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 857-860
    [7] [7] Tipping M E, Bishop C M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1999, 61(3): 611-622
    [8] [8] Turk M, Pentland A P. Face recognition using eigenfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Maui, Hawaii: IEEE, 1991. 586-591
    [9] Zeng Xian-Hua. Researches on Related Issues of Spectral Method for Manifold Learning [Ph.D. dissertation], Beijing Jiaotong University, China, 2009 (曾宪华. 流形学习的谱方法相关问题研究 [博士学位论文], 北京交通大学, 中国, 2009)
    [10] Yang J C, Liang C Y, Yang L, Suo H B, Wang J J, Yan Y H. Factor analysis of Laplacian approach for speaker recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012. 4221-4224
    [11] He X F, Niyogi P. Locality preserving projections. In: Proceedings of the Neural Information Processing Systems 16 (NIPS). Vancouver, Canada: The MIT Press, 2003. 153-160
    [12] He X F, Yan S C, Hu Y X, Niyogi P, Zhang H J. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005, 27(3): 328-340
    [13] Cai D, He X F. Locality preserving projections. In: Proceedings of the 28th Annual International ACM SIGIR Conference (SIGIR'05). Salvador, Brazil: ACM, 2005
    [14] Cai D, He X F, Han J W, Zhang H J. Orthogonal Laplacianfaces for face recognition. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614
    [15] Hatch A O, Kajarekar S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: INTERSPEECH. Pittsburgh, PA, USA, 2006. 1471-1474
    [16] Torres-Carrasquillo P A, Singer E, Kohler M A, Greene R J, Reynolds D A, John R, Deller J R Jr. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings of the International Conferences on Spoken Language Processing (ICSLP). Denver, 2002. 89-92
  • 加载中
计量
  • 文章访问数:  2033
  • HTML全文浏览量:  78
  • PDF下载量:  1459
  • 被引次数: 0
出版历程
  • 收稿日期:  2013-03-12
  • 修回日期:  2013-08-28
  • 刊出日期:  2014-08-20

目录

    /

    返回文章
    返回