2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于流形正则化极限学习机的语种识别系统

徐嘉明 张卫强 杨登舟 刘加 夏善红

徐嘉明, 张卫强, 杨登舟, 刘加, 夏善红. 基于流形正则化极限学习机的语种识别系统. 自动化学报, 2015, 41(9): 1680-1685. doi: 10.16383/j.aas.2015.c140916
引用本文: 徐嘉明, 张卫强, 杨登舟, 刘加, 夏善红. 基于流形正则化极限学习机的语种识别系统. 自动化学报, 2015, 41(9): 1680-1685. doi: 10.16383/j.aas.2015.c140916
XU Jia-Ming, ZHANG Wei-Qiang, YANG Deng-Zhou, LIU Jia, XIA Shan-Hong. Manifold Regularized Extreme Learning Machine for Language Recognition. ACTA AUTOMATICA SINICA, 2015, 41(9): 1680-1685. doi: 10.16383/j.aas.2015.c140916
Citation: XU Jia-Ming, ZHANG Wei-Qiang, YANG Deng-Zhou, LIU Jia, XIA Shan-Hong. Manifold Regularized Extreme Learning Machine for Language Recognition. ACTA AUTOMATICA SINICA, 2015, 41(9): 1680-1685. doi: 10.16383/j.aas.2015.c140916

基于流形正则化极限学习机的语种识别系统

doi: 10.16383/j.aas.2015.c140916
基金项目: 

国家自然科学基金(61273268,61370034,61403224)资助

详细信息
    作者简介:

    张卫强 清华大学电子工程系副研究员.主要研究方向为语音信号处理,机器学习.E-mail:wqzhang@tsinghua.edu.cn

    杨登舟 中国科学院电子学研究所博士研究生.主要研究方向为语音信号处理,机器学习.E-mail:yangdengzhou@sina.com

    刘加 清华大学电子工程系教授.主要研究方向为语音识别,信号处理.E-mail:liuj@tsinghua.edu.cn

    夏善红 中国科学院电子学研究所研究员.主要研究方向为信号处理,传感技术.E-mail:shxia@mail.ie.ac.cn

    通讯作者:

    徐嘉明 中国科学院电子学研究所博士研究生.主要研究方向为语音信号处理,机器学习.本文通信作者.E-mail:xujiaming09@sina.com

Manifold Regularized Extreme Learning Machine for Language Recognition

Funds: 

Supported by National Natural Science Foundation of China (61273268, 61370034, 61403224)

  • 摘要: 支持向量机 (Support vector machine, SVM) 在语种识别中已经起到了重要的作用.近些年来,极限学习机 (Extreme learning machine, ELM) 在很多领域取得了成功的应用.相比于 SVM, ELM 最大的优点在于极易实现、训练速度快,而且通常可以取得与 SVM 相近甚至优于 SVM 的识别性能. 鉴于 ELM 这些优异的特点,本文将 ELM 引入到语种识别中,并针对 ELM 由于随机初始化模型参 数所带来的潜在问题,提出了流形正则化极限学习机 (Manifold regularized extreme learning machine, MRELM) 算法.实验结果表明,在高斯超矢量(Gaussian supervector, GSV)特征空间上,相对于 SVM 基线系统,该算法对30秒语音的识别性能有明显的提升. 同时该算法也可以成功地应用到 i-vector 特征空间中,取得与当前主流的打分算法相近的识别性能.
  • [1] Li H Z, Ma B, Lee K A. Spoken language recognition: from fundamentals to practice. Proceedings of the IEEE, 2013, 101(5): 1136-1159
    [2] Biadsy F. Automatic dialect and accent recognition and its application to speech recognition [Ph.D. dissertation], Columbia University, USA, 2011.
    [3] Zissman M A, Berkling K M. Automatic language identification. Speech Communication, 2001, 35(1-2): 115-124
    [4] Muthusamy Y K, Barnard E, Cole R A. Reviewing automatic language identification. IEEE Signal Processing Magazine, 1994, 11(4): 33-41
    [5] Campbell W M, Singer E, Torres-Carrasquillo P A, Reynolds, D A. Language recognition with support vector machines. In: Proceedings of the 2004 ODYSSEY-The Speaker and Language Recognition Workshop. Toledo, Spain: ISCA, 2004. 285-288
    [6] Campbell W M, Campbell J P, Reynolds D A, Singer E, Torres-Carrasquillo P A. Support vector machines for speaker and language recognition. Computer Speech & Language, 2006, 20(2-3): 210-229
    [7] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Budapest, Hungary: IEEE, 2004. 985-990
    [8] Huang G B, Wang D H, Lan Y. Extreme learning machines: a survey. International Journal of Machine Learning and Cybernetics, 2011, 2(2): 107-122
    [9] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: theory and applications. Neurocomputing, 2006, 70(1-3): 489-501
    [10] Huang G B, Zhou H M, Ding X J, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2012, 42(2): 513-529
    [11] Liang N Y, Huang G B, Saratchandran P, Sundararajan N. A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Transactions on Neural Networks, 2006, 17(6): 1411-1423
    [12] Xu J T, Zhou H M, Huang G B. Extreme learning machine based fast object recognition. In: Proceedings of the 15th IEEE International Conference on Information Fusion. Singapore: IEEE, 2012. 1490-1496
    [13] Sole M M, Tsoeu M S. Sign language recognition using the extreme learning machine. In: Proceedings of the 2011 IEEE AFRICON Conference. Livingstone, Zambia: IEEE, 2011. 1-6
    [14] Suresh S, Babu V, Sundararajan N. Image quality measurement using sparse extreme learning machine classifier. In: Proceedings of the 9th IEEE International Conference on Control, Automation, Robotics and Vision. Singapore: IEEE, 2006. 1-6
    [15] Horata P, Chiewchanwattana S, Sunat K. Robust extreme learning machine. Neurocomputing, 2013, 102: 31-44
    [16] Yu Q, Miche Y, Eirola E, Van Heeswijk M, Séverin E, Lendasse A. Regularized extreme learning machine for regression with missing data. Neurocomputing, 2013, 102: 45-51
    [17] Zong W W, Huang G B, Chen Y Q. Weighted extreme learning machine for imbalance learning. Neurocomputing, 2013, 101: 229-242
    [18] Iosifidis A, Tefas A, Pitas I. Minimum class variance extreme learning machine for human action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(11): 1968-1979
    [19] Tenenbaum J B, De Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290(5500): 2319-2323
    [20] Roweis S T, Saul L K. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290(5500): 2323-2326
    [21] Huang G, Song S J, Gupta J N D, Wu C. Semi-supervised and unsupervised extreme learning machines. IEEE Transactions on Cybernetics, 2014, 44(12): 2405-2417
    [22] Liu B, Xia S X, Meng F R, Zhou Y. Manifold regularized extreme learning machine. Neural Computing and Applications, 2015, DOI: 10.1007/s00521-014-1777-8
    [23] Deng W Y, Zheng Q H, Chen L. Regularized extreme learning machine. In: Proceedings of the 2009 IEEE Symposium on Computational Intelligence and Data Mining. Nashville, USA: IEEE, 2009. 389-395
    [24] Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
    [25] Dehak N, Kenny P, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 788-798
    [26] Tomar V S, Rose R C. Manifold regularized deep neural networks. In: Proceedings of the 2014 Annual Conference of the International Speech Communication Association. Singapore: ISCA, 2014. 348-352
    [27] Guan N Y, Tao D C, Luo Z G, Yuan B. Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Transactions on Image Processing, 2011, 20(7): 2030-2048
    [28] Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research, 2006, 7: 2399-2434
    [29] Peng Y, Zhu J Y, Zheng W L, Lu B L. EEG-based emotion recognition with manifold regularized extreme learning machine. In: Proceedings of the 36th IEEE International Conference on Engineering in Medicine and Biology Society. San Diego, USA: IEEE, 2014. 974-977
    [30] Wang H, Yan S C, Xu D, Tang X A, Huang T. Trace ratio vs. ratio trace for dimensionality reduction. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007. 1-8
    [31] Martin A F, Greenberg C S. The 2009 NIST language recognition evaluation. In: Proceedings of the 2010 ODYSSEY-The Speaker and Language Recognition Workshop. Brno, Czech Republic: ISCA, 2010. 165-171
    [32] Zhang W Q, Hou T, Liu J. Discriminative score fusion for language identification. Chinese Journal of Electronics, 2010, 19(1): 124-128
    [33] Campbell W M, Sturim D E, Reynolds D A, Solomonoff A. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing. Toulouse, France: IEEE, 2006. 1-1
    [34] Singer E, Torres-Carrasquillo P, Reynolds D, McCree A, Richardson F, Dehak N, Sturim D. The MITLL NIST LRE 2011 language recognition system. In: Proceedings of the 2012 The Speaker and Language Recognition Workshop. Singapore: ISCA, 2012. 209-215
  • 加载中
计量
  • 文章访问数:  1947
  • HTML全文浏览量:  119
  • PDF下载量:  1783
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-01-05
  • 修回日期:  2015-06-07
  • 刊出日期:  2015-09-20

目录

    /

    返回文章
    返回