2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于流形正则化极限学习机的语种识别系统

徐嘉明 张卫强 杨登舟 刘加 夏善红

熊一枫, 卢继华, 何梓珮, 曹晨曦. 阴影模型的正则化无设备重建与实时定位. 自动化学报, 2015, 41(6): 1159-1165. doi: 10.16383/j.aas.2015.c130441
引用本文: 徐嘉明, 张卫强, 杨登舟, 刘加, 夏善红. 基于流形正则化极限学习机的语种识别系统. 自动化学报, 2015, 41(9): 1680-1685. doi: 10.16383/j.aas.2015.c140916
XIONG Yi-Feng, LU Ji-Hua, HE Zi-Pei, CAO Chen-Xi. Device-free Reconstruction and Real-time Location Based on Shadowing Model in Radio Tomographic Imaging. ACTA AUTOMATICA SINICA, 2015, 41(6): 1159-1165. doi: 10.16383/j.aas.2015.c130441
Citation: XU Jia-Ming, ZHANG Wei-Qiang, YANG Deng-Zhou, LIU Jia, XIA Shan-Hong. Manifold Regularized Extreme Learning Machine for Language Recognition. ACTA AUTOMATICA SINICA, 2015, 41(9): 1680-1685. doi: 10.16383/j.aas.2015.c140916

基于流形正则化极限学习机的语种识别系统

doi: 10.16383/j.aas.2015.c140916
基金项目: 

国家自然科学基金(61273268,61370034,61403224)资助

详细信息
    作者简介:

    张卫强 清华大学电子工程系副研究员.主要研究方向为语音信号处理,机器学习.E-mail:wqzhang@tsinghua.edu.cn

    杨登舟 中国科学院电子学研究所博士研究生.主要研究方向为语音信号处理,机器学习.E-mail:yangdengzhou@sina.com

    刘加 清华大学电子工程系教授.主要研究方向为语音识别,信号处理.E-mail:liuj@tsinghua.edu.cn

    夏善红 中国科学院电子学研究所研究员.主要研究方向为信号处理,传感技术.E-mail:shxia@mail.ie.ac.cn

    通讯作者:

    徐嘉明 中国科学院电子学研究所博士研究生.主要研究方向为语音信号处理,机器学习.本文通信作者.E-mail:xujiaming09@sina.com

Manifold Regularized Extreme Learning Machine for Language Recognition

Funds: 

Supported by National Natural Science Foundation of China (61273268, 61370034, 61403224)

  • 摘要: 支持向量机 (Support vector machine, SVM) 在语种识别中已经起到了重要的作用.近些年来,极限学习机 (Extreme learning machine, ELM) 在很多领域取得了成功的应用.相比于 SVM, ELM 最大的优点在于极易实现、训练速度快,而且通常可以取得与 SVM 相近甚至优于 SVM 的识别性能. 鉴于 ELM 这些优异的特点,本文将 ELM 引入到语种识别中,并针对 ELM 由于随机初始化模型参 数所带来的潜在问题,提出了流形正则化极限学习机 (Manifold regularized extreme learning machine, MRELM) 算法.实验结果表明,在高斯超矢量(Gaussian supervector, GSV)特征空间上,相对于 SVM 基线系统,该算法对30秒语音的识别性能有明显的提升. 同时该算法也可以成功地应用到 i-vector 特征空间中,取得与当前主流的打分算法相近的识别性能.
  • [1] Li H Z, Ma B, Lee K A. Spoken language recognition: from fundamentals to practice. Proceedings of the IEEE, 2013, 101(5): 1136-1159
    [2] Biadsy F. Automatic dialect and accent recognition and its application to speech recognition [Ph.D. dissertation], Columbia University, USA, 2011.
    [3] Zissman M A, Berkling K M. Automatic language identification. Speech Communication, 2001, 35(1-2): 115-124
    [4] Muthusamy Y K, Barnard E, Cole R A. Reviewing automatic language identification. IEEE Signal Processing Magazine, 1994, 11(4): 33-41
    [5] Campbell W M, Singer E, Torres-Carrasquillo P A, Reynolds, D A. Language recognition with support vector machines. In: Proceedings of the 2004 ODYSSEY-The Speaker and Language Recognition Workshop. Toledo, Spain: ISCA, 2004. 285-288
    [6] Campbell W M, Campbell J P, Reynolds D A, Singer E, Torres-Carrasquillo P A. Support vector machines for speaker and language recognition. Computer Speech & Language, 2006, 20(2-3): 210-229
    [7] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Budapest, Hungary: IEEE, 2004. 985-990
    [8] Huang G B, Wang D H, Lan Y. Extreme learning machines: a survey. International Journal of Machine Learning and Cybernetics, 2011, 2(2): 107-122
    [9] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: theory and applications. Neurocomputing, 2006, 70(1-3): 489-501
    [10] Huang G B, Zhou H M, Ding X J, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2012, 42(2): 513-529
    [11] Liang N Y, Huang G B, Saratchandran P, Sundararajan N. A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Transactions on Neural Networks, 2006, 17(6): 1411-1423
    [12] Xu J T, Zhou H M, Huang G B. Extreme learning machine based fast object recognition. In: Proceedings of the 15th IEEE International Conference on Information Fusion. Singapore: IEEE, 2012. 1490-1496
    [13] Sole M M, Tsoeu M S. Sign language recognition using the extreme learning machine. In: Proceedings of the 2011 IEEE AFRICON Conference. Livingstone, Zambia: IEEE, 2011. 1-6
    [14] Suresh S, Babu V, Sundararajan N. Image quality measurement using sparse extreme learning machine classifier. In: Proceedings of the 9th IEEE International Conference on Control, Automation, Robotics and Vision. Singapore: IEEE, 2006. 1-6
    [15] Horata P, Chiewchanwattana S, Sunat K. Robust extreme learning machine. Neurocomputing, 2013, 102: 31-44
    [16] Yu Q, Miche Y, Eirola E, Van Heeswijk M, Séverin E, Lendasse A. Regularized extreme learning machine for regression with missing data. Neurocomputing, 2013, 102: 45-51
    [17] Zong W W, Huang G B, Chen Y Q. Weighted extreme learning machine for imbalance learning. Neurocomputing, 2013, 101: 229-242
    [18] Iosifidis A, Tefas A, Pitas I. Minimum class variance extreme learning machine for human action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(11): 1968-1979
    [19] Tenenbaum J B, De Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290(5500): 2319-2323
    [20] Roweis S T, Saul L K. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290(5500): 2323-2326
    [21] Huang G, Song S J, Gupta J N D, Wu C. Semi-supervised and unsupervised extreme learning machines. IEEE Transactions on Cybernetics, 2014, 44(12): 2405-2417
    [22] Liu B, Xia S X, Meng F R, Zhou Y. Manifold regularized extreme learning machine. Neural Computing and Applications, 2015, DOI: 10.1007/s00521-014-1777-8
    [23] Deng W Y, Zheng Q H, Chen L. Regularized extreme learning machine. In: Proceedings of the 2009 IEEE Symposium on Computational Intelligence and Data Mining. Nashville, USA: IEEE, 2009. 389-395
    [24] Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
    [25] Dehak N, Kenny P, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 788-798
    [26] Tomar V S, Rose R C. Manifold regularized deep neural networks. In: Proceedings of the 2014 Annual Conference of the International Speech Communication Association. Singapore: ISCA, 2014. 348-352
    [27] Guan N Y, Tao D C, Luo Z G, Yuan B. Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Transactions on Image Processing, 2011, 20(7): 2030-2048
    [28] Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research, 2006, 7: 2399-2434
    [29] Peng Y, Zhu J Y, Zheng W L, Lu B L. EEG-based emotion recognition with manifold regularized extreme learning machine. In: Proceedings of the 36th IEEE International Conference on Engineering in Medicine and Biology Society. San Diego, USA: IEEE, 2014. 974-977
    [30] Wang H, Yan S C, Xu D, Tang X A, Huang T. Trace ratio vs. ratio trace for dimensionality reduction. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007. 1-8
    [31] Martin A F, Greenberg C S. The 2009 NIST language recognition evaluation. In: Proceedings of the 2010 ODYSSEY-The Speaker and Language Recognition Workshop. Brno, Czech Republic: ISCA, 2010. 165-171
    [32] Zhang W Q, Hou T, Liu J. Discriminative score fusion for language identification. Chinese Journal of Electronics, 2010, 19(1): 124-128
    [33] Campbell W M, Sturim D E, Reynolds D A, Solomonoff A. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing. Toulouse, France: IEEE, 2006. 1-1
    [34] Singer E, Torres-Carrasquillo P, Reynolds D, McCree A, Richardson F, Dehak N, Sturim D. The MITLL NIST LRE 2011 language recognition system. In: Proceedings of the 2012 The Speaker and Language Recognition Workshop. Singapore: ISCA, 2012. 209-215
  • 期刊类型引用(9)

    1. 马金旭,陶庆,刘景轩,赵暮超,胡学政. 改进FBCSP和CNN的运动想象脑电信号分类. 科学技术与工程. 2024(27): 11726-11732 . 百度学术
    2. 罗靖,王耀杰,刘光明,王晓帆,鲁晓锋,黑新宏. 面向运动想象脑电图识别的镜卷积神经网络. 中国图象图形学报. 2021(09): 2257-2269 . 百度学术
    3. 燕桢,张立新. 脑机接口在康复治疗中的应用. 中国康复医学杂志. 2020(02): 228-232 . 百度学术
    4. 张宪法,郝矿荣,陈磊. 免疫多域特征融合的多核学习SVM运动想象脑电信号分类. 自动化学报. 2020(11): 2417-2426 . 本站查看
    5. 蒋贵虎,陈万忠,马迪,吴佳宝. 基于ITD和PLV的四类运动想象脑电分类方法研究. 仪器仪表学报. 2019(05): 195-202 . 百度学术
    6. 李嘉伟,任立红,丁永生,陈磊. 基于免疫优化的脑电自适应集成分类方法研究. 机电工程. 2018(08): 873-879 . 百度学术
    7. 柳建光,袁道任,冯少康. 运动相关脑电信号的运动意图预测方法研究. 计算机测量与控制. 2018(05): 37-41 . 百度学术
    8. 孙小棋,李昕,蔡二娟,康健楠. 改进模糊熵算法及其在孤独症儿童脑电分析中的应用. 自动化学报. 2018(09): 1672-1678 . 本站查看
    9. 何群,杜硕,张园园,江国乾,谢平. 融合单通道框架及多通道框架的运动想象分类. 仪器仪表学报. 2018(09): 20-29 . 百度学术

    其他类型引用(22)

  • 加载中
计量
  • 文章访问数:  1974
  • HTML全文浏览量:  125
  • PDF下载量:  1795
  • 被引次数: 31
出版历程
  • 收稿日期:  2015-01-05
  • 修回日期:  2015-06-07
  • 刊出日期:  2015-09-20

目录

    /

    返回文章
    返回