2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于扩展N元文法模型的快速语言模型预测算法

单煜翔 陈谐 史永哲 刘加

单煜翔, 陈谐, 史永哲, 刘加. 基于扩展N元文法模型的快速语言模型预测算法. 自动化学报, 2012, 38(10): 1618-1626. doi: 10.3724/SP.J.1004.2012.01618
引用本文: 单煜翔, 陈谐, 史永哲, 刘加. 基于扩展N元文法模型的快速语言模型预测算法. 自动化学报, 2012, 38(10): 1618-1626. doi: 10.3724/SP.J.1004.2012.01618
SHAN Yu-Xiang, CHEN Xie, SHI Yong-Zhe, LIU Jia. Fast Language Model Look-ahead Algorithm Using Extended N-gram Model. ACTA AUTOMATICA SINICA, 2012, 38(10): 1618-1626. doi: 10.3724/SP.J.1004.2012.01618
Citation: SHAN Yu-Xiang, CHEN Xie, SHI Yong-Zhe, LIU Jia. Fast Language Model Look-ahead Algorithm Using Extended N-gram Model. ACTA AUTOMATICA SINICA, 2012, 38(10): 1618-1626. doi: 10.3724/SP.J.1004.2012.01618

基于扩展N元文法模型的快速语言模型预测算法

doi: 10.3724/SP.J.1004.2012.01618

Fast Language Model Look-ahead Algorithm Using Extended N-gram Model

  • 摘要: 针对基于动态解码网络的大词汇量连续语音识别器,本文提出了一种采用扩展N元文法模 型进行快速语言模型(Language model, LM)预测的方法.扩展N元文法模型统一了语言模型和语言模型预测树的 表示与分数计算方法,从而大大简化了解码器的实现,极大地提升了语言模型预测的速度,使得高阶语言模型预测成为可能.扩展N元文法模型在解码之前离线生成,生成过程利 用了N元文法的稀疏性加速计算过程,并采用了词尾节点前推和分数量化的方法压缩模 型存储空间大小.实验表明,相比于采用动态规划在解码过程中实时计算语言模型预测分 数的传统方法,本文提出的方法在相同的字错误率下使得整个识别系统识别速率提升了5~ 9 倍,并且采用高阶语言模型预测可获得比低阶预测更优的解码速度与精度.
  • [1] Ortmanns S, Ney H, Eiden A. Language-model look-ahead for large vocabulary speech recognition. In: Proceedings of the 1996 International Conference on Spoken Language Processing. Philadelphia, PA, USA: IEEE, 1996. 2095-2098[2] Ortmanns S, Eiden A, Ney H. Improved lexical tree search for large vocabulary speech recognition. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing. Seattle, WA, USA: IEEE, 1998. 817-820[3] Soltau H, Saon G. Dynamic network decoding revisited. In: Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding. Merano, Italy: IEEE, 2009. 276-281[4] Cardenal-López A, Diéguez-Tirado P, Garcia-Mateo C. Fast LM look-ahead for large vocabulary continuous speech recognition using perfect hashing. In: Proceedings of the 2002 IEEE International Conference on Acoustics, Speech and Signal Processing. Orlando, FL, USA: IEEE, 2002. 705 -708[5] Li X L, Zhao Y X. A fast and memory-efficient N-gram language model lookup method for large vocabulary continuous speech recognition. Computer Speech and Language, 2007, 21(1): 1-25[6] Huijbregts M, Ordelman R, de Jong F. Fast N-gram language model look-ahead for decoders with static pronunciation prefix trees. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association. Brisbane Australia: ISCA, 2008. 1582-1585[7] Chen L Z, Chin K K. Efficient language model look-ahead probabilities generation using lower order LM look-ahead information. In: Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, Nevada, USA: IEEE, 2008. 4925-4928[8] Nolden D, Ney H, Schluter R. Exploiting sparseness of backing-off language models for efficient look-ahead in LVCSR. In: Proceedings of the 2011 IEEE International Con-ference on Acoustics, Speech and Signal Processing. Prague, Czech: IEEE, 2011. 4684-4687[9] Mohri M, Pereira F, Riley M. Speech recognition with weighted finite-state transducers. Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Heidelberg, Germany: Springer-Verlag, 2008. 559- 584[10] Young S J. A review of large-vocabulary continuous-speech. IEEE Signal Processing Magazine, 1996, 13(5): 45-57[11] Young S J, Russell N H, Thornton J H S. Token Passing: a Simple Conceptual Model for Connected Speech Recognition Systems. Technical Report CUED/F-INFENG/TR38, Engineering Department, Cambridge University, USA, 1989[12] Pylkknen J. New pruning criteria for efficient decoding. In: Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005). Lisboa, Portugal: IEEE, 2005. 581-584[13] Chen S F, Goodman J. An Empirical Study of Smoothing Techniques for Language Modeling. Technical Report TR-10-98, Computer Science Group, Harvard University, USA, 1998[14] Ravishankar M K. Efficient Algorithms for Speech Recognition [Ph.D. dissertation], Carnegie Mellon University, USA, 1996
  • 加载中
计量
  • 文章访问数:  2376
  • HTML全文浏览量:  72
  • PDF下载量:  1095
  • 被引次数: 0
出版历程
  • 收稿日期:  2012-01-01
  • 修回日期:  2012-03-22
  • 刊出日期:  2012-10-20

目录

    /

    返回文章
    返回