2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

稀缺资源语言神经网络机器翻译研究综述

李洪政 冯冲 黄河燕

李洪政, 冯冲, 黄河燕. 稀缺资源语言神经网络机器翻译研究综述. 自动化学报, 2020, 45(x): 1−15 doi: 10.16383/j.aas.c200103
引用本文: 李洪政, 冯冲, 黄河燕. 稀缺资源语言神经网络机器翻译研究综述. 自动化学报, 2020, 45(x): 1−15 doi: 10.16383/j.aas.c200103
LI Hong-Zheng, Feng Chong, Huang He-Yan. A survey on low-resource neural machine translation. Acta Automatica Sinica, 2020, 45(x): 1−15 doi: 10.16383/j.aas.c200103
Citation: LI Hong-Zheng, Feng Chong, Huang He-Yan. A survey on low-resource neural machine translation. Acta Automatica Sinica, 2020, 45(x): 1−15 doi: 10.16383/j.aas.c200103

稀缺资源语言神经网络机器翻译研究综述

doi: 10.16383/j.aas.c200103
基金项目: 中国博士后科学基金面上资助项目(2018M640069), 国家自然科学基金青年项目(61902024), 国家自然科学基金重点项目(61732005), 国家重点研发计划(2018YFC0832104)资助
详细信息
    作者简介:

    李洪政:北京理工大学计算机学院博士后,助理研究员. 2018年获北京师范大学中文信息处理研究所博士学位. 主要研究方向为自然语言处理和机器翻译. E-mail: lihongzheng@bit.edu.cn

    冯冲:北京理工大学计算机学院副研究员. 2005年获中国科学技术大学计算机科学系博士学位. 主要研究方向为自然语言处理,信息抽取,机器翻译. E-mail: fengchong@bit.edu.cn

    黄河燕:北京理工大学计算机学院教授. 1989年获中国科学院计算技术研究所计算机科学与技术博士学位. 主要研究方向为自然语言处理,机器翻译,社交网络,信息检索,智能处理系统. 本文通信作者. E-mail: hhy63@bit.edu.cn

A Survey on Low-Resource Neural Machine Translation

Funds: Supported by China Postdoctoral Science Foundation (2018M640069), National Natural Science Foundation of China (61902024,61732005), and National Key R&D Program of China (2018YFC0832104)
  • 摘要: 作为目前主流翻译方法的神经网络机器翻译已经取得了很大突破, 在很多具有丰富数据资源的语言上的翻译质量也不断得到改善, 但对于稀缺资源语言的翻译效果却仍然并不理想. 稀缺资源语言机器翻译是目前机器翻译领域的重要研究热点之一, 近几年来吸引了国内外的广泛关注. 本文对稀缺资源语言机器翻译的研究进行比较全面的回顾, 首先简要介绍了与稀缺资源语言翻译相关的学术活动和数据集, 然后重点梳理了目前主要的研究方法和一些研究结论, 总结了每类方法的特点, 在此基础上总结了不同方法之间的关系并分析了目前的研究现状. 最后, 对稀缺资源语言机器翻译未来可能的研究趋势和发展方向进行了展望,并给出了相关建议.
  • 图  1  基于枢轴语言的方法(a)和MELE方法(b)

    Fig.  1  Pivot-based method(a) and MELE method(b)

    图  2  基于枢轴语言的方法(a)和“老师-学生”方法(b)

    Fig.  2  Pivot-based method(a) and “Teacher-student” method (b)

    图  3  迁移学习, 多语言迁移学习与元学习

    Fig.  3  Transfer Learning, Multilingual Transfer Learning and Meta Learning

    图  4  无监督翻译方法

    Fig.  4  unsupervised NMT

    图  5  数据增强框架. 其中, 1)和2)是传统数据增强方法, 3)和4)是新提出的方法.

    Fig.  5  Data Augmentation Method, where 1) and 2) are traditional methods, while 3) and 4) are new ones.

    图  8  各类翻译方法的优势与局限

    Fig.  8  Advantages and limits of translation methods

    图  6  几类翻译方法之间的关系

    Fig.  6  Relations between the translation methods

    图  7  WMT2019中涉及的主要方法和技术

    Fig.  7  Main methods in WMT2019

    表  1  低资源语言翻译相关的数据资源

    Table  1  Data for low-resource MT

    数据集 描述
    WMT data WMT提供的英语-低资源语言的数据集. 这也是目前研究中使用最多的数据集.
    IWSLT data 面向口语翻译的IWSLT比赛也提供了一些低资源翻译数据集.
    WAT data WAT提供亚洲低资源语言的翻译数
    据集.
    LORELEI data 由DARPA开发的低资源单语-英语双语数据集.
    JW300[13] 该语料库涵盖了超过300种语言的双语数据.
    WikiMatrix[14] 该语料库由Facebook开发构建, 包含85种语言的维基百科平行语料.
    FLORES 由Facebook开发的英语-尼泊尔语和僧伽罗语的双语数据集.
    Indian Language Corpora Initiative (ILCI) corpus[15] 该语料库包括11种印度语言与英语的平行语料.
    Asian Language Treebank[16] 该亚洲语言树库项目包括印尼语、老挝语等9种东南亚语言与英语的平行语料.
    下载: 导出CSV

    表  2  使用多种翻译方法的一些文献

    Table  2  Literatures with more than one MT method

    文献 使用的方法
    [84-87] 多语言, 迁移学习
    [89] 多语言, 反向翻译, 领域迁移
    [18], [23], [49] 多语言, 枢轴语言方法
    [102] 多语言, 无监督方法
    [41-45], [58] 反向翻译, 半监督方法
    [68], [71] 数据增强, 枢轴语言方法
    [56] 数据增强, 多任务方法
    [39] 迁移学习,半监督方法
    [36] 迁移学习,枢轴语言方法
    下载: 导出CSV

    表  3  几类方法在WMT2019中的使用情况

    Table  3  The methods in WMT2019

    方法 频次
    回译方法 45
    多次回译方法 19
    迁移学习和微调 24
    使用额外语言(包括枢轴语言和多语种) 12
    无监督方法 9
    下载: 导出CSV
  • [1] Kalchbrenner N, Blunsom P. Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle, USA. 2013. 1700−1709
    [2] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems. Montreal, Canada. 2014. 3104−3112
    [3] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations. San Diego, USA. 2015
    [4] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, et al. Attention is all you need. In: Proceedings of 31st Conference on Neural Information Processing Systems. Long Beach, USA. 2017. 5998−6008
    [5] 刘洋. 神经机器翻译前沿进展. 计算机研究与发展, 2017, 54(6): 1144−1149 doi: 10.7544/issn1000-1239.2017.20160805

    Liu Yang. Recent advances in neural machine translation. Journal of Computer Research and Development, 2017, 54(6): 1144−1149 doi: 10.7544/issn1000-1239.2017.20160805
    [6] 李亚超, 熊德意, 张民. 神经机器翻译综述. 计算机学报, 2018, 41(12): 2734−2755 doi: 10.11897/SP.J.1016.2018.02734

    Li Ya-Chao, Xiong De-Yi, Zhang Min. A survey of neural machine translation. Chinese Journal of Computers, 2018, 41(12): 2734−2755 doi: 10.11897/SP.J.1016.2018.02734
    [7] 林倩, 刘庆, 苏劲松, 林欢, 杨静, 罗斌. 神经网络机器翻译研究热点与前沿趋势分析. 中文信息学报, 2019, 33(11): 1−14 doi: 10.3969/j.issn.1003-0077.2019.11.001

    Lin Qian, Liu Qing, Su Jin-Song, Lin Huan, Yang Jing, Luo Bin. Focuses and frontiers tendency in neural machine translation research. Journal of Chinese Information Processing, 2019, 33(11): 1−14 doi: 10.3969/j.issn.1003-0077.2019.11.001
    [8] 赵阳, 周龙, 王迁, 马聪, 刘宇宸, 王亦宁, 向露, 张家俊, 周玉, 宗成庆. 民汉稀缺资源神经机器翻译技术研究. 江西师范大学学报(自然科学版), 2019, 43(6): 630−637

    Zhao Yang, Zhou Long, Wang Qian, Ma Cong, Liu Yu-Chen, Wang Yi-Ning, et al. The study on ethnic-to-Chinese scare-resource neural machine translation. Journal of Jiangxi Normal University (Natural Science Edition), 2019, 43(6): 630−637
    [9] Bojar O, Chatterjee R, Federmann C, Graham Y, Haddow B, Huck M, et al. Findings of the 2016 conference on machine translation. In: Proceedings of the 1st Conference on Machine Translation. Berlin, Germany. 2016. 131−198
    [10] Bojar O, Chatterjee R, Federmann C, Graham Y, Haddow B, Huang S, et al. Findings of the 2017 conference on machine translation. In: Proceedings of the 2nd Conference on Machine Translation. Copenhagen, Denmark. 2017. 169−214
    [11] Bojar O, Federmann C, Fishel M, Graham Y, Haddow B, Koehn P, et al. Findings of the 2018 conference on machine translation. In: Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Brussels, Belgium. 2018. 272–303
    [12] Barrault L, Bojar O, Costa-jussa M, Federmann C, Fishel M, Graham Y, et al. Findings of the 2019 conference on machine translation. In: Proceedings of the 4th Conference on Machine Translation: Shared Task Papers. Florence, Italy. 2019. 1−61
    [13] Agic Z, Vulic I. JW300: a wide-coverage parallel corpus for low-resource languages. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 3204–3210
    [14] Schwenk H, Chaudhary V, Sun S, Gong H, Guzmán F. Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia. arXiv preprint arXiv: 1907.05791, 2019
    [15] Jha G N. The TDIL program and the Indian language corpora initiative (ILCI). In: Proceedings of the 7th international conference on Language Resources and Evaluation. Valletta, Malta. 2010. 982−985
    [16] Thu Y K, Pa W P, Utiyama M, Finch A, Sumita E. Introducing the Asian language treebank (alt). In: Proceedings of the 10th International Conference on Language Resources and Evaluation. Portorož, Slovenia. 2016. 1574−1578
    [17] Ahmadnia B, Serrano J, Haffari G. Persian-Spanish low-resource statistical machine translation through English as pivot language. In: Proceedings of the International Conference Recent Advances in Natural Language Processing. Varna, Bulgaria. 2017. 24−30
    [18] Johnson M, Schuster M, Le Q V, Krikun M, Wu Y H, Chen Z, et al. Google’s multilingual neural machine translation system: enabling zero-shot translation. Transactions of the Association for Computational Linguistics, 2017, 5: 339−351 doi: 10.1162/tacl_a_00065
    [19] Cheng Y, Yang Q, Liu Y, Sun M S, Xu W. Joint training for pivot-based neural machine translation. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia. 2017. 3974−3980
    [20] Zheng H, Cheng Y, Liu Y. Maximum expected likelihood estimation for zero-resource neural machine translation. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia. 2017. 4251−4257
    [21] Chen Y, Liu Y, Cheng Y, Li V O. A teacher-student framework for zero-resource neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada. 2017. 1925−1935
    [22] Ren S, Chen W, Liu S, Li M, Zhou M, Ma S. Triangular architecture for rare language translation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia. 2018. 56−65
    [23] Lakew S M, Lotito Q F, Negri M, Turchi M, Federico M. Improving zero-shot translation of low-resource languages. In: Proceedings of International Workshop on Spoken Language Translation. Tokyo, Japan. 2017. 113−119
    [24] Nakayama H, Nishida N. Zero-resource machine translation by multimodal encoder–decoder network with multimedia pivot. Machine Translation, 2017, 31: 49−64 doi: 10.1007/s10590-017-9197-z
    [25] Chowdhury K D, Hasanuzzaman M, Liu Q. Multimodal neural machine translation for low-resource language pairs using synthetic data. In: Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP. Melbourne, Australia. 2018. 33−42
    [26] Pan S. J, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 2(10): 1345−1359
    [27] Ruder S. Neural Transfer Learning for Natural Language Processing[Ph.D. dissertation], National University of Ireland, 2019
    [28] Zoph B, Yuret D, May J, Knight K. Transfer learning for low-resource neural machine translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, Texas. 2016, 1568−1575
    [29] Nguyen T Q, Chiang D. Transfer learning across low-resource, related languages for neural machine translation. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing. Taipei, China. 2017.296−301
    [30] Dabre R, Nakagawa T, Kazawa H. An empirical study of language relatedness for transfer learning in neural machine translation. In: Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation. Cebu City, Philippines. 2017. 282−286
    [31] Kocmi T, Bojar O. Trivial transfer learning for low-resource neural machine translation. In: Proceedings of the Third Conference on Machine Translation, Brussels, Belgium. 2018. 244−252
    [32] Gu J T, Hassan H, Devlin J, Li V O. Universal neural machine translation for extremely low resource languages. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, USA. 2018. 344−354
    [33] Gu J T, Wang Y, Chen Y, Cho K, Li V O. Meta-learning for low-resource neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. 2018. 3622–3631
    [34] Li R M, Wang X, Yu H. MetaMT, a meta learning method leveraging multiple domain data for low resource machine translation. In: 34th AAAI Conference on Artificial Intelligence. New York, USA. 2020
    [35] Kim Y, Gao Y B, Ney H. Effective cross-lingual transfer of neural machine translation models without shared vocabularies. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019, 1246–1257
    [36] Kim Y, Petrov P, Petrushkov P, Khadivi S, Ney H. Pivot-based transfer learning for neural machine translation between non-English languages. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China. 2019. 866–876
    [37] Ji B J, Zhang Z R, Duan X Y, Zhang M, Chen B X, Luo W H. Cross-lingual pre-training based transfer for zero-shot neural machine translation. In: 34th AAAI Conference on Artificial Intelligence. New York, USA. 2020
    [38] Cheng Y, Xu W, He Z H, He W, Wu H, Sun M S, et al. Semi-supervised learning for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany. 2016.1965−1974
    [39] Skorokhodov I, Rykachevskiy A, Emelyanenko D, Slotin S, Ponkratov A. Semi-supervised neural machine translation with language models. In: Proceedings of AMTA 2018 Workshop on Technologies for MT of Low Resource Languages. Boston, USA. 2018. 37–44
    [40] Gulcehre C, Firat O, Xu K, Cho K, Barrault L, Lin H C, et al. On integrating a language model into neural machine translation. Computer Speech and Language, 2017, 45: 137−148 doi: 10.1016/j.csl.2017.01.014
    [41] Zheng Z X, Zhou H, Huang S J, Li L, Dai X Y, Chen J J. Mirror-generative neural machine translation. In: 8th International Conference on Learning Representations. Addis Ababa, Ethiopia. 2020
    [42] Lample G, Conneau A, Denoyer L, Ranzato M A. Unsupervised machine translation using monolingual corpora only. In: 6th International Conference on Learning Representations. Vancouver, Canada. 2018
    [43] Lample G, Ott M, Conneau A, Denoyer L, Ranzato MA. Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. 2018. 5039–5049
    [44] Artetxe M, Labaka G, Agirre E, Cho K. Unsupervised neural machine translation. In: 6th International Conference on Learning Representations. Vancouver, Canada. 2018
    [45] Artetxe M, Labaka G, Agirre E. An effective approach to unsupervised machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019.194−203
    [46] Artetex M, Labaka G, Agirre E. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia. 2018. 789−798
    [47] Artetex M, Labaka G, Agirre E. Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, USA. 2018. 5012−5019
    [48] Yang Z, Chen W, Wang F, Xu B. Unsupervised neural machine translation with weight sharing. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia. 2018.46−55
    [49] Gu J T, Wang Y, Cho K, Li V O. 2019. Improved zero-shot neural machine translation via ignoring spurious correlations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 1258−1268
    [50] Liu Y H, Gu J T, Goyal N, Li X, Edunov S, Ghazvininejad M, et al. Multilingual denoising pre-training for neural machine translation. arXiv preprint arXiv: 2001.08210, 2020
    [51] Artetxe M, Labaka G, Agirre E. Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada. 2017. 451−462
    [52] Conneau A Lample G. Cross-lingual language model pretraining. In: Proceedings of 33rd Conference on Neural Information Processing Systems, Vancouver, Canada. 2019. 1−11
    [53] Pourdamghani N, Aldarrab N, Ghazvininejad M, Knight K, May J. Translating translationese: a two-step approach to unsupervised machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 3057−3062
    [54] Leng Y C, Tan X, Qin T, Li X Y, Liu T Y. Unsupervised pivot translation for distant languages. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 175−183
    [55] Sennrich R, Zhang B. Revisiting low-resource neural machine translation: a case study. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 211−221
    [56] Zhang J J, Zong C Q. Exploiting source-side monolingual data in neural machine translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, Texas. 2016. 1535–1545
    [57] Gibadullin I, Valeev A, Khusainova A, Khan A. A survey of methods to leverage monolingual data in low-resource neural machine translation. In: The International Conference on Advanced Technologies and Humanitarian Sciences. Rabat, Morocco. 2019
    [58] Sennrich R, Haddow B, Birch A. Improving neural machine translation models with monolingual data. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany. 2016. 86−96
    [59] Park J, Na B, Yoon S. Building a neural machine translation system using only synthetic parallel data. arXiv preprint arXiv: 1704.00253, 2017
    [60] Poncelas A, Shterionov D, Way A, Wenniger G, Passban P. Investigating backtranslation in neural machine translation. In: Proceedings of the 21st Annual Conference of the European Association for Machine Translation. Alacant, Spain. 2018. 249−258
    [61] Poncelas A, Popovic M, Shterionov D, Wenniger G M, Way A. Combining smt and nmt back-translated data for efficient nmt. In: Proceedings of Recent Advances in Natural Language Processing. Varna, Bulgaria. 2019. 922−931
    [62] Edunov S, Ott M, Auli M, Grangier D. Understanding back-translation at scale. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium.2018.489–500
    [63] Fadaee M, Monz C. Back-translation sampling by targeting difficult words in neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. 2018.436–446
    [64] Hoang C, Koehn P, Haffari G, Cohn T. Iterative back-translation for neural machine translation. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Melbourne, Australia. 2018. 18–24
    [65] Imankulova A, Dabre R, Fujita A, Imamura K. Exploiting out-of-domain parallel data through multilingual transfer learning for low-resource neural machine translation. In: Proceedings of Machine Translation Summit XVⅡ. Dublin, Ireland. 2019. 128−139
    [66] Imankulova A, Sato T, Komachi M. Improving low-resource neural machine translation with filtered pseudo-parallel corpus. In: Proceedings of the 4th Workshop on Asian Translation. Taipei, China. 2017. 70–78
    [67] Wu J W, Wang X, Wang W Y. Extract and edit: an alternative to back-translation for unsupervised neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, Minnesota. 2019. 1173−1183
    [68] Currey A, Heafield K. Zero-resource neural machine translation with monolingual pivot data. In: Proceedings of the 3rd Workshop on Neural Generation and Translation. Hong Kong, China. 2019. 99–107
    [69] Fadaee M, Bisazza A, Monz C. Data augmentation for low-resource neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada. 2017. 567–573
    [70] Wang X Y, Pham H, Dai Z H, Neubig G. Switchout: an efficient data augmentation algorithm for neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. 2018. 856–861
    [71] Xia M Z, Kong X, Anastasopoulos A, Neubig G. 2019. Generalized data augmentation for low-resource translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 5786–5796
    [72] Zhu J F, Gao F, Wu L J, Xia Y C, Qin T, Zhou W G, et al. Soft contextual data augmentation for neural machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 5539–5544
    [73] Zhou C T, Ma X Z, Hu J J, Neubig G. Handling syntactic divergence in low-resource machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Hong Kong, China. 2019.1388−1394
    [74] Currey A, Barone AVM, Heafield K. Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation. Copenhagen, Denmark. 2017. 148−156
    [75] Li G L, Liu L M, Huang G P, Zhu C H, Zhao T J. Understanding data augmentation in neural machine translation: two perspectives towards generalization. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Hong Kong, China. 2019. 5689–5695
    [76] Firat O, Cho K, Bengio Y. Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, California. 2016.866−875
    [77] Dabre R, Chu C H, Kunchukuttan A. A comprehensive survey of multilingual neural machine translation. ACM Computing Surveys, to be published
    [78] Tan X, Ren Y, He D, Qin T, Zhao Z, Liu T Y. Multilingual neural machine translation with knowledge distillation. In: 7th International Conference on Learning Representations. New Orleans, USA. 2019
    [79] Tan X, Chen J L, He D, Xia Y C, Qin T, Liu T Y. Multilingual Neural Machine Translation with Language Clustering. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China. 2019. 963–973
    [80] Wang X Y, Pham H, Arthur P, Neubig G. Multilingual neural machine translation with soft decoupled encoding. In: 7th International Conference on Learning Representations. New Orleans, USA. 2019
    [81] Platanios E A, Sachan M, Neubig G, Mitchell T. Contextual parameter generation for universal neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. 2018. 425−435
    [82] Arivazhagan N, Bapna A, Firat O, Lepikhin D, Johnson M, Krikun M, et al. Massively multilingual neural machine translation in the wild: findings and challenges. arXiv preprint arXiv: 1907.05019, 2019
    [83] Firat O, Sankaran B, Al-onaizan Y, Vural F Y, Cho K. Zero-resource translation with multi-lingual neural machine translation. In: Proceedings of the 2016 conference of Empirical Methods in Natural Language Processing. Austin, Texas. 2016. 268−277
    [84] Zhou Z, Sperber M, Waibel A. Massively parallel cross-lingual learning in low-resource target language translation. In: Proceedings of third Conference of Machine Translation. Brussels, Belgium. 2018. 232−243
    [85] Maimaiti M, Liu Y, Luan H B, Sun M S. Multi-round transfer learning for low-resource nmt using multiple high-resource languages. ACM Transactions on Asian and Low-Resource Language Information Processing, 2019, 18(4): Article 38, 1−26
    [86] Wang X Y, Neubig G. Target conditioned sampling: optimizing data selection for multilingual neural machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 5823−5828
    [87] Dabre R, Fujita A, Chu C H. Exploiting multilingualism through multistage fine-tuning for low-resource neural machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Hong Kong, China. 2019. 1410–1416
    [88] Murthy R, Kunchukuttan A, Bhattacharyya P. Addressing word-order divergence in multilingual neural machine translation for extremely low resource languages. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, Minnesota. 2019. 3868−3873
    [89] Imankulova A, Sato T, Komachi M. Filtered pseudo-parallel corpus improves low-resource neural machine translation. ACM Transactions on Asian and Low-Resource Language Information Processing, 2019, 19(2): 24−40
    [90] Neubig G, Hu J. Rapid adaptation of neural machine translation to new languages. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. 2018. 875–880
    [91] Lu Y C, Keung P, Ladhak F, Bhardwaj V, Zhang S N, Sun J. A neural interlingua for multilingual machine translation. In: Proceedings of the Third Conference on Machine Translation. Belgium, Brussels. 2018. 84–92
    [92] Sestorain L, Ciaramita M, Buck C. Zero-Shot Dual Machine Translation. arXiv preprint arXiv: 1805.10338, 2018
    [93] Wang Y N, Zhou L, Zhang J J, Zhai F F, Xu J F, Zong C Q. A compact and language-sensitive multilingual translation method. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 2019. 1213–1223
    [94] Kiperwasser E, Ballesteros M. Scheduled multi-task learning: from syntax to translation. Transactions of the Association for Computational Linguistics, 2018, 6: 225−240 doi: 10.1162/tacl_a_00017
    [95] Zaremoodi P, Buntine W, Haffari G. Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia. 2018. 656−661
    [96] Zaremoodi P, Haffari G. Adaptively scheduled multitask learning: the case of low-resource neural machine translation. In: Proceedings of the 3rd Workshop on Neural Generation and Translation. Hong Kong, China. 2019. 177–186
    [97] Zoph B, Knight K. Multi-source neural translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, California. 2016. 30–34
    [98] He D, Xia Y C, Qin T, Wang L W, Yu N H, Liu T Y, et al. Dual learning for machine translation. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 820–828
    [99] He T Y, Chen J L, Tan X, Qin T. Language graph distillation for low-resource machine translation. arXiv preprint arXiv: 1908: 06258, 2019
    [100] Ostling R, Tiedemann J. Neural machine translation for low-resource languages. arXiv preprint arXiv: 1708.05729, 2017
    [101] Nishimura Y, Sudoh K, Neubig G, Nakamura S. Multi-source neural machine translation with data augmentation. In: Proceedings of the 15th International Workshop on Spoken Language Translation. Bruges, Belgium. 2018. 48−53
    [102] Garcia X, Foret P, Sellam T, Parikh A P. A multilingual view of unsupervised machine translation. arXiv preprint arXiv: 2002.02955, 2020
    [103] Anastasopoulos A, Chiang D. Leveraging translations for speech transcription in low-resource settings. In: Proceedings of Interspeech 2018. Hyderabad, India. 2018. 1279−1283
    [104] Stoian M C, Bansal S, Goldwater S. Analyzing ASR pretraining for low-resource speech-to-text translation. In: Proceedings of 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona, Spain. 2020. 7909−7913
    [105] Bansal S, Kamper H, Livescu K, Lopez A, Goldwater S. Low-resource speech-to-text translation. In: Proceedings of Interspeech 2018. Hyderabad, India. 2018. 1298−1302
    [106] Anastasopoulos A, Chiang D, Duong L. An unsupervised probability model for speech-to-translation alignment of low-resource languages. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas. 2016. 1255−1263
    [107] Erdmann A, Habash N, Taji D, Bouamor H. Low resourced machine translation via morpho-syntactic modeling: the case of dialectal Arabic. In: Proceedings of Machine Translation Summit XVI. Nagoya, Japan. 2017. 185-200
    [108] Honnet P E, Popescu-Belis A, Musat C, Baeriswyl M. Machine translation of low-resource spoken dialects: strategies for normalizing Swiss German. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation. Miyazaki, Japan. 2018. 3781−3788
    [109] Musleh A, Durrani N, Temnikova I, Nakov P, Vogel S, Alsaad O. Enabling medical translation for low-resource languages. In: Proceedings of International Conference on Intelligent Text Processing and Computational Linguistics. Konya, Turkey. 2016.3−16
    [110] Ngo T V, Ha T L, Nguyen P T, Nguyen L M. Overcoming the rare word problem for low-resource language pairs in neural machine translation. In: Proceedings of the 6th Workshop on Asian Translation. Hong Kong, China. 2019. 207−214
    [111] 王卓, 余正涛, 文永华, 高盛祥, 吴飞. 融合词汇翻译概率的汉越神经机器翻译方法. 昆明理工大学学报(自然科学版), 2019, 44(1): 54−60

    Wang Zhuo, Yu Zheng-Tao, Wen Yong-Hua, Gao Sheng-Xiang, Wu Fei. Chinese-Vietnamese neural machine translation integrated with lexicon probability. Journal of Kunming University of Science and Technology (Natural Science), 2019, 44(1): 54−60
    [112] 车万金, 余正涛, 郭军军, 文永华, 于志强. 融入分类词典的汉越混合网络神经机器翻译集外词处理方法. 中文信息学报, 2019, 33(12): 67−75 doi: 10.3969/j.issn.1003-0077.2019.12.009

    Che Wan-Jin, Yu Zheng-Tao, Guo Jun-Jun, Wen Yong-Hua, Yu Zhi-Qiang. Unknown words processing methods for Chinese-Vietnamese neural machine translation based on hybrid network integrating classification dictionaries. Journal of Chinese Information Processing, 2019, 33(12): 67−75 doi: 10.3969/j.issn.1003-0077.2019.12.009
    [113] 徐毓, 赖华, 余正涛, 高盛祥, 文永华. 基于深度可分离卷积的汉越神经机器翻译. 厦门大学学报(自然科学版), 2020, 59(2): 220−224

    Xu Yu, Yu Zheng-Tao, Gao Sheng-Xiang, Wen Yong-Hua. Chinese-Vietnamese neural machine translation based on deep separable convolution. Journal of Xiamen University (Natural Science), 2020, 59(2): 220−224
    [114] 贾承勋, 赖华, 余正涛, 文永华, 于志强. 基于枢轴语言的汉越神经机器翻译伪平行语料生成. 计算机工程与科学, 2020

    JIA Cheng-Xun, LAI Hua, YU Zheng-Tao, WEN Yong-Hua, YU Zhi-Qiang. Pseudo-parallel corpus generation for Chinese-Vietnamese neural machine translation based on pivot language. Computer Engineering & Science, 2020[Online], available: http://kns.cnki.net/kcms/detail/43.1258.TP.20200429.1040.004.html, May 7, 2020(in Chinese)
    [115] 于志强, 余正涛, 黄于欣, 郭军军, 高盛祥. 基于变分信息瓶颈的半监督神经机器翻译. 自动化学报, 2020

    Yu Zhi-Qiang, Yu Zheng-Tao, Huang Yu-Xin, Guo Jun-Jun, Gao Sheng-Xiang. Improving semi-supervised neural machine translation with variational information bottleneck. Acta Automatica Sinica, 2020[Online], available: http://kns.cnki.net/kcms/detail/11.2109.tp.20200309.1839.002.html, May 7, 2020(in Chinese)
    [116] Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, Minnesota. 2019. 4171–4186
    [117] Clinchant S, Jung K W, Nikoulina V. On the use of BERT for neural machine translation. In: Proceedings of the 3rd Workshop on Neural Generation and Translation. Hong Kong, China. 2019. 108–117
    [118] Zhu J H, Xia Y C, Wu L J, He D, Qin T, Zhou W G, et al. Incorporating BERT into neural machine translation. In: the 8th International Conference on Learning Representations. 2020
    [119] Radford A, Narasimhan K, Salimans K, Sutskever I. Improving language understanding by generative pre-training[Online], available: https://openai.com/blog/language-unsupervised/, April, 4, 2020
  • 加载中
计量
  • 文章访问数:  88
  • HTML全文浏览量:  159
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-03-03
  • 录用日期:  2020-05-07

目录

    /

    返回文章
    返回