2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多粒度对抗训练的鲁棒跨语言对话系统

向露 朱军楠 周玉 宗成庆

向露, 朱军楠, 周玉, 宗成庆. 基于多粒度对抗训练的鲁棒跨语言对话系统. 自动化学报, 2021, 47(x): 1−12 doi: 10.16383/j.aas.c200764
引用本文: 向露, 朱军楠, 周玉, 宗成庆. 基于多粒度对抗训练的鲁棒跨语言对话系统. 自动化学报, 2021, 47(x): 1−12 doi: 10.16383/j.aas.c200764
Xiang Lu, Zhu Jun-Nan, Zhou Yu, Zhou Cheng-Qing. Robust cross-lingual dialogue system based on multi-granularity adversarial training. Acta Automatica Sinica, 2021, 47(x): 1−12 doi: 10.16383/j.aas.c200764
Citation: Xiang Lu, Zhu Jun-Nan, Zhou Yu, Zhou Cheng-Qing. Robust cross-lingual dialogue system based on multi-granularity adversarial training. Acta Automatica Sinica, 2021, 47(x): 1−12 doi: 10.16383/j.aas.c200764

基于多粒度对抗训练的鲁棒跨语言对话系统

doi: 10.16383/j.aas.c200764
基金项目: 国家重点研发计划重点专项(No.2017YFB1002103)资助
详细信息
    作者简介:

    向露:中国科学院自动化研究所模式识别国家重点实验室博士研究生. 主要研究方向为人机对话系统、文本生成和自然语言处理. E-mail: lu.xiang@nlpr.ia.ac.cn

    朱军楠:中国科学院自动化研究所助理研究员. 主要研究方向为自动摘要、文本生成和自然语言处理. E-mail: junna.zhu@nlpr.ia.ac.cn

    周玉:中国科学院自动化研究所研究员. 主要研究方向为自动摘要、机器翻译和自然语言处理. E-mail: yzhou@nlpr.ia.ac.cn

    宗成庆:中科院自动化所研究员, 国科大岗位教授, 中国计算机学会会士、中国人工智能学会会士. 主要从事自然语言处理和机器翻译等研究, 出版专著《统计自然语言处理》和《文本数据挖掘》, 发表论文200余篇. E-mail: cqzong@nlpr.ia.ac.cn

Robust Cross-Lingual Dialogue System Based on Multi-Granularity Adversarial Training

Funds: Supported by National Key Research and Development Program of China (No.2017YFB1002103)
  • 摘要: 跨语言对话系统是当前国际研究的热点和难点. 在实际的应用系统搭建中通常需要翻译引擎作为不同语言之间对话的桥梁. 然而, 翻译引擎往往是基于不同训练样本构建的, 无论是所在领域, 还是擅长处理语言的特性, 均与对话系统的实际应用需求存在较大的差异, 从而导致整个对话系统的鲁棒性差, 响应性能低. 因此, 如何增强跨语言对话系统的鲁棒性对于提升其实用性具有重要的意义. 本文提出了一种基于多粒度对抗训练的鲁棒性跨语言对话系统构建方法. 该方法首先面向机器翻译构建多粒度噪声数据, 分别在词汇、短语和句子层次生成相应的对抗样本, 之后利用多粒度噪声数据和干净数据进行对抗训练, 从而更新对话系统的参数, 进而指导对话系统学习噪声无关的隐层表示, 最终达到提升跨语言对话系统性能的目的. 在公开对话数据集上对两种语言的实验表明, 本文所提出的方法能够显著提升跨语言对话系统的性能, 尤其提升跨语言对话系统的鲁棒性.
  • 图  1  基于机器翻译的跨语言对话系统

    Fig.  1  Machine translation based cross-lingual dialogue system

    图  2  TSCP框架

    Fig.  2  TSCP Framework

    图  3  词汇级和短语级对抗样本生成框架

    Fig.  3  The framework of word-level and phrase-level adversarial examples generation

    图  4  多粒度对抗样本实例

    Fig.  4  An example of multi-granularity adversarial examples

    图  5  对抗训练结构框图

    Fig.  5  The structure of adversarial training

    图  6  两种测试

    Fig.  6  Two kinds of Test

    表  1  数据集统计信息

    Table  1  Statistics of Datasets

    数据集CamRest676
    规模训练集: 408 / 验证集: 136 / 测试集: 136
    领域餐馆预定
    数据集KVRET
    规模训练集: 2425 / 验证集: 302 / 测试集: 302
    领域日程规划、天气信息查询、导航
    下载: 导出CSV

    表  2  CamRest676数据集上的实验结果

    Table  2  Experimental results on CamRest676

    对抗样本Cross-Test Mono-Test
    BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数 BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数
    0基线系统0.17310.47760.64850.73610.20010.93280.82041.0767
    1随机交换0.17590.48510.65990.74840.21590.91040.76391.0530
    2停用词0.16920.50000.63470.73650.23000.91790.78031.0791
    3同义词0.18050.44030.70510.75320.21590.90300.78241.0586
    4词汇级0.19410.45520.75030.79690.20560.89550.82271.0647
    5短语级0.20170.44780.76020.80570.22150.85070.79921.0465
    6句子级0.19370.49250.76620.82310.21270.87310.81211.0553
    7多粒度0.21780.51490.79250.87150.23430.88810.82691.0918
    下载: 导出CSV

    表  3  KVRET数据集上的实验结果

    Table  3  Experimental results on KVRET

    对抗样本Cross-TestMono-Test
    BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数
    0基线系统0.17370.42180.70730.73820.20960.79290.79481.0034
    1随机交换0.17510.44360.71220.75310.20560.84000.80331.0273
    2停用词0.16760.43270.71830.74310.19610.81090.80161.0023
    3同义词0.16800.41450.72340.73700.19440.81090.78980.9947
    4词汇级0.18050.44360.76960.78710.20950.81090.82021.0251
    5短语级0.19330.47270.76030.80970.22190.82550.81701.0431
    6句子级0.18030.47270.78430.80880.19650.82180.81361.0142
    7多粒度0.17620.52360.78590.83090.19440.82180.82351.0171
    下载: 导出CSV

    表  4  KVRET数据集上德到英的跨语言实验结果

    Table  4  Cross-Lingual experimental results from German to English on KVRET

    对抗样本Cross-TestMono-Test
    BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数
    0基线系统0.16410.39640.71620.72030.20960.79270.79481.0034
    1词汇级0.16610.46180.75960.77680.20110.81090.81631.0147
    2短语级0.18420.51270.79220.83670.19750.82180.82481.0208
    3句子级0.18440.50910.77740.82770.20670.82910.80951.0260
    4多粒度0.17430.53090.81170.84560.19050.79640.81860.9981
    下载: 导出CSV

    表  5  跨语言对话实验结果实例

    Table  5  Examples of cross-lingual experimental results

    Cross-TestMono-Test
    CamRest676实验结果实例
    User: 请推荐给我一些位于南方的价位适中的餐厅.
    MT: please recommend me some restaurant at the
    appropriate price in the south.
    User: please suggest me some moderately priced
    restaurant in the south.
    Gold_bspan: moderate south
    Gold_response: i found two moderately priced restaurants in the south part of town. would you prefer food_SLOT or food_SLOT food?
    基线系统generated_bspan: south
    generated_response: there are no restaurants in the
    south part of town.
    generated_bspan: moderate south
    generated_response: there are two restaurants in
    the south part of town.
    多粒度generated_bspan: moderate south
    generated_response: there are two moderately priced
    restaurants in the south part of town. would you like
    more information?
    generated_bspan: moderate south
    generated_response: there are two moderately priced
    restaurants in the south part of town. would you like
    more information?
    KVRET实验结果实例
    Driver: 最近的停车场怎么走?
    MT: how can the latest car park go?
    Driver: what are the direction to the closest parking garage?
    Gold_bspan: parking garage
    Gold_response: the closest parking garage is poi_SLOT, located distance_SLOT away at address_SLOT
    基线系统generated_bspan: ITEM_3
    generated_response: there is a ITEM_0 one and poi_SLOT, distance_SLOT away with traffic_info_SLOT.
    generated_bspan: parking garage
    generated_response: the nearest parking garage is
    poi_SLOT, distance_SLOT away.
    多粒度generated_bspan: parking garage
    generated_response: the closest parking garage is
    distance_SLOT away.
    generated_bspan: parking garage
    generated_response: the closest parking garage is
    poi_SLOT
    Driver: 安排5日上午11点与副会长会面.
    MT: arranged for him to meet with the vice
    president at 11 : 00 am on the 5th.
    Driver: schedule a conference with the vice president
    on the 5th at 11am.
    Gold_bspan: 5th; vice president; conference; 11 am
    Gold_response: conference scheduled for the 5th at 11am with the vice president.
    基线系统generated_bspan: 5th; vice president; vice president;
    generated_response: the 5th is set the president
    to discus the president.
    generated_bspan: 5th; vice president; conference; 11am
    generated_response:the 5th is with a vice president for
    the 5th at address_SLOT with a high of the 5th of the 5th
    of the 5th of the 5th of the 5th of the 5th is with the 5th.
    多粒度generated_bspan: 5th; vice president; conference; 11am
    generated_response: your conference with vice president
    at this 5th at with the vice president at the 5th.
    generated_bspan: 5th; vice president; conference; 11am
    generated_response: you have a conference with the
    vice president on the 5th at this 5th.
    下载: 导出CSV

    表  6  翻译现象类别示例

    Table  6  Categories of translation phenomena

    类别1原始单语句子I am looking for a moderately priced restaurant in the south part of town.
    中文测试集你知道镇北部有什么价格适中的餐馆吗?
    MTI’m looking for a cheap restaurant in the south of the town.
    类别2原始单语句子A restaurant in the moderately priced range, please.
    中文测试集请给我一家中等价位的餐馆
    MTPlease give me a mid-priced restaurant.
    类别3原始单语句子I would like a cheap restaurant that serves greek food.
    中文测试集我想要一家供应希腊食物的便宜餐馆.
    MTI’d like a cheap restaurant to supply greek food.
    下载: 导出CSV

    表  7  翻译系统噪声类型分析

    Table  7  Noise type analysis of MT

    翻译结果分类轮数
    类别127
    类别272
    类别323
    类别455
    下载: 导出CSV

    表  8  4种翻译现象上的实验结果

    Table  8  Experimental results on four translation phenomena

    类别Cross-TestMono-Test
    BLEU/实体匹配率/成功率$ {\mathrm{F}}_{1} $BLEU/实体匹配率/成功率$ {\mathrm{F}}_{1} $
    基线系统
    10.1229/0.2632/0.35480.1987/1.0000/0.6571
    20.1672/0.2879/0.42340.2093/0.9394/0.6239
    30.1429/0.3500/0.55380.1588/0.8500/0.6757
    40.1640/0.5909/0.56290.1891/0.8864/0.6595
    多粒度
    10.1706/0.4737/0.51350.2301/1.0000/0.6835
    20.2327/0.5000/0.67480.2594/0.8939/0.6935
    30.1607/0.3000/0.53520.1801/0.7000/0.5278
    40.2066/0.5909/0.59890.1924/0.8182/0.6448
    下载: 导出CSV

    表  9  CamRest676数据集上使用其他单语对话系统的跨语言实验结果

    Table  9  Cross-Lingual experimental results using other monolingual dialogue systems on CamRest676

    对抗样本Cross-TestMono-Test
    BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数BLEU实体匹配率成功率${\rm{F} }_{1}$组合分数
    SEDST
    0基线系统0.16710.64550.72940.85450.21070.95450.81201.0940
    1多粒度0.20930.83330.81931.03560.22920.92590.83781.1111
    LABES-S2S
    2基线系统0.19100.74500.72600.92650.23500.96400.79901.1165
    3多粒度0.23000.81500.82901.05200.24000.94400.85801.1410
    下载: 导出CSV
  • [1] Li X J, Chen Y N, Li L H, Gao J F, Celikyilmaz A. End-to-end task-completion neural dialogue systems. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing. Taipei, Taiwan, China: Asian Federation of Natural Language Processing, 2017. 733−743.
    [2] Liu B, Lane I. End-to-end learning of task-oriented dialogs. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. New Orleans, Louisiana, USA: Association for Computational Linguistics, 2018. 67−73.
    [3] Wen T H, Vandyke D, Mrkšić N, Gašić M, Rojas-Barahona L M, Su P H, et al. A network-based end-to-End trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Valencia, Spain: Association for Computational Linguistics, 2017. 438−449.
    [4] Wang W K, Zhang J J, Li Q, Zong C Q, Li Z F. Are you for real? detecting identity fraud via dialogue interactions. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing. Hong Kong, China: Association for Computational Linguistics, 2019. 1762−1771.
    [5] Wang W K, Zhang J J, Li Q, Hwang M Y, Zong C Q, Li Z F. Incremental learning from scratch for task-oriented dialogue systems. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019. 3710−3720.
    [6] Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations, 2015. 1−11.
    [7] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J, et al. Intriguing properties of neural networks. arXiv preprint arXiv: 1312.6199, 2013.
    [8] 董胤蓬, 苏航, 朱军. 面向对抗样本的深度神经网络可解释性分析. 自动化学报, 2020, 45(x): 1−12

    Dong Yin-Peng, Su Hang, Zhu Jun. Towards interpretable deep neural networks by leveraging adversarial examples. Acta Automatica Sinica, 2020, 45(x): 1−12
    [9] 孔锐, 蔡佳纯, 黄钢. 基于生成对抗网络的对抗攻击防御模型. 自动化学报, 2020, 41(x): 1−17

    Kong Rui, Cai Jia-Chun, Huang Gang. Defense to adversarial attack with generative adversarial network. Acta Automatica Sinica, 2020, 41(x): 1−17
    [10] Young S, Gasic M, Thomson B, Williams J D. POMDP-based statistical spoken dialog systems: a review. Proceedings of the IEEE, 2013, 101(5): 1160−1179 doi: 10.1109/JPROC.2012.2225812
    [11] Williams J D, Young S. Partially observable markov decision processes for spoken dialog systems. Computer Speech & Language, 2007, 21(2): 393−422
    [12] Mesnil G, Dauphin Y, Yao K, Bengio Y, Zweig G. Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio Speech & Language Processing, 2015, 23(3): 530−539
    [13] Bai H, Zhou Y, Zhang J J, Zong C Q. Memory consolidation for contextual spoken language understanding with dialogue logistic inference. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019. 5448−5453.
    [14] Lee S, Stent A. Task lineages: dialog state tracking for flexible interaction. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Los Angeles: Association for Computational Linguistics, 2016. 11−21.
    [15] Zhong V, Xiong C, Socher R. Global-locally self-attentive encoder for dialogue state tracking. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018. 1458−1467.
    [16] Wang W K, Zhang J J, Zhang H, Hwang M Y, Zong C Q, Li Z F. A teacher-student framework for maintainable dialog manager. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018. 3803−3812.
    [17] Sharma S, He J, Suleman K, Schulz H, Bachman P. Natural language generation in dialogue using lexicalized and delexicalized data. In: Proceedings of the International Conference on Learning Representations Workshop, 2017. 1−6.
    [18] Eric M, Manning C D. Key-value retrieval networks for task-oriented dialogue. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Saarbrücken, Germany: Association for Computational Linguistics, 2017. 37−49.
    [19] Madotto A, Wu C S, Fung P. Mem2seq: effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018. 1468−1478.
    [20] Wu C S, Socher R, Xiong C. Global-to-local memory pointer networks for task-oriented dialogue. In: Proceedings of the 7th International Conference on Learning Representations, 2019. 1−19.
    [21] Lei W Q, Jin X S, Kan M Y, Ren Z C, He X N, Yin D W. Sequicity: simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018. 1437−1447.
    [22] García F, Hurtado L F, Segarra E, Sanchis E, Riccardi G. Combining multiple translation systems for spoken language understanding portability. In: 2012 IEEE Spoken Language Technology Workshop (SLT). Miami, FL, USA: IEEE, 2012. 194−198.
    [23] Calvo M, García F, Hurtado L F, Jiménez S, Sanchis E. Exploiting multiple hypotheses for multilingual spoken language understanding. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning. Sofia, Bulgaria: Association for Computational Linguistics, 2013. 193−201.
    [24] Calvo M, Hurtado L F, Garcia F, Sanchis E, Segarra E. Multilingual Spoken Language Understanding using graphs and multiple translations. Computer Speech & Language, 2016, 38: 86−103
    [25] Bai H, Zhou Y, Zhang J J, Zhao L, Hwang M Y, Zong C Q. Source critical reinforcement learning for transferring spoken language understanding to a new language. In: Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA: Association for Computational Linguistics, 2018. 3597−3607.
    [26] Chen W H, Chen J S, Su Y, Wang X, Yu D, Yan X F, et al. Xl-nbt: a cross-lingual neural belief tracking framework. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018. 414−424.
    [27] Schuster S, Gupta S, Shah R, Lewis M. Cross-lingual transfer learning for multilingual task oriented dialog. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, Minnesota: Association for Computational Linguistics, 2019. 3795−3805.
    [28] Ebrahimi J, Rao A, Lowd D, Dou D J. HotFlip: white-box adversarial examples for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018. 31−36.
    [29] Takeru Miyato, Andrew M. Dai, and Ian Goodfellow. Adversarial Training Methods for Semi-Supervised Text Classification. In: Proceedings of the 5th International Conference on Learning Representations, 2017. 1−11.
    [30] Belinkov Y, Bisk Y. Synthetic and natural noise both break neural machine translation. In: Proceedings of the 5th International Conference on Learning Representations, 2018. 1−13.
    [31] Cheng Y, Jiang L, Macherey W. Robust neural machine translation with doubly adversarial inputs. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019. 4324−4333.
    [32] Cheng Y, Tu Z P, Meng F D, Zhai J J, Liu Y. Towards robust neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018. 1756−1766.
    [33] Li J W, Monroe W, Shi T L, Jean S, Ritter A, Jurafsky D. Adversarial Learning for Neural Dialogue Generation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, 2017. 2157−2169.
    [34] Tong N, Bansal M. Adversarial over-sensitivity and over-stability strategies for dialogue models. In: Proceedings of the 22nd Conference on Computational Natural Language Learning. Brussels, Belgium: Association for Computational Linguistics, 2018. 486−496.
    [35] Gu J T, Lu Z D, Li H, Li V O K. Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: Association for Computational Linguistics, 2016. 1631−1640.
    [36] Och F J, Ney H. A systematic comparison of various statistical alignment models. Computational Linguistics, 2003, 29(1): 19−51 doi: 10.1162/089120103321337421
    [37] Kingma D, Ba J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, 2015. 1−15.
    [38] Mehri S, Srinivasan T, Eskenazi M. Structured fusion networks for dialog. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stockholm, Sweden: Association for Computational Linguistics, 2019. 165−177.
    [39] Jin X S, Lei W Q, Ren Z C, Chen H S, Liang S S, Zhao Y H, et al. Explicit State Tracking with Semi-Supervision for Neural Dialogue Generation. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. New York, NY, USA: Association for Computing Machinery, 2018. 1403−1412.
    [40] Zhang Y C, Ou Z J, Wang H X, Feng J L. A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Online: Association for Computational Linguistics, 2020. 9207−9219.
  • 加载中
计量
  • 文章访问数:  41
  • HTML全文浏览量:  14
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-09-16
  • 录用日期:  2021-01-15
  • 网络出版日期:  2021-02-02

目录

    /

    返回文章
    返回