2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于领域知识的图模型词义消歧方法

鹿文鹏 黄河燕 吴昊

鹿文鹏, 黄河燕, 吴昊. 基于领域知识的图模型词义消歧方法. 自动化学报, 2014, 40(12): 2836-2850. doi: 10.3724/SP.J.1004.2014.02836
引用本文: 鹿文鹏, 黄河燕, 吴昊. 基于领域知识的图模型词义消歧方法. 自动化学报, 2014, 40(12): 2836-2850. doi: 10.3724/SP.J.1004.2014.02836
LU Wen-Peng, HUANG He-Yan, WU Hao. Word Sense Disambiguation with Graph Model Based on Domain Knowledge. ACTA AUTOMATICA SINICA, 2014, 40(12): 2836-2850. doi: 10.3724/SP.J.1004.2014.02836
Citation: LU Wen-Peng, HUANG He-Yan, WU Hao. Word Sense Disambiguation with Graph Model Based on Domain Knowledge. ACTA AUTOMATICA SINICA, 2014, 40(12): 2836-2850. doi: 10.3724/SP.J.1004.2014.02836

基于领域知识的图模型词义消歧方法

doi: 10.3724/SP.J.1004.2014.02836
基金项目: 

国家重点基础研究发展计划(973计划)(2013CB329303),国家自然科学基金(61132009),山东省高等学校科技计划(J12LN09)资助

详细信息
    作者简介:

    黄河燕 北京理工大学教授. 主要研究方向为自然语言处理, 机器翻译.E-mail: hhy63@bit.edu.cn

    通讯作者:

    鹿文鹏 北京理工大学计算机学院博士研究生, 齐鲁工业大学理学院副教授. 主要研究方向为词义消歧. 本文通信作者.E-mail: luwpeng@bit.edu.cn

Word Sense Disambiguation with Graph Model Based on Domain Knowledge

Funds: 

Supported by National Basic Research Program of China (973 Program) (2013CB329303), National Natural Science Foundation of China (61132009), and Shandong Province Higher Educational Science and Technology Program (J12LN09)

  • 摘要: 对领域知识挖掘利用的充分与否,直接影响到面向特定领域的词义消歧(Word sense disambiguation, WSD)的性能.本文提出一种基于领域知识的图模型词义消歧方法,该方法充分挖掘领域知识,为目标领域收集文本领域关联词作为文本领域知识,为目标歧义词的各个词义获取词义领域标注作为词义领域知识;利用文本领域关联词和句子上下文词构建消歧图,并根据词义领域知识对消歧图进行调整;使用改进的图评分方法对消歧图的各个词义结点的重要度进行评分,选择正确的词义.该方法能有效地将领域知识整合到图模型中,在Koeling数据集上,取得了同类研究的最佳消歧效果.本文亦对多种图模型评分方法做了改进,进行了详细的对比实验研究.
  • [1] Navigli R. Word sense disambiguation: a survey. ACM Computing Surveys, 2009, 41(2): 1011-1069
    [2] Liu Yu-Peng, Li Sheng, Zhao Tie-Jun. System combination based on WSD using WordNet. Acta Automatica Sinica, 2010, 36(11): 1575-1580(刘宇鹏, 李生, 赵铁军. 基于WordNet 词义消歧的系统融合. 自动化学报, 2010, 36(11): 1575-1580)
    [3] Lu Zhi-Mao, Liu Ting, Li Sheng. The research progress of statistical word sense disambiguation. Acta Electronica Sinica, 2006, 34(2): 333-343(卢志茂, 刘挺, 李生. 统计词义消歧的研究进展. 电子学报, 2006, 34(2): 333-343)
    [4] Wang Bo, Yang Mu-Yun, Li Sheng, Zhao Tie-Jun. Evaluation of all-words WSD for Chinese in machine translation. Acta Automatica Sinica, 2008, 34(5): 535-541(王博, 杨沐昀, 李生, 赵铁军. 中文全词消歧在机器翻译系统中的性能评测. 自动化学报, 2008, 34(5): 535-541)
    [5] Wang Rui-Qin, Kong Fan-Sheng. Research on unsupervised word sense disambiguation. Journal of Software, 2009, 20(8): 2138-2152(王瑞琴, 孔繁胜. 无监督词义消歧研究. 软件学报, 2009, 20(8): 2138-2152)
    [6] Lu Zhi-Mao, Liu Ting, Li Sheng. Full-words automatic word sense tagging based on unsupervised learning algorithm. Acta Automatica Sinica, 2006, 32(2): 228-236(卢志茂, 刘挺, 李生. 基于无指导机器学习的全文词义自动标注方法. 自动化学报, 2006, 32(2): 228-236)
    [7] Agirre E, de Lacalle O L, Soroa A. Knowledge-based WSD and specific domains: performing better than generic supervised WSD. In: Proceedings of the 2009 International Joint Conference on Artificial Intelligence 2009. Pasadena, USA: Morgan Kaufmann Publishers Inc, 2009. 1501-1506
    [8] Magnini B, Strapparava C, Pezzulo G, Gliozzo A. The role of domain information in word sense disambiguation. Natural Language Engineering, 2002, 8(4): 359-373
    [9] Navigli R, Ponzetto S P. BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 2012, 193: 217-250
    [10] Stevenson M, Agirre E, Soroa A. Exploiting domain information for word sense disambiguation of medical documents. Journal of the American Medical Informatics Association, 2011, 19(2): 235-240
    [11] Agirre E, de Lacalle O L, Fellbaum C, Hsieh S K, Tesconi M, Monachini M, Vossen P, Seqers R. SemEval-2010 task 17: all-words word sense disambiguation on a specific domain. In: Proceedings of the 2009 NAACL HLT Workshop on Semantic Evaluations: Recent Achievements and Future Directions. Boulder, Colorado: Association for Computational Linguistics, 2009. 123-128
    [12] Agirre E, Soroa A. Personalizing PageRank for word sense disambiguation. In: Proceedings of the 12th Conference of the European Chapter of the ACL. Stroudsburg: Association for Computational Linguistics, 2009. 33-41
    [13] Mihalcea R, Tarau P, Figa E. PageRank on semantic networks, with application to word sense disambiguation. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004). Stroudsburg: Association for Computational Linguistics, 2004. Article no. 1126, DOI: 10.3115/1220355.1220517
    [14] Koeling R, Macarthy D, Carroll J. Domain-specific sense distributions and predominant sense acquisition. In: Proceedings of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP). Stroudsburg: Association for Computational Linguistics, 2005. 419-426
    [15] Gale W A, Church K W, Yarowsky D. One sense per discourse. In: Proceedings of the 4th DARPA Workshop on Speech and Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 1992. 233 -237
    [16] McCarthy D, Koeling R, Weeds J, Carroll J. Unsupervised acquisition of predominant word senses. Computational Linguistics, 2007, 33(4): 553-590
    [17] Agirre E, de Lacalle O L. Supervised domain adaption for WSD. In: Proceedings of the 12th Conference of the European Chapter of the ACL. Athens, Greece: Association for Computational Linguistics, 2009. 42-50
    [18] Chan Y S, Ng H T. Domain adaptation with active learning for word sense disambiguation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic: Association for Computational Linguistics, 2007. 49-56
    [19] Zhong Z, Ng H T, Chan Y S. Word sense disambiguation using OntoNotes: an empirical study. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Stoudsburg, PA: Association for Computational Linguistics, 2008. 1002-1010
    [20] Aitor S, Eneko A, Oier L L, Monica M, Jessie L, Shu K H. Kyoto: an integrated system for specific domain WSD. In: Proceedings of the 5th International Workshop on Semantic Evaluation. Uppsala, Sweden: Association for Computational Linguistics, 2010. 417-420
    [21] Reddy S, Inumella A, McCarthy D, Stevenson M. IIITH: domain specific word sense disambiguation. In: Proceedings of the 5th International Workshop on Semantic Evaluation. Stoudsburg, PA: Association for Computational Linguistics, 2010. 387-391
    [22] Galley M, McKeown K. Improving word sense disambiguation in lexical chaining. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003). San Francisco, CA: Morgan Kaufmann Publishers Inc., 2003. 1486-1488
    [23] Chen Wen-Liang, Zhu Jing-Bo, Zhu Mu-Hua, Yao Tian-Shun. Text representation using domain dictionary. Journal of Computer Research and Development, 2005, 42(12): 2155 -2160(陈文亮, 朱靖波, 朱慕华, 姚天顺. 基于领域词典的文本特征表示. 计算机研究与发展, 2005, 42(12): 2155-2160)
    [24] Jin P, McCarthy D, Koeling R, Carroll J. Estimating and exploiting the entropy of sense distributions. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Boulder, Colorado: Association for Computational Linguistics, 2009. 233-236
    [25] Liu Peng-Yuan, Zhao Tie-Jun. Unsupervised translation disambiguation based on Web indirect association of bilingual word. Journal of Software, 2010, 21(4): 575-585(刘鹏远, 赵铁军. 基于双语词汇Web间接关联的无指导译文消歧. 软件学报, 2010, 21(4): 575-585)
    [26] Navigli R, Lapata M. An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(4): 678-692
  • 加载中
计量
  • 文章访问数:  1857
  • HTML全文浏览量:  97
  • PDF下载量:  1807
  • 被引次数: 0
出版历程
  • 收稿日期:  2014-01-21
  • 修回日期:  2014-05-01
  • 刊出日期:  2014-12-20

目录

    /

    返回文章
    返回