Extraction of Comparative Elements Using Conditional Random Fields
-
摘要: 随着主观性评价文本数量的不断增长, 文本情感分析已经成为众多研究者关注的对象. 比较要素抽取是比较句情感分析的重要研究任务之一, 比较句的情感分析结果与比较要素相结合才更有意义. 为了提高比较要素抽取的性能, 本文提出在构建系统模型的过程中引入浅层句法信息、比较词候选信息和启发式位置信息等多种语言学相关特征, 并且在不增加领域知识的情况下, 有效提高系统的准确率和F1值, 同时本文提出的方法可以有效处理含有多个比较关系的句子. 实验结果表明, 将本文提出的特征应用于条件随机域 (Conditional random fields, CRFs)模型可以有效提高比较要素抽取的各项性能指标, 同时, 将本文的实验结果与2012 年中文情感分析评测结果的最大值进行了比较, 各项指标均超过最大值, 进一步证明了本文方法的有效性.Abstract: With the rapid growth of the number of evaluative texts on the Web, sentiment analysis has attracted the attention of researchers all over the world. Extraction of comparative elements is one of the important tasks for sentiment analysis of comparative sentences. It is more meaningful that results of sentiment analysis combine with comparative elements. To improve the performance of comparative elements extraction, this paper proposes to introduce shallow parsing features, comparative word candidates and heuristic position information to conditional random fields (CRFs) for building a system model. The proposed method is not only free from introducing domain knowledge but also can effectively deal with sentences containing a few comparative relationships. Experiment results show that the performance of system is improved when introducing proposed features to the CRFs model. Meanwhile, compared with the best results of the 2012 Chinese opinion analysis evaluation, the F1-scores of the proposed method are higher than the maximum value.
-
[1] Liu B. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (2nd Edition). Berlin: Springer-Verlag, 2011. 459-496 [2] Xu Bing, Zhao Tie-Jun, Wang Shan-Yu, Zheng De-Quan. Extraction of opinion targets based on shallow parsing features. Acta Automatica Sinica, 2011, 37(10): 1241-1247(徐冰, 赵铁军, 王山雨, 郑德权. 基于浅层句法特征的评价对象抽取研究. 自动化学报, 2011, 37(10): 1241-1247) [3] Xu K Q. Mining and Analyzing Customer Opinions/Senti- ments of Web 2.0 for Business Applications [Ph.D. dissertation], City University of Hong Kong, China, 2011. [4] Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Barcelona, Spain: ACL, 2004. 271-278 [5] Carenini G, Cheung J C K, Pauls A. Multi-document summarization of evaluation text. Computational Intelligence, 2013, 29(4): 545-576 [6] Jindal N, Liu B. Mining comparative sentences and relations. In: Proceedings of the 21th National Conference on Artificial Intelligence (AAAI-06). Boston, Massachusetts, USA: AAAI, 2006. 1331-1336 [7] Yang S, Ko Y. Finding relevant features for Korean comparative sentence extraction. Pattern Recognition Letters, 2011, 32(2): 293-296 [8] Xu K Q, Liao S S, Li J X, Song Y X. Mining comparative opinions from customer reviews for competitive intelligence. Decision Support Systems, 2011, 50(4): 743-754 [9] Feldman R, Fresko M, Goldenberg J, Netzer O, Ungar L. Extracting product comparisons from discussion boards. In: Proceedings of the 7th IEEE International Conference on Data Mining. Omaha, Nebraska, USA: IEEE, 2007. 469-474 [10] Xing L Q, Liu L. Chinese standard comparative sentence recognition and extraction research. In: Proceedings of the 2013 International Conference on Information Engineering and Applications. Chongqing, China: Springer, 2013. 415 -422 [11] Huang Gao-Hui, Yao Tian-Fang, Liu Quan-Sheng. Mining Chinese comparative sentences and relations based on CRF algorithm. Application Research of Computers, 2010, 27(6): 2061-2064(黄高辉, 姚天昉, 刘全升. 基于 CRF 算法的汉语比较句识别和关系抽取. 计算机应用研究, 2010, 27(6): 2061-2064) [12] Song Rui, Lin Hong-Fei, Chang Fu-Yang. Chinese comparative sentences identification and comparative relations extraction. Journal of Chinese Information Processing, 2009, 23(2): 102-107(宋锐, 林鸿飞, 常富洋. 中文比较句识别及比较关系抽取. 中文信息学报, 2009, 23(2): 102-107) [13] Li S, Lin C Y, Song Y I, Li Z J. Comparable entity mining from comparative questions. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(7): 1498-1509 [14] Ding X W, Liu B, Zhang L. Entity discovery and assignment for opinion mining applications. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Paris, France: ACM, 2009. 1125- 1134 [15] Wang S G, Li H X, Song X L. Automatic semantic role labeling for Chinese comparative sentences based on hybrid patterns. In: Proceedings of the 2010 International Conference on Artificial Intelligence and Computational Intelligence. Sanya, China: IEEE, 2010. 378-382 [16] Li Jian-Jun. Research on the Identification of Comparative Sentences and Relations and Its Application [Master dissertation], Chongqing University, China, 2011.(李建军. 比较句与比较关系识别研究及其应用 [硕士学位论文]. 重庆大学, 中国, 2011.) [17] Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference Machine Learning. Williamstown, Australia: Morgan Kaufmann, 2001. 282-289
点击查看大图
计量
- 文章访问数: 1507
- HTML全文浏览量: 97
- PDF下载量: 1488
- 被引次数: 0