2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于词语对狄利克雷过程的时序摘要

席耀一 李弼程 李天彩 黄山奇

席耀一, 李弼程, 李天彩, 黄山奇. 基于词语对狄利克雷过程的时序摘要. 自动化学报, 2015, 41(8): 1452-1460. doi: 10.16383/j.aas.2015.c150001
引用本文: 席耀一, 李弼程, 李天彩, 黄山奇. 基于词语对狄利克雷过程的时序摘要. 自动化学报, 2015, 41(8): 1452-1460. doi: 10.16383/j.aas.2015.c150001
XI Yao-Yi, LI Bi-Cheng, LI Tian-Cai, HUANG Shan-Qi. Temporal Summarization Based on Biterm Dirichlet Process. ACTA AUTOMATICA SINICA, 2015, 41(8): 1452-1460. doi: 10.16383/j.aas.2015.c150001
Citation: XI Yao-Yi, LI Bi-Cheng, LI Tian-Cai, HUANG Shan-Qi. Temporal Summarization Based on Biterm Dirichlet Process. ACTA AUTOMATICA SINICA, 2015, 41(8): 1452-1460. doi: 10.16383/j.aas.2015.c150001

基于词语对狄利克雷过程的时序摘要

doi: 10.16383/j.aas.2015.c150001
基金项目: 

国家社会科学基金(14BXW028)资助

详细信息
    作者简介:

    李弼程 解放军信息工程大学信息系统工程学院教授.主要研究方向为文本分析与理解,语音处理与识别,图像/视频处理与识别,信息融合.E-mail:lbclm@gmail.com

Temporal Summarization Based on Biterm Dirichlet Process

Funds: 

Supported by National Social Science Foundation of China (14BXW028)

  • 摘要: 时序摘要是按照时间顺序生成摘要, 对话题的演化发展进行概括. 已有的相关研究忽视或者不能准确发现句子中隐含的子话题信息. 针对该问题, 本文建立了一种新的主题模型, 即词语对狄利克雷过程, 并提出了一种基于该模型的时序摘要生成方法. 首先通过模型推理得到句子的子话题分布; 然后利用该分布计算句子的相关度和新颖度; 最后按时间顺序抽取与话题相关且新颖度高的句子组成时序摘要. 实验结果表明, 本文方法较目前的代表性研究方法生成了更高质量的时序摘要.
  • [1] Yan R, Wan X J, Otterbacher J, Kong L, Li X M, Zhang Y. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Beijing, China: ACM, 2011. 745-754
    [2] Yan R, Kong L, Huang C R, Wan X J, Li X M, Zhang Y. Timeline generation through evolutionary trans-temporal summarization. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Edinburgh, UK: ACL, 2011. 433-443
    [3] Tran G B, Tran T A, Tran N K. Leveraging learning to rank in an optimization framework for timeline summarization. In: Proceedings of the 36th Annual International ACM SIGIR Workshop on Time-aware Information Access. Dublin, Ireland: ACM, 2013. 433-443
    [4] Chieu H L, Lee Y K. Query based event extraction along a timeline. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Sheffield, UK: ACM, 2004. 425-432
    [5] Xu S Z, Wang S S, Zhang Y. Summarizing complex events: a cross-modal solution of storylines extraction and reconstruction. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle, USA: ACL, 2013. 1281-1291
    [6] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022
    [7] Cao Jian-Ping, Wang Hui, Xia You-Qing, Qiao Feng-Cai, Zhang Xin. Bi-path evolution model for online topic model based on LDA. Acta Automatica Sinica, 2014, 40(12): 2877 -2886(曹建平, 王晖, 夏友清, 乔凤才, 张鑫. 基于 LDA 的双通道在线主题演化模型. 自动化学报, 2014, 40(12): 2877-2886)
    [8] Gao D H, Li W J, Zhang R X. Sequential summarization: a new application for timely updated twitter trending topics. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia, Bulgaria: ACL, 2013. 567-571
    [9] Huang L F, Huang L E. Optimized event storyline generation based on mixture-event-aspect model. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle, USA: ACL, 2013. 726-735
    [10] Li J W, Li S J. Evolutionary hierarchical dirichlet process for timeline summarization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia, Bulgaria: ACL, 2013. 556-560
    [11] Yan X H, Guo J F, Lan Y Y, Cheng X Q. A biterm topic model for short texts. In: Proceedings of the 22nd International World Wide Web Conference. Rio de Janeiro, Brazil: ACM, 2013. 1445-1455
    [12] Allan J, Gupta R, Khandelwal V. Temporal summaries of new topics. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New Orleans, USA: ACM, 2001. 10-18
    [13] Lin F R, Liang C H. Storyline-based summarization for news topic retrospection. Decision Support Systems, 2008, 45(3): 473-490
    [14] He Rui-Fang, Qin Bing, Liu Ting, Pan Yue-Qun, Li Sheng. Temporal multi-document summarization based on macro-micro importance discriminative model. Journal of Computer Research and Development, 2009, 46(7): 1184-1191(贺瑞芳, 秦兵, 刘挺, 潘越群, 李生. 基于宏微观重要性判别模型的时序多文档文摘. 计算机研究与发展, 2009, 46(7): 1184-1191)
    [15] Chen C C, Chen M C. TSCAN: a content anatomy approach to temporal topic summarization. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(1): 170-183
    [16] Shou L D, Wang Z H, Chen K, Chen G. Sumblr: continuous summarization of evolving tweet streams. In: Proceedings of the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland: ACM, 2013. 533-542
    [17] Olariu A. Efficient online summarization of microblogging streams. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Gothenburg, Sweden: ACL, 2013. 236-240
    [18] Olariu A. Hierarchical clustering in improving microblog stream summarization. In: Proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics. Samos, Greece: Springer, 2013. 424- 435
    [19] Zubiaga A, Spina D, Amigó E, Gonzalo J. Towards real-time summarization of scheduled events from twitter streams. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media. Milwaukee, USA: ACM, 2013. 319-320
    [20] Teh Y W, Jordan M I, Beal M J, Blei D M. Hierarchical dirichlet processes. Journal of the American Statistical Association, 2006, 101(476): 1566-1581
    [21] Griffiths T L, Steyvers M. Finding scientific topics. Proceedings of the National Academy of Science of the United States of America, 2004, 101(Suppl 1): 5228-5235
    [22] Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Melbourne, Australia: ACM, 1998. 335-336
    [23] Lin C Y, Hovy E. Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Edmonton, Canada: ACL, 2003. 71-78
    [24] Erkan G, Radev D R. LexRank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 2004, 22(1): 457-479
    [25] Radev D R, Jing H Y, Stys M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919-938
    [26] Li P, Wang Y L, Gao W, Jiang J. Generating aspect-oriented multi-document summarization with event-aspect model. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Edinburgh, UK: ACL, 2011. 1137-1146
  • 加载中
计量
  • 文章访问数:  1657
  • HTML全文浏览量:  67
  • PDF下载量:  1922
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-01-04
  • 修回日期:  2015-04-08
  • 刊出日期:  2015-08-20

目录

    /

    返回文章
    返回