2.765

2022影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于密度与近邻传播的数据流聚类算法

张建朋 陈福才 李邵梅 刘力雄

张建朋, 陈福才, 李邵梅, 刘力雄. 基于密度与近邻传播的数据流聚类算法. 自动化学报, 2014, 40(2): 277-288. doi: 10.3724/SP.J.1004.2014.00277
引用本文: 张建朋, 陈福才, 李邵梅, 刘力雄. 基于密度与近邻传播的数据流聚类算法. 自动化学报, 2014, 40(2): 277-288. doi: 10.3724/SP.J.1004.2014.00277
ZHANG Jian-Pen, CHEN Fu-Cai, LI Shao-Mei, LIU Li-Xiong. Data Stream Clustering Algorithm Based on Density and Affinity Propagation Techniques. ACTA AUTOMATICA SINICA, 2014, 40(2): 277-288. doi: 10.3724/SP.J.1004.2014.00277
Citation: ZHANG Jian-Pen, CHEN Fu-Cai, LI Shao-Mei, LIU Li-Xiong. Data Stream Clustering Algorithm Based on Density and Affinity Propagation Techniques. ACTA AUTOMATICA SINICA, 2014, 40(2): 277-288. doi: 10.3724/SP.J.1004.2014.00277

基于密度与近邻传播的数据流聚类算法

doi: 10.3724/SP.J.1004.2014.00277
基金项目: 

国家高技术研究发展计划(863计划)(2011AA010603,2011AA010605)资助

详细信息
    作者简介:

    陈福才 国家数字交换系统工程技术研究中心研究员.主要研究方向为电信网信息关防.E-mail:chenfucai@ndsc.com.cn

Data Stream Clustering Algorithm Based on Density and Affinity Propagation Techniques

Funds: 

Supported by National High Technology Research and Development Program of China (863 Program) (2011AA010603, 2011AA010605)

  • 摘要: 针对现有算法聚类精度不高、处理离群点能力较差以及不能实时检测数据流变化的缺陷,提出一种基于密度与近邻传播融合的数据流聚类算法.该算法采用在线/离线两阶段处理框架,通过引 入微簇衰减密度来精确反映数据流的演化信息,并采用在线动态维护和删减微簇机制,使算法模型更 符合原始数据流的内在特性.同时,当模型中检测到新的类模式出现时,采用一种改进的加权近邻传播聚类(Weighted and hierarchical affinity propagation,WAP)算法对模 型进行重建,因而能够实时检测到数据流的变化,并能给出任意时间的聚类结果.在真实数据集和人工 数据集上的实验表明,该算法具有良好的适用性、有效性和可扩展性,能够取得较好的聚类效果.
  • [1] Hassani M, Spaus P, Gaber M M, Seidl T. Density-based projected clustering of data streams. In: Proceedings of the 2012 Scalable Uncertainty Management, Berlin Heidelberg, Springer, 2012. 311-324
    [2] Bifet A, Holmes G, Pfahringer B, Kranen P, Kremer H, Jansen T, Seidl T. MOA: massive online analysis, a framework for stream classification and clustering. The Journal of Machine Learning Research, 2010, 99: 1601-1604
    [3] Aggarwal C C, Han J W, Wang J Y, Yu P S. A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases (Vol. 29), VLDB Endowment, 2003. 81-92
    [4] Aggarwal C C, Han J W, Wang J Y, Yu P S. A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases (Vol. 30), VLDB Endowment, 2004. 852-863
    [5] Cao F, Ester M, Qian W, Zhou A Y. Density-based clustering over an evolving data stream with noise. In: Proceedings of the 2006 SIAM International Conference on Data Mining, Bethesda, USA, 2006. 328-339
    [6] Chen Y, Tu L. Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Jose, California, 2007. 133-142
    [7] Tu L, Chen Y. Stream data clustering based on grid density and attraction. ACM Transactions on Knowledge Discovery from Data (TKDD), 2009, 3(3): 12-20
    [8] Yang Ning, Tang Chang-Jie, Wang Yue, Chen Yu, Zheng J L. Clustering algorithm on data stream with skew distribution based on temporal density. Journal of Software, 2010, 21(5): 1031-1041 (杨宁,唐常杰,王悦,陈瑜,郑皎凌.一种基于时态密度的倾斜分布数据流聚类算法. 软件学报, 2010 21(5): 1031-1041)
    [9] Ntoutsi I, Zimek A, Palpanas T, Kröger P, Kriegel H P. Density-based projected clustering over high dimensional data streams. SIAM SDM, 2012, 12: 987-998
    [10] Yu Yan-Wei, Wang Qin, Kuang Jun, He Jie. An on-line density-based clustering algorithm for spatial data stream. Acta Automatica Sinica, 2012, 38(6): 1051-1058 (于彦伟, 王沁, 邝俊, 何杰. 一种基于密度的空间数据流在线聚类算法. 自动化学报, 2012, 38(6): 1051-1058)
    [11] Zhu Qun, Zhang Yu-Hong, Hu Xue-Gang, Li Pei-Pei. A double-window-based classification algorithm for concept drifting data streams. Acta Automatica Sinica, 2011, 37(9): 1077-1084 (朱群, 张玉红, 胡学钢, 李培培.一种基于双层窗口的概念漂移数据流分类算法. 自动化学报, 2011, 37(9): 1077-1084)
    [12] Tang J. An Algorithm for Streaming Clustering [Ph.,D. dissertation], Uppsala University, Sweden, 2011
    [13] Zhang X, Furtlehner C, Sebag M. Data streaming with affinity propagation. In: Proceedings of the 2008 Machine Learning and Knowledge Discovery in Databases, Berlin Heidelberg, Springer, 2008. 628-643
    [14] Wang Kai-Jun, Zhang Jun-Ying, Li Dan, Zhang Xin-Na, Guo Tao. Adaptive affinity propagation clustering. Acta Automatica Sinica, 2007, 33(12): 1242-1246 (王开军, 张军英, 李丹,张新娜,郭涛. 自适应仿射传播聚类. 自动化学报, 2007, 33(12): 1242-1246)
    [15] Huang De-Cai, Wu Tian-Hong. Density-based clustering algorithm for mixture data sets. Control and Decision, 2010, 25(3): 416-421 (黄德才, 吴天虹. 基于密度的混合属性数据流聚类算法. 控制与决策, 2010 25(3): 416-421)
    [16] Zhao L, Kang H S, Kim S R. Improved clustering for intrusion detection by principal component analysis with effective noise reduction. In: Proceedings of the 2013 Information and Communicatiaon Technology, Berlin Heidelberg, Springer, 2013. 490-495
  • 加载中
计量
  • 文章访问数:  2345
  • HTML全文浏览量:  119
  • PDF下载量:  1081
  • 被引次数: 0
出版历程
  • 收稿日期:  2013-01-16
  • 修回日期:  2013-05-03
  • 刊出日期:  2014-02-20

目录

    /

    返回文章
    返回