An Overlapping Semantic Community Structure Detecting Algorithm by Label Propagation
-
摘要: 语义社会网络(Semantic social network, SSN)是一种由信息节点及链接关系构成的新型复杂网络, 为此以节点邻接关系为挖掘对象的传统社会网络社区发现算法无法有效处理语义社会网络重叠社区发现问题. 由此提出标签传播的语义重叠社区发现算法, 该算法以标签传播算法(Latent Dirichlet allocation, LDA)模型为语义信息模型, 利用Gibbs取样法建立节点语义信息到语义空间的量化映射; 提出可度量节点间相似性的主成分 (Semantic coherent neighborhood propinquity, SCNP)模型和语义影响力(Semantic impact, SI)模型; 以SCNP作为标签传播的权重, 以SI 作为截断值的参数, 提出一种改进的Semantic-LPA (Semantic label propagation algorithm)算法; 提出可度量语义社区发现结果的语义模块度模型, 并通过实验分析, 验证了算法及语义模块度模型的有效性及可行性.Abstract: Since the semantic social network (SSN) is a new kind of complex networks consisting of information nodes and link relationships, the traditional community detection algorithms which depend on the adjacency in social networks are not efficient in the SSN. To solve this problem, an overlapping community structure detecting method in semantic social network is proposed based on label propagation. Firstly, the algorithm utilizes the Gibss sampling method to establish the quantization mapping by which semantic information in nodes can be mapped into the semantic space, with the latent Dirichlet allocation (LDA) as the semantic model. Secondly, a principal component SCNP model is proposed which could measure the propinquity between nodes and the semantic impact model. Thirdly, an improved semantic label propagation algorithm is put forward, with SCNP as the weight of propagation and SI as the parameter of threshold. Finally, a semantic model by which the community structure of SSN can be measured is presented. The efficiency and feasibility of the algorithm and the semantic modularity are verified by experimental analysis.
-
[1] Yang Bo, Liu Da-You. Complex network clustering algorithms. Journal of Software, 2009, 20(1): 54-66(杨博, 刘大有. 复杂网络聚类方法. 软件学报, 2009, 20(1): 54-66) [2] [2] Girvan M, Newman M E J. Community structure in social and biological networks. Proceedings of National Academy of Science of the United States of America, 2002, 9(12): 7921-7826 [3] [3] Newman M E J. Fast algorithm for detecting community structure in networks. Physical Review E, 2004, 69(6): 066133 [4] [4] Palla G, Derenyi I, Farkas I, Vicsde T. Uncovering the overlapping community structures of complex networks in nature and societ. Nature, 2005, 435(7043): 814-818 [5] [5] Shen H W, Cheng X Q, Cai K, Hu M B. Detect overlapping and hierarchical community structure in networks. Physica A, 2009, 388(8): 1706-1712 [6] [6] Lancichinetti A, Fortunato S, Kertesz J. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 2009, 11(3): 033015 [7] [7] Gregory S. Finding overlapping communities in networks by label propagation. New Journal of Physics, 2010, 12(10): 103018 [8] [8] Jin D, Yang B, Baquero C, Liu D Y, He D X, Liu J. Markov random walk under constraint for discovering overlapping communities in complex networks. Journal of Statistical Mechanics: Theory and Experiment, 2011, P05031 [9] Jin Di, Yang Bo, Liu Jie, Liu Da-You, He Dong-Xiao. Ant colony optimization based on random walk for community detection in complex networks. Journal of Software, 2012, 23(3): 451-464(金弟, 杨博, 刘杰, 刘大有, 何东晓. 复杂网络簇结构探测-基于随机游走的蚁群算法. 软件学报, 2012,23(3): 451-464) [10] Gan Wen-Yan, He Nan, Li De-Yi. Community discovery method in networks based on topological potential. Journal of Software, 2009, 20(8): 2241-2254(淦文燕, 赫南, 李德毅. 一种基于拓扑势的网络社区发现方法. 软件学报, 2009, 20(8): 2241-2254) [11] Jin Di, Liu Jie, Yang Bo, He Dong-Xiao, Liu Da-You. Genetic algorithm with local search for community detection in large-scale complex networks. Acta Automatica Sinica, 2011, 37(7): 873-882(金弟, 刘杰, 杨博, 何东晓, 刘大有. 局部搜索与遗传算法结合的大规模复杂网络社区探测. 自动化学报, 2011, 37(7): 873-882) [12] He Dong-Xiao, Zhou Xu, Wang Zuo, Zhou Chun-Guang, Wang Zhe, Jin Di. Community mining in complex networks-clustering combination based genetic algorithm. Acta Automatica Sinica, 2010, 36(8): 1160-1170(何东晓, 周栩, 王佐, 周春光, 王喆, 金弟. 复杂网络社区挖掘-基于聚类融合的遗传算法. 自动化学报, 2010,36(8): 1160-1170) [13] Yang Bo, Liu Jie, Liu Da-You. A random network ensemble model based generalized network community mining algorithm. Acta Automatica Sinica, 2012, 38(5): 812-822(杨博, 刘杰, 刘大有. 基于随机网络集成模型的广义网络社区挖掘算法. 自动化学报, 2012, 38(5): 812-822) [14] Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022 [15] Zhang H Z, Qiu B J, Giles C L, Foley H C, Yen J. An LDA-based community structure discovery approach for large-scale social networks. In: Proceedings of the 2007 IEEE Intelligence and Security Informatics. New Brunswick, NJ: IEEE, 2007. 200-207 [16] Kemp C, Tenenbaum J B, Griffiths T L, Yamada Y, Ueda N. Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence. Boston, MA: AAAI Press, 2006. 381-388 [17] Henderson K, Eliassi R T. Applying latent Dirichlet allocation to group discovery in large graphs. In: Proceedings of the 2009 ACM symposium on Applied Computing. New York: ACM, 2009. 1456-1461 [18] Henderson K, Eliassi-Rad T, Papadimitriou S, Faloutsos C. HCDF: A hybrid community discovery framework. In: Proceedings of the 2010 SIAM International Conference on Data Mining. Columbus, OH: SIAM, 2010. 754-765 [19] Zhang H, Giles C L, Foley H C, Yen J. Probabilistic community discovery using hierarchical latent Gaussian mixture model. In: Proceedings of the 22nd National Conference on Artificial Intelligence. Boston, MA: AAAI Press, 2007, 7: 663-668 [20] Zhang H Z, Li W, Wang X R, Giles C L. HSN-PAM: Finding hierarchical probabilistic 2007 groups from large-scale networks. In: Proceedings of the 2007 IEEE International Conference on Data Mining Workshops. Omaha, NE: IEEE, 2007. 27-32 [21] Steyvers M, Smyth P, Rosen-Zvi M, Groffiths T. Probabilistic author-topic models for information discovery. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2004. 306-315 [22] McCallum A, Corrada-Emmanuel A, Wang X R. Topic and role discovery in social networks. Computer Science Department Faculty Publication Series, 2005. 3 [23] McCallum A, Wang X, Corrada-Emmanuel A. Topic and role discovery in social networks with experiments on Enron and academic email. Journal of Artificial Intelligence Research, 2007, 30(1): 249-272 [24] Zhou D, Manavoglu E, Li J, Lee C L, Zha H Y. Probabilistic models for discovering e-communities. In: Proceedings of the 15th International Conference on World Wide Web. New York: ACM, 2006. 173-182 [25] Cha Y, Cho J. Social-network analysis using topic models. In: Proceedings of the 35th international ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2012. 565-574 [26] Wang X, Mohanty N, McCallum A. Group and topic discovery from relations and text. In: Proceedings of the 3rd International Workshop on Link Discovery. New York: ACM, 2005. 28-35 [27] Pathak N, DeLong C, Banerjee A, Erickson K. Social topic models for community extraction. In: Proceedings of the 2nd SNA-KDD Workshop. Las Vegas, Nevada, USA: ACM, 2008. 8 [28] Mei Q, Cai D, Zhang D, Zhai C X. Topic modeling with network regularization. In: Proceedings of the 17th International Conference on World Wide Web. New York: ACM, 2008. 101-110 [29] Sachan M, Contractor D, Faruquie T, Subramaniam V. Probabilistic model for discovering topic based communities in social networks. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. New York: ACM, 2011. 2349-2352 [30] Sachan M, Contractor D, Faruquie T, Subramaniam V. Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st International Conference on World Wide Web. New York: ACM, 2012. 331-340 [31] Yin Z J, Cao L L, Gu Q Q, Han J W. Latent community topic analysis: integration of community discovery with topic modeling. ACM Transactions on Intelligent Systems and Technology, 2012, 3(4): Article No. 63, DOI: 10.1145/2 337542.2337548 [32] Zhang Y Z, Wang J Y, Wang Y, Zhou L Z. Parallel community detection on large networks with propinquity dynamics. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2009. 997-1006 [33] Lou H, Li S H, Zhao Y X. Detecting community structure using label propagation with weighted coherent neighborhood propinquity. Physica A: Statistical Mechanics and Its Applications, 2013, 392(14): 3095-3105 [34] Zhu X J Ghahramani Z, Lafferty J. Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning (ICML-2003). Piscataway, N J: IEEE, 2003. 912-919 [35] Xie J R, Szymanski B K, Liu X M. SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In: Proceedings of the 11th IEEE International Conference of Data Mining Workshops. Washington, CD: IEEE, 2011. 344-349
点击查看大图
计量
- 文章访问数: 1813
- HTML全文浏览量: 60
- PDF下载量: 1968
- 被引次数: 0