2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种学习稀疏BN最优结构的改进K均值分块学习算法

高晓光 王晨凤 邸若海

高晓光, 王晨凤, 邸若海. 一种学习稀疏BN最优结构的改进K均值分块学习算法. 自动化学报, 2020, 46(5): 923-933. doi: 10.16383/j.aas.c180837
引用本文: 高晓光, 王晨凤, 邸若海. 一种学习稀疏BN最优结构的改进K均值分块学习算法. 自动化学报, 2020, 46(5): 923-933. doi: 10.16383/j.aas.c180837
GAO Xiao-Guang, WANG Chen-Feng, DI Ruo-Hai. A Block Learning Algorithm With Improved K-means Algorithm for Learning Sparse BN Optimal Structure. ACTA AUTOMATICA SINICA, 2020, 46(5): 923-933. doi: 10.16383/j.aas.c180837
Citation: GAO Xiao-Guang, WANG Chen-Feng, DI Ruo-Hai. A Block Learning Algorithm With Improved K-means Algorithm for Learning Sparse BN Optimal Structure. ACTA AUTOMATICA SINICA, 2020, 46(5): 923-933. doi: 10.16383/j.aas.c180837

一种学习稀疏BN最优结构的改进K均值分块学习算法

doi: 10.16383/j.aas.c180837
基金项目: 

国家自然科学基金 61573285

详细信息
    作者简介:

    王晨凤  西北工业大学电子信息学院硕士研究生. 2017年获得西北工业大学学士学位.主要研究方向为贝叶斯网络和数据挖掘. E-mail: chen-cc@mail.nwpu.edu.cn

    邸若海 西北工业大学电子信息学院博士后. 2016年获得西北工业大学系统工程专业博士学位.主要研究方向为小数据集条件下贝叶斯网络结构学习和参数学习. E-mail: diruohai@nwpu.edu.cn

    通讯作者:

    高晓光 西北工业大学电子信息学院教授. 1989年获得西北工业大学飞行器导航与控制系统博士学位.主要研究方向为贝叶斯网络和复杂系统建模与分析.本文通信作者. E-mail: cxg2012@nwpu.edu.cn

A Block Learning Algorithm With Improved K-means Algorithm for Learning Sparse BN Optimal Structure

Funds: 

National Natural Science Foundation of China 61573285

More Information
    Author Bio:

    WANG Chen-Feng Master student at the School of Electronics and Information, Northwestern Polytechnical University. She received her bachelor degree from Northwestern Polytechnical University in 2017. Her research interest covers Bayes networks and data mining

    DI Ruo-Hai Postdoctoral at the School of Electronics and Information, Northwestern Polytechnical University. He received his Ph. D. degree in system engineering from Northwestern Polytechnical University in 2016. His research interest covers structure and parameter learning of Bayesian networks from small data set

    Corresponding author: GAO Xiao-Guang Professor at the School of Electronics and Information, Northwestern Polytechnical University. She received her Ph. D. degree in aircraft navigation and control system from Northwestern Polytechnical University in 1989. Her research interest covers Bayesian networks, modeling and analysis of complex systems. Corresponding author of this paper
  • 摘要: 目前贝叶斯网络(Bayesian networks, BN)的传统结构学习算法在处理高维数据时呈现出计算负担过大、在合理时间内难以得到期望精度结果的问题.为了在高维数据下学习稀疏BN的最优结构, 本文提出了一种学习稀疏BN最优结构的改进K均值分块学习算法.该算法采用分而治之的策略, 首先采用互信息作为节点间距离度量, 利用融合互信息的改进K均值算法对网络分块; 其次, 使用MMPC (Max-min parent and children)算法得到整个网络的架构, 根据架构找到块间所有边的可能连接方向, 从而找到所有可能的图结构; 之后, 对所有图结构依次进行结构学习; 最终利用评分找到最优BN.实验证明, 相比现有分块结构学习算法, 本文提出的算法不仅习得了网络的精确结构, 且学习速度有一定提高; 相比非分块经典结构学习算法, 本文提出的算法在保证精度基础上, 学习速度大幅提高, 解决了非分块经典结构学习算法无法在合理时间内处理高维数据的难题.
    Recommended by Associate Editor HU Qing-Hua
    1)  本文责任编委 胡清华
  • 图  1  提出的改进$K$均值算法的流程图

    Fig.  1  Diagram of the improved $K$-means algorithm

    图  2  Merge函数运行前网络结构示意图

    Fig.  2  An example of a network structure before Merge function

    图  3  四种可能的图结构

    Fig.  3  Four possible graph structures

    图  4  节点$X_1$的原始父节点图

    Fig.  4  The original parentGraph of the node $X_1$

    图  5  节点$X_1$的修剪后父节点图

    Fig.  5  The modified parentGraph of the node $X_1$

    图  6  四个节点的节点序图

    Fig.  6  The orderGraph of four nodes

    图  7  四种可能的图结构

    Fig.  7  Four possible graph structures

    图  8  两个算法的$Q$值比较

    Fig.  8  The $Q$ of the two algorithms

    图  9  两个算法的运行时间比较

    Fig.  9  The run time of the two algorithms

    图  10  IKM算法的稳定性

    Fig.  10  The stability of the IKM algorithm

    图  11  OS-IKM算法与现有分块算法的运行时间对比

    Fig.  11  The comparison of the run time between OS-IKM algorithm and other block algorithms

    图  12  OS-IKM算法与其他非分块经典结构学习算法的运行时间对比

    Fig.  12  The comparison of the run time between OS-IKM algorithm and other traditional algorithms

    表  1  实验中用到的BN的参数

    Table  1  Details of Bayesian networks in the experiments

    网络 节点数 边数
    Sachs 11 17
    Mybnet1 18 27
    Boerlage92 23 36
    Insurance 27 52
    Mybnet2 32 42
    Alarm 37 46
    下载: 导出CSV

    表  2  汉明距离对比

    Table  2  The Hamming distances of the seven algorithms

    OS-IKM算法 OS-FN算法 SAR算法 MMHC based on graph partitioning算法 GS based on graph partitioning算法 动态规划算法 HC算法
    (a) Sachs网络
    $A(N_G^\ast)$ 0 0 0 0 0 1 1
    $M(N_G^\ast)$ 3 3 0 3 3 0 0
    $I(N_G^\ast)$ 4 3 2 6 6 3 4
    $H(N_G^\ast)$ 7 6 2 9 9 4 5
    (b) Mybnet1网络
    $A(N_G^\ast)$ 0 0 0 0 0 3 4
    $M(N_G^\ast)$ 2 3 1 2 2 2 2
    $I(N_G^\ast)$ 6 8 3 6 11 12 6
    $H(N_G^\ast)$ 8 11 4 8 13 17 12
    (c) Boerlage92网络
    $A(N_G^\ast)$ 0 0 3 0 0 7 0
    $M(N_G^\ast)$ 9 9 6 10 9 3 10
    $I(N_G^\ast)$ 5 7 7 8 8 8 9
    $H(N_G^\ast)$ 14 16 16 18 17 18 19
    (d) Mybnet2网络
    $A(N_G^\ast)$ 0 0 1 0 0 0
    $M(N_G^\ast)$ 5 6 5 7 7 8
    $I(N_G^\ast)$ 9 9 9 11 9 6
    $H(N_G^\ast)$ 14 15 15 18 16 14
    (e) Alarm网络
    $A(N_G^\ast)$ 0 0 7 0 0 8
    $M(N_G^\ast)$ 9 11 4 8 11 5
    $I(N_G^\ast)$ 6 8 4 14 9 12
    $H(N_G^\ast)$ 15 19 15 22 20 25
    下载: 导出CSV

    表  3  BIC评分对比($\times 10^4$)

    Table  3  The BIC scores of the seven algorithms ($\times 10^4$)

    OS-IKM算法 OS-FN算法 MMHC based on Graph Partitioning算法 GS based on Graph Partitioning算法 动态规划算法 HC算法
    Sachs $-3.7004$ $-3.7907$ $-3.7038$ $-3.7813$ $-3.6616$ $-3.6753$
    Mybnet1 $-5.0766$ $-5.0823$ $-5.0833$ $-5.0853$ $-5.0946$ $-5.0773$
    Boerlage92 $-5.0474$ $-5.0754$ $-5.0853$ $-5.0773$ $-5.065$ $-5.0758$
    Mybnet2 $-9.1036$ $-9.1300$ $-9.1478$ $-9.1459$ $-9.0947$
    Alarm $-6.521$ $-6.7179$ $-6.7889$ $-6.7486$ $-6.8439$
    下载: 导出CSV
  • [1] Peal J. Probabilistic Reasoning in Intelligent Systems. Massachusetts: Morgan Kaufmann, 1988.
    [2] Soli R, Kaabi B, Barhoumi M, Maktouf C, Ahmed S. Bayesian phylogenetic analysis of the influenza-A virus genomes isolated in Tunisia and determination of potential recombination events. Molecular Phylogenetics And Evolution, 2019, 134: 253-268 doi: 10.1016/j.ympev.2019.01.019
    [3] Stahl E A, Wegmann D, Trynka G, Gutierrez-Achury J, Do R, Voight B F, et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nature Genetics, 2012, 44(5): 483-489 doi: 10.1038/ng.2232
    [4] Ehsani-Moghaddam B, Queenan J A, MacKenzie J, Birtwhistle R V. Mucopolysaccharidosis type Ⅱ detection by naive Bayes classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network. Plos One, 2018, 13(12): e0209018 doi: 10.1371/journal.pone.0209018
    [5] 郑文博, 王坤峰, 王飞跃.基于贝叶斯生成对抗网络的背景消减算法.自动化学报, 2018, 44(5): 878-890 doi: 10.16383/j.aas.2018.c170562

    Zheng Wen-Bo, Wang Kun-Feng, Wang Fei-Yue. Background subtraction algorithm with bayesian generative adversarial networks. Acta Automatica Sinica, 2018, 44(5): 878-890 doi: 10.16383/j.aas.2018.c170562
    [6] 顾晓清, 蒋亦樟, 王士同.用于不平衡数据分类的0阶TSK型模糊系统.自动化学报, 2017, 43(10): 1773-1788 doi: 10.16383/j.aas.2017.c160200

    Gu Xiao-Qing, Jiang Yi-Zhang, Wang Shi-Tong. Zero-order TSK-type fuzzy system for imbalanced data classiflcation. Acta Automatica Sinica, 2017, 43(10): 1773-1788 doi: 10.16383/j.aas.2017.c160200
    [7] 王松涛, 周真, 曲寒冰, 李彬. RGB-D图像的贝叶斯显著性检测.自动化学报, 2017, 43(10): 1810-1828 doi: 10.16383/j.aas.2017.e160141

    Wang Song-Tao, Zhou Zhen, Qu Han-Bing, Li Bin. Bayesian saliency detection for RGB-D images. Acta Automatica Sinica, 2017, 43(10): 1810-1828 doi: 10.16383/j.aas.2017.e160141
    [8] 韩绍金, 李建勋.基于密度核估计的贝叶斯网络结构学习算法.计算机工程与应用, 2014, 50(15): 107-112 doi: 10.3778/j.issn.1002-8331.1307-0003

    Han Shao-Jin, Li Jian-Xun. Structure learning algorithm for bayesian network based on probability density kernel estimation. Computer Engineering & Applications, 2014, 50(15): 107-112 doi: 10.3778/j.issn.1002-8331.1307-0003
    [9] Malone B, Yuan C, Hansen E A. Memory-efficient dynamic programming for learning optimal bayesian networks. In: Proceedings of the Twenty-Fifth Association for the Advance of Artificial Intelligence Conference. AAAI Press, 2011. 1057-1062
    [10] Yuan C, Malone B. Learning optimal bayesian networks: a shortest path perspective. Journal of Artificial Intelligence Research, 2013, 48: 23-65 doi: 10.1613/jair.4039
    [11] Bartlett M, Cussens J. Integer linear programming for the bayesian network structure learning problem. Artificial Intelligence, 2017, 244: 258-271 doi: 10.1016/j.artint.2015.03.003
    [12] Singh A P, Moore A W. Finding optimal bayesian networks by dynamic programming. Technical Report CMU-CALD-05-106, School of Computer Science, Carnegie Mellon University, USA, 2005.
    [13] Koller D, Friedman N. Probabilistic Graphical Models. Cambridge, MA: MIT Press, 2009.
    [14] Li S H, Zhang J, Huang K H, Gao C X. A graph partitioning approach for bayesian network structure learning. In: Proceedings of the 33rd Chinese Control Conference. Nanjing, China: IEEE, 2014
    [15] Liu H, Zhou S G, Lam W, Guan J H. A new hybrid method for learning Bayesian networks: separation and reunion. Knowledge-Based Systems, 2017, 121: 185-197 doi: 10.1016/j.knosys.2017.01.029
    [16] Newman M E J, Girvan M. Finding and evaluating community structure in networks. Physical Review E, 2004, 69(2): 026113 doi: 10.1103/PhysRevE.69.026113
    [17] Newman M E J. Modularity and community structure in networks. In: Proceedings of the 2006 American Physical Society March Meeting. American Physical Society, 2006. 8577-8582
    [18] Zhang W, Zhang G X, Chen X H, Liu Y Q, Zhou X M, Zhou J L. DHC: a distributed hierarchical clustering algorithm for large datasets. Journal of Circuits, Systems and Computers, 2019, 28(4): 1950065.1-1950065.26 http://d.old.wanfangdata.com.cn/Periodical/dzkjdxxb201705020
    [19] Zhou Zhi-Hua. Machine Learning. Tsinghua University Press, 2016.
    [20] Birant D, Kut A. ST-DBSCAN: an algorithm for clustering spatial-temporal data. Data and Knowledge Engineering, 2007, 60(1): 208-221 http://d.old.wanfangdata.com.cn/Periodical/xddzjs201921029
    [21] Viswanath P, Pinkesh R. I-DBSCAN: a fast hybrid density based clustering method. In: Proceedings of IEEE 18th International Conference on Pattern Recognition International Conference on Pattern Recognition. Hong Kong, China: IEEE Computer Society, 2006. 912-915
    [22] Newman M E J. Fast algorithm for detecting community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys, 2004, 69(6): 066133. doi: 10.1103/PhysRevE.69.066133
    [23] Vinh N X, Chetty M, Coppel R, Wangikar P P. GlobalMIT: learning globally optimal dynamic bayesian network with the mutual information test criterion. Bioinformatics, 2011, 27(19): 2765-2766 doi: 10.1093/bioinformatics/btr457
    [24] Konstantinos T, Vincenzo L, Sofia T, Tsamardinos I. On scoring maximal ancestral graphs with the max-min hill climbing algorithm. International Journal of Approximate Reasoning, 2018 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=0d5df1ec3afbd3527a261d532e769283
    [25] Tsamardinos I. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 2006, 65(1): 31-78 doi: 10.1007-s10994-006-6889-7/
  • 加载中
图(12) / 表(3)
计量
  • 文章访问数:  2102
  • HTML全文浏览量:  374
  • PDF下载量:  54
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-12-18
  • 录用日期:  2019-04-15
  • 刊出日期:  2020-06-01

目录

    /

    返回文章
    返回