›› 2015,Vol. 30 ›› Issue (4): 713-724.doi: 10.1007/s11390-015-1556-8

所属专题: Artificial Intelligence and Pattern Recognition Data Management and Data Mining

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

一种基于先验知识的贝叶斯网结构学习算法

Jun-Gang Xu1(徐俊刚), Member, CCF, ACM, IEEE, Yue Zhao1(赵越), Jian Chen2(陈健), Member, CCF, ACM, IEEE, Chao Han2(韩超)   

  1. 1. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 101408, China;
    2. School of Software Engineering, South China University of Technology, Guangzhou 510006, China
  • 收稿日期:2015-01-31 修回日期:2015-03-25 出版日期:2015-07-05 发布日期:2015-07-05
  • 作者简介:Jun-Gang Xu is an associate professor of the School of Computer and Control Engineering (SCCE) at University of Chinese Academy of Sciences (UCAS), Beijing, where he joined as a faculty member in 2005. He received his B.S. and M.S. degrees in mechanical engineering from Southwest Petroleum University, Chengdu, and Shandong University, Jinan, in 1996 and 1999 respectively, and his Ph.D. degree in computer science from Graduate University of Chinese Academy of Sciences, Beijing, in 2003. During 2003~2005, he was a postdoctoral researcher in Tsinghua University, Beijing. His current research interests include data mining, information retrieval, big data management and deep learning.
  • 基金资助:

    This work was supported by the National Natural Science Foundation of China under Grant No. 61372171 and the National Key Technology Research and Development Program of China under Grant No. 2012BAH23B03.

A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge

Jun-Gang Xu1(徐俊刚), Member, CCF, ACM, IEEE, Yue Zhao1(赵越), Jian Chen2(陈健), Member, CCF, ACM, IEEE, Chao Han2(韩超)   

  1. 1. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 101408, China;
    2. School of Software Engineering, South China University of Technology, Guangzhou 510006, China
  • Received:2015-01-31 Revised:2015-03-25 Online:2015-07-05 Published:2015-07-05
  • About author:Jun-Gang Xu is an associate professor of the School of Computer and Control Engineering (SCCE) at University of Chinese Academy of Sciences (UCAS), Beijing, where he joined as a faculty member in 2005. He received his B.S. and M.S. degrees in mechanical engineering from Southwest Petroleum University, Chengdu, and Shandong University, Jinan, in 1996 and 1999 respectively, and his Ph.D. degree in computer science from Graduate University of Chinese Academy of Sciences, Beijing, in 2003. During 2003~2005, he was a postdoctoral researcher in Tsinghua University, Beijing. His current research interests include data mining, information retrieval, big data management and deep learning.
  • Supported by:

    This work was supported by the National Natural Science Foundation of China under Grant No. 61372171 and the National Key Technology Research and Development Program of China under Grant No. 2012BAH23B03.

贝叶斯网络是一种重要的概率图模型,它表现为随机节点的集合以及它们之间的依赖。贝叶斯网络已经应用到很多领域,如自然语言处理、硬件诊断、生物信息学、统计物理、计量经济学,等等。从现有数据中学习得到贝叶斯网结构是贝叶斯网研究最重要的研究任务之一。特别是,学习贝叶斯网选结构是一个非确定性NP难问题。为了解决这一问题,人们提出了很多启发式算法,其中一些算法借助不同类型的先验知识来学习贝叶斯网结构。然而,现有算法对先验知识有很多限制,如对先验知识的质量有限制,对先验知识的使用有限制,等等。这导致在这些算法中很难很好地使用先验知识。本文中,我们将先验知识引入到马尔科夫蒙特卡洛方法(Markov Chain Monte Carlo,MCMC)中,提出了一种称作约束马尔科夫蒙特卡洛(Constrained MCMC, C-MCMC)的算法来学习贝叶斯网结构。定义了三种先验知识:父节点存在性,父节点的不存在性和分布知识,包括边的条件概率分布和节点的概率分布。所有这些先验知识很容易应用到这个算法中。我们进行了实验,验证了C-MCMC算法的可行性和有效性。实验中,我们使用三个度量来评价C-MCMC算法的性能。第一个度量是ADI,它用三个子度量来度量从学习所得结构重构初始结构的过程,这三个度量包括:增加弧的数量(A),减少弧的数量(D)和插入弧的数量(I)。第二个度量是Kullback-Leibler(KL) 距离,它通常用来评估学习得到的贝叶斯网结构,KL距离越小,学习得到的贝叶斯网结构越好。第三个度量是BDeu函数得分,它用来指导MCMC算法的局部搜索方法。实验结果表明,即使少量的先验知识也能帮助改进贝叶斯网结构学习的学习过程。由于具有很好的易用性和灵活性,我们相信C-MCMC算法能够广泛应用到实际应用中。

Abstract: Bayesian Network is an important probabilistic graphical model that represents a set of random nodes and their conditional dependencies. Bayesian Networks have been increasingly used in a wide range of applications, such as natural language processing, hardware diagnostics, bioinformatics, statistical physics, econometrics, etc. Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a Non-deterministic Polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction of prior knowledge, use restriction of prior knowledge, etc. This makes it difficult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into Markov Chain Monte Carlo (MCMC) algorithm and proposed an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node and distribution knowledge including the Conditional Probability Distribution (CPD) of edges and the Probability Distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.

[1] Jensen F V. Introduction to Bayesian Networks. Secaucus, USA: Springer-Verlag, 1996.

[2] Chickering D M, Heckerman D, Meek C. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research, 2004, 5: 1287-1330.

[3] Heckerman D, Geiger D, Chickering D M. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 1995, 20(3): 197-243.

[4] Chickering D M. Learning equivalence classes of Bayesiannetwork structures. Journal of Machine Learning Research, 2002, 2: 445-498.

[5] Pernkopf F, Bilmes J. Efficient heuristics for discriminative structure learning of Bayesian network classifiers. Journal of Machine Learning Research, 2010, 11: 2323–2360.

[6] Giudici P, Castelo R. Improving Markov chain Monte Carlo model search for data mining. Machine Learning, 2003, 50 (1/2): 127-158.

[7] Koivisto M, Sood K. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 2004, 5: 549-573.

[8] Silander T, Myllymaki P. A simple approach for finding the globally optimal Bayesian network structure. In Proc. the 22nd Conference on Uncertainty in Artificial Intelligence, July 2006.

[9] Perrier E, Imoto S, Miyano S. Finding optimal Bayesian network given a super-structure. Journal of Machine Learning Research, 2008, 9: 2251-2286.

[10] Kojima K, Perrier E, Imoto S, Miyano S. Optimal search on clustered structural constraint for learning Bayesian network structure. Journal of Machine Learning Research, 2010, 11: 285-310.

[11] de Campos C P, Ji Q. Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research, 2011, 12: 663-689.

[12] Ehlers R S. Computational tools for comparing asymmetric GARCH models via Bayes factors. Mathematics and Computers in Simulation, 2012, 82(5): 858-867.

[13] Grzegorczyk M, Husmeier D. Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Machine Learning, 2008, 71(2/3): 265-305.

[14] Corander J, Ekdahl M, Koski T. Parallel interacting MCMC for learning of topologies of graphical models. Data Mining and Knowledge Discovery, 2008, 17(3): 431-456.

[15] Teyssier M, Koller D. Ordering-based search: A simple and effective algorithm for learning Bayesian networks. In Proc. the 21st Conference on Uncertainty in Artificial Intelligence, June 2005, pp.548-549.

[16] Cano A, Masegosa A R, Moral S. A method for integrating expert knowledge when learning Bayesian networks from data. IEEE Transaction on Systems, Man, and Cybernetics, Part B: Cybernetics, 2011, 41(5): 1382-1394.

[17] de Campos L M, Castellano J G. Bayesian network learning algorithms using structural restrictions. International Journal of Approximate Reasoning, 2007, 45(2): 233-254.

[18] Heckerman D, Geiger D, Chickering D M. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 1995, 20(3): 197-243.

[19] Lauritzen S L, Spiegelhalter D J. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B (Methodological), 1988, 50(2): 157-224.

[20] Binder J, Koller D, Russell S, Kanazawa K. Adaptive probabilistic networks with hidden variables. Machine Learning, 1997, 29(2/3): 213-244.

[21] Beinlich I A, Suermondt H J, Chavez RM, Cooper G F. The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In Proc. the 2nd European Conference on Artificial Intelligence in Medicine, August 1989, pp.247-256.

[22] Forbes J, Huang T, Kanazawa K, Russell S. The BATmobile: Towards a Bayesian automated taxi. In Proc. the 14th International Joint Conference on Artificial Intelligence, August 1995, pp.1878-1885.

[23] Acid S, de Campos L M. Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. Journal of Artificial Intelligence Research, 2003, 18: 445-490.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘明业; 洪恩宇;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] 高庆狮; 张祥; 杨树范; 陈树清;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] 闵应骅; 韩智德;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] 朱鸿;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] 李明慧;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: