|
›› 2015,Vol. 30 ›› Issue (4): 713-724.doi: 10.1007/s11390-015-1556-8
所属专题: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining
• Special Section on Selected Paper from NPC 2011 • 上一篇 下一篇
Jun-Gang Xu1(徐俊刚), Member, CCF, ACM, IEEE, Yue Zhao1(赵越), Jian Chen2(陈健), Member, CCF, ACM, IEEE, Chao Han2(韩超)
Jun-Gang Xu1(徐俊刚), Member, CCF, ACM, IEEE, Yue Zhao1(赵越), Jian Chen2(陈健), Member, CCF, ACM, IEEE, Chao Han2(韩超)
贝叶斯网络是一种重要的概率图模型,它表现为随机节点的集合以及它们之间的依赖。贝叶斯网络已经应用到很多领域,如自然语言处理、硬件诊断、生物信息学、统计物理、计量经济学,等等。从现有数据中学习得到贝叶斯网结构是贝叶斯网研究最重要的研究任务之一。特别是,学习贝叶斯网选结构是一个非确定性NP难问题。为了解决这一问题,人们提出了很多启发式算法,其中一些算法借助不同类型的先验知识来学习贝叶斯网结构。然而,现有算法对先验知识有很多限制,如对先验知识的质量有限制,对先验知识的使用有限制,等等。这导致在这些算法中很难很好地使用先验知识。本文中,我们将先验知识引入到马尔科夫蒙特卡洛方法(Markov Chain Monte Carlo,MCMC)中,提出了一种称作约束马尔科夫蒙特卡洛(Constrained MCMC, C-MCMC)的算法来学习贝叶斯网结构。定义了三种先验知识:父节点存在性,父节点的不存在性和分布知识,包括边的条件概率分布和节点的概率分布。所有这些先验知识很容易应用到这个算法中。我们进行了实验,验证了C-MCMC算法的可行性和有效性。实验中,我们使用三个度量来评价C-MCMC算法的性能。第一个度量是ADI,它用三个子度量来度量从学习所得结构重构初始结构的过程,这三个度量包括:增加弧的数量(A),减少弧的数量(D)和插入弧的数量(I)。第二个度量是Kullback-Leibler(KL) 距离,它通常用来评估学习得到的贝叶斯网结构,KL距离越小,学习得到的贝叶斯网结构越好。第三个度量是BDeu函数得分,它用来指导MCMC算法的局部搜索方法。实验结果表明,即使少量的先验知识也能帮助改进贝叶斯网结构学习的学习过程。由于具有很好的易用性和灵活性,我们相信C-MCMC算法能够广泛应用到实际应用中。
[1] Jensen F V. Introduction to Bayesian Networks. Secaucus, USA: Springer-Verlag, 1996.[2] Chickering D M, Heckerman D, Meek C. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research, 2004, 5: 1287-1330.[3] Heckerman D, Geiger D, Chickering D M. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 1995, 20(3): 197-243.[4] Chickering D M. Learning equivalence classes of Bayesiannetwork structures. Journal of Machine Learning Research, 2002, 2: 445-498.[5] Pernkopf F, Bilmes J. Efficient heuristics for discriminative structure learning of Bayesian network classifiers. Journal of Machine Learning Research, 2010, 11: 2323–2360.[6] Giudici P, Castelo R. Improving Markov chain Monte Carlo model search for data mining. Machine Learning, 2003, 50 (1/2): 127-158.[7] Koivisto M, Sood K. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 2004, 5: 549-573.[8] Silander T, Myllymaki P. A simple approach for finding the globally optimal Bayesian network structure. In Proc. the 22nd Conference on Uncertainty in Artificial Intelligence, July 2006.[9] Perrier E, Imoto S, Miyano S. Finding optimal Bayesian network given a super-structure. Journal of Machine Learning Research, 2008, 9: 2251-2286.[10] Kojima K, Perrier E, Imoto S, Miyano S. Optimal search on clustered structural constraint for learning Bayesian network structure. Journal of Machine Learning Research, 2010, 11: 285-310.[11] de Campos C P, Ji Q. Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research, 2011, 12: 663-689.[12] Ehlers R S. Computational tools for comparing asymmetric GARCH models via Bayes factors. Mathematics and Computers in Simulation, 2012, 82(5): 858-867.[13] Grzegorczyk M, Husmeier D. Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Machine Learning, 2008, 71(2/3): 265-305.[14] Corander J, Ekdahl M, Koski T. Parallel interacting MCMC for learning of topologies of graphical models. Data Mining and Knowledge Discovery, 2008, 17(3): 431-456.[15] Teyssier M, Koller D. Ordering-based search: A simple and effective algorithm for learning Bayesian networks. In Proc. the 21st Conference on Uncertainty in Artificial Intelligence, June 2005, pp.548-549.[16] Cano A, Masegosa A R, Moral S. A method for integrating expert knowledge when learning Bayesian networks from data. IEEE Transaction on Systems, Man, and Cybernetics, Part B: Cybernetics, 2011, 41(5): 1382-1394.[17] de Campos L M, Castellano J G. Bayesian network learning algorithms using structural restrictions. International Journal of Approximate Reasoning, 2007, 45(2): 233-254.[18] Heckerman D, Geiger D, Chickering D M. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 1995, 20(3): 197-243.[19] Lauritzen S L, Spiegelhalter D J. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B (Methodological), 1988, 50(2): 157-224.[20] Binder J, Koller D, Russell S, Kanazawa K. Adaptive probabilistic networks with hidden variables. Machine Learning, 1997, 29(2/3): 213-244.[21] Beinlich I A, Suermondt H J, Chavez RM, Cooper G F. The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In Proc. the 2nd European Conference on Artificial Intelligence in Medicine, August 1989, pp.247-256.[22] Forbes J, Huang T, Kanazawa K, Russell S. The BATmobile: Towards a Bayesian automated taxi. In Proc. the 14th International Joint Conference on Artificial Intelligence, August 1995, pp.1878-1885.[23] Acid S, de Campos L M. Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. Journal of Artificial Intelligence Research, 2003, 18: 445-490. |
No related articles found! |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |