一种基于先验知识的贝叶斯网结构学习算法
A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge
-
摘要: 贝叶斯网络是一种重要的概率图模型,它表现为随机节点的集合以及它们之间的依赖。贝叶斯网络已经应用到很多领域,如自然语言处理、硬件诊断、生物信息学、统计物理、计量经济学,等等。从现有数据中学习得到贝叶斯网结构是贝叶斯网研究最重要的研究任务之一。特别是,学习贝叶斯网选结构是一个非确定性NP难问题。为了解决这一问题,人们提出了很多启发式算法,其中一些算法借助不同类型的先验知识来学习贝叶斯网结构。然而,现有算法对先验知识有很多限制,如对先验知识的质量有限制,对先验知识的使用有限制,等等。这导致在这些算法中很难很好地使用先验知识。本文中,我们将先验知识引入到马尔科夫蒙特卡洛方法(Markov Chain Monte Carlo,MCMC)中,提出了一种称作约束马尔科夫蒙特卡洛(Constrained MCMC, C-MCMC)的算法来学习贝叶斯网结构。定义了三种先验知识:父节点存在性,父节点的不存在性和分布知识,包括边的条件概率分布和节点的概率分布。所有这些先验知识很容易应用到这个算法中。我们进行了实验,验证了C-MCMC算法的可行性和有效性。实验中,我们使用三个度量来评价C-MCMC算法的性能。第一个度量是ADI,它用三个子度量来度量从学习所得结构重构初始结构的过程,这三个度量包括:增加弧的数量(A),减少弧的数量(D)和插入弧的数量(I)。第二个度量是Kullback-Leibler(KL) 距离,它通常用来评估学习得到的贝叶斯网结构,KL距离越小,学习得到的贝叶斯网结构越好。第三个度量是BDeu函数得分,它用来指导MCMC算法的局部搜索方法。实验结果表明,即使少量的先验知识也能帮助改进贝叶斯网结构学习的学习过程。由于具有很好的易用性和灵活性,我们相信C-MCMC算法能够广泛应用到实际应用中。Abstract: Bayesian Network is an important probabilistic graphical model that represents a set of random nodes and their conditional dependencies. Bayesian Networks have been increasingly used in a wide range of applications, such as natural language processing, hardware diagnostics, bioinformatics, statistical physics, econometrics, etc. Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a Non-deterministic Polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction of prior knowledge, use restriction of prior knowledge, etc. This makes it difficult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into Markov Chain Monte Carlo (MCMC) algorithm and proposed an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node and distribution knowledge including the Conditional Probability Distribution (CPD) of edges and the Probability Distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.