We use cookies to improve your experience with our site.

给定基数的连通诱导子图枚举算法

Novel Algorithms for Efficient Mining of Connected Induced Subgraphs of a Given Cardinality

  • 摘要:
    研究背景 子图枚举问题在芯片设计、生物信息学、化学分子结构分析、通信网络及人工智能等多个领域都有着重要的应用。子图枚举问题是图论领域中的一个重要和前沿问题,旨在从给定的无向图中枚举出所有满足给定条件的子图。已有研究表明,从网络(或图)中枚举出所有具有k个节点的连通诱导子图是子图枚举问题研究中的热点和难点,具有极高计算复杂度的任务。
    目的 本文旨在提出新的算法来提升子图枚举的效率,并给出算法的正确性、时间复杂度和空间复杂度的理论证明。
    方法 面向给定节点数的连通诱导子图枚举问题,我们提出了两种高效的算法。其中,所提出的自底向上算法通过添加当前子图的邻居节点(添加的邻居节点不能在记录已经访问的邻居节点的集合中)来递增地枚举连通诱导子图。通过这种方式,每个连通诱导子图均被枚举且只被枚举一次。不同于已有的算法采用自底向上或基于超图的方式,本文提出的自顶向下算法通过从大的连通子图中逐渐删除所定义的可删除节点(非关节点),直到当前子图的大小为k
    结果 在真实的图数据集上的实验结果表明,在枚举小基数(k)的连通诱导子图方面,本文提出的自底向上算法优于作者已知的最好的算法。在枚举大基数k的连通诱导子图方面,所提出的自顶向下算法可以实现比当前最快的算法高达19倍的加速。从理论角度来看,在基数k较大的情况下,本文提出的自顶向下算法的时间复杂度优于已有算法的时间复杂度。
    结论 本文提出了分别以自底向上和自顶向下的方式枚举给定基数的连通诱导子图的两种高效算法。在不同大小的真实世界网络图上进行的实验,与当前最优算法相比,本文提出的算法具有更高的效率。从应用的角度来看,所提出的算法具有较好的实用性,可应用于生物信息学、信息检索和芯片设计等领域的子图挖掘任务中。未来的工作将给出自顶向下算法的延迟及其证明,并给出和证明基数为k的连通诱导子图的数量的上限。

     

    Abstract: Mining subgraphs with interesting structural properties from networks (or graphs) is a computationally challenging task. In this paper, we propose two algorithms for enumerating all connected induced subgraphs of a given cardinality from networks (or connected undirected graphs in networks). The first algorithm is a variant of a previous well-known algorithm. The algorithm enumerates all connected induced subgraphs of cardinality k in a bottom-up manner. The data structures that lead to unit time element checking and linear space are presented. Different from previous algorithms that work in either a bottom-up manner or a reverse search manner, an algorithm that enumerates all connected induced subgraphs of cardinality k in a top-down manner is proposed. The correctness and complexity of the top-down algorithm are theoretically analyzed and proven. In the experiments, we evaluate the efficiency of the algorithms using a set of real-world networks from various fields. Experimental results show that the variant bottom-up algorithm outperforms the state-of-the-art algorithms for enumerating connected induced subgraphs of small cardinality, and the top-down algorithm can achieve an order of magnitude speedup over the state-of-the-art algorithms for enumerating connected induced subgraphs of large cardinality.

     

/

返回文章
返回