K部K一致(超)网络中的社区发现

刘欣

doi:10.1007/s11390-011-0177-0

摘要: 1．本文的创新点
社区发现，即自动挖掘出网络中连接紧密的节点集合，能有助于我们更深刻的理解和认识网络。虽然人们近年来提出了多种不同的社区发现方法，它们往往局限于一部网络。现实世界中的很多复杂系统由多种不同类型的实体组成，这些系统更适合被表示成由不同类型节点组成的异质网络，比如二部网络或三部超网络。
在二部网络中存在两类节点，每条边连接两个不同类型的节点。二部网络广泛存在于自然界，比如作者-论文网，顾客-商品网，演员-电影网。在三部超网络中存在三类节点，每条超边连接三个不同类型的节点。三部超网络适合于描述社会化标签系统，其中三类节点分别代表系统中的用户，标签和资源，每条超边表示某某用户用某个标签描述了某个资源。更一般的，二部网络和三部超网络可以被抽象成一个K部K一致(超)网络，其中包含K类节点，每条(超)边连接K个不同类型的节点（二部网络和三部超网络分别对应于K=2和3的情况）。
现有的针对二部网络或三部超网络的社区发现方法包括：1）扩展的模块度优化方法，2）网络约减方法（即把二部网络或三部超网络约减成一部网络，再应用一部网络中的已有方法）。这两类方法的最大不足是，它们仅能处理社区是一对一对应的情况。然而，现实中的社区往往具有多对多对应的情况。比如，一个研究团队可能会从事多项研究课题，并在多个领域发表论文。于是，在作者-论文网中，该研究团队社区就会对应于多个不同主题的论文社区。
本文提出了一个框架性的方法，用于求解K部K一致(超)网络中的社区发现，该方法既能处理社区是一对一对应的情况，也能处理社区是多对多对应的情况。 2．实现方法
2007年Rosvall等人在PNAS上发表了基于信息压缩思想的求解一部网络中社区发现问题的方法，该思想的精髓是把网络的社区结构看作网络结构的一种高效信息压缩。我们把Rosvall的思想扩展到K部K一致(超)网络，提出了一种全新的评价社区结构的客观函数，从而将社区发现问题转化成优化问题，进而提出了一个启发式的求解算法。 3．结论及未来待解决的问题
在deep South二部网络以及人工随机三部超网络中的实验表明，和已有的方法相比，我们的新方法具有如下优点：
1) 广泛性：它既能够处理社区是一对一对应的情况，也能够处理社区是多对多对应的情况。
2) 准确性：即使是针对社区为一对一对应的情况，它也比已有的方法更加准确。
3) 无参化：它不依赖于任何参数，能够自动分析出社区结构。
4) 高速化：它速度快，能够应用于现实世界中的大规模网络。

Abstract: In social tagging systems such as Delicious and Flickr, users collaboratively manage tags to annotate resources. Naturally, a social tagging system can be modeled as a (user, tag, resource) hypernetwork, where there are three different types of nodes, namely users, resources and tags, and each hyperedge has three end nodes, connecting a user, a resource and a tag that the user employs to annotate the resource. Then how can we automatically cluster related users, resources and tags, respectively? This is a problem of community detection in a 3-partite, 3-uniform hypernetwork. More generally, given a K-partite K-uniform (hyper)network, where each (hyper)edge is a K-tuple composed of nodes of K different types, how can we automatically detect communities for nodes of different types? In this paper, by turning this problem into a problem of finding an efficient compression of the (hyper)network's structure, we propose a quality function for measuring the goodness of partitions of a K-partite K-uniform (hyper)network into communities, and develop a fast community detection method based on optimization. Our method overcomes the limitations of state of the art techniques and has several desired properties such as comprehensive, parameter-free, and scalable. We compare our method with existing methods in both synthetic and real-world datasets.

K部K一致(超)网络中的社区发现

Detecting Communities in K-Partite K-Uniform (Hyper)Networks