›› 2011,Vol. 26 ›› Issue (5): 778-791.doi: 10.1007/s11390-011-0177-0

• • 上一篇    下一篇

K部K一致(超)网络中的社区发现

Xin Liu (刘欣) and Tsuyoshi Murata, Member, ACM, IEEE   

  • 收稿日期:2010-10-01 修回日期:2011-07-04 出版日期:2011-09-05 发布日期:2011-09-05
  • 作者简介:Xin Liu is a Ph.D. candidate in the Department of Computer Science, Tokyo Institute of Technology. He received the B.S. degree in computing and information science from Wuhan University of Technology in 2004, and the M.S. degree in computer science from Wuhan University in 2007. His research interests includeWeb mining and social network analysis.
    Tsuyoshi Murata is an associate professor in the Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology. He obtained his doctor's degree in computer science at Tokyo Institute of Technology in 1997, on the topic of machine discovery of geometrical theorems. At Tokyo Institute of Technology, he conducts research on Web mining, artificial intelligence, and social network analysis. He is a member of IEEE, AAAI, ACM, JSAI, IPSJ and JSSST.

Detecting Communities in K-Partite K-Uniform (Hyper)Networks

Xin Liu (刘欣) and Tsuyoshi Murata, Member, ACM, IEEE   

  1. Department of Computer Science, Tokyo Institute of Technology, Tokyo 152-8552, Japan
  • Received:2010-10-01 Revised:2011-07-04 Online:2011-09-05 Published:2011-09-05
  • Contact: Xin Liu E-mail:tsinllew@ai.cs.titech.ac.jp; murata@cs.titech.ac.jp
  • About author:Xin Liu is a Ph.D. candidate in the Department of Computer Science, Tokyo Institute of Technology. He received the B.S. degree in computing and information science from Wuhan University of Technology in 2004, and the M.S. degree in computer science from Wuhan University in 2007. His research interests includeWeb mining and social network analysis.
    Tsuyoshi Murata is an associate professor in the Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology. He obtained his doctor's degree in computer science at Tokyo Institute of Technology in 1997, on the topic of machine discovery of geometrical theorems. At Tokyo Institute of Technology, he conducts research on Web mining, artificial intelligence, and social network analysis. He is a member of IEEE, AAAI, ACM, JSAI, IPSJ and JSSST.
  • Supported by:

    This work was supported in part by JSPS Grant-in-Aid under Grant No. 22300049 and IBM Ph.D. Fellowship.

1.本文的创新点
社区发现,即自动挖掘出网络中连接紧密的节点集合,能有助于我们更深刻的理解和认识网络。虽然人们近年来提出了多种不同的社区发现方法,它们往往局限于一部网络。现实世界中的很多复杂系统由多种不同类型的实体组成,这些系统更适合被表示成由不同类型节点组成的异质网络,比如二部网络或三部超网络。
在二部网络中存在两类节点,每条边连接两个不同类型的节点。二部网络广泛存在于自然界,比如作者-论文网,顾客-商品网,演员-电影网。在三部超网络中存在三类节点,每条超边连接三个不同类型的节点。三部超网络适合于描述社会化标签系统,其中三类节点分别代表系统中的用户,标签和资源,每条超边表示某某用户用某个标签描述了某个资源。更一般的,二部网络和三部超网络可以被抽象成一个K部K一致(超)网络,其中包含K类节点,每条(超)边连接K个不同类型的节点(二部网络和三部超网络分别对应于K=2和3的情况)。
现有的针对二部网络或三部超网络的社区发现方法包括:1)扩展的模块度优化方法,2)网络约减方法(即把二部网络或三部超网络约减成一部网络,再应用一部网络中的已有方法)。这两类方法的最大不足是,它们仅能处理社区是一对一对应的情况。然而,现实中的社区往往具有多对多对应的情况。比如,一个研究团队可能会从事多项研究课题,并在多个领域发表论文。于是,在作者-论文网中,该研究团队社区就会对应于多个不同主题的论文社区。
本文提出了一个框架性的方法,用于求解K部K一致(超)网络中的社区发现,该方法既能处理社区是一对一对应的情况,也能处理社区是多对多对应的情况。 2.实现方法
2007年Rosvall等人在PNAS上发表了基于信息压缩思想的求解一部网络中社区发现问题的方法,该思想的精髓是把网络的社区结构看作网络结构的一种高效信息压缩。我们把Rosvall的思想扩展到K部K一致(超)网络,提出了一种全新的评价社区结构的客观函数,从而将社区发现问题转化成优化问题,进而提出了一个启发式的求解算法。 3.结论及未来待解决的问题
在deep South二部网络以及人工随机三部超网络中的实验表明,和已有的方法相比,我们的新方法具有如下优点:
1) 广泛性:它既能够处理社区是一对一对应的情况,也能够处理社区是多对多对应的情况。
2) 准确性:即使是针对社区为一对一对应的情况,它也比已有的方法更加准确。
3) 无参化:它不依赖于任何参数,能够自动分析出社区结构。
4) 高速化:它速度快,能够应用于现实世界中的大规模网络。

Abstract: In social tagging systems such as Delicious and Flickr, users collaboratively manage tags to annotate resources. Naturally, a social tagging system can be modeled as a (user, tag, resource) hypernetwork, where there are three different types of nodes, namely users, resources and tags, and each hyperedge has three end nodes, connecting a user, a resource and a tag that the user employs to annotate the resource. Then how can we automatically cluster related users, resources and tags, respectively? This is a problem of community detection in a 3-partite, 3-uniform hypernetwork. More generally, given a K-partite K-uniform (hyper)network, where each (hyper)edge is a K-tuple composed of nodes of K different types, how can we automatically detect communities for nodes of different types? In this paper, by turning this problem into a problem of finding an efficient compression of the (hyper)network's structure, we propose a quality function for measuring the goodness of partitions of a K-partite K-uniform (hyper)network into communities, and develop a fast community detection method based on optimization. Our method overcomes the limitations of state of the art techniques and has several desired properties such as comprehensive, parameter-free, and scalable. We compare our method with existing methods in both synthetic and real-world datasets.

[1] Fortunato S. Community detection in graphs. Physics Reports, 2010, 486: 75-174.

[2] Danon L, Duch L, Guilera A D, Arenas A. Comparing community structure identification. J. Stat. Mech, 2005, 9: P09008.

[3] Lancichinetti A, Fortunato S. Community detection algorithms: A comparative analysis. Phys. Rev. E, 2009, 80(5): 056117.

[4] Leskovec J, Lang K J, Mahoney M W. Empirical comparison of algorithms for network community detection. In Proc. the 19th International Conference on World Wide Web, Raleigh, USA, Apr. 26-30, 2010, pp.631-640.

[5] Shen H, Cheng X. Spectral methods for the detection of network community structure: A comparative analysis. J. Stat. Mech., 2010, 10: P10020.

[6] Guimerà R, Pardo M S, Amaral L A N. Module identification in bipartite and directed networks. Phys. Rev. E, 2007, 76(3): 036102.

[7] Zlati? V, Ghoshal G, Caldarelli G. Hypergraph topological quantities for tagged social networks. Phys. Rev. E, 2009, 80(3): 036118.

[8] Neubauer N, Obermayer K. Towards community detection in k-partite k-uniform hypergraphs. In Workshop on Analyzing Networks and Learning with Graphs, Whistler, BC, Canada, Dec. 11, 2009.

[9] Lu C, Chen X, Park E K. Exploit the tripartite network of social tagging for web clustering. In Proc. the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China, Nov. 2-6, 2009, pp.1545-1548.

[10] Zhou T, Ren J, Medo M, Zhang Y C. Bipartite network projection and personal recommendation. Phys. Rev. E, 2007, 76(4): 046115.

[11] Barber M J. Modularity and community detection in bipartite network. Phys. Rev. E, 2007, 76(6): 066102.

[12] Murata T, Ikeya T. A new modularity for detecting one-tomany correspondence of communities in bipartite networks. Advances in Complex Systems, 2010, 13(1): 19-31.

[13] Suzuki K,Wakita K. Extracting multi-facet community structure from bipartite networks. In Proc. International Conference on Computational Science and Engineering, Vancouver, BC, Canada, Aug. 29-31, 2009, pp.312-319.

[14] Murata T. Detecting communities from tripartite networks. In Proc. the 19th International Conference on World Wide Web, Raleigh, USA, Apr. 26-30, 2010, pp.1159-1160.

[15] Murata T. Modularity for heterogeneous networks. In Proc. the 21st ACM Conference on Hypertext and Hypermedia, Toronto, Canada, Jun. 13-16, 2010, pp.129-134.

[16] Lin Y R, Sun J, Castro P, Konuru R, Sundaram H, Kelliher A. Metafac: Community discovery via relational hypergraph factorization. In Proc. the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, Jun. 28-Jul. 1, 2009, pp.527-535.

[17] Dhillon I S, Mallela S, Modha D S. Information-theoretic coclustering. In Proc. the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Washington DC, USA, Aug. 24-27, 2003, pp.89-98.

[18] Li T. A general model for clustering binary data. In Proc. the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, USA, Aug. 21-24, 2005, pp.188-197.

[19] Banerjee A, Dhillon I, Ghosh J, Merugu S, Modha D S. A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. Journal of Machine Learning Research, 2007, 8: 1919-1986.

[20] Long B, Zhang Z, Yu P S. A probabilistic framework for relational clustering. In Proc. the 13th ACM International Conference on Knowledge Discovery and Data Mining, San Jose, USA, Aug. 12-15, 2007, pp.470-479.

[21] Newman M E J. Networks: An Introduction. New York: Oxford University Press, 2010.

[22] Rosvall M, Bergstrom C T. An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. USA, 2007, 104(18): 7327-7331.

[23] Kernighan B, Lin S. An efficient heuristic procedure to partition graphs. Bell Syst. Tech. J., 1970, 49(2): 291-307.

[24] Scott J. Social Network Analysis: A Handbook. Second Edition, Sage Publications, Newberry Park, CA, 2000.

[25] Newman M E J, Girvan M. Finding and evaluating community structure in networks. Phys. Rev. E, 2004, 69(2): 026113.

[26] Newman M E J. Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA, 2006, 103(23): 8577- 8582.

[27] Fortunato S, Barthélemy M. Resolution limit in community detection. Proc. Natl. Acad. Sci. USA, 2007, 104(1): 36-41.

[28] Shen H, Cheng X, Fang B. Covariance, correlation matrix, and the multiscale community structure of networks. Phys. Rev. E, 2010, 82(1): 016114.

[29] Newman M E J. Fast algorithm for detecting community structure in networks. Phys. Rev. E, 2004, 69(6): 066133.

[30] Clauset A, Newman M E J, Moore C. Finding community structure in very large networks. Phys. Rev. E, 2004, 70(6): 066111.

[31] Duch L, Arenas A. Community detection in complex networks using extremal optimization. Phys. Rev. E, 2005, 72(2): 027104.

[32] Medus A, Acuna G, Dorso C O. Detection of community structures in networks via global optimization. Physica A, 2005, 358(2-4): 593-604.

[33] Newman M E J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E, 2006, 74(3): 036104.

[34] Schuetz P, Caflisch A. Efficient modularity optimization by multistep greedy algorithm and vertex refinement. Phys. Rev. E, 2008, 77(4): 046112.

[35] Blondel V D, Guillaume J L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J. Stat. Mech., 2008, 10: P10008.

[36] Zhang X S, Wang R S, Wang Y, Wang J, Qiu Y, Wang L, Chen L. Modularity optimization in community detection of complex networks. Europhys. Lett., 2009, 87(3): 38002.

[37] Liu X, Murata T. Advanced modularity-specialized label propagation algorithm for detecting communities in networks. Physica A, 2010, 389(7): 1493-1500.

[38] Gao B, Liu T Y, Zheng X, Cheng Q S, Ma W Y. Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. In Proc. the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, USA, Aug. 21-24, 2005, pp.41-50.

[39] Greco G, Guzzo A, Pontieri L. An information-theoretic framework for high-order co-clustering of heterogeneous objects. In Proc. the 15th Italian Symposium on Advanced Database Systems, Torre Canne, Italy, Jun. 17-20, 2007, pp.397-404.

[40] Long B, Zhang Z F, Wu X Y, Yu P S. Spectral clustering for multi-type relational data. In Proc. the 23rd International Conference on Machine Learning, Pittsburgh, USA, Jun. 25- 29, 2006, pp.585-592.

[41] Long B, Wu X, Zhang Z, Yu P S. Unsupervised learning on k-partite graphs. In Proc. the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, USA, Aug. 20-23, 2006, pp.317-326.

[42] Singh A P, Gordon G J. Relational learning via collective matrix factorization. In Proc. the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.650—658, Las Vegas, USA, Aug. 24-27, 2008.

[43] Singh A P, Gordon G J. A unified view of matrix factorization models. In Proc. the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Antwerp, Belgium, Sept. 15-19, 2008, pp.358-373.

[44] Cattuto C, Schmitz C, Baldassarri A, Servedio V D P, Loreto V, Hotho A, Grahl M, Stumme G. Network properties of folksonomies. AI Communications, 2007, 20(4): 245-262.

[45] Halpin H, Robu V, Shepherd H. The complex dynamics of collaborative tagging. In Proc. the 16th International Conference on World Wide Web, Banff, Canada, May 8-12, 2007, pp.211-220.

[46] Long B, Wu X, Zhang Z, Yu P S. Community learning by graph approximation. In Proc. the 7th IEEE International Conference on Data Mining, Omaha, USA, Oct. 28-31, 2007, pp.232-241.

[47] Long B, Zhang Z, Yu P S, Xu T. Clustering on complex graphs. In Proc. the 23rd National Conference on Artificial Intelligence, Chicago, USA, Jul. 13-17, 2008, pp.659-664.

[48] Rissanen J. Modelling by shortest data description. Automatica, 1978, 14(5): 465-471.

[49] Brandes U, Delling D, Gaertler M, GÄorke R, Hoefer M, Nikolski Z, Wagner D. On modularity—np-completeness and beyond. Technical Report 2006-19, ITI Wagner, Faculty of Informatics, UniversitÄat Karlsruhe, 2006.

[50] Raghavan U N, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E, 2007, 76(3): 036106.

[51] Barber M J, Clark J W. Detecting network communities by propagating labels under constraints. Phys. Rev. E, 2009, 80(2): 026129.

[52] Arenas A, Duch J, Fernández A, Gómez S. Size reduction of complex networks preserving modularity. New Journal of Physics, 2007, 9: 176.

[53] Davis A, Gardner B B, Gardner M R. Deep South. Chicago: University of Chicago Press, IL, 1941.

[54] Breiger R, Carley K, Pattison P (eds.) Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers, Washington, DC: The National Academics Press, USA, 2003.

[55] Jaccard P. Étude comparative de la distribution florale dans une portion des alpes et des jura. Bull. Soc. Vaudoise Sci. Nat., 1901, 37: 547-579.

[56] LÄu L, Zhou T. Link prediction in complex networks: A survey. Physica A, 2011, 390: 1150-1170.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Jose K- Raphel; Siu Cheung Hui; Angela Goh;. Class Based Contextual Logic for DOOD[J]. , 1996, 11(2): 161 -170 .
[2] 庄旗铭; 蒋定安; 杨庆;. A Comparative Analysis of Different Arbitration Protocols for Multiple-Bus Multiprocessors[J]. , 1996, 11(3): 313 -325 .
[3] 马宗民; Yan Li;. Using Multivalued Logic in Relational Database Containing Null Value[J]. , 1996, 11(4): 421 -426 .
[4] . 不相容多流体的动画模拟[J]. , 2007, 22(1): 156 -160 .
[5] . 一种新的公钥加密方案[J]. , 2007, 22(1): 95 -02 .
[6] Wei Hu (胡伟), Zhao Dong (董朝), and Guo-Dong Yuan (袁国栋). 基于边缘保持滤波的编辑传播[J]. , 2012, 27(4): 830 -840 .
[7] Bin Xiao, Yi-Fu Zhang, Yan-Ping Gao, Liang Yang, Dong-Mei Wu, and Bao-Xia Fan. 一种健壮和高能效的65nm SOC物理实现方法[J]. , 2013, 28(4): 682 -688 .
[8] Yu-Xin Ma, Jia-Yi Xu, Di-Chao Peng, Ting Zhang, Cheng-Zhe Jin, Hua-Min Qu, Wei C. 基于可视分析的多上下文移动社交网络社区发现[J]. , 2013, 28(5): 797 -809 .
[9] Zhi-Ping Jiang, Wei Xi, Xiangyang Li, Shaojie Tang, Ji-Zhong Zhao, Jin-Song Han,. 通信既众包:带有基于CSI测速功能的Wi-Fi室内定位系统[J]. , 2014, 29(4): 589 -604 .
[10] Chao Deng, Yi-Ci Cai, Qiang Zhou. 用于低功耗时钟树综合的寄存器结群方法[J]. , 2015, 30(2): 391 -403 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: