Journal of Computer Science and Technology ›› 2020, Vol. 35 ›› Issue (2): 320-337.doi: 10.1007/s11390-020-9957-8

• Special Section on Learning and Mining in Dynamic Environments • Previous Articles     Next Articles

Finding Communities by Decomposing and Embedding Heterogeneous Information Network

Yue Kou, Member, CCF, De-Rong Shen, Senior Member, CCF, Dong Li*, Tie-Zheng Nie, Member, CCF, Ge Yu, Senior Member, CCF, Member, ACM, IEEE        

  1. School of Computer Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2019-08-20 Revised:2020-01-03 Online:2020-03-05 Published:2020-03-18
  • Contact: Dong Li E-mail:lidongmason@163.com
  • About author:Yue Kou is an associate professor at School of Computer Science and Engineering, Northeastern University, Shenyang. She received her Ph.D. degree in computer software and theory from Northeastern University, Shenyang, in 2009. She is a member of CCF. Her main research interests include Web data management and social networks analysis.
  • Supported by:
    The work was supported by the National Key Research and Development Program of China under Grant No. 2018YFB1003404 and the National Natural Science Foundation of China under Grant Nos. 61672142, U1435216 and 61602103.

Community discovery is an important task in social network analysis. However, most existing methods for community discovery rely on the topological structure alone. These methods ignore the rich information available in the content data. In order to solve this issue, in this paper, we present a community discovery method based on heterogeneous information network decomposition and embedding. Unlike traditional methods, our method takes into account topology, node content and edge content, which can supply abundant evidence for community discovery. First, an embedding-based similarity evaluation method is proposed, which decomposes the heterogeneous information network into several subnetworks, and extracts their potential deep representation to evaluate the similarities between nodes. Second, a bottom-up community discovery algorithm is proposed. Via leader nodes selection, initial community generation, and community expansion, communities can be found more efficiently. Third, some incremental maintenance strategies for the changes of networks are proposed. We conduct experimental studies based on three real-world social networks. Experiments demonstrate the effectiveness and the efficiency of our proposed method. Compared with the traditional methods, our method improves normalized mutual information (NMI) and the modularity by an average of 12% and 37% respectively.

Key words: community discovery; heterogeneous information network; decomposition; embedding; incremental maintenance;

[1] Nowicki K, Snijders T A B. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 2001, 96(455):1077-1087.
[2] Airoldi E M, Blei D M, Fienberg S E, Xing E P, Jaakkola T. Mixed membership stochastic block models for relational data with application to protein-protein interactions. In Proc. the International Biometrics Society Annual Meeting, July 2006.
[3] Hofman J M, Wiggins C H. A Bayesian approach to network modularity. Phy. Rev. Letters, 2008, 100(25):Article No. 258701.
[4] Ren W, Yan G, Liao X, Xiao L. Simple probabilistic algorithm for detecting community structure. Phys. Rev. E Stat. Nonlin Soft Matter Phys., 2009, 79(3):Article No. 036111.
[5] Zhang Z, Cui P, Pei J, Wang X, Zhu W. TIMERS:Errorbounded SVD restart on dynamic networks. In Proc. the 32nd AAAI Conference on Artificial Intelligence, February 2018, pp.224-231.
[6] Yang Z, Hao T, Dikmen O, Chen X, Oja E. Clustering by nonnegative matrix factorization using graph random walk. In Proc. the 26th International Conference on Neural Information Processing Systems, December 2012, pp.1088-1096.
[7] Qiao S, Han N, Gao Y, Li R, Huang J, Guo J, Gutierrez L, Wu X. A fast parallel community discovery model on complex networks through approximate optimization. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(9):1638-1651.
[8] Nguyen N P, Dinh T N, Xuan Y, Thai M. Adaptive algorithms for detecting community structure in dynamic social networks. In Proc. the 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, April 2011, pp.2282-2290.
[9] Hou B, Wang Z, Chen Q, Suo B, Fang C, Li Z, lves Z. Efficient maximal clique enumeration over graph data. Data Science and Engineering, 2016, 1(4):219-230.
[10] Rossetti G, Cazabet R. Community discovery in dynamic networks:A survey. ACM Computing Surveys, 2018, 51(2):Article No. 35.
[11] Ruan Y, Fuhry D, Parthasarathy S. Efficient community detection in large networks using content and links. In Proc. the 22nd International Conference on World Wide Web, May 2013, pp.1089-1098.
[12] Yang J, McAuley J, Leskovec J. Community detection in networks with node attributes. In Proc. the 13th IEEE International Conference on Data Mining, December 2013, pp.1151-1156.
[13] Tian F, Gao B, Cui Q, Chen E, Liu T. Learning deep representations for graph clustering. In Proc. the 28th AAAI Conference on Artificial Intelligence, July 2014, pp.1293-1299.
[14] Wang X, Jin D, Cao X, Yang L, Zhang W. Semantic community identification in large attribute networks. In Proc. the 30th AAAI Conference on Artificial Intelligence, February 2016, pp.265-271.
[15] He D, Feng Z, Jin D, Wang X, Zhang W. Joint identification of network communities and semantics via integrative modeling of network topologies and node contents. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.116-124.
[16] Pei Y, Chakraborty N, Sycara K. Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In Proc. the 24th International Joint Conference on Artificial Intelligence, July 2015, pp.2083-2089.
[17] Zhang G, Jin D, Gao J, Jiao P, Fogelman-Soulié F F, Huang X. Finding communities with hierarchical semantics by distinguishing general and specialized topic. In Proc. the 27th International Joint Conference on Artificial Intelligence, July 2018, pp.3648-3654.
[1] Xiao-Wei Feng, Xiang-Yu Kong, Chuan He, and Dong-Hui Xu. On the Discrete-Time Dynamics of Cross-Coupled Hebbian Algorithm [J]. Journal of Computer Science and Technology, 2022, 37(1): 252-265.
[2] Dan-Hao Zhu, Xin-Yu Dai, Jia-Jun Chen. Pre-Train and Learn: Preserving Global Information for Graph Neural Networks [J]. Journal of Computer Science and Technology, 2021, 36(6): 1420-1430.
[3] Chao Kong, Bao-Xiang Chen, Li-Ping Zhang. DEM: Deep Entity Matching Across Heterogeneous Information Networks [J]. Journal of Computer Science and Technology, 2020, 35(4): 739-750.
[4] Chun-Yang Ruan, Ye Wang, Jiangang Ma, Yanchun Zhang, Xin-Tian Chen. Adversarial Heterogeneous Network Embedding with Metapath Attention Mechanism [J]. Journal of Computer Science and Technology, 2019, 34(6): 1217-1229.
[5] Chun-Yang Ling, Yan-Zhen Zou, Ze-Qi Lin, Bing Xie. Graph Embedding Based API Graph Search and Recommendation [J]. Journal of Computer Science and Technology, 2019, 34(5): 993-1006.
[6] Da-Wei Cheng, Yi Tu, Zhen-Wei Ma, Zhi-Bin Niu, Li-Qing Zhang. BHONEM: Binary High-Order Network Embedding Methods for Networked-Guarantee Loans [J]. Journal of Computer Science and Technology, 2019, 34(3): 657-669.
[7] Wei-Bei Fan, Jian-Xi Fan, Cheng-Kuan Lin, Yan Wang, Yue-Juan Han, Ru-Chuan Wang. Optimally Embedding 3-Ary n-Cubes into Grids [J]. Journal of Computer Science and Technology, 2019, 34(2): 372-387.
[8] Hanen Ameur, Salma Jamoussi, Abdelmajid Ben Hamadou. A New Method for Sentiment Analysis Using Contextual Auto-Encoders [J]. Journal of Computer Science and Technology, 2018, 33(6): 1307-1319.
[9] Lei Guo, Yu-Fei Wen, Xin-Hua Wang. Exploiting Pre-Trained Network Embeddings for Recommendations in Social Networks [J]. , 2018, 33(4): 682-696.
[10] Guang-Hao Ma, Ming-Li Zhang, Xue-Mei Li, Cai-Ming Zhang. Image Smoothing Based on Image Decomposition and Sparse High Frequency Gradient [J]. , 2018, 33(3): 502-510.
[11] Ji-Zhao Zhu, Yan-Tao Jia, Jun Xu, Jian-Zhong Qiao, Xue-Qi Cheng. Modeling the Correlations of Relations for Knowledge Graph Embedding [J]. , 2018, 33(2): 323-334.
[12] Cui-Cui Zhang, Zhi-Lei Liu. Prior-Free Dependent Motion Segmentation Using Helmholtz-Hodge Decomposition Based Object-Motion Oriented Map [J]. , 2017, 32(3): 520-535.
[13] Dong-Fang Zhou, Jian-Xi Fan, Cheng-Kuan Lin, Bao-Lei Cheng, Jing-Ya Zhou, Zhao Liu. Optimal Path Embedding in the Exchanged Crossed Cube [J]. , 2017, 32(3): 618-629.
[14] Chun Liao, Chong Feng, Sen Yang, He-Yan Huang. A Hybrid Method of Domain Lexicon Construction for Opinion Targets Extraction Using Syntax and Semantics [J]. , 2016, 31(3): 595-603.
[15] Fei Tian, Bin Gao, En-Hong Chen, Tie-Yan Liu. Learning Better Word Embedding by Asymmetric Low-Rank Projection of Knowledge Graph [J]. , 2016, 31(3): 624-634.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Li Wanxue;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 .
[4] C.Y.Chung; H.R.Hwa;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[5] Sun Zhongxiu; Shang Lujun;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[6] Pan Qijing;. A Routing Algorithm with Candidate Shortest Path[J]. , 1986, 1(3): 33 -52 .
[7] Zhang Cui; Zhao Qinping; Xu Jiafu;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 .
[8] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[9] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[10] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved