计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (1): 170-184.doi: 10.1007/s11390-019-1905-0

所属专题: Artificial Intelligence and Pattern Recognition Data Management and Data Mining

• Data Management and Data Mining • 上一篇    下一篇

邀请谁来参加:社交网络中规模受限的k核问题研究

Yu-Liang Ma1, Member, CCF, Ye Yuan1, Member, CCF, ACM, IEEE, Fei-Da Zhu2, Member, ACM, IEEE Guo-Ren Wang3, Member, CCF, ACM, IEEE, Jing Xiao4, and Jian-Zong Wang4, Member, CCF   

  1. 1 School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China;
    2 School of Information Systems, Singapore Management University, Singapore 188065, Singapore;
    3 School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;
    4 Ping An Technology (Shenzhen) Co., Ltd, Shenzhen 518048, China
  • 收稿日期:2017-11-28 修回日期:2018-09-20 出版日期:2019-01-05 发布日期:2019-01-12
  • 作者简介:Yu-Liang Ma received his B.S. degree in computer science from the College of Computer Science and Engineering, Northeastern University, Shenyang, in 2013. Currently, he is a Ph.D. candidate of Northeastern University, Shenyang. His main research interests include graph databases, location-based social networks, and social network analysis.
  • 基金资助:
    This research is partially supported by the National Research Foundation, Prime Ministers Office, Singapore, under its International Research Centres in Singapore Funding Initiative and Pinnacle Lab for Analytics at Singapore Management University, the National Natural Science Foundation of China under Grant Nos. 61572119, 61622202, 61732003, 61729201, 61702086, and U1401256, and the Fundamental Research Funds for the Central Universities of China under Grant No. N150402005.

Who Should Be Invited to My Party: A Size-Constrained k-Core Problem in Social Networks

Yu-Liang Ma1, Member, CCF, Ye Yuan1, Member, CCF, ACM, IEEE, Fei-Da Zhu2, Member, ACM, IEEE Guo-Ren Wang3, Member, CCF, ACM, IEEE, Jing Xiao4, and Jian-Zong Wang4, Member, CCF   

  1. 1 School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China;
    2 School of Information Systems, Singapore Management University, Singapore 188065, Singapore;
    3 School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;
    4 Ping An Technology (Shenzhen) Co., Ltd, Shenzhen 518048, China
  • Received:2017-11-28 Revised:2018-09-20 Online:2019-01-05 Published:2019-01-12
  • About author:Yu-Liang Ma received his B.S. degree in computer science from the College of Computer Science and Engineering, Northeastern University, Shenyang, in 2013. Currently, he is a Ph.D. candidate of Northeastern University, Shenyang. His main research interests include graph databases, location-based social networks, and social network analysis.
  • Supported by:
    This research is partially supported by the National Research Foundation, Prime Ministers Office, Singapore, under its International Research Centres in Singapore Funding Initiative and Pinnacle Lab for Analytics at Singapore Management University, the National Natural Science Foundation of China under Grant Nos. 61572119, 61622202, 61732003, 61729201, 61702086, and U1401256, and the Fundamental Research Funds for the Central Universities of China under Grant No. N150402005.

通过结合社交网络中用户的亲密度和网络拓扑结构,本文提出一种规模受限的k-core群组查询(size constrained k-core group query,SCCGQ)。具体地,SCCGQ旨在查找一个规模大小为h的用户集合,使得该用户集合具有最高的用户亲密度,同时其诱导子图能构成k-core结构。SCCGQ问题可以广泛应用于事件规划、任务调度以及社会分析等其它领域。传统的k-core问题主要找出给定社交网络中的k-core结构,而SCCGQ问题不仅考虑k-core结构,同时还考虑用户群组的规模。传统的k-core问题具有多项式时间的解决算法,而SCCGQ问题具有指数级的复杂度。为了解决SCCGQ问题,本文提出一种辐散算法(blast scatter,BS)。BS以查询节点为中心进行向外扩展搜索。在每一次向外扩展中,BS采用贪心策略进行中心节点的更新,然后一次选取多个邻居节点进行扩展。为了进一步的加速查询处理,本文提出一种有界限制的搜索算法(Bounded Extension,BE)。BE采用一种有效的社会距离剪枝策略和紧密的亲密度上界来降低搜索空间。此外,为了加速查询处理,本文提出一种离线的社交感知索引。最后,本文在真实的社交网络数据集上进行大量的实验来表明本文所提出算法的性能。

关键词: 群组查询, k-核, 社会分析, 社交网络

Abstract: In this paper, we investigate the problem of a size-constrained k-core group query (SCCGQ) in social networks, taking both user closeness and network topology into consideration. More specifically, SCCGQ intends to find a group of h users that has the highest social closeness while being a k-core. SCCGQ can be widely applied to event planning, task assignment, social analysis, and many other fields. In contrast to existing work on the k-core detection problem, which aims to find a k-core in a social network, SCCGQ not only focuses on k-core detection but also takes size constraints into consideration. Although the conventional k-core detection problem can be solved in linear time, SCCGQ has a higher complexity. To solve the problem of SCCGQ, we propose a Blast Scatter (BS) algorithm, which appoints the query node as the center to begin outward expansions via breadth search. In each outward expansion, BS finds a new center through a greedy strategy and then selects multiple neighbors of the center. To speed up the BS algorithm, we propose an advanced search algorithm, called Bounded Extension (BE). Specifically, BE combines an effective social distance pruning strategy and a tight upper bound of social closeness to prune the search space considerably. In addition, we propose an offline social-aware index to accelerate the query processing. Finally, our experimental results demonstrate the efficiency and effectiveness of our proposed algorithms on large real-world social networks.

Key words: group query, k-core, social analysis, social network

[1] Allen J. Event Planning:The Ultimate Guide to Successful Meetings, Corporate Events, Fundraising Galas, Conferences, Conventions, Incentives and Other Special Events (2nd edition). Wiley, 2008.
[2] She J, Tong Y, Chen L, Cao C C. Conflict-aware eventparticipant arrangement and its variant for online setting. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(9):2281-2295.
[3] She J, Tong Y, Chen L, Song T. Feedback-aware social event-participant arrangement. In Proc. ACM SIGMOD International Conference on Management of Data, May 2017, pp.851-865.
[4] Tong Y, She J, Meng R. Bottleneck-aware arrangement over event-based social networks:The max-min approach. World Wide Web, 2016, 19(6):1151-1177.
[5] Sinnen O, Sousa L A. Communication contention in task scheduling. IEEE Transactions on Parallel and Distributed Systems, 2005, 16(6):503-515.
[6] Tong Y, Wang L, Zhou Z, Ding B, Chen L, Ye J, Xu K. Flexible online task assignment in real-time spatial data. Proceedings of the VLDB Endowment 2017, 10(11):1334-1345.
[7] Tong Y, She J, Ding B, Wang L, Chen L. Online mobile micro-task allocation in spatial crowdsourcing. In Proc. the 32nd International Conference on Data Engineering, May 2016, pp.49-60.
[8] She J, Tong Y, Chen L. Utility-aware social eventparticipant planning. In Proc. ACM SIGMOD International Conference on Management of Data, May 2015, pp.1629-1643.
[9] Tong Y, Chen L, Zhou Z, Jagadish H V, Shou L, Lv W. SLADE:A smart large-scale task decomposer in crowdsourcing. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(8):1588-1601.
[10] Scott J. Social Network Analysis (3rd edition). Sage, 2012.
[11] Knoke D, Yang S. Social Network Analysis (2nd edition). Sage Publishers, 2007
[12] Downey R G, Fellows M R. Fixed parameter tractability and completeness Ⅱ:On completeness for W[1]. Theoretical Computer Science, 1995, 141(1/2):109-131.
[13] Tan P N, Steinbach M, Kumar V. Introduction to Data Mining (1st edition). Pearson India, 2006.
[14] Guha S, Rastogi R, Shim K. ROCK:A robust clustering algorithm for categorical attributes. Information Systems, 2001, 25(5):345-366.
[15] Spitzer F. Principles of Random Walk (2nd edition). Springer, 2001
[16] Seidman S B. Network structure and minimum degree. Social Networks, 1983, 5(3):269-287.
[17] Khaouid W, Barsky M, Srinivasan V, Thomo A. K-core decomposition of large networks on a single PC. Proceedings of the VLDB Endowment, 2015, 9(1):13-23.
[18] West D B. Introduction to Graph Theory (2nd edition). Pearson, 2000.
[19] Chakrabarti D, Zhan Y, Faloutsos C. RMAT:A recursive model for graph mining. In Proc. the 4th SIAM International Conference on Data Mining, April 2004, pp.442-446.
[20] Yuan Y, Lian X, Chen L, Yu J X, Wang G, Sun Y. Keyword search over distributed graphs with compressed signature. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(6):1212-1225.
[21] Hanneman R A, Riddle M. Introduction to Social Network Methods. University of California, 2005.
[22] Luce R D, Perry A D. A method of matrix analysis of group structure. Psychometrika, 1949, 14(2):95-116.
[23] Bron C. Algorithm 457:Finding all cliques of an undirected graph. Communications of the ACM, 1973, 16(9):575-576.
[24] Cheng J, Ke Y, Fu A W C, Yu J X, Zhu L. Finding maximal cliques in massive networks by h*-graph. In Proc. ACM SIGMOD International Conference on Management of Data, June 2010, pp.447-458.
[25] Luce R D. Connectivity and generalized cliques in sociometric group structure. Psychometrika, 1950, 15(2):169-190.
[26] Seidman S B, Foster B L. A graph-theoretic generalization of the clique concept*. Journal of Mathematical Sociology, 1978, 6(1):139-154.
[27] Dorogovtsev S, Goltsev A V, Mendes J F. K-core organization of complex networks. Physical Review Letters, 2006, 96(4):040601.
[28] Alvarez-Hamelin J I, Dall'Asta L, Barrat A, Vespignani A. k-core decomposition:A tool for the visualization of large scale networks. arXiv:0504107, 2005. https://arxiv.org/abs/cs/0504107, May, 2018.
[29] Zhang H, Zhao H, Cai W, Liu J, Zhou W. Using the k-core decomposition to analyze the static structure of large-scale software systems. The Journal of Supercomputing, 2010, 53(2):352-369.
[30] Saríyüce A E, Gedik B, Jacques-Silva G, Wu K L, Catalyürek Ü V. Streaming algorithms for k-core decomposition. Proceedings of the VLDB Endowment, 2013, 6(6):433-444.
[31] Cheng J, Ke Y, Chu S, Özsu M T. Efficient core decompo-sition in massive networks. In Proc. the 27th International Conference on Data Engineering, April 2011, pp.51-62.
[32] Wen D, Qin L, Zhang Y, Lin X. I/O efficient core graph decomposition at web scale. In Proc. the 32nd International Conference on Data Engineering, May 2016, pp.133-144.
[33] Zhang F, Zhang Y, Qin L, Zhang W, Lin X. When engagement meets similarity:Efficient (k, r)-core computation on social networks. Proceedings of the VLDB Endowment, 2017, 10(10):998-1009.
[34] Cohen J. Graph twiddling in a MapReduce world. Computing in Science & Engineering, 2009, 11(4):29-41.
[35] Wang J, Cheng J. Truss decomposition in massive networks. Proceedings of the VLDB Endowment, 2012, 5(9):812-823.
[36] Huang X, Cheng H, Qin L, Tian W, Yu J X. Querying k-truss community in large and dynamic graphs. In Proc. ACM SIGMOD International Conference on Management of Data, June 2014, pp.1311-1322.
[37] Chen P L, Chou C K, Chen M S. Distributed algorithms for k-truss decomposition. In Proc. the 2014 IEEE International Conference on Big Data, October 2015, pp.471-480.
[38] Deng K, Sadiq S, Zhou X, Xu H, Fung G P C, Lu Y. On group nearest group query processing. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(2):295-308.
[39] Li Y, Chen R, Xu J, Huang Q, Hu H, Choi B. Geo-social k-cover group queries for collaborative spatial computing. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(10):2729-2742.
[40] Yang D N, Chen Y L, Lee W C, Chen M S. On socialtemporal group query with acquaintance constraint. Proceedings of the VLDB Endowment, 2011, 4(6):397-408.
[41] Yang D N, Shen C Y, Lee W C, Chen M S. On sociospatial group query for location-based social networks. In Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2012, pp.949-957.
[1] Ming Chen, Wen-Zhong Li, Lin Qian, Sang-Lu Lu, Dao-Xu Chen. 基于循环神经网络位置兴趣挖掘的下一跳兴趣点推荐[J]. 计算机科学技术学报, 2020, 35(3): 603-616.
[2] Jing-Ya Zhou, Jian-Xi Fan, Cheng-Kuan Lin, Bao-Lei Cheng. 在线社交网络中一种经济有效的用户数据存储方法[J]. 计算机科学技术学报, 2019, 34(1): 234-252.
[3] Tie-Yun Qian, Bei Liu, Liang Hong, Zhen-Ni You. 基于位置社交网络中时间地点感知的POI推荐[J]. 计算机科学技术学报, 2018, 33(6): 1219-1230.
[4] Peng-Peng Zhao, Hai-Feng Zhu, Yanchi Liu, Zi-Ting Zhou, Zhi-Xu Li, Jia-Jie Xu. 一种地理社交组推荐的生成模型方法[J]. , 2018, 33(4): 727-738.
[5] Mehdi Azaouzi, Lotfi Ben Romdhane. 一种有效使用社交行为计算社交网络中影响力节点的两相模型[J]. , 2018, 33(2): 286-304.
[6] Rong Wang, Tian-Lei Hu, Gang Chen. 本地达人会去何处?一种基于用户位置-话题相似度的Top-N地点推荐方法[J]. , 2018, 33(1): 190-206.
[7] Jing Jiang, Zi-Fei Shan, Xiao Wang, Li Zhang, Ya-Fei Dai. 真实环境下虚假团体的测量研究[J]. , 2015, 30(6): 1344-1357.
[8] Huai-Yu Wan, Zhi-Wei Wang You-Fang Lin, Xu-Guang Jia, Yuan-Wei Zhou. 旅客社交网络中的家庭团体发现研究[J]. , 2015, 30(5): 1141-1153.
[9] Hui Li, Jiang-Tao Cui, Jian-Feng Ma. 一个三层次的在线网络社交影响研究综述[J]. , 2015, 30(1): 184-199.
[10] Mingxuan Yuan, Lei Chen, Philip S. YU, Hong Mei. 比空白保护得更多:社交网络中用户敏感信息的反学习[J]. , 2014, 29(5): 762-776.
[11] Rafael Messias Martins, Gabriel Faria Andery, Henry Heberle, Fernando Vieira Pau. 基于多维投影的社交网络可视化分析[J]. , 2012, 27(4): 791-810.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 沈一栋;. A Fixpoint Semantics for Stratified Databases[J]. , 1993, 8(2): 12 -21 .
[2] 吴信东;. Inductive Learning[J]. , 1993, 8(2): 22 -36 .
[3] 马军; 马绍汉;. Efficient Parallel Algorithms for Some Graph Theory Problems[J]. , 1993, 8(4): 76 -80 .
[4] 马军; 马绍汉;. An O(k~2n~2) Algorithm to Find a k-Partition in a k-Connected Graph[J]. , 1994, 9(1): 86 -91 .
[5] 金国华; 陈福接. On the Problem of Optimizing Parallel Programs for Complex Memory Hierarchies[J]. , 1994, 9(1): 1 -26 .
[6] 向东; 魏道政;. GLOBAL: A Design for Random Testability Algorithm[J]. , 1994, 9(2): 182 -192 .
[7] 周建强; 谢立; 戴非; 孙钟秀;. Adaptive Memory Coherence Algorithms in DSVM[J]. , 1994, 9(4): 365 -372 .
[8] 马小虎; 潘志庚; 张福炎;. The Automatic Generation of Chinese Outline Font Based on Stroke Extraction[J]. , 1995, 10(1): 42 -52 .
[9] 张松懋;. Weak Precedence Story Parsing Grammar[J]. , 1995, 10(1): 53 -64 .
[10] 唐志敏; 夏培肃;. A Maximum Time Difference Pipelined Arithmetic Unit Based on CMOS Gate Array[J]. , 1995, 10(2): 97 -103 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: