›› 2015, Vol. 30 ›› Issue (4): 799-809.doi: 10.1007/s11390-015-1561-y

Special Issue: Data Management and Data Mining

• Special Section on Data Management and Data Mining • Previous Articles     Next Articles

Zip: An Algorithm Based on Loser Tree for Common Contacts Searching in Large Graphs

Hong Tang1(唐宏), Shuai Mu2(牟帅), Jin Huang3(黄晋), Member, CCF, Jia Zhu3*(朱佳), Member, CCF, Jian Chen2(陈健), Member, CCF, ACM, IEEE, Rui Ding3(丁蕊), Member, CCF, ACM   

  1. 1. School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510006, China;
    2. School of Software Engineering, South China University of Technology, Guangzhou 510006, China;
    3. School of Computer Science, South China Normal University, Guangzhou 510631, China
  • Received:2015-02-01 Revised:2015-05-24 Online:2015-07-05 Published:2015-07-05
  • Contact: Jia Zhu is currently an associate professor in the School of Computer Science at South China Normal University, Guangzhou,after finishing his postdoctoral research at United Nations University, Tokyo. E-mail:jzhu@m.scnu.edu.cn
  • About author:Hong Tang is a Ph.D. candidate in Sun Yat-sen University, Guangzhou. He received his B.S. and M.S. degrees from Huazhong University of Science and Technology, Wuhan, in 1997, and Jinan University, Guangzhou, in 2006, respectively. His research interests include cloud computing and large scale data mining.
  • Supported by:

    This work was supported by the Youth Teacher Startup Fund of South China Normal University of China under Grant No. 14KJ18 and the National High Technology Research and Development 863 Program of China under Grant No. 2013AA01A212.

The problem of k-hop reachability between two vertices in a graph has received considerable attention in recent years. A substantial number of algorithms have been proposed with the goal of improving the searching efficiency of the k-hop reachability between two vertices in a graph. However, searching and traversing are challenging tasks, especially in large-scale graphs. Furthermore, the existing algorithms propounded by different scholars are not satisfactory in terms of feasibility and scalability when applied to different kinds of graphs. In this work we propose a new algorithm, called Zip, in an attempt to efficiently determine the common contacts between any two random vertices in a large-scale graph. First, we describe a novel algorithm for constructing the graph index via binary searching which maintains the adjacent list of each vertex in order. Second, we present the ways to achieve a sequential k-hop contact set by using the loser tree, a merge sorting algorithm. Finally, we develop an efficient algorithm for querying common contacts and an optimized strategy for k-hop contact set serialization. Experimental results on synthetic and real datasets show that the proposed Zip algorithm outperforms existing state-of-the-art algorithms (e.g., breadth-first Searching, GRAIL, Graph Stratification algorithm).

[1] Cheng J, Shang Z, Cheng H et al. Efficient processing of khop reachability queries. The VLDB Journal, 2014, 23(2): 227-252.

[2] Cheng J, Shang Z, Cheng H et al. K-Reach: Who is in your small world. Proceedings of the VLDB Endowment, 2012, 5(11): 1292-1303.

[3] Yildirim H, Chaoji V, Zaki M J. GRAIL: Scalable reachability index for large graphs. Proceedings of the VLDB Endowment, 2010, 3(1/2): 276-284.

[4] Yildirim H, Zaki M J. Graph indexing for reachability queries. In Proc. the 29th International Conference on Data Engineering Workshops (ICDEW), Mar. 2010, pp.321-324.

[5] Fu A W C,Wu H, Cheng J, Chu S, Wong R. IS-LABEL: An independent-set based labeling scheme for point-to-point distance querying on large graphs. Proceedings of the VLDB Endowment, 2013, 6(6): 457-468.

[6] Akiba T, Iwata Y, Yoshida Y. Fast exact shortestpath distance queries on large networks by pruned landmark labeling. arXiv: 1304.4661, 2013. http://arxiv.org/- abs/1304.4661, May 2015.

[7] Wei F. TEDI: Efficient shortest path query answering on graphs. ACM SIGMOD International Conference on Management of Data, Jun. 2010, pp.99-110.

[8] Hasanzadeh F, Naghibzadeh M. GRU: Efficient reachability answering for large graphs using united interval labeling. In Proc. the 3rd International Advance Computing Conference, Feb. 2013, pp.1011-1017.

[9] Jin R, Ruan N, You B, Wang H. Hub-accelerator: Fast and exact shortest path computation in large social networks. arXiv: 1305.0507, 2013. http://arxiv.org/abs/1305.0507, May 2015.

[10] Beamer S, Asanovic K, Patterson D. Direction-optimizing breadth-first search. In Proc. International SC Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2012, Article No. 12.

[11] Jagadish H V. A compression technique to materialize transitive closure. ACM Transactions on Database Systems, 1990, 15(4): 558-598.

[12] Zhao X, Sala A, Zheng H et al. Efficient shortest paths on massive social graphs. In Proc. the 7th Int. Conf. Applications and Worksharing, Oct. 2011, pp.77-86.

[13] Chen Y, Chen Y. An efficient algorithm for answering graph reachability queries. In Proc. the 24th Int. Conf. Data Engineering, Apr. 2008, pp.893-902.

[14] Knuth D E. Optimum binary search trees. Acta Informatica, 1971, 1(1): 14-25.

[15] Nowak R. Generalized binary search. In Proc. the 46th IEEE Annual Allerton Conference on Communication, Control, and Computing, Sept. 2008, pp.568-574.

[16] Caselles V, Meinhardt E, Monasse P. Constructing the tree of shapes of an image by fusion of the trees of connected components of upper and lower level sets. Positivity, 2008, 12(1): 55-73.

[17] Jin R, Xiang Y, Ruan N et al. 3-HOP: A high-compression indexing scheme for reachability query. In Proc. SIGMOD Conference, June 29-July 2, 2009, pp.813-836.

[18] Jin R, Xiang Y, Ruan N et al. Efficiently answering reachability queries on very large directed graphs. In Proc. ACM SIGMOD, Jun. 2008, pp.595-608.

[19] Agrawal R, Borgida A, Jagadish H V. Efficient management of transitive relationships in large data and knowledge bases. In Proc. ACM Symp. Management of Data, May 31- June 2, 1989, pp.253-262.

[20] Cohen E, Halperin E, Kaplan H et al. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 2002, 32(5): 1338-1355.

[21] Shang H, Kitsuregawa M. Efficient breadth-first search on large graphs with skewed degree distributions. In Proc. the 16th EDBT, Mar. 2013, pp.311-322.

[22] Xiao Y, Wu W, Pei J, Wang W, He Z. Efficiently indexing shortest paths by exploiting symmetry in graphs. In Proc. the 12th EDBT, Mar. 2009, pp.493-504.

[23] Potamias M, Bonchi F, Castillo C et al. Fast shortest path distance estimation in large networks. In Proc. the 18th ACM CIKM, Nov. 2009, pp.867-876.
No related articles found!
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved