Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (6): 1185-1202.doi: 10.1007/s11390-019-1969-x

Special Issue: Data Management and Data Mining

• Data Management and Data Mining • Previous Articles     Next Articles

An Efficient Framework for Multiple Subgraph Pattern Matching Models

Jiu-Ru Gao1, Wei Chen1, Jia-Jie Xu1, Member, CCF, ACM, An Liu1, Member, CCF, ACM, Zhi-Xu Li1, Member, CCF, ACM, Hongzhi Yin2, Member, CCF, ACM, Lei Zhao1,*, Member, CCF, ACM   

  1. 1 School of Computer Science and Technology, Soochow University, Suzhou 215006, China;
    2 School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane 4072, Australia
  • Received:2018-12-14 Revised:2019-09-12 Online:2019-11-16 Published:2019-11-16
  • Contact: Lei Zhao E-mail:zhaol@suda.edu.cn
  • About author:Jiu-Ru Gao received her B.S. degree in computer science and technology from Changshu Institute of Technology, Changshu, in 2016. She is currently a M.S. candidate in Soochow University, Suzhou. Her research interests include data mining, data analysis and graph computing.
  • Supported by:
    This work is supported by the National Natural Science Foundation of China under Grant No. 61572335, and the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20151223.

With the popularity of storing large data graph in cloud, the emergence of subgraph pattern matching on a remote cloud has been inspired. Typically, subgraph pattern matching is defined in terms of subgraph isomorphism, which is an NP-complete problem and sometimes too strict to find useful matches in certain applications. And how to protect the privacy of data graphs in subgraph pattern matching without undermining matching results is an important concern. Thus, we propose a novel framework to achieve the privacy-preserving subgraph pattern matching in cloud. In order to protect the structural privacy in data graphs, we firstly develop a k-automorphism model based method. Additionally, we use a cost-model based label generalization method to protect label privacy in both data graphs and pattern graphs. During the generation of the k-automorphic graph, a large number of noise edges or vertices might be introduced to the original data graph. Thus, we use the outsourced graph, which is only a subset of a k-automorphic graph, to answer the subgraph pattern matching. The efficiency of the pattern matching process can be greatly improved in this way. Extensive experiments on real-world datasets demonstrate the high efficiency of our framework.

Key words: subgraph pattern matching; k-automorphism; label generalization;

[1] Lu H, Chang Y. Mining disease transmission networks from health insurance claims. In Proc. the 2017 International Conference on Smart Health, June 2017, pp.268-273.
[2] Ray B, Ghedin E, Chunara R. Network inference from multimodal data:A review of approaches from infectious disease transmission. Journal of Biomedical Informatics, 2016, 64:44-54.
[3] Balsa E, Pérez-Solà C, Díaz C. Towards inferring communication patterns in online social networks. ACM Trans. Internet Techn., 2017, 17(3):Article No. 32.
[4] Yin H, Zhou X, Cui B, Wang H, Zheng K, Hung N Q V. Adapting to user interest drift for POI recommendation. IEEE Trans. Knowl. Data Eng., 2016, 28(10):2566-2581.
[5] Yin H, Hu Z, Zhou X, Wang H, Zheng K, Hung N Q V, Sadiq S W. Discovering interpretable geo-social communities for user behavior prediction. In Proc. the 32nd IEEE International Conference on Data Engineering, May 2016, pp.942-953.
[6] Xie M, Yin H, Wang H, Xu F, Chen W, Wang S. Learning graph-based POI embedding for location-based recommendation. In Proc. the 25th ACM International Conference on Information and Knowledge Management, October 2016, pp.15-24.
[7] Yin H, Wang W, Wang H, Chen L, Zhou X. Spatial-aware hierarchical collaborative deep learning for POI recommendation. IEEE Trans. Knowl. Data Eng., 2017, 29(11):2537-2551.
[8] Aggarwal C C, Wang H. Managing and Mining Graph Data. Spring, 2010.
[9] Gallagher B. Matching structure and semantics:A survey on graph-based pattern matching. In Proc. the 2006 AAAI Fall Symposium on Capturing and Using Patterns for Evidence Detection, October 2006, pp.45-53.
[10] Henzinger M R, Henzinger T A, Kopke P W. Computing simulations on finite and infinite graphs. In Proc. the 36th Annual Symposium on Foundations of Computer Science, October 1995, pp.453-462.
[11] Fan W, Li J, Ma S, Tang N, Wu Y, Wu Y. Graph pattern matching:From intractable to polynomial time. Proceedings of the VLDB Endowment, 2010, 3(1):264-275.
[12] Brynielsson J, Högberg J, Kaati L, Mårtenson C, Svenson P. Detecting social positions using simulation. In Proc. the 2010 International Conference on Advances in Social Networks Analysis and Mining, August 2010, pp.48-55.
[13] Ma S, Cao Y, Fan W, Huai J, Wo T. Strong simulation:Capturing topology in graph pattern matching. ACM Trans. Database Syst., 2014, 39(1):Article No. 4.
[14] Fard A, Nisar M U, Ramaswamy L, Miller J A, Saltz M. A distributed vertex-centric approach for pattern matching in massive graphs. In Proc. the 2013 IEEE International Conference on Big Data, October 2013, pp.403-411.
[15] Fard A, Nisar M U, Miller J A, Ramaswamy L. Distributed and scalable graph pattern matching:Models and algorithms. International Journal of Big Data, 2014, 1(1):1-14.
[16] Fan W, Li J, Ma S, Tang N, Wu Y. Adding regular expressions to graph reachability and pattern queries. In Proc. the 27th International Conference on Data Engineering, April 2011, pp.39-50.
[17] Chang Z, Zou L, Li F. Privacy preserving subgraph matching on large graphs in cloud. In Proc. the 2016 International Conference on Management of Data, June 2016, pp.199-213.
[18] Zou L, Chen L, Özsu M T. K-automorphism:A general framework for privacy preserving network publication. Proceedings of the VLDB Endowment, 2009, 2(1):946-957.
[19] Tai C, Tseng P, Yu P S, Chen M. Identity protection in sequential releases of dynamic networks. IEEE Trans. Knowl. Data Eng., 2014, 26(3):635-651.
[20] Liu K, Terzi E. Towards identity anonymization on graphs. In Proc. the ACM SIGMOD International Conference on Management of Data, June 2008, pp.93-106.
[21] Zhou B, Pei J. Preserving privacy in social networks against neighborhood attacks. In Proc. the 24th International Conference on Data Engineering, April 2008, pp.506-515.
[22] Li J, Xiong J, Wang X. The structure and evolution of large cascades in online social networks. In Proc. the 4th International Conference on Computational Social Networks, August 2015, pp. 273-284.
[23] Hay M, Miklau G, Jensen D et al. Resisting structural reidentification in anonymized social networks. VLDB, 2010, 19(6):797-823.
[24] Cheng J, Fu A W, Liu J. K-isomorphism:Privacy preserving network publication against structural attacks. In Proc. the ACM SIGMOD International Conference on Management of Data, June 2010, pp. 459-470.
[25] Wu W, Xiao Y, Wang W et al. K-symmetry model for identity anonymization in social networks. In Proc. the 13th International Conference on Extending Database Technology, March 2010, pp.111-122.
[26] Liu K, Terzi E. Towards identity anonymization on graphs. In Proc. the ACM SIGMOD International Conference on Management of Data, June 2008, pp.93-106.
[27] Tai C H, Yu P S, Yang D H, Chen M S. Privacy preserving social network publication against friendship attacks. In Proc. the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2011, pp.1262-1270.
[28] Chen S, Zhou S. Recursive mechanism:Towards node differential privacy and unrestricted joins. In Proc. the ACM SIGMOD International Conference on Management of Data, June 2013, pp.653-664.
[29] Zhu T, Li G, Zhou W, Yu P S. Differentially private data publishing and analysis:A survey. IEEE Trans. Knowl. Data Eng., 2017, 29(8):1619-1638.
[30] Qian Q, Li Z, Zhao P et al. Publishing graph node strength histogram with edge differential privacy. In Proc. the 23rd International Conference Database Systems for Advanced Applications, May 2018, pp.75-91.
[31] Sala A, Zhao X, Wilson C et al. Sharing graphs using differentially private graph models. In Proc. the 11th ACM SIGCOMM Internet Measurement Conference, November 2011, pp.81-98.
[32] Chen R, Fung B C, Yu P S et al. Correlated network data publication via differential privacy. VLDB, 2014, 23(4):653-676.
[33] Zhang J, Cormode G, Procopiuc C M et al. Private release of graph statistics using ladder functions. In Proc. the 2015 ACM SIGMOD International Conference on Management of Data, May 2015, pp.731-745.
[34] Ullmann J R. An algorithm for subgraph isomorphism. J. ACM, 1976, 23(1):31-42.
[35] Yuan Y, Wang G, Chen L et al. Efficient subgraph similarity search on large probabilistic graph databases. VLDB, 2012, 5(9):800-811.
[36] Yuan Y, Wang G, Xu J Y et al. Efficient distributed subgraph similarity matching. VLDB, 2015, 24(3):369-394.
[37] Gao J, Xu J, Liu G et al. A privacy-preserving framework for subgraph pattern matching in cloud. In Proc. the 23rd International Conference on Database Systems for Advanced Applications, May 2018, pp.307-322.
[38] Karypis G, Kumar V. Analysis of multilevel graph partitioning. In Proc. the 1995 ACM/IEEE Conference on Supercomputing, December 1995.
[39] Karypis G, Kumar V. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed computing, 1998, 48(1):96-129.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved