Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (3): 690-706.doi: 10.1007/s11390-019-1936-6

Special Issue: Computer Architecture and Systems

• Computer Architecture and Systems • Previous Articles    

Cacheap: Portable and Collaborative I/O Optimization for Graph Processing

Peng Zhao1,2, Student Member, CCF, Chen Ding3, Member, ACM, IEEE, Lei Liu1,*, Member, CCF, Jiping Yu4, Wentao Han4, Member, CCF, ACM, IEEE, Xiao-Bing Feng1,2, Member, CCF, ACM, IEEE   

  1. 1 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences Beijing 100190, China;
    2 University of Chinese Academy of Sciences, Beijing 100049, China;
    3 Department of Computer Science, University of Rochester, Rochester 14623, U.S.A.;
    4 Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Received:2018-05-09 Revised:2019-03-19 Online:2019-05-05 Published:2019-05-06
  • Contact: Lei Liu
  • About author:Peng Zhao received his B.S. degree in computer science and technology from Jilin University, Changchun, in 2013. Currently he is a Ph.D. candidate of Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS), Beijing. His research interests include parallel and distributed programming models and graph processing systems.
  • Supported by:
    This work is supported by the National Key Research and Development Program of China under Grant No. 2017YFB1003103, the National Natural Science Foundation of China under Grant Nos. 61432018, 61432016, 61332009, and 61521092, the National Science Foundation of USA under Contract Nos. CCF-1717877 and CCF-1629376, and an IBM CAS Faculty Fellowship.

Increasingly there is a need to process graphs that are larger than the available memory on today's machines. Many systems have been developed with graph representations that are efficient and compact for out-of-core processing. A necessary task in these systems is memory management. This paper presents a system called Cacheap which automatically and efficiently manages the available memory to maximize the speed of graph processing, minimize the amount of disk access, and maximize the utilization of memory for graph data. It has a simple interface that can be easily adopted by existing graph engines. The paper describes the new system, uses it in recent graph engines, and demonstrates its integer factor improvements in the speed of large-scale graph processing.

Key words: out-of-core graph processing system; I/O optimization; memory cache; graph analytics; locality;

[1] Coffman T, Greenblatt S, Marcus S. Graph-based technologies for intelligence analysis. Communications of the ACM, 2004, 47(3):45-47.
[2] Han W, Miao Y, Li K, Wu M, Yang F, Zhou L, Prabhakaran V, Chen W, Chen E. Chronos:A graph engine for temporal graph analysis. In Proc. the 9th Eurosys Conference, April 2014, Article No. 1.
[3] Jeong H, Mason P S, Barabasi A L, Oltvai N Z. Lethality and centrality in protein networks. Nature, 2001, 411(6833):41-42.
[4] Xiang L, Yuan Q, Zhao S, Chen L, Zhang X, Yang Q, Sun J. Temporal recommendation on graphs via long- and shortterm preference fusion. In Proc. the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2010, pp.723-732.
[5] Cheng J, Liu Q, Li Z, Fan W, Lui C S J, He C. VENUS:Vertex-centric streamlined graph computation on a single PC. In Proc. the 31st IEEE International Conference on Data Engineering, April 2015, pp.1131-1142.
[6] Chi Y, Dai G, Wang Y, Sun G, Li G, Yang H. NXgraph:An efficient graph processing system on a single machine. In Proc. the 32nd IEEE International Conference on Data Engineering, May 2016, pp.409-420.
[7] Han W, Lee S, Park K, Lee J, Kim M, Kim J, Yu H. TurboGraph:A fast parallel graph engine handling billion-scale graphs in a single PC. In Proc. the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2013, pp.77-85.
[8] Maass S, Min C, Kashyap S, Kang W, Kumar M, Kim T. Mosaic:Processing a trillion-edge graph on a single machine. In Proc. the 12th European Conference on Computer Systems, April 2017, pp.527-543.
[9] Roy A, Mihailovic I, Zwaenepoel W. X-Stream:Edgecentric graph processing using streaming partitions. In Proc. the 24th ACM SIGOPS Symposium of Operating Systems Principles, November 2013, pp.472-488.
[10] Vora K, Xu G Q, Gupta R. Load the edges you need:A generic I/O optimization for disk-based graph processing.In Proc. the 2016 USENIX Annual Technical Conference, June 2016, pp.507-522.
[11] Zhang Y, Liao X, Jin H, Gu L, Tan G, Zhou B. HotGraph:Efficient asynchronous processing for real-world graphs. IEEE Transactions on Computers, 2017, 66(5):799-809.
[12] Zhu X, Han W, Chen W. GridGraph:Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In Proc. the 2015 USENIX Annual Technical Conference, July 2015, pp.375-386.
[13] Kyrola A, Blelloch E G, Guestrin C. GraphChi:Large-scale graph computation on just a PC. In Proc. the 10th USENIX Symposium on Operating Systems Design and Implementation, October 2012, pp.31-46.
[14] Zheng D, Mhembere D, Burns C R, Vogelstein T J, Priebe E C, Szalay S A. FlashGraph:Processing billion-node graphs on an array of commodity SSDs. In Proc. the 13th USENIX Conference on File and Storage Technologies, February 2015, pp.45-58.
[15] Nguyen D, Lenharth A, Pingali K. A lightweight infrastructure for graph analytics. In Proc. the 24th ACM SIGOPS Symposium on Operating Systems Principles, November 2013, pp.456-471.
[16] Chhugani J, Satish N, Kim C, Sewall J, Dubey P. Fast and efficient graph traversal algorithm for CPUs:Maximizing single-node efficiency. In Proc. the 26th IEEE International Parallel and Distributed Processing Symposium, May 2012, pp.378-389.
[17] Gonzalez E J, Low Y, Gu H, Bickson D, Guestrin C. PowerGraph:Distributed graph-parallel computation on natural graphs. In Proc. the 10th USENIX Symposium on Operating Systems Design and Implementation, October 2012, pp.17-30.
[18] Gonzalez E J, Xin S R, Dace A, Crankshaw D, Franklin J M, Stoica I. GraphX:Graph processing in distributed dataflow framework. In Proc. the 11th USENIX Symposium on Operating Systems Design and Implementation, October 2014, pp.599-613.
[19] Liu H, Huang H H. Graphene:Fine-grained IO management for graph computing. In Proc. the 15th USENIX Conference on File and Storage Technologies, February 2017, pp.285-300.
[20] Malexicz G, Austern H M, Bik J C A, Dehnert C J, Horn I, Leiser N, Czajkowski G. Pregel:A system for largescale graph processing. In Proc. the ACM SIGMOD International Conference on Management of Data, June 2010, pp.135-146.
[21] Roy A, Bindschaedler L, Malicevic J, Zwaenepoel W. Chaos:Scale-out graph processing from secondary storage. In Proc. the 25th Symposium on Operating Systems Principles, October 2015, pp.410-424.
[22] Kwak H, Lee C, Park H, Moon B S. What is Twitter, a social network or a news media? In Proc. the 19th International Conference on World Wide Web, April 2010, pp.591-600.
[23] Boldi P, Vigna S. The WebGraph framework I:Compression techniques. In Proc. the 13th International World Wide Web Conference, May 2004, pp.595-602.
[24] Belady A L. A study of replacement algorithms for a virtualstorage computer. IBM Systems Journal, 1966, 5(2):78- 101.
[25] Mattson L R, Gecsei J, Slutz D, Traiger L I. Evaluation techniques for storage hierarchies. IBM Systems Journal, 1970, 9(2):78-117.
[26] Faloutsos M, Faloutsos P, Faloutsos C. On power-law relationships of the Internet topology. In Proc. the 1999 ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, August 1999, pp.251-262.
[27] Wang K, Xu H G, Su Z, Liu D Y. GraphQ:Graph query processing with abstraction refinement:Scalable and programmable analytics over very large graphs on a single PC. In Proc. the 2015 USENIX Annual Technical Conference, July 2015, pp.387-401.
[28] Wu M, Yang F, Xue J, Xiao W, Miao Y, Wei L, Lin H, Dai Y, Zhou L. GraM:Scaling graph computation to the trillions. In Proc. the 6th ACM Symposium on Cloud Computing, August 2016, pp.408-421.
[1] Da-Wei Sun (孙大为), Student Member, CCF, ACM, Gui-Ran Chang (常桂然), Shang Gao (高尚), Li-Zhong Jin (靳立忠), and Xing-Wei Wang, (王兴伟), Senior Member, CCF, ACM. Modeling a Dynamic Data Replication Strategy to Increase System Availability in Cloud Computing Environments [J]. , 2012, (2): 256-272.
[2] Wei Mi, Xiao-Bing Feng, Member, CCF, ACM, Yao-Cang Jia, Li Chen, Member, CCF, ACM, and Jing-Ling Xue, Senior Member, IEEE. PARBLO: Page-Allocation-Based DRAM Row Buffer Locality Optimization [J]. , 2009, 24(6): 1086-1097.
[3] Yan Zhang, Zhi-Feng Chen, and Yuan-Yuan Zhou. Efficient Execution of Multiple Queries on Deep Memory Hierarchy [J]. , 2007, 22(2): 273-279 .
[4] Zhou Jianqiang; Xie Li; Dai Fei; Sun Zhongxiu;. Adaptive Memory Coherence Algorithms in DSVM [J]. , 1994, 9(4): 365-372.
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved