We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Dong Yan, Xu-Sen Yin, Cheng Lian, Xiang Zhong, Xin Zhou, Gan-Sha Wu. Using Memory in the Right Way to Accelerate Big Data Processing[J]. Journal of Computer Science and Technology, 2015, 30(1): 30-41. DOI: 10.1007/s11390-015-1502-9
Citation: Dong Yan, Xu-Sen Yin, Cheng Lian, Xiang Zhong, Xin Zhou, Gan-Sha Wu. Using Memory in the Right Way to Accelerate Big Data Processing[J]. Journal of Computer Science and Technology, 2015, 30(1): 30-41. DOI: 10.1007/s11390-015-1502-9

Using Memory in the Right Way to Accelerate Big Data Processing

More Information
  • Author Bio:

    Dong Yan is a Ph.D. candidate in Tsinghua University. He received his B.S. degree in computer science and technology from Wuhan University in 2009. His research interests include operating system kernel, debug, and big data.

  • Received Date: July 13, 2014
  • Revised Date: December 15, 2014
  • Published Date: January 04, 2015
  • Big data processing is becoming a standout part of data center computation. However, latest research has indicated that big data workloads cannot make full use of modern memory systems. We find that the dramatic inefficiency of the big data processing is from the enormous amount of cache misses and stalls of the depended memory accesses. In this paper, we introduce two optimizations to tackle these problems. The first one is the slice-and-merge strategy, which reduces the cache miss rate of the sort procedure. The second optimization is direct-memory-access, which reforms the data structure used in key/value storage. These optimizations are evaluated with both micro-benchmarks and the real-world benchmark HiBench. The results of our micro-benchmarks clearly demonstrate the effectiveness of our optimizations in terms of hardware event counts; and the additional results of HiBench show the 1.21X average speedup on the application-level. Both results illustrate that careful hardware/software co-design will improve the memory efficiency of big data processing. Our work has already been integrated into Intel distribution for Apache Hadoop.
  • [1]
    Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2008, 51(1):107-113.
    [2]
    Low Y, Gonzalez J, Kyrola A, Bickson D, Guestrin C, Hellerstein J M. Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1006.4990, 2010. http://arxiv.org/abs/1006.4990, Dec. 2014.
    [3]
    Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin M, Shenker S, Stoica I. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. the 9th USENIX Conference on Networked Systems Design and Implementation, April 2012, pp.15-28.
    [4]
    Shafer J, Rixner S, Cox A L. The Hadoop distributed filesystem: Balancing portability and performance. In Proc. IEEE International Symposium on Performance Analysis of Systems and Software, March 2010, pp.122-133.
    [5]
    Wang Y, Xu C, Li X, Yu W. JVM-bypass for efficient Hadoop shuffling. In Proc. the 27th IEEE International Symposium on Parallel and Distributed Processing, May 2013, pp.569-578.
    [6]
    Hardavellas N, Ferdman M, Falsafi B, Ailamaki A. Toward dark silicon in servers. IEEE Micro, 2011, 31(4): 6-15.
    [7]
    Horowitz M, Alon E, Patil D, Naffziger S, Kumar R, Bernstein K. Scaling, power, and the future of CMOS. In Proc. IEEE Int. Electron Devices Meeting, December 2005, pp.7-15.
    [8]
    Ferdman M, Adileh A, Kocberber O, Volos S, Alisafaee M, Jevdjic D, Kaynak C, Popescu A D, Ailamaki A, Falsafi B. Quantifying the mismatch between emerging scale-out applications and modern processors. ACM Transactions on Computer Systems, 2012, 30(4): Article No. 15.
    [9]
    Huang S, Huang J, Dai J, Xie T, Huang B. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In Proc. the 26th IEEE International Conference on Data Engineering Workshops, March 2010, pp.41-51.
    [10]
    Yang D, Zhong X, Yan D, Dai F, Yin X, Lian C, Zhu Z, Jiang W, Wu G. NativeTask: A Hadoop compatible framework for high performance. In Proc. IEEE International Conference on Big Data, October 2013, pp.94-101.
    [11]
    Chen R, Chen H, Zang B. Tiled-MapReduce: Optimizing resource usages of data-parallel applications on multicore with tiling. In Proc. the 19th International Conference on Parallel Architectures and Compilation Techniques, September 2010, pp.523-534.
    [12]
    Brodal G S, Fagerberg R, Vinther K. Engineering a cacheoblivious sorting algorithm. Journal of Experimental Algorithmics, 2008, 12(2): Article No. 22.
    [13]
    Levinthal D. Cycle accounting analysis on Intel® CoreTM 2 processors. Intel Corp., 2012. https://software.intel.com/sites/ products/collateral/hpc/vtune/cycle accounting analysis. pdf, Dec. 2014.
    [14]
    Levinthal D. Performance analysis guide for Intel® CoreTM i7 processor and Intel® XeonTM 5500 processors. Intel Performance Analysis Guide, 2009. https://software. intel.com/sites/products/collateral/hpc/vtune/performance analysis guide.pdf, Dec. 2014.
    [15]
    Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C. Evaluating MapReduce for multi-core and multiprocessor systems. In Proc. the 13th IEEE International Symposium on High Performance Computer Architecture, February 2007, pp.13-24.
    [16]
    Yoo R M, Romano A, Kozyrakis C. Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system. In Proc. IEEE International Symposium on Workload Characterization, October 2009, pp.198-207.
    [17]
    He B, Fang W, Luo Q, Govindaraju N K, Wang T. Mars: A MapReduce framework on graphics processors. In Proc. the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008, pp.260-269.
    [18]
    Hong C, Chen D, Chen W, Zheng W, Lin H. MapCG: Writing parallel program portable between CPU and GPU. In Proc. the 19th International Conference on Parallel Architectures and Compilation Techniques, September 2010, pp.217-226.

Catalog

    Article views (48) PDF downloads (2417) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return