基于内存访问优化的大数据处理

阎栋; 尹绪森; 连城; 钟翔; 周鑫; 吴甘沙

doi:10.1007/s11390-015-1502-9

基于内存访问优化的大数据处理

Using Memory in the Right Way to Accelerate Big Data Processing

摘要

摘要: 大数据处理正在数据中心的运行负载中占据越来越大的比重.但是最近的研究却指出,大数据的处理过程并没有有效的利用内存系统.我们发现,高昂的缓存缺失率和内存访问依赖是造成大数据处理的低效的两大原因.为解决此问题,本文引入了两个针对性的优化技术,切分合并策略和直接内存访问.切分合并策略能够大大减少在排序过程中的末级缓存的缺失率;而直接内存访问则重新设计了键值对的存储格式以消除内存访问依赖.在实验部分,我们从专有测试集和实际应用两个方面检验本文所引入的两个优化技术的效果.在专有测试集的试验中,CPU提供的硬件事件清楚的显示了优化技术所带来的性能上的改进.而在实际应用方面,我们选取HiBench所包含的八个典型的大数据应用作为代表.实验结果显示, HiBench性能的平均提升达1.21倍.这充分说明了,在软件架构设计中考虑硬件特性能够大大改善大数据处理过程中的访存效率.我们的工作已经被集成到了英特尔的Hadoop发行版中.

Abstract: Big data processing is becoming a standout part of data center computation. However, latest research has indicated that big data workloads cannot make full use of modern memory systems. We find that the dramatic inefficiency of the big data processing is from the enormous amount of cache misses and stalls of the depended memory accesses. In this paper, we introduce two optimizations to tackle these problems. The first one is the slice-and-merge strategy, which reduces the cache miss rate of the sort procedure. The second optimization is direct-memory-access, which reforms the data structure used in key/value storage. These optimizations are evaluated with both micro-benchmarks and the real-world benchmark HiBench. The results of our micro-benchmarks clearly demonstrate the effectiveness of our optimizations in terms of hardware event counts; and the additional results of HiBench show the 1.21X average speedup on the application-level. Both results illustrate that careful hardware/software co-design will improve the memory efficiency of big data processing. Our work has already been integrated into Intel distribution for Apache Hadoop.

HTML全文

参考文献()

施引文献

资源附件()