Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (1): 110-122.doi: 10.1007/s11390-020-0776-8

Special Issue: Computer Architecture and Systems

• Special Section on Memory-Centric System Research for High-Performance Computing • Previous Articles     Next Articles

Usage Scenarios for Byte-Addressable Persistent Memory in High-Performance and Data Intensive Computing

Michèle Weiland1 and Bernhard Homölle2        

  1. 1 EPCC, The University of Edinburgh, Edinburgh EH8 9BT, U.K.;
    2 System Vertrieb Alexander GmbH, Paderborn 33106, Germany
  • Received:2020-07-03 Revised:2020-11-09 Online:2021-01-05 Published:2021-01-23
  • About author:Michèle Weiland is a Senior Research Fellow at EPCC, the supercomputing centre at the University of Edinburgh, Edinburgh. She specializes in novel technologies for extreme scale parallel computing, leading EPCC's technical work in the ASiMoV Strategic Prosperity Partnership with RollsRoyce. She was responsible for managing the technical work as part of the EU H2020 project NEXTGenIO. She is the EPCC principal investigator on a number of research grants, including the EC Horizon 2020 projects HPC-WE and SAGE2.
  • Supported by:
    The NEXTGenIO (Next Generation I/O for the Exascale) project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 671951.

Byte-addressable persistent memory (B-APM) presents a new opportunity to bridge the performance gap between main memory and storage. In this paper, we present the usage scenarios for this new technology, based on the capabilities of Intel's DCPMM. We outline some of the basic performance characteristics of DCPMM, and explain how it can be configured and used to address the needs of memory and I/O intensive applications in the HPC (high-performance computing) and data intensive domains. Two decision trees are presented to advise on the configuration options for BAPM; their use is illustrated with two examples. We show that the flexibility of the technology has the potential to be truly disruptive, not only because of the performance improvements it can deliver, but also because it allows systems to cater for wider range of applications on homogeneous hardware.

Key words: byte-addressable; data intensive; memory intensive; persistent;

[1] Jackson A, Weiland M, Parsons M, Homölle B. An architecture for high performance computing and data systems using byte-addressable persistent memory. In Proc. the 2019 ISC High Performance International Workshops, June 2019, pp.258-274. DOI:10.1007/978-3-030-34356-921.
[2] Izraelevitz J, Yang J, Zhang L et al. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv:1903.05714, 2019., March 2020.
[3] Patil O, Ionkov L, Lee J et al. Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using Intel Optane DC persistent memory modules. In Proc. the International Symposium on Memory Systems, September 2019, pp.288-303. DOI:10.1145/3357526.3357541.
[4] Mason T, Doudali T D, Seltzer M, Gavrilovska A. Unexpected performance of Intelr OptaneTM DC persistent memory. IEEE Computer Architecture Letters, 2020, 19(1):55-58. DOI:10.1109/LCA.2020.2987303.
[5] Clark S J, Segall M D, Pickard C J et al. First principles methods using CASTEP. Zeitschrift für Kristallographie, 2005, 220:567-570. DOI:10.1524/zkri.220.5.567.65075.
[6] Weiland M, Brunst H, Quintino T et al. An early evaluation of Intel's Optane DC persistent memory module and its impact on high-performance scientific applications. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2019, Article No. 76. DOI:10.1145/3295500.3356159.
[7] Vef M A, Moti N, Süß T et al. GekkoFS|A temporary burst buffer file system for HPC applications. J. Comput. Sci. Technol., 2020, 35(1):72-91. DOI:10.1007/s11390-020- 9797-6.
[8] Brinkmann A, Mohror K, Yu W et al. Ad hoc file systems for high-performance computing. J. Comput. Sci. Technol., 2020, 35(1):4-26. DOI:10.1007/s11390-020-9801-1.
[9] Smart S, Quintino T, Raoult B. A scalable object store for meteorological and climate data. In Proc. the Platform for Advanced Scientific Computing Conference, June 2017, Article No. 13. DOI:10.1145/3093172.3093238.
[10] Smart S, Quintino T, Raoult B. A high-performance distributed object-store for exascale numerical weather prediction and climate. In Proc. the Platform for Advanced Scientific Computing Conference, June 2019, Article No. 16. DOI:10.1145/3324989.3325726.
[11] Weiland M, Jackson A, Johnson N, Parsons M. Exploiting the performance benefits of storage class memory for HPC and HPDA workflows. Journal of Supercomputing Frontiers and Innovations, 2018, 5(1):79-94. DOI:10.14529/jsfi180105.
[12] Miranda A, Jackson A, Tocci T, Panourgias I, Nou R. NORNS:Extending slurm to support data-driven workflows through asynchronous data staging. In Proc. the 2019 IEEE International Conference on Cluster Computing, Sept. 2019. DOI:10.1109/CLUSTER.2019.8891014.
[13] Brown N, Weiland M, Hill A et al. A highly scalable Met Office NERC Cloud model. In Proc. the 3rd International Conference on Exascale Applications and Software, April 2015, pp.132-137. DOI:10.5555/2820083.2820108.
[1] Heng Bu, Ming-Kai Dong, Ji-Fei Yi, Bin-Yu Zang, Hai-Bo Chen. Revisiting Persistent Indexing Structures on Intel Optane DC Persistent Memory [J]. Journal of Computer Science and Technology, 2021, 36(1): 140-157.
[2] Hai-Kun Liu, Di Chen, Hai Jin, Xiao-Fei Liao, Binsheng He, Kan Hu, Yu Zhang. A Survey of Non-Volatile Main Memory Technologies: State-of-the-Arts, Practices, and Future Directions [J]. Journal of Computer Science and Technology, 2021, 36(1): 4-32.
[3] Yuan-Chao Xu, Hu Wan, Ke-Ni Qiu, Tao Li, Wei-Gong Zhang. Reducing Synchronization Cost for Single-Level Store in Mobile Systems [J]. , 2016, 31(4): 836-848.
Full text



[1] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[2] Shi Weigeng; StephenY.H.Su;. An Online Diagnosable Fault-Tolerant Redundancy System[J]. , 1987, 2(4): 310 -321 .
[3] Yu Shiwen;. Application of Grammatical Parsing Technique in Chinese Input[J]. , 1990, 5(4): 312 -318 .
[4] Han Jianchao; Shi Zhongzhi;. Formalizing Default Reasoning[J]. , 1990, 5(4): 374 -378 .
[5] Pan Jinping;. Automated Tools for Rapid Prototyping[J]. , 1991, 6(3): 271 -275 .
[6] Wang Yihe; Hong Jiarong;. AECAM:An Extension Matrix Algorithm on a Cellular Automata Machine[J]. , 1992, 7(1): 88 -91 .
[7] Fan Zhongchun; Xing Hancheng;. DKLFRS:A Default Knowledge Logical Framework Representation System[J]. , 1992, 7(2): 136 -142 .
[8] Jin Guohua; Yang Xuejun; Chen Fujie;. Loop Staggering,Loop Compacting:Restructuring Techniques for Thrashing Problem[J]. , 1993, 8(1): 49 -57 .
[9] Xu Manwu;. An Implementation of Pure Horn Clause Logic Programming in a Reduction System[J]. , 1993, 8(3): 53 -61 .
[10] Adelino Santos;. Cooperative Hypermedia Editing with CoMEdiA[J]. , 1993, 8(3): 67 -79 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved