Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (1): 140-157.doi: 10.1007/s11390-020-9871-0

Special Issue: Computer Architecture and Systems

• Computer Architecture and Systems • Previous Articles     Next Articles

Revisiting Persistent Indexing Structures on Intel Optane DC Persistent Memory

Heng Bu, Ming-Kai Dong, Ji-Fei Yi, Bin-Yu Zang, Distinguished Member, CCF, Member, ACM, IEEE, and Hai-Bo Chen*, Distinguished Member, CCF, ACM        

  1. Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2019-07-21 Revised:2020-02-15 Online:2021-01-05 Published:2021-01-23
  • Contact: Hai-Bo Chen E-mail:haibocheng@sjtu.edu.cn
  • About author:Heng Bu received his B.S. degree in software engineering from Nanjing University, Nanjing, in 2017. He is a Master student of the Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University, Shanghai. His research interests include non-volatile memory, file systems and concurrent data structures.
  • Supported by:
    This work is supported in part by the National Key Research and Development Program under Grant No. 2016YFB1000104, the National Natural Science Foundation of China under Grant No. 61672345, and the HighTech Support Program from Shanghai Committee of Science and Technology under Grant No. 19511121100.

Persistent indexing structures are proposed in response to emerging non-volatile memory (NVM) to provide high performance yet durable indexes. However, due to the lack of real NVM hardware, many prior persistent indexing structures were evaluated via emulation, which varies a lot across different setups and differs from the real deployment. Recently, Intel has released its Optane DC Persistent Memory Module (PMM), which is the first production-ready NVM. In this paper, we revisit popular persistent indexing structures on PMM and conduct comprehensive evaluations to study the performance differences among persistent indexing structures, including persistent hash tables and persistent trees. According to the evaluation results, we find that Cacheline-Conscious Extendible Hashing (CCEH) achieves the best performance among all evaluated persistent hash tables, and Failure-Atomic ShifT B+-Tree (FAST) and Write Optimal Radix Tree (WORT) perform better than other trees. Besides, we find that the insertion performance of hash tables is heavily influenced by data locality, while the insertion latency of trees is dominated by the flush instructions. We also uncover that no existing emulation methods accurately simulate PMM for all the studied data structures. Finally, we provide three suggestions on how to fully utilize PMM for better performance, including using clflushopt/clwb with sfence instead of clflush, flushing continuous data in a batch, and avoiding data access immediately after it is flushed to PMM.

Key words: persistent memory; data structure; performance; Intel Optane DC persistent memory module (PMM);

[1] Kannan S, Bhat N, Gavrilovska A, Arpaci-Dusseau A, Arpaci-Dusseau R. Redesigning LSMs for nonvolatile memory with NoveLSM. In Proc. the 2018 USENIX Annual Technical Conference, July 2018, pp.993-1005.
[2] Liu S, Wei Y, Zhao J, Kolli A, Khan S. PMTest:A fast and flexible testing framework for persistent memory programs. In Proc. the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 2019, pp.411-425. DOI:10.1145/3297858.3304015.
[3] Xu J, Kim J, Memaripour A, Swanson S. Finding and fixing performance pathologies in persistent memory software stacks. In Proc. the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 2019, pp.427-439. DOI:10.1145/3297858.3304077.
[4] Xia F, Jiang D J, Xiong J, Sun N H. A survey of phase change memory systems. Journal of Computer Science and Technology, 2015, 30(1):121-144. DOI:10.1007/s11390- 015-1509-2.
[5] Zuo P, Hua Y. A write-friendly hashing scheme for nonvolatile memory systems. In Proc. the 33rd International Conference on Massive Storage Systems and Technology, May 2017.
[6] Zuo P, Hua Y, Wu J. Write-optimized and highperformance hashing index scheme for persistent memory. In Proc. the 13th USENIX Symposium on Operating Systems Design and Implementation, Oct. 2018, pp.461-476.
[7] Nam M, Cha H, Choi Y R, Noh S H, Nam B. Write-optimized dynamic hashing for persistent memory. In Proc. the 17th USENIX Conference on File and Storage Technologies, Feb. 2019, pp.31-44. DOI:10.5555/3323298.3323302.
[8] Yang J, Wei Q, Chen C, Wang C, Yong K L, He B. NV-Tree:Reducing consistency cost for NVM-based single level systems. In Proc. the 13th USENIX Conference on File and Storage Technologies, Feb. 2015, pp.167-181. DOI:10.5555/2750482.2750495.
[9] Chen S Jin Q. Persistent B+-trees in non-volatile main memory. Proceedings of the VLDB Endowment, 2015, 8(7):786-797. DOI:10.14778/2752939.2752947.
[10] Lee S K, Lim K H, Song H, Nam B, Noh S H. WORT:Write optimal radix tree for persistent memory storage systems. In Proc. the 15th USENIX Conference on File and Storage Technologies, Feb. 2017, pp.257-270.
[11] Hwang D, Kim W H, Won Y, Nam B. Endurable transient inconsistency in byte-addressable persistent B+-tree. In Proc. the 16th USENIX Conference on File and Storage Technologies, Feb. 2018, pp.187-200.
[12] Burr G W, Breitwisch M J, Franceschini M et al. Phase change memory technology. Journal of Vacuum Science & Technology B, 2010, 28(2):223-262. DOI:10.1116/1.3301579.
[13] Apalkov D, Khvalkovskiy A, Watts S et al. Spintransfer torque magnetic random access memory (STTMRAM). ACM Journal on Emerging Technologies in Computing Systems, 2013, 9(2):Article No. 13. DOI:10.1145/2463585.2463589.
[14] Yang J J, Williams R S. Memristive devices in computing system:Promises and challenges. ACM Journal on Emerging Technologies in Computing Systems, 2013, 9(2):Article No. 11. DOI:10.1145/2463585.2463587.
[15] Zuo P, Hua Y, Zhao M, Zhou W, Guo Y. Improving the performance and endurance of encrypted non-volatile main memory through deduplicating writes. In Proc. the 51st Annual IEEE/ACM International Symposium on Microarchitecture, Oct. 2018, pp.442-454. DOI:10.1109/MICRO.2018.00043.
[16] Rudoff A. Persistent memory programming. Login, 2017, 42(2):34-40.
[17] Xu J, Swanson S. NOVA:A log-structured file system for hybrid volatile/non-volatile main memories. In Proc. the 14th USENIX Conference on File and Storage Technologies, Feb. 2016, pp.323-338.
[18] Rao D S, Kumar S, Keshavamurthy A, Lantz P, Reddy D, Sankaran R, Jackson J. System software for persistent memory. In Proc. the 9th European Conference on Computer Systems, Apr. 2014, Article No. 15. DOI:10.1145/2592798.2592814.
[19] Dong M, Chen H. Soft updates made simple and fast on non-volatile memory. In Proc. the 2017 USENIX Annual Technical Conference, Jul. 2017, pp.719-731.
[20] Kwon Y, Fingler H, Hunt T, Peter S, Witchel E, Anderson T. Strata:A cross media file system. In Proc. the 26th Symposium on Operating Systems Principles, Oct. 2017, pp.460-477. DOI:10.1145/3132747.3132770.
[21] Volos H, Nalli S, Panneerselvam S, Varadarajan V, Saxena P, Swift M M. Aerie:Flexible file-system interfaces to storage-class memory. In Proc. the 9th European Conference on Computer Systems, Apr. 2014, Article No. 14. DOI:10.1145/2592798.2592810.
[22] Dong M, Bu H, Yi J, Dong B, Chen H. Performance and protection in the ZoFS user-space NVM file system. In Proc. the 27th ACM Symposium on Operating Systems Principles, Oct. 2019, pp.478-493. DOI:10.1145/3341301.3359637.
[23] Bhandari K, Chakrabarti D R, Boehm H J. Makalu:Fast recoverable allocation of non-volatile memory. ACM SIGPLAN Notices, 2016, 51(10):677-694. DOI:10.1145/3022671.2984019.
[24] Flajolet P, Poblete P, Viola A. On the analysis of linear probing hashing. Algorithmica, 1998, 22(4):490-515. DOI:10.1007/PL00009236.
[25] Pagh R, Rodler F F. Cuckoo hashing. Journal of Algorithms, 2004, 51(2):122-144. DOI:10.1016/j.jalgor.20- 03.12.002.
[26] Ellis C S. Extendible hashing for concurrent operations and distributed data. In Proc. the 2nd ACM SIGACTSIGMOD Symposium on Principles of Database Systems, March 1983, pp.106-116. DOI:10.1145/588058.588072.
[27] Oukid I, Lasperas J, Nica A, Willhalm T, Lehner W. FPTree:A hybrid SCM-DRAM persistent and concurrent Btree for storage class memory. In Proc. the 2016 International Conference on Management of Data, June 2016, pp.371-386. DOI:10.1145/2882903.2915251.
[28] Sha E H M, Jiang W, Dong H, Ma Z, Zhang R, Chen X, Zhuge Q. Towards the design of efficient and consistent index structure with minimal write activities for non-volatile memory. IEEE Transactions on Computers, 2017, 67(3):432-448. DOI:10.1109/TC.2017.2754381.
[29] Volos H, Magalhaes G, Cherkasova L, Li J. Quartz:A lightweight performance emulator for persistent memory software. In Proc. the 16th Annual Middleware Conference, Nov. 2015, pp.37-49. DOI:10.1145/2814576.2814806.
[30] Leis V, Kemper A, Neumann T. The adaptive radix tree:ARTful indexing for main-memory databases. In Proc. the 29th International Conference on Data Engineering, April 2013, pp.38-49. DOI:10.1109/ICDE.2013.6544812.
[31] Evans J. A scalable concurrent malloc (3) implementation for FreeBSD. In Proc. the 2006 BSDCan Conference, May 2006.
[32] Viswanathan K. Intel corporation. Intel memory latency checker v3.7. https://software.intel.com/en-us/articles/intelr-memory-latency-checker, Feb. 2020.
[33] Izraelevitz J, Yang J, Zhang L et al. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv:1903.05714, 2019. https://arxiv.org/pdf/1903.05714v3.pdf, Oct. 2020.
[1] Tian-Ni Xu, Hai-Feng Sun, Di Zhang, Xiao-Ming Zhou, Xiu-Feng Sui, Sa Wang, Qun Huang, and Yun-Gang Bao. NfvInsight: A Framework for Automatically Deploying and Benchmarking VNF Chains [J]. Journal of Computer Science and Technology, 2022, 37(3): 680-698.
[2] Zhou Zhang, Pei-Quan Jin, Xiao-Liang Wang, Yan-Qi Lv, Shou-Hong Wan, Xi-Ke Xie. COLIN: A Cache-Conscious Dynamic Learned Index with High Read/Write Performance [J]. Journal of Computer Science and Technology, 2021, 36(4): 721-740.
[3] Songjie Niu, Shimin Chen. TransGPerf: Exploiting Transfer Learning for Modeling Distributed Graph Computation Performance [J]. Journal of Computer Science and Technology, 2021, 36(4): 778-791.
[4] Hai-Kun Liu, Di Chen, Hai Jin, Xiao-Fei Liao, Binsheng He, Kan Hu, Yu Zhang. A Survey of Non-Volatile Main Memory Technologies: State-of-the-Arts, Practices, and Future Directions [J]. Journal of Computer Science and Technology, 2021, 36(1): 4-32.
[5] Jian-Bin Fang, Xiang-Ke Liao, Chun Huang, De-Zun Dong. Performance Evaluation of Memory-Centric ARMv8 Many-Core Architectures: A Case Study with Phytium 2000+ [J]. Journal of Computer Science and Technology, 2021, 36(1): 33-43.
[6] Jason Liu, Pedro Espina, Xian-He Sun. A Study on Modeling and Optimization of Memory Systems [J]. Journal of Computer Science and Technology, 2021, 36(1): 71-89.
[7] Lan Huang, Da-Lin Li, Kang-Ping Wang, Teng Gao, Adriano Tavares. A Survey on Performance Optimization of High-Level Synthesis Tools [J]. Journal of Computer Science and Technology, 2020, 35(3): 697-720.
[8] Hong-Mei Wei, Jian Gao, Peng Qing, Kang Yu, Yan-Fei Fang, Ming-Lu Li. MPI-RCDD: A Framework for MPI Runtime Communication Deadlock Detection [J]. Journal of Computer Science and Technology, 2020, 35(2): 395-411.
[9] Wen-Yan Chen, Ke-Jiang Ye, Cheng-Zhi Lu, Dong-Dai Zhou, Cheng-Zhong Xu. Interference Analysis of Co-Located Container Workloads: A Perspective from Hardware Performance Counters [J]. Journal of Computer Science and Technology, 2020, 35(2): 412-417.
[10] Zheng-Hao Jin, Haiyang Shi, Ying-Xin Hu, Li Zha, Xiaoyi Lu. CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance [J]. Journal of Computer Science and Technology, 2020, 35(1): 194-208.
[11] André Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, Marc-André Vef. Ad Hoc File Systems for High-Performance Computing [J]. Journal of Computer Science and Technology, 2020, 35(1): 4-26.
[12] Yu-Tong Lu, Peng Cheng, Zhi-Guang Chen. Design and Implementation of the Tianhe-2 Data Storage and Management System [J]. Journal of Computer Science and Technology, 2020, 35(1): 27-46.
[13] Qi Chen, Kang Chen, Zuo-Ning Chen, Wei Xue, Xu Ji, Bin Yang. Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance [J]. Journal of Computer Science and Technology, 2020, 35(1): 47-60.
[14] Marc-André Vef, Nafiseh Moti, Tim Süß, Markus Tacke, Tommaso Tocci, Ramon Nou, Alberto Miranda, Toni Cortes, André Brinkmann. GekkoFS—A Temporary Burst Buffer File System for HPC Applications [J]. Journal of Computer Science and Technology, 2020, 35(1): 72-91.
[15] Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, Qing Zheng. Mochi: Composing Data Services for High-Performance Computing Environments [J]. Journal of Computer Science and Technology, 2020, 35(1): 121-144.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Sun Zhongxiu; Shang Lujun;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[4] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[5] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[6] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[7] Zheng Guoliang; Li Hui;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[8] Shen Li; Stephen Y.H.Su;. Generalized Parallel Signature Analyzers with External Exclusive-OR Gates[J]. , 1986, 1(4): 49 -61 .
[9] Huang Xuedong; Cai Lianhong; Fang Ditang; Chi Bianjin; Zhou Li; Jiang Li;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[10] Xu Xiaoshu;. Simplification of Multivalued Sequential SULM Network by Using Cascade Decomposition[J]. , 1986, 1(4): 84 -95 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved