|
计算机科学技术学报 ›› 2021,Vol. 36 ›› Issue (1): 140-157.doi: 10.1007/s11390-020-9871-0
所属专题: Computer Architecture and Systems
Heng Bu, Ming-Kai Dong, Ji-Fei Yi, Bin-Yu Zang, Distinguished Member, CCF, Member, ACM, IEEE, and Hai-Bo Chen*, Distinguished Member, CCF, ACM
Heng Bu, Ming-Kai Dong, Ji-Fei Yi, Bin-Yu Zang, Distinguished Member, CCF, Member, ACM, IEEE, and Hai-Bo Chen*, Distinguished Member, CCF, ACM
非易失性内存因其支持字节寻址访问、高性能、低延迟等特性,正在学术界获得着广泛的关注。由于高性能且非易失性,非易失性内存可以被用来替代或加速固态硬盘和机械硬盘等设备持久地保存数据。作为快速查找数据的关键,非易失性内存上的存储持久索引结构成为了比较热门的研究方向之一。
在非易失性内存出现之前,哈希表、基数树是在内存中比较常用的索引结构,B/B+树则为块存储设备而设计,常用于保存持久化数据。由于非易失性内存同时具备内存可字节寻址特性和存储的持久化能力,哈希表、基数树和B/B+树均被重新设计和实现为非易失性内存上的索引结构(即持久索引结构)。但由于当时非易失性内存商品尚未发布,这些对持久索引结构的研究均使用模拟的方法进行性能测试。模拟方法大多基于现有的内存硬件,通过增加额外的延迟来模拟非易失性内存略低的读写性能,包括在CLFLUSH指令后插入延迟来模拟写延迟、根据CACHE MISS数量估算读的次数从而插入设定的延迟来模拟读延迟、以及基于NUMA访问特性来增大读写延迟等。
2019年,英特尔发布了傲腾持久化内存模块(Intel® OptaneTM Persistent Memory Module)。真实的非易失性内存产品出现后,此前研究中的持久索引结构在真实硬件上的性能到底如何,成为了一个很重要的问题。因此,我们对先前研究中的持久索引结构在傲腾持久内存上进行重新测试,并尝试回答以下问题:
1. 现有研究中的持久索引结构在实际傲腾持久内存硬件上性能表现如何?
2. 此前研究中提出和使用的模拟方式与实际硬件的性能是否有差距?差距有多大?
3. 持久索引结构的性能在非易失内存上是否以及如何被进一步提升?
为了回答上述问题,我们首先对傲腾持久化内存进行了测试,来揭示其基础的性能及特性。接着我们在傲腾持久内存上对持久索引结构重新进行了性能测试,分别展示了其插入和搜索的性能,并对测试结果进行了详细的分析。我们还使用现有的模拟方法对这些持久索引结构进行了测试,并将测试结果与实际傲腾持久内存上的测试结果进行比对,验证它们的各种模拟方法的准确性。最后,我们对持久索引结构中的刷回指令进行了比对测试,展示了它们对索引结构性能的影响,并总结了三个傲腾持久内存的使用和优化建议。
根据以上测试结果,我们获得了以下发现:
1. 傲腾持久化内存的读写粒度为256B;
2. 在所测试的常见持久哈希表中,CCEH表现最优,相比于调用的CLFLUSH指令的数量,数据局部性对性能的影响更大,数据局部性越差的哈希表性能越差;相比于硬件性能,哈希表本身采取的锁机制对其扩展性更具有决定性意义;
3. 树形持久索引结构中,FAST和WORT取得了相近的插入和查询延迟,插入延迟与插入过程中调用的CLFLUSH指令近似成正比;
4. 所有的模拟方式都无法针对所有的索引结构提供统一的配置同时正确地模拟出非易失内存。
并总结出了如下三个性能优化建议:
1. 尽量使用CLFLUSHOPT/CLWB + SFENCE替代CLFLUSH;
2. 尽量成批刷回连续的非易失内存区域;
3. 避免在数据刚被刷回之后访问数据;
本文是第一个在实际非易失内存上对索引结构进行性能测试与分析的工作。我们基于傲腾持久内存对不同的持久索引结构进行了性能测试,并给出了不同类型的持久索引结构中表现最佳的索引。同时,我们还根据测试结果分析了各个索引结构设计与实现的优劣。另一方面,我们验证了现有的非易失性内存模拟方式的准确性,使得读者能更好地理解以往基于这些模拟方式的测试结果。最后,我们提出了三个傲腾持久内存的使用和优化建议,来提高非易失性内存系统的性能。
[1] Kannan S, Bhat N, Gavrilovska A, Arpaci-Dusseau A, Arpaci-Dusseau R. Redesigning LSMs for nonvolatile memory with NoveLSM. In Proc. the 2018 USENIX Annual Technical Conference, July 2018, pp.993-1005. [2] Liu S, Wei Y, Zhao J, Kolli A, Khan S. PMTest:A fast and flexible testing framework for persistent memory programs. In Proc. the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 2019, pp.411-425. DOI:10.1145/3297858.3304015. [3] Xu J, Kim J, Memaripour A, Swanson S. Finding and fixing performance pathologies in persistent memory software stacks. In Proc. the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 2019, pp.427-439. DOI:10.1145/3297858.3304077. [4] Xia F, Jiang D J, Xiong J, Sun N H. A survey of phase change memory systems. Journal of Computer Science and Technology, 2015, 30(1):121-144. DOI:10.1007/s11390- 015-1509-2. [5] Zuo P, Hua Y. A write-friendly hashing scheme for nonvolatile memory systems. In Proc. the 33rd International Conference on Massive Storage Systems and Technology, May 2017. [6] Zuo P, Hua Y, Wu J. Write-optimized and highperformance hashing index scheme for persistent memory. In Proc. the 13th USENIX Symposium on Operating Systems Design and Implementation, Oct. 2018, pp.461-476. [7] Nam M, Cha H, Choi Y R, Noh S H, Nam B. Write-optimized dynamic hashing for persistent memory. In Proc. the 17th USENIX Conference on File and Storage Technologies, Feb. 2019, pp.31-44. DOI:10.5555/3323298.3323302. [8] Yang J, Wei Q, Chen C, Wang C, Yong K L, He B. NV-Tree:Reducing consistency cost for NVM-based single level systems. In Proc. the 13th USENIX Conference on File and Storage Technologies, Feb. 2015, pp.167-181. DOI:10.5555/2750482.2750495. [9] Chen S Jin Q. Persistent B+-trees in non-volatile main memory. Proceedings of the VLDB Endowment, 2015, 8(7):786-797. DOI:10.14778/2752939.2752947. [10] Lee S K, Lim K H, Song H, Nam B, Noh S H. WORT:Write optimal radix tree for persistent memory storage systems. In Proc. the 15th USENIX Conference on File and Storage Technologies, Feb. 2017, pp.257-270. [11] Hwang D, Kim W H, Won Y, Nam B. Endurable transient inconsistency in byte-addressable persistent B+-tree. In Proc. the 16th USENIX Conference on File and Storage Technologies, Feb. 2018, pp.187-200. [12] Burr G W, Breitwisch M J, Franceschini M et al. Phase change memory technology. Journal of Vacuum Science & Technology B, 2010, 28(2):223-262. DOI:10.1116/1.3301579. [13] Apalkov D, Khvalkovskiy A, Watts S et al. Spintransfer torque magnetic random access memory (STTMRAM). ACM Journal on Emerging Technologies in Computing Systems, 2013, 9(2):Article No. 13. DOI:10.1145/2463585.2463589. [14] Yang J J, Williams R S. Memristive devices in computing system:Promises and challenges. ACM Journal on Emerging Technologies in Computing Systems, 2013, 9(2):Article No. 11. DOI:10.1145/2463585.2463587. [15] Zuo P, Hua Y, Zhao M, Zhou W, Guo Y. Improving the performance and endurance of encrypted non-volatile main memory through deduplicating writes. In Proc. the 51st Annual IEEE/ACM International Symposium on Microarchitecture, Oct. 2018, pp.442-454. DOI:10.1109/MICRO.2018.00043. [16] Rudoff A. Persistent memory programming. Login, 2017, 42(2):34-40. [17] Xu J, Swanson S. NOVA:A log-structured file system for hybrid volatile/non-volatile main memories. In Proc. the 14th USENIX Conference on File and Storage Technologies, Feb. 2016, pp.323-338. [18] Rao D S, Kumar S, Keshavamurthy A, Lantz P, Reddy D, Sankaran R, Jackson J. System software for persistent memory. In Proc. the 9th European Conference on Computer Systems, Apr. 2014, Article No. 15. DOI:10.1145/2592798.2592814. [19] Dong M, Chen H. Soft updates made simple and fast on non-volatile memory. In Proc. the 2017 USENIX Annual Technical Conference, Jul. 2017, pp.719-731. [20] Kwon Y, Fingler H, Hunt T, Peter S, Witchel E, Anderson T. Strata:A cross media file system. In Proc. the 26th Symposium on Operating Systems Principles, Oct. 2017, pp.460-477. DOI:10.1145/3132747.3132770. [21] Volos H, Nalli S, Panneerselvam S, Varadarajan V, Saxena P, Swift M M. Aerie:Flexible file-system interfaces to storage-class memory. In Proc. the 9th European Conference on Computer Systems, Apr. 2014, Article No. 14. DOI:10.1145/2592798.2592810. [22] Dong M, Bu H, Yi J, Dong B, Chen H. Performance and protection in the ZoFS user-space NVM file system. In Proc. the 27th ACM Symposium on Operating Systems Principles, Oct. 2019, pp.478-493. DOI:10.1145/3341301.3359637. [23] Bhandari K, Chakrabarti D R, Boehm H J. Makalu:Fast recoverable allocation of non-volatile memory. ACM SIGPLAN Notices, 2016, 51(10):677-694. DOI:10.1145/3022671.2984019. [24] Flajolet P, Poblete P, Viola A. On the analysis of linear probing hashing. Algorithmica, 1998, 22(4):490-515. DOI:10.1007/PL00009236. [25] Pagh R, Rodler F F. Cuckoo hashing. Journal of Algorithms, 2004, 51(2):122-144. DOI:10.1016/j.jalgor.20- 03.12.002. [26] Ellis C S. Extendible hashing for concurrent operations and distributed data. In Proc. the 2nd ACM SIGACTSIGMOD Symposium on Principles of Database Systems, March 1983, pp.106-116. DOI:10.1145/588058.588072. [27] Oukid I, Lasperas J, Nica A, Willhalm T, Lehner W. FPTree:A hybrid SCM-DRAM persistent and concurrent Btree for storage class memory. In Proc. the 2016 International Conference on Management of Data, June 2016, pp.371-386. DOI:10.1145/2882903.2915251. [28] Sha E H M, Jiang W, Dong H, Ma Z, Zhang R, Chen X, Zhuge Q. Towards the design of efficient and consistent index structure with minimal write activities for non-volatile memory. IEEE Transactions on Computers, 2017, 67(3):432-448. DOI:10.1109/TC.2017.2754381. [29] Volos H, Magalhaes G, Cherkasova L, Li J. Quartz:A lightweight performance emulator for persistent memory software. In Proc. the 16th Annual Middleware Conference, Nov. 2015, pp.37-49. DOI:10.1145/2814576.2814806. [30] Leis V, Kemper A, Neumann T. The adaptive radix tree:ARTful indexing for main-memory databases. In Proc. the 29th International Conference on Data Engineering, April 2013, pp.38-49. DOI:10.1109/ICDE.2013.6544812. [31] Evans J. A scalable concurrent malloc (3) implementation for FreeBSD. In Proc. the 2006 BSDCan Conference, May 2006. [32] Viswanathan K. Intel corporation. Intel memory latency checker v3.7. https://software.intel.com/en-us/articles/intelr-memory-latency-checker, Feb. 2020. [33] Izraelevitz J, Yang J, Zhang L et al. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv:1903.05714, 2019. https://arxiv.org/pdf/1903.05714v3.pdf, Oct. 2020. |
[1] | 徐天妮, 孙海锋, 张笛, 周小明, 隋秀峰, 王卅, 黄群, 包云岗. 服务链自动化部署与测试框架[J]. 计算机科学技术学报, 2022, 37(3): 680-698. |
[2] | Zhou Zhang, Pei-Quan Jin, Xiao-Liang Wang, Yan-Qi Lv, Shou-Hong Wan, Xi-Ke Xie. COLIN:一种具有高读写性能的缓存感知的动态学习索引[J]. 计算机科学技术学报, 2021, 36(4): 721-740. |
[3] | Songjie Niu, Shimin Chen. TransGPerf:利用迁移学习建模分布式图计算性能[J]. 计算机科学技术学报, 2021, 36(4): 778-791. |
[4] | Jason Liu, Pedro Espina, Xian-He Sun. 关于储存系统建模和优化的综述[J]. 计算机科学技术学报, 2021, 36(1): 71-89. |
[5] | Kai Wu, Dong Li. Unimem: 用于高性能计算的基于非易失性内存的异构主内存上的运行时系统数据管理[J]. 计算机科学技术学报, 2021, 36(1): 90-109. |
[6] | Hai-Kun Liu, Di Chen, Hai Jin, Xiao-Fei Liao, Binsheng He, Kan Hu, Yu Zhang. 非易失性内存技术综述:现状,实践和展望[J]. 计算机科学技术学报, 2021, 36(1): 4-32. |
[7] | Jian-Bin Fang, Xiang-Ke Liao, Chun Huang, De-Zun Dong. 以主存为中心的ARMv8众核体系结构性能评估:以飞腾2000+为例[J]. 计算机科学技术学报, 2021, 36(1): 33-43. |
[8] | Lan Huang, Da-Lin Li, Kang-Ping Wang, Teng Gao, Adriano Tavares. 一个关于高级综合工具性能优化的综述[J]. 计算机科学技术学报, 2020, 35(3): 697-720. |
[9] | Wen-Yan Chen, Ke-Jiang Ye, Cheng-Zhi Lu, Dong-Dai Zhou, Cheng-Zhong Xu. 混部容器负载干扰分析:从硬件性能计数器的视角[J]. 计算机科学技术学报, 2020, 35(2): 412-417. |
[10] | Hong-Mei Wei, Jian Gao, Peng Qing, Kang Yu, Yan-Fei Fang, Ming-Lu Li. MPI-RCDD:一种MPI运行时的通信死锁检测框架[J]. 计算机科学技术学报, 2020, 35(2): 395-411. |
[11] | Zheng-Hao Jin, Haiyang Shi, Ying-Xin Hu, Li Zha, Xiaoyi Lu. CirroData:另一个基于SQL-on-Hadoop的高性能数据分析引擎[J]. 计算机科学技术学报, 2020, 35(1): 194-208. |
[12] | André Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, Marc-André Vef. 高性能计算专用文件系统[J]. 计算机科学技术学报, 2020, 35(1): 4-26. |
[13] | Yu-Tong Lu, Peng Cheng, Zhi-Guang Chen. Tianhe-2数据存储与管理系统设计与实现[J]. 计算机科学技术学报, 2020, 35(1): 27-46. |
[14] | Qi Chen, Kang Chen, Zuo-Ning Chen, Wei Xue, Xu Ji, Bin Yang. 神威存储系统面向应用I/O性能提升的优化介绍[J]. 计算机科学技术学报, 2020, 35(1): 47-60. |
[15] | Marc-André Vef, Nafiseh Moti, Tim Süß, Markus Tacke, Tommaso Tocci, Ramon Nou, Alberto Miranda, Toni Cortes, André Brinkmann. GekkoFS—一种用于高性能计算应用的临时突发缓冲文件系统[J]. 计算机科学技术学报, 2020, 35(1): 72-91. |
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |