We use cookies to improve your experience with our site.
聂世强, 张驰, 伍卫国. DIR: 面向老化SSD读性能优化的动态请求交叉策略[J]. 计算机科学技术学报, 2024, 39(1): 82-98. DOI: 10.1007/s11390-023-1601-y
引用本文: 聂世强, 张驰, 伍卫国. DIR: 面向老化SSD读性能优化的动态请求交叉策略[J]. 计算机科学技术学报, 2024, 39(1): 82-98. DOI: 10.1007/s11390-023-1601-y
Nie SQ, Zhang C, Wu WG. DIR: Dynamic request interleaving for improving the read performance of aged solid-state drives. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(1): 82−98 Jan. 2024. DOI: 10.1007/s11390-023-1601-y.
Citation: Nie SQ, Zhang C, Wu WG. DIR: Dynamic request interleaving for improving the read performance of aged solid-state drives. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(1): 82−98 Jan. 2024. DOI: 10.1007/s11390-023-1601-y.

DIR: 面向老化SSD读性能优化的动态请求交叉策略

DIR: Dynamic Request Interleaving for Improving the Read Performance of Aged Solid-State Drives

  • 摘要: 基于NAND闪存的固态存储盘已成为现代计算机系统中的主要存储介质,在移动设备、数据中心等应用场合被广泛地使用。为了降低每比特成本以满足现代应用的容量和成本需求,闪存厂商推出了基于高密度NAND闪存的固态存储盘。基于 TLC NAND闪存芯片的SSD作为广泛使用的高密度NAND闪存设备,在经受大量的擦除操作和较长的保留时间后通常具有较高的比特错误率。因为TLC NAND闪存单元内部表示不同比特信息的相邻电压之间的容错区间非常窄, TLC单元更容易受到编程干扰等错误源的影响。尽管基于TLC NAND闪存芯片的SSD越来越多地采用接近容错极限的LDPC码来提高其可靠性,但对于具有较高比特错误率(Raw Bit Error Rate,RBER)的老化SSD而言,LDPC码译码过程引发的软解码过程带来的额外闪存感测操作和读重试将会非常耗时。TLC闪存单元能够存储三个比特的信息,分别被称为最低有效位(Least Significant Bit,LSB),中间有效位(Central Significant Bit,CSB)和最高有效位(Most Significant Bit,MSB),同一闪存单元的三比特位的可靠性差异较大,即使它们具有相同的擦除周期和保留时间,三者的RBER也会完全不同。这种不平衡通常会导致三种不同类型页的读写延迟不等。给定一个包含多个子请求的读I/O请求,其完成时间由其中最慢的子请求决定。子请求的请求时间由子请求排队长度和子请求服务时间决定。现有的研究优化排队等待时间并减少服务时间以提高SSD性能。但是对于采用LDPC码的老化SSD,读取具有高RBER的物理页会导致多次的读重试。页读取是I/O请求服务时间中占比较高的部分,页级的读性能优化仍然是比较重要的一环。在SSD使用初期,一般LSB页的读延迟仅为MSB页的读延迟的2/3,但是在SSD老化后,LSB页和MSB页之间读延迟具有数倍差异。与本文提出的策略类似的研究有位级数据布局优化策略等。这些策略将来自每个逻辑页的数据插入同一物理字线(Word Line, WL)的三种不同类型页中。由于物理页是按顺序访问的,因此将这些策略直接集成到闪存转换层(Flash Translation Layer,FTL)中会导致严重的读/写放大问题,进而造成严重的性能下降。本文提出了动态请求交叉(Dynamic Request Interleaving, DIR)策略,用于提升基于 TLC NAND 闪存的老化 SSD 读性能。本文的主要创新和成果总结如下:1) 分析现有的数据集中请求访问的逻辑页信息后发现许多相邻逻辑页倾向于同时读取和更新。这种局部性特征允许对连续逻辑页上的数据进行分组和交叉存放,而不会造成严重的读/写放大问题;2) 利用逻辑页的访问局部性特征,将相邻逻辑页的数据分段交叉存放到不同位置的不同类型的物理页上,此方法在页级别上均衡了不同闪存页的数据可靠性。本文提出的数据布局方法在保持访问并行性的同时有效地减少了读重试次数。此外,还设计了相匹配的动态请求交叉友好的闪存页分配策略,将SSD分为适合于多页数据交叉存放的多组区域和适合单页数据存放的单组区域,在多组区域内能够采用并行读写命令并行读取和写入交叉后的数据,进一步降低读放大的影响;3) 最后,将本文提出的DIR策略与现有策略进行比较。实验结果表明,在老化 SSD中采用DIR策略可以利用I/O请求中的近33%的局部性,与现有策略相比,平均减少了43%的读延迟。

     

    Abstract: Triple-level cell (TLC) NAND flash is increasingly adopted to build solid-state drives (SSDs) for modern computer systems. While TLC NAND flash effectively improves storage density, it faces severe reliability issues; in particular, the pages exhibit different raw bit error rates (RBERs). Integrating strong low-density parity-check (LDPC) code helps to improve reliability but suffers from prolonged and proportional read latency due to multiple read retries for worse pages. The straightforward idea is that dispersing page-size data across several pages in different types can achieve a lower average RBER and reduce the read latency. However, directly implementing this simple idea into flash translation layer (FTL) induces the read amplification issue as one logic page residing in more than one physical page brings several read operations. In this paper, we propose the Dynamic Request Interleaving (DIR) technology for improving the performance of TLC NAND flash-based SSDs, in particular, the aged ones with large RBERs. DIR exploits the observation that the latency of an I/O request is determined, without considering the queuing time, by the access of the slowest device page, i.e., the page that has the highest RBER. By grouping consecutive logical pages that have high locality and interleaving their encoded data in different types of device pages that have different RBERs, DIR effectively reduces the number of read retries for LDPC with limited read amplification. To meet the requirement of allocating hybrid page types for interleaved data, we also design a page-interleaving friendly page allocation scheme, which splits all the planes into multi-plane regions for storing the interleaved data and single-plane regions for storing the normal data. The pages in the multi-plane region can be read/written in parallel by the proposed multi-plane command and avoid the read amplification issue. Based on the DIR scheme and the proposed page allocation scheme, we build DIR-enable FTL, which integrates the proposed schemes into the FTL with some modifications. Our experimental results show that adopting DIR in aged SSDs exploits nearly 33% locality from I/O requests and, on average, reduces 43% read latency over conventional aged SSDs.

     

/

返回文章
返回