Journal of Computer Science and Technology 2018, Vol. 33 Issue (1) :58-78    DOI: 10.1007/s11390-018-1808-5
Endurable SSD-Based Read Cache for Improving the Performance of Selective Restore from Deduplication Systems
Jian Liu1,2,3, Yun-Peng Chai2,3,*, Member, CCF, Xiao Qin4, Senior Member, IEEE, Yao-Hong Liu2,3
1 Division of Computer Science and Engineering, Louisiana State University, Baton Rouge, LA 70803, U.S.A;
2 Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education of China, Beijing 100872, China;
3 School of Information, Renmin University of China, Beijing 100872, China;
4 Shelby Center for Engineering Technology, Department of Computer Science and Software Engineering Samuel Ginn College of Engineering, Auburn University, Auburn, AL 36849-5347, U.S.A

Abstract Deduplication has been commonly used in both enterprise storage systems and cloud storage. To overcome the performance challenge for the selective restore operations of deduplication systems, solid-state-drive-based (i.e., SSD-based) read cache can be deployed for speeding up by caching popular restore contents dynamically. Unfortunately, frequent data updates induced by classical cache schemes (e.g., LRU and LFU) significantly shorten SSDs' lifetime while slowing down I/O processes in SSDs. To address this problem, we propose a new solution-LOP-Cache-to greatly improve the write durability of SSDs as well as I/O performance by enlarging the proportion of long-term popular (LOP) data among data written into SSD-based cache. LOP-Cache keeps LOP data in the SSD cache for a long time period to decrease the number of cache replacements. Furthermore, it prevents unpopular or unnecessary data in deduplication containers from being written into the SSD cache. We implemented LOP-Cache in a prototype deduplication system to evaluate its performance. Our experimental results indicate that LOP-Cache shortens the latency of selective restore by an average of 37.3% at the cost of a small SSD-based cache with only 5.56% capacity of the deduplicated data. Importantly, LOP-Cache improves SSDs' lifetime by a factor of 9.77. The evidence shows that LOP-Cache offers a cost-efficient SSD-based read cache solution to boost performance of selective restore for deduplication systems.
Keywordsdata deduplication   solid state drive (SSD)   flash   cache   endurance     
Received 2016-12-06;

This work is supported by the Natural Science Foundation of Beijing under Grant No. 4172031, the Fundamental Research Funds for the Central Universities of China, and the Research Funds of Renmin University of China under Grant No. 16XNLQ02. Xiao Qin's work is supported by the U.S. National Science Foundation under Grant Nos. ⅡS-1618669, CCF-0845257 (CAREER), CNS-0917137, CNS-0757778, CCF-0742187, CNS-0831502, CNS-0855251, and OCI-0753305. Xiao Qin's study is also supported by the Programme of Introducing Talents of Discipline to Universities (111 Project) in China under Grant No. B07038.

Corresponding Authors: Yun-Peng Chai     Email: ypchai@ruc.edu.cn
About author: Jian Liu received his B.E. degree in electronic information engineering from China Agricultural University, Beijing, in 2012, and his M.E. degree in computer science and technology from National Computer System Engineering Research Institute of China, Beijing, in 2015. He is currently a Ph.D. student majored in computer science at Louisiana State University, Baton Rouge. His research interests include data deduplication, SSD-based storage systems, etc.
Jian Liu, Yun-Peng Chai, Xiao Qin, Yao-Hong Liu.Endurable SSD-Based Read Cache for Improving the Performance of Selective Restore from Deduplication Systems[J]  Journal of Computer Science and Technology, 2018,V33(1): 58-78
