We use cookies to improve your experience with our site.
Peng-Fei Li, Yu Hua, Qin Cao. An Enhanced Physical-Locality Deduplication System for Space Efficiency[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-023-2646-7
Citation: Peng-Fei Li, Yu Hua, Qin Cao. An Enhanced Physical-Locality Deduplication System for Space Efficiency[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-023-2646-7

An Enhanced Physical-Locality Deduplication System for Space Efficiency

  • Many data have been generated from various embedded devices, applications and systems, and require cost-efficient storage service. Data deduplication removes duplicate chunks and becomes an important technique for storage systems to improve the space efficiency. However, the stored unique chunks are heavily fragmented, which decreases the restore performance and incurs high overheads for garbage collections. Existing schemes fail to achieve an efficient trade-off among deduplication, restore and garbage collection performance, due to failing to explore and exploit the physical locality of different chunks. In this paper, we trace the storage patterns of the fragmented chunks in backup systems, and propose a high-performance deduplication system, called HiDeStore. The main insight is to enhance the physical-locality for the new backup versions during the deduplication phase, which identifies and stores hot chunks in the active containers. The chunks not appearing in new backups become cold and are gathered together in the archival containers. Moreover, we remove the expired data with the isolated container deletion scheme, avoiding the high overheads for the expired data detection. Compared with state-of-the-art schemes, HiDeStore respectively improves the deduplication and restore performance by up to 1.4× and 1.6×, without decreasing the deduplication ratios and incurring high garbage collection overheads.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return