? Hot Data Identification with Multiple Bloom Filters: Block-Level Decision vs I/O Request-Level Decision
Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | FAQ
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2018, Vol. 33 Issue (1) :79-97    DOI: 10.1007/s11390-018-1809-4
Computer Architecture and Systems Current Issue | Archive | Adv Search << Previous Articles | Next Articles >>
Hot Data Identification with Multiple Bloom Filters: Block-Level Decision vs I/O Request-Level Decision
Dongchul Park1,2, Weiping He3, David H. C. Du4, Fellow, IEEE
1 Division of Computer and Electronic Systems Engineering, Hankuk University of Foreign Studies Gyeonggi-do 17035, Korea;
2 Intel Corporation, Hillsboro, OR 97124, U.S.A;
3 Dell Storage, Eden Prairie, MN 55344, U.S.A;
4 Department of Computer Science and Engineering, University of Minnesota-Twin Cities, Minneapolis, MN 55455, U.S.A

Abstract
Reference
Related Articles
Download: [PDF 1207KB]     Export: BibTeX or EndNote (RIS)  
Abstract Hot data identification is crucial for many applications though few investigations have examined the subject. All existing studies focus almost exclusively on frequency. However, effectively identifying hot data requires equally considering recency and frequency. Moreover, previous studies make hot data decisions at the data block level. Such a fine-grained decision fits particularly well for flash-based storage because its random access achieves performance comparable with its sequential access. However, hard disk drives (HDDs) have a significant performance disparity between sequential and random access. Therefore, unlike flash-based storage, exploiting asymmetric HDD access performance requires making a coarse-grained decision. This paper proposes a novel hot data identification scheme adopting multiple bloom filters to efficiently characterize recency as well as frequency. Consequently, it not only consumes 50% less memory and up to 58% less computational overhead, but also lowers false identification rates up to 65% compared with a state-of-the-art scheme. Moreover, we apply the scheme to a next generation HDD technology, i.e., Shingled Magnetic Recording (SMR), to verify its effectiveness. For this, we design a new hot data identification based SMR drive with a coarse-grained decision. The experiments demonstrate the importance and benefits of accurate hot data identification, thereby improving the proposed SMR drive performance by up to 42%.
Articles by authors
Keywordshot data   bloom filter   shingled magnetic recording (SMR)     
Received 2016-10-06;
Fund:

This work was supported by Hankuk University of Foreign Studies Research Fund of Korea, and also partially supported by the National Science Foundation (NSF) Awards of USA under Grant Nos. 1053533, 1439622, 1217569, 1305237, and 1421913.

About author: Dongchul Park is currently an assistant professor in Division of Computer & Electronic Systems Engineering at Hankuk University of Foreign Studies (HUFS), Gyeonggi-do, South Korea. Before joining HUFS, he was a senior staff research engineer in Storage Technology Group (STG) at Intel, Hillsboro, Oregon, USA in 2017 and a senior research engineer in Memory Solutions Laboratory (MSL) at Samsung Semiconductor Inc. in San Jose, California, USA from 2012 to 2016. He received his Ph.D. degree in computer science and engineering at the University of Minnesota-Twin Cities, Minneapolis, in 2012, and was a member of Center for Research in Intelligent Storage (CRIS) group under the advice of professor David H. C. Du. His research interests focus on storage system design and applications including non-volatile memories, in-storage computing, big data processing, Hadoop MapReduce, data deduplication, key-value store, cloud computing, and shingled magnetic recording (SMR) technology.
Cite this article:   
Dongchul Park, Weiping He, David H. C. Du.Hot Data Identification with Multiple Bloom Filters: Block-Level Decision vs I/O Request-Level Decision[J]  Journal of Computer Science and Technology, 2018,V33(1): 79-97
URL:  
http://jcst.ict.ac.cn:8080/jcst/EN/10.1007/s11390-018-1809-4
Copyright 2010 by Journal of Computer Science and Technology