›› 2013,Vol. 28 ›› Issue (6): 1012-1024.doi: 10.1007/s11390-013-1394-5

所属专题: Data Management and Data Mining

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

面向个人数据云备份的应用感知数据缩减与加密

Yin-Jin Fu1 (付印金), Nong Xiao1, * (肖侬), Member, IEEE, Xiang-Ke Liao2 (廖湘科), Member, IEEE, and Fang Liu1 (刘芳), Member, CCF   

  • 收稿日期:2012-12-10 修回日期:2013-05-06 出版日期:2013-11-05 发布日期:2013-11-05
  • 作者简介:Yin-Jin Fu received his B.S. degree in mathematics from Nanjing University, China, and M.S. degree in computer science from National University of Defense Technology (NUDT), Changsha, in 2006 and 2008, respectively. Now he is a Ph.D. candidate at the State Key Laboratory of High Performance Computing in NUDT. His research areas are data deduplication, cloud storage, and distributed file systems.

Application-Aware Client-Side Data Reduction and Encryption of Personal Data in Cloud Backup Services

Yin-Jin Fu1 (付印金), Nong Xiao1, * (肖侬), Member, IEEE, Xiang-Ke Liao2 (廖湘科), Member, IEEE, and Fang Liu1 (刘芳), Member, CCF   

  1. 1 State Key Laboratory of High Performance Computing, National University of Defense Technology Changsha 410073, China;
    2 School of Computer, National University of Defense Technology, Changsha 410073, China
  • Received:2012-12-10 Revised:2013-05-06 Online:2013-11-05 Published:2013-11-05
  • About author:Yin-Jin Fu received his B.S. degree in mathematics from Nanjing University, China, and M.S. degree in computer science from National University of Defense Technology (NUDT), Changsha, in 2006 and 2008, respectively. Now he is a Ph.D. candidate at the State Key Laboratory of High Performance Computing in NUDT. His research areas are data deduplication, cloud storage, and distributed file systems.
  • Supported by:

    This work was supported in part by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013201, the National Natural Science Foundation of China under Grant Nos. 61025009, 61232003, 61120106005, 61170288, and 61379146.

随着个人计算设备存放越来越多的重要信息,针对个人数据的云备份正变得越来越重要。客户端在数据传输前采用数据缩减技术,如重复数据删除、Delta编码和LZ压缩等,消除数据冗余能够有效地节省网络带宽和降低云存储空间。然而,在云备份服务中进行客户端数据缩减将面临缩减效率和数据隐私方面的挑战。本文通过开发个人数据中的应用感知提出了一种安全高效的云备份服务Pangolin。它在客户端使用应用感知的数据缩减技术来加速备份操作,同时将选择加密集成到数据缩减中来保证敏感应用数据的安全性,以降低数据安全风险。最后,对比传统的云备份服务,基于原型实现验证了本文提出的新机制能够将备份窗口缩减到原来的33%~75%,并且敏感应用的安全机制对备份窗口大小的影响可以忽略不计。

Abstract: Cloud backup has been an important issue ever since large quantities of valuable data have been stored on the personal computing devices. Data reduction techniques, such as deduplication, delta encoding, and Lempel-Ziv (LZ) compression, performed at the client side before data transfer can help ease cloud backup by saving network bandwidth and reducing cloud storage space. However, client-side data reduction in cloud backup services faces efficiency and privacy challenges. In this paper, we present Pangolin, a secure and efficient cloud backup service for personal data storage by exploiting application awareness. It can speedup backup operations by application-aware client-side data reduction technique, and mitigate data security risks by integrating selective encryption into data reduction for sensitive applications. Our experimental evaluation, based on a prototype implementation, shows that our scheme can improve data reduction efficiency over the state-of-the-art methods by shortening the backup window size to 33%~75%, and its security mechanism for sensitive applications has negligible impact on backup window size.

[1] Armbrust M, Fox A, Griffith R, Joseph A D, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M. A view of cloud computing. Communications of the ACM, 2010, 53(4): 50-58.

[2] Biggar H. Experiencing data de-duplication: Improving efficiency and reducing capacity requirements. White Paper, the Enterprise Strategy Group, Feb. 2007. www.abtechsystems. com/files/pdfs/WP001 04.pdf, Dec. 2012.

[3] Ponemon L. The cost of a lost laptop. White Paper, Ponemon Institute, Apr. 2009. http://communities.intel.com/docs/DOC-3076, Dec. 2012.

[4] Storer M W, Greenan K, Long D D, Miller E L. Secure data deduplication. In Proc. the 4th StorageSS, Oct. 2008, pp.110.

[5] Harnik D, Pinkas B, Shulman-Peleg A. Side channels in cloud services: Deduplication in cloud storage. IEEE Security & Privacy, 2010, 8(6): 40-47.

[6] Halevi S, Harnik D, Pinkas B, Shulman-Peleg A. Proofs of ownership in remote storage systems. In Proc. the 18th CCS, Oct. 2011, pp.491-500.

[7] Blelloch G E. Introduction to data compression. Technical Report, Computer Science Department, Carnegie Mellon University, Oct. 2001. http://www.cs.cmu.edu/afs/cs/project/pscico-guyb/realworld/www/compression.pdf, Oct. 2013.

[8] Douglis F, Iyengar A. Application-specific delta-encoding via resemblance detection. In Proc. the USENIX ATC, Jun. 2003, pp.113-126.

[9] Shilane P, Huang M, Wallace G, Hsu W. WAN optimized replication of backup datasets using stream-informed delta compression. ACM Transactions on Storage, 2012, 8(4): Article No. 13.

[10] Zhu B, Li K, Patterson H. Avoiding the disk bottleneck in the data domain deduplication file system. In Proc. the 6th FAST, Feb. 2008, pp.269-282.

[11] Bois L D, Amatruda R. Backup and recovery: Accelerating efficiency and driving down IT costs using data deduplication. Technical Report, EMC Corporation, Feb. 2010.

[12] Shilane P, Wallace G, Huang M, Hsu W. Delta compressed and deduplicated storage using stream-informed locality. In Proc. the 4th HotStorage, June 2012, Article No. 10.

[13] Maximizing data efficiency: Benefits of global deduplication. White Paper, NEC, June 2009. http://www.knowledgestorm.com/sol summary 5136573.asp, Dec. 2013.

[14] Anderson P, Zhang L. Fast and secure laptop backups with encrypted de-duplication. In Proc. the 24th LISA, Dec. 2010, Article No. 3.

[15] Lillibridge M, Eshghi K, Bhagwat D, Deolalikar V, Trezise G, Camble P. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proc. the 7th FAST, Feb. 2009, pp.111-123.

[16] Meister D, Brinkmann A. Multi-level comparison of data deduplication in a backup scenario. In Proc. the SYSTOR, May 2009, Article No. 8.

[17] Agrawal N, Bolosky W J, Douceur J R, Lorch J R. A five-year study of file-system metadata. In Proc. the 5th FAST, Feb. 2007, pp.31-45.

[18] Bhagwat D, Eshghi K, Long D D, Lillibridge M. Extreme binning: Scalable, parallel deduplication for chunk based file backup. In Proc. the 17th MASCOTS, Sept. 2009, pp.1-9.

[19] Tan Y, Jiang H, Feng D, Tian L, Yan Z, Zhou G. SAM: A semantic-aware multi-tiered source de-duplication framework for cloud backup. In Proc. the 39th ICPP, Sept. 2010, pp.614-623.

[20] Vrable M, Savage S, Voelker G M. Cumulus: Filesystem backup to the cloud. In Proc. the 7th FAST, Feb. 2009, pp.225-238.

[21] MacDonald J. File system support for delta compression[Master's Thesis]. Department of Electrical Engineering and Computer Science, University of California at Berkeley, 2000.

[22] Asenjo J C. The advanced encryption standard | Implementation and transition to a new cryptographic benchmark. Network Security, 2002, 2002(7): 7-9.

[23] Fu Y, Jiang H, Xiao N, Tian L, Liu F. AA-Dedupe: An application-aware source deduplication approach for cloud backup services in the personal computing environment. In Proc. the IEEE CLUSTER, Sept. 2011, pp.112-120.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘明业; 洪恩宇;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] 孙钟秀; 商陆军;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[3] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[4] 高庆狮; 张祥; 杨树范; 陈树清;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[5] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[6] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[7] 闵应骅; 韩智德;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[8] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[9] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[10] 朱鸿;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: