›› 2012, Vol. ›› Issue (2): 256-272.doi: 10.1007/s11390-012-1221-4

• Architecture • Previous Articles     Next Articles

Modeling a Dynamic Data Replication Strategy to Increase System Availability in Cloud Computing Environments

Da-Wei Sun1 (孙大为), Student Member, CCF, ACM, Gui-Ran Chang2 (常桂然), Shang Gao3 (高尚), Li-Zhong Jin1 (靳立忠), and Xing-Wei Wang1,* (王兴伟), Senior Member, CCF, ACM   

  1. 1. School of Information Science and Engineering, Northeastern University, Shenyang 110819, China;
    2. Computing Center, Northeastern University, Shenyang 110819, China;
    3. School of Engineering and Information Technology, Deakin University, Geelong, Victoria 3217, Australia
  • Received:2011-06-17 Revised:2012-01-29 Online:2012-03-05 Published:2012-03-05
  • Supported by:

    Supported by the National Natural Science Foundation of China under Grant Nos. 61070162, 71071028 and 70931001, the Speciali-zed Research Fund for the Doctoral Program of Higher Education of China under Grant Nos. 20110042110024 and 20100042110025, the Fundamental Research Funds for the Central Universities of China under Grant Nos. N100604012, N090504003 and N090504006.

Failures are normal rather than exceptional in the cloud computing environments. To improve system avai-lability, replicating the popular data to multiple suitable locations is an advisable choice, as users can access the data from a nearby site. This is, however, not the case for replicas which must have a fixed number of copies on several locations. How to decide a reasonable number and right locations for replicas has become a challenge in the cloud computing. In this paper, a dynamic data replication strategy is put forward with a brief survey of replication strategy suitable for distributed computing environments. It includes: 1) analyzing and modeling the relationship between system availability and the number of replicas; 2) evaluating and identifying the popular data and triggering a replication operation when the popularity data passes a dynamic threshold; 3) calculating a suitable number of copies to meet a reasonable system byte effective rate requirement and placing replicas among data nodes in a balanced way; 4) designing the dynamic data replication algorithm in a cloud. Experimental results demonstrate the efficiency and effectiveness of the improved system brought by the proposed strategy in a cloud.

[1] Foster I, Zhao Y, Raicu I, Lu S Y. Cloud computing and gridcomputing 360-degree compared. In Proc. Grid ComputingEnvironments Workshop, Austin, TX, USA, Nov. 12-16, 2008,pp.1-10.

[2] Buyya R, Yeo C S, Venugopal S, Broberg J, Brandic I. Cloudcomputing and emerging IT platforms: Vision, hype, and re-ality for delivering computing as the 5th utility. Future Gene-ration Computer Systems, 2009, 25(6): 599-616.

[3] Armbrust M, Fox A, Griffith R, Joseph A D, Katz R, Konwin-ski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M.A view of cloud computing. Communications of the ACM,2010, 53(4): 50-58.

[4] Mell P, Grance T. The NIST definition of cloud computing.Communications of the ACM, 2010, 53(6): 50.

[5] Iosup A, Ostermann S, Yigitbasi N, Prodan R, Fahringer T,Epema D H J. Performance analysis of cloud computing ser-vices for many-tasks scientific computing. IEEE Transactionson Parallel and Distributed Systems, 2011, 22(6): 931-945.

[6] Han Y B, Sun J Y, Wang G L, Li H F. A cloud-basedBPM architecture with user-end distribution of non-compute-intensive activities and sensitive data. Journal of ComputerScience and Technology, 2010, 25(6): 1157-1167.

[7] Wang H. Privacy-preserving data sharing in cloud computing.Journal of Computer Science and Technology, 2010, 25(3):401-414.

[8] He K Q, Wang J A, Liang P. Semantic interoperability aggre-gation in service requirements refinement. Journal of Com-puter Science and Technology, 2010, 25(6): 1103-1117.

[9] Xu B M, Zhao C Y, Hu E Z, Hu B. Job scheduling algorithmbased on Berger model in cloud environment. Advances inEngineering Software, 2011, 42(7): 419-425.

[10] Ghemawat S, Gobioff H, Leung S T. The Google file system.ACM SIGOPS Operating Systems Review, 2003, 37(5): 29-43.

[11] Shvachko K, Hairong K, Radia S, Chansler R. The Hadoopdistributed file system. In Proc. the 26th Symposium on MassStorage Systems and Technologies, Incline Village, NV, USA,May 3-7, 2010, pp.1-10.

[12] Wang S S, Yan K Q, Wang S C. Achieving efficient agreementwithin a dual-failure cloud-computing environment. ExpertSystem with Applications, 2010, 38(1): 906-915.

[13] Chang R S, Chang H P. A dynamic data replication strategyusing access-weights in data grids.Journal of Supercomputing,2008, 45(3): 277-295.

[14] Kim Y H, Jung M J, Lee C H. Energy-aware real-time taskscheduling exploiting temporal locality. IEICE Transactionson Information and Systems, 2010, 93(5): 1147-1153.

[15] Wei Q, Veeravalli B, Gong B, Zeng L, Feng D. CDRM: Acost-effective dynamic replication management scheme forcloud storage cluster. In Proc. 2010 IEEE InternationalConference on Cluster Computing, Heraklion, Crete, Greece,Sept. 20-24, 2010, pp.188-196.

[16] Bonvin N, Papaioannou T G, Aberer K. A self-organized,fault-tolerant and scalable replication scheme for cloud sto-rage. In Proc. the 1st ACM Symposium on Cloud Computing,Indianapolis, IN, USA, June 10-11, 2010, pp.205-216.

[17] Nguyen T, Cutway A, Shi W. Differentiated replication stra-tegy in data centers. In Proc. the IFIP International Confer-ence on Network and Parallel Computing, Zhengzhou, China,Sept. 13-15, 2010, pp.277-288.

[18] Mckusick M, Quinlan S. GFS: Evolution on fast-forward.Communications of the ACM, 2010, 53(3): 42-47.

[19] Ahmad N, Fauzi A A C, Sidek R M, Zin N M, Beg A H. Low-est data replication storage of binary vote assignment datagrid. In Proc. the 2nd International Conference NetworkedDigital Technologies, Prague, Czech Republic, July 7-9, 2010,pp.466-473.

[20] Rahman R M, Barker K, Alhajj R. Replica placement designwith static optimality and dynamic maintainability. In Proc.the 6th IEEE International Symposium on Cluster Comput-ing and the Grid, Singapore, May 16-19, 2006, pp.434-437.

[21] Dogan A. A study on performance of dynamic file replica-tion algorithms for real-time file access in data grids. FutureGeneration Computer Systems, 2009, 25(8): 829-839.

[22] Lei M, Vrbsky S V, Hong X. An on-line replication strategy toincrease availability in data grids. Future Generation Com-puter Systems, 2008, 24(2): 85-98.

[23] Litke A, Skoutas D, Tserpes K, Varvarigou T. Efficient taskreplication and management for adaptive fault tolerance inmobile grid environments. Future Generation Computer Sys-tems, 2007, 23(2): 163-178.

[24] Dobber M, van der Mei R, Koole G. Dynamic load balanc-ing and job replication in a global-scale grid environment: Acomparison. IEEE Transactions on Parallel and DistributedSystems, 2009, 20(2): 207-218.

[25] Yuan D, Yang Y, Liu X, Chen J. A data placement strategyin scientific cloud workflows. Future Generation ComputerSystems, 2010, 26(8): 1200-1214.

[26] Rood B, Lewis M J. Grid resource availability prediction-based scheduling and task replication. Journal of Grid Com-puting, 2009, 7(4): 479-500.

[27] Latip R, Othman M, Abdullah A, Ibrahim H, Md SulaimanN. Quorum-based data replication in grid environment. In-ternational Journal of Computational Intelligence Systems,2009, 2(4): 386-397.

[28] Avizienis A, Laprie J C, Randell B R, Landwehr C. Basicconcepts and taxonomy of dependable and secure computing.IEEE Transactions on Dependable and Secure Computing,2004, 1(1): 11-33.

[29] Al-Kuwaiti M, Kyriakopoulos N, Hussein S. A comparativeanalysis of network dependability, fault-tolerance, reliability,security, and survivability. IEEE Communications Surveys &Tutorials, 2009, 11(2): 106-124.

[30] Ray I, Ray I, Chakraborty S. An interoperable context sensi-tive model of trust. Journal of Intelligent Information Sys-tems, 2009, 32(1): 75-104.

[31] Tu M, Li P, Yen I L, Thuraisingham B M, Khan L. Securedata objects replication in data grid. IEEE Transactions onDependable and Secure Computing, 2010, 7(1): 50-64.

[32] Wang J Y, Jea K F. A near-optimal database allocation forreducing the average waiting time in the grid computing en-vironment. Information Sciences, 2009, 179(21): 3772-3790.

[33] Jung D, Chin S H, Chung K S, Suh T, Yu H C, Gil J M. Aneffective job replication technique based on reliability and per-formance in mobile grids. InProc. the 5th International Con-ference Advances in Grid and Pervasive Computing, Hualien,Taiwan, China, May 10-13, 2010, pp.47-58.

[34] Buyya R, Ranjan R, Calheiros R N. Modeling and simulationof scalable cloud computing environments and the CloudSimtoolkit: Challenges and opportunities. In Proc. 2009 In-ternational Conference on High Performance Computing &Simulation, Leipzig, Germany, June 21-24, 2009, pp.1-11.

[35] Belalem G, Tayeb F Z, Zaoui W. Approaches to improve theresources management in the simulator CloudSim. In Proc.the 1st International Conference Information Computing andApplications, Tangshan, China, Oct. 15-18, 2010, pp.189-196.

[36] Calheiros R N, Ranjan R, Beloglazov A, De Rose C A F,Buyya R. CloudSim: A toolkit for modeling and simulation ofcloud computing environments and evaluation of resource pro-visioning algorithms. Software-Practice & Experience, 2011,41(1): 23-50.
No related articles found!
Full text



[1] Fei Xianglin; Liao Lei; Wang Hezhen; Wang Chengzao;. Structured Development Environment Based on the Object-Oriented Concepts[J]. , 1992, 7(3): 193 -201 .
[2] Chen Ke; Masumi Ishikawa;. A Parallel Voting Scheme for Aspect Recovery[J]. , 1995, 10(5): 385 -402 .
[3] Wu Junsheng; Wu Guangmao;. Element-Partition-Based Methods for Visualization of 3D Unstructured Grid Data[J]. , 1998, 13(5): 417 -425 .
[4] CHEN Haiming;. Function Definition Language FDL andIts Implementation[J]. , 1999, 14(4): 414 -421 .
[5] SHU Yantai; XUE Fei; JIN Zhigang; Oliver Yang;. The Impact of Self-Similar Traffic on Network Delay[J]. , 1999, 14(6): 585 -589 .
[6] MA Jun; YANG Bo; MA Shaohan;. A Practical Algorithm for the Minimum Rectilinear Steiner Tree[J]. , 2000, 15(1): 96 -99 .
[7] MA Zongmin; ZHANG W. J; MA W. Y;. Extending the Relational Model to Deal with Probabilistic Data[J]. , 2000, 15(3): 230 -240 .
[8] Yu Sun, Yue-Fei Sui, and You-Ming Xia. Logical Sentences as the Intent of Concepts[J]. , 2005, 20(3): 338 -344 .
[9] Heng Li, Jin-Song Liu, Zhao Xu et al.. Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome[J]. , 2005, 20(4): 446 -453 .
[10] Yohan D. Fougerolle, Andrei Gribok, Sebti Foufou, Frederic Truchetet, and Mongi A. Abidi. Radial Supershapes for Solid Modeling[J]. , 2006, 21(2): 238 -243 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved