›› 2012, Vol. ›› Issue (2): 273-280.doi: 10.1007/s11390-012-1222-3

• Architecture • Previous Articles     Next Articles

Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

Seong Woo Kwak1, Kwan-Ho You2, and Jung-Min Yang3   

  1. 1. Department of Electronic Engineering, Keimyung University, Daegu 704-701, Korea;
    2. School of Information & Communication Engineering, Sungkyunkwan University, Suwon 440-746, Korea;
    3. Department of Electrical Engineering, Catholic University of Daegu, Daegu 712-702, Korea
  • Received:2011-04-12 Revised:2011-09-27 Online:2012-03-05 Published:2012-03-05

This paper proposes a checkpoint rollback strategy for real-time systems with double modular redundancy. Without built-in fault-detection and spare processors, our scheme is able to recover from both transient and permanent faults. Two comparisons are conducted at each checkpoint. First, the states stored in two consecutive checkpoints of one processor are compared for checking integrity of the processor. The states of two processors are also compared for detecting faults and the system rolls back to the previous checkpoint whenever required by logic of the proposed scheme. A Markov model is induced by the fault recovery scheme and analyzed to provide the probability of task completion within its deadline. The optimal number of checkpoints is selected so as to maximize the probability of task completion.

[1] Young J W. A first order approximation to the optimal check-point intervals. Commun. the ACM, 1974, 17(9): 530-531.

[2] Naruse K, Umemura S, Nakagawa, S. Optimal checkpoint-ing interval for two-level recovery schemes. Computers andMathematics with Applications, 2006, 51(2): 371-376.

[3] Ziv A, Bruck J. Performance optimization of checkpointingschemes with task duplication. IEEE Transactions on Com-puters, 1997, 46(12): 1381-1386.

[4] Nakagawa S, Fukumoto S, Ishii N. Optimal checkpointingintervals for a double modular redundancy with signatures.Comput. and Math. with Applicat., 2003, 46(7): 1089-1094.

[5] Krishina C M, Shin K G. Real-Time Systems. McGraw-Hill,1997.

[6] Pradhan D K, Vaidya N H. Roll-forward checkpointingscheme: A novel fault-tolerant architecture. IEEE Tran-sactions on Computers, 1994, 43(10): 1163-1174.

[7] Ziv A, Bruck J. Analysis of checkpointing schemes with taskduplication. IEEE Trans. Computers, 1998, 47(2): 222-227.

[8] Pradhan D K, Vaidya N H. Roll-forward and rollback recov-ery: Performance-reliability trade-off. IEEE Transactions onComputers, 1997, 46(3): 372-378.

[9] Tiwari A, Tomko K A. Enhanced reliability of finite-state ma-chines in FPGA through efficient fault detection and correc-tion. IEEE Transactions on Reliability, 2005, 54(3): 459-467.

[10] Yang J M, Kwak S W. A checkpoint scheme with task du-plication considering transient and permanent fault. In Proc.IEEE Int. Conf. Industrial Engineering and EngineeringManagement (IEEM2010), Dec. 2010, pp.606-610.

[11] Karpovsky M, Su S Y H. Detection and location of input andfeedback bridging faults among input and output lines. IEEETransactions on Computers, 1980, C-29(6): 523-527.

[12] Hashizume M, Yotsuyanagi H, Tamesada T. Identification offeedback bridging faults with oscillation. In Proc. the 8thAsian Test Symposium, Nov. 1999, pp.25-30.

[13] Konuk H, Ferguson F J. Oscillation and sequential behaviorcaused by opens in the routing in digital CMOS circuits.IEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems, 1998, 17(11): 1200-1210.

[14] Berdjag D, Zolghadri A, Cieslak J, Goupil P. Fault detectionand isolation for redundant aircraft sensors. In Proc. SysTol2010, Oct. 2010, pp.137-142.

[15] Kwak S W, Choi B J, Kim B K. Optimal checkpointing strat-egy for real-time control systems under faults with exponen-tial duration. IEEE Trans. Reliability, 2001, 50(3): 293-301.
No related articles found!
Full text



[1] Li Weidong; Wei Daozheng;. Test Derivation Through Critical Path Transitions[J]. , 1992, 7(1): 12 -18 .
[2] Harald E. Otto;. UNDO, An Aid for Explorative Learning?[J]. , 1992, 7(3): 226 -236 .
[3] Gu Junzhong;. Modelling Enterprises with Object-Oriented Paradigm[J]. , 1993, 8(3): 80 -89 .
[4] Wang Hui; Liu Dayou; Wang Yafei;. Sequential Back-Propagation[J]. , 1994, 9(3): 252 -260 .
[5] CAI Jiamei;. The Sequence Modeling Method Based on ECCin Developing Program Specifications[J]. , 1999, 14(4): 337 -348 .
[6] Joonghyun Ryu, Rhohun Park, and Deok-Soo Kim. Connolly Surface on an Atomic Structure via Voronoi Diagram of Atoms[J]. , 2006, 21(2): 255 -260 .
[7] Shan Wang, Xiao-Yong Du, Xiao-Feng Meng, and Hong Chen. Database Research: Achievements and Challenges[J]. , 2006, 21(5): 823 -837 .
[8] Guan-Qun Gu and Jun-Zhou Luo. Some Issues on Computer Networks: Architecture and Key Technologies[J]. , 2006, 21(5): 708 -722 .
[9] Zhiyuan Li. Simultaneous Minimization of Capacity and Conflict Misses[J]. , 2007, 22(4): 497 -504 .
[10] Gang Xu, Guo-Zhao Wang, and Xiao-Diao Chen. Free-Form Deformation with Rational DMS-Spline Volumes[J]. , 2008, 23(5 ): 862 -873 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved