Journal of Computer Science and Technology ›› 2023, Vol. 38 ›› Issue (2): 405-421.doi: 10.1007/s11390-022-2553-3

Special Issue: Computer Architecture and Systems

• Computer Architecture and Systems • Previous Articles     Next Articles

Parallel Software-Based Self-Testing with Bounded Model Checking for Kilo-Core Networks-on-Chip

Ying Zhang1 (张 颖), Member, CCF, IEEE, Peng-Fei Ji1 (季鹏飞), Pan-Wei Zhu1 (朱潘玮), Zebo Peng2, Senior Member, IEEE, Hua-Wei Li3 (李华伟), Fellow, CCF, and Jian-Hui Jiang1, * (江建慧), Senior Member, CCF   

  1. 1 School of Software Engineering, Tongji University, Shanghai 200092, China
    2 Department of Computer and Information Science, Linkoping University, Linkoping 58183, Sweden
    3 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2022-06-01 Revised:2022-10-31 Accepted:2022-12-15 Online:2023-05-10 Published:2023-05-10
  • Contact: Jian-Hui Jiang E-mail:jhjiang@tongji.edu.cn
  • About author:Jian-Hui Jiang received his B.E., M.E. and Ph.D. degrees in traffic information engineering in Shanghai Railway University, Shanghai, in 1985, 1988, and 1999, respectively. He is currently a full professor of software engineering and a vice dean of the School of Software Engineering at Tongji University, Shanghai. His current research interests include dependable systems and networks, software reliability engineering, VLSI/SoC testing and fault-tolerance.
  • Supported by:
    This paper was supported in part by the National Key Research and Development Program of China under Grant No. 2020YFB 1600201, the National Natural Science Foundation of China (NSFC) under Grant Nos. 61974105, 62090024, and U20A20202, and the Zhejiang Lab under Grant No. 2021KC0AB01.

Online testing is critical to ensuring reliable operations of the next generation of supercomputers based on a kilo-core network-on-chip (NoC) interconnection fabric. We present a parallel software-based self-testing (SBST) solution that makes use of the bounded model checking (BMC) technique to generate test sequences and parallel packets. In this method, the parallel SBST with BMC derives the leading sequence for each router’s internal function and detects all functionally-testable faults related to the function. A Monte-Carlo simulation algorithm is then used to search for the approximately optimum configuration of the parallel packets, which guarantees the test quality and minimizes the test cost. Finally, a multi-threading technology is used to ensure that the Monte-Carlo simulation can reach the approximately optimum configuration in a large random space and reduce the generating time of the parallel test. Experimental results show that the proposed method achieves a high fault coverage with a reduced test overhead. Moreover, by performing online testing in the functional mode with SBST, it effectively avoids the over-testing problem caused by functionally untestable turns in kilo-core NoCs.

Key words: software-based self-testing (SBST); parallel test; kilo-core networks-on-chip (NoCs); online testing;

[1] Zhang Y, Hong X P, Chen Z S, Peng Z B, Jiang J H. A deterministic-path routing algorithm for tolerating many faults on very-large-scale network-on-chip. ACM Trans. Design Automation of Electronic Systems, 2021, 26(1): Article No. 8. DOI: 10.1145/3414060.
[2] Zhang Y, Chakrabarty K, Li H W, Jiang J H. Software-based online self-testing of network-on-chip using bounded model checking. In Proc. the 2017 IEEE International Test Conference, Oct. 31–Nov. 2, 2017. DOI: 10.1109/TEST.2017.8242037.
[3] Dongarra J. Report on the Sunway TaihuLight system. Tech Report UT-EECS-16-742, Oak Ridge National Laboratory, 2016. https://netlib.org/utk/people/JackDongarra/PAPERS/sunway-report-2016-old.pdf, Mar. 2023.
[4] Zhang L, Han Y H, Xu Q, Li X W, Li H W. On topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2009, 17(9): 1173–1186. DOI: 10.1109/TVLSI.2008.2002108.
[5] Ouyang Y M, Da J, Wang X M, Han Q Q, Liang H G, Du G M. A TSV fault-tolerant scheme based on failure classification in 3D-NoC. Journal of Circuits, Systems and Computers, 2017, 26(4): 1750059. DOI: 10.1142/S0218126617500591.
[6] Chen Z S, Zhang Y, Peng Z B, Jiang J H. A deterministic-path routing algorithm for tolerating many faults on wafer-level NoC. In Proc. the 2019 Design, Automation and Test in Europe Conference & Exhibition, Mar. 2019, pp.1337–1342. DOI: 10.23919/DATE.2019.8714948.
[7] Lee D, Das S, Doppa J R, Pande P P, Chakrabarty K. Performance and thermal tradeoffs for energy-efficient monolithic 3D network-on-chip. ACM Trans. Design Automation of Electronic Systems, 2018, 23(5): Article No. 60. DOI: 10.1145/3223046.
[8] Liu W C, Yang L, Jiang W W, Feng L, Guan N, Zhang W, Dutt N. Thermal-aware task mapping on dynamically reconfigurable network-on-chip based multiprocessor system-on-chip. IEEE Trans. Computers, 2018, 67(12): 1818–1834. DOI: 10.1109/TC.2018.2844365.
[9] Xiang D, Chakrabarty K, Fujiwara H. Multicast-based testing and thermal-aware test scheduling for 3D ICs with a stacked network-on-chip. IEEE Trans. Computers, 2016, 65(9): 2767–2779. DOI: 10.1109/TC.2015.2493548.
[10] Wang L, Wang X H, Leung H F, Mak T. A non-minimal routing algorithm for aging mitigation in 2D-mesh NoCs. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2019, 38(7): 1373–1377. DOI: 10.1109/TCAD.2018.2855149.
[11] Xiang D, Shen K L, Bhattacharya B B, Wen X Q, Lin X J. Thermal-aware small-delay defect testing in integrated circuits for mitigating overkill. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2016, 35(3): 499–512. DOI: 10.1109/TCAD.2015.2474365.
[12] Biere A, Cimatti A, Clarke E M, Strichman O, Zhu Y S. Bounded model checking. Advances in Computers, 2003, 58: 117–148. DOI: 10.1016/S0065-2458(03)58003-2.
[13] Clarke E, Biere A, Raimi R, Zhu Y S. Bounded model checking using satisfiability solving. Formal Methods in System Design, 2001, 19(1): 7–34. DOI: 10.1023/A:1011276507260.
[14] Cota E, Liu C. Constraint-driven test scheduling for NoC-based systems. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2006, 25(11): 2465–2478. DOI: 10.1109/TCAD.2006.881331.
[15] Yuan F, Huang L, Xu Q. Re-examining the use of network-on-chip as test access mechanism. In Proc. the 2008 Conference on Design, Automation and Test in Europe, Mar. 2018, pp.808–811. DOI: 10.1145/1403375.1403571.
[16] Richter M, Chakrabarty K. Optimization of test pin-count, test scheduling, and test access for NoC-based multicore SoCs. IEEE Trans. Computers, 2014, 63(3): 691–702. DOI: 10.1109/TC.2013.82.
[17] Strano A, Gómez C, Ludovici D, Favalli M, Gómez M E, Bertozzi D. Exploiting network-on-chip structural redundancy for a cooperative and scalable built-in self-test architecture. In Proc. the 2011 Design, Automation and Test in Europe, Mar. 2011. DOI: 10.1109/DATE.2011.5763109.
[18] Wang J S, Ebrahimi M, Huang L T, Xie X, Li Q, Li G J, Jantsch A. Efficient design-for-test approach for networks-on-chip. IEEE Trans. Computers, 2019, 68(2): 198–213. DOI: 10.1109/TC.2018.2865948.
[19] Herve M B, Cota E, Kastensmidt F L, Lubaszewski M. NoC interconnection functional testing: Using boundary-scan to reduce the overall testing time. In Proc. the 10th Latin American Test Workshop, Mar. 2009. DOI: 10.1109/LATW.2009.4813801.
[20] Kakoee M R, Bertacco V, Benini L. At-speed distributed functional testing to detect logic and delay faults in NoCs. IEEE Trans. Computers, 2014, 63(3): 703–717. DOI: 10.1109/TC.2013.202.
[21] Kranitis N, Merentitis A, Theodorou G, Paschalis A, Gizopoulos D. Hybrid-SBST methodology for efficient testing of processor cores. IEEE Design & Test of Computers, 2008, 25(1): 64–75. DOI: 10.1109/MDT.2008.15.
[22] Collet J H, Zajac P, Psarakis M, Gizopoulos D. Chip self-organization and fault tolerance in massively defective multicore arrays. IEEE Trans. Dependable and Secure Computing, 2011, 8(2): 207–217. DOI: 10.1109/TDSC.2009.53.
[23] Dalirsani A, Imhof M E, Wunderlich H J. Structural software-based self-test of network-on-chip. In Proc. the 32nd VLSI Test Symposium, Apr. 2014. DOI: 10.1109/VTS.2014.6818754.
[24] Cheng K T, Krishnakumar A S. Automatic functional test generation using the extended finite state machine model. In Proc. the 30th International Design Automation Conference, June 1993, pp.86–91. DOI: 10.1145/157485.164585.
[25] Psarakis M, Gizopoulos D, Sanchez E, Reorda M S. Microprocessor software-based self-testing. IEEE Design & Test of Computers, 2010, 27(3): 4–19. DOI: 10.1109/MDT.2010.5.
[26] Ganai M K, Gupta A. Accelerating high-level bounded model checking. In Proc. the 2006 International Conference on Computer Aided Design, Nov. 2006, pp.794–801. DOI: 10.1109/ICCAD.2006.320122.
[27] Tupuri R S, Abraham J A. A novel functional test generation method for processors using commercial ATPG. In Proc. the 1997 International Test Conference, Nov. 1997, pp.743–752. DOI: 10.1109/TEST.1997.639687.
[28] Zhang Y, Li H W, Li X W. Automatic test program generation using uting-trace-based constraint extraction for embedded processors. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2013, 27(7): 1220–1233. DOI: 10.1109/TVLSI.2012.2208130.
[29] Negro V C, Goldstein L H. The generation of correlated multivariate samples for Monte Carlo simulation. IEEE Spectrum, 1968, 5(2): 5. DOI: 10.1109/MSPEC.1968.5214753.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Li Wei;. A Structural Operational Semantics for an Edison Like Language(2)[J]. , 1986, 1(2): 42 -53 .
[3] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[4] Li Wanxue;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 .
[5] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[6] Wang Xuan; Lü Zhimin; Tang Yuhai; Xiang Yang;. A High Resolution Chinese Character Generator[J]. , 1986, 1(2): 1 -14 .
[7] C.Y.Chung; H.R.Hwa;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[8] Sun Zhongxiu; Shang Lujun;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[9] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[10] Jin Lan; Yang Yuanyuan;. A Modified Version of Chordal Ring[J]. , 1986, 1(3): 15 -32 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved