Journal of Computer Science and Technology ›› 2020, Vol. 35 ›› Issue (2): 395-411.doi: 10.1007/s11390-020-9701-4

• Special Section of ChinaSys 2019 • Previous Articles     Next Articles

MPI-RCDD: A Framework for MPI Runtime Communication Deadlock Detection

Hong-Mei Wei1, Jian Gao2,*, Peng Qing2, Kang Yu2, Yan-Fei Fang2, Ming-Lu Li1, Senior Member, CCF, IEEE, Member, ACM        

  1. 1 Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
    2 Jiangnan Institute of Computing Technology, Wuxi 214083, China
  • Received:2019-05-10 Revised:2020-01-07 Online:2020-03-05 Published:2020-03-18
  • Contact: Jian Gao E-mail:gaojian_whu@163.com
  • About author:Hong-Mei Wei received her Master's degree in computer science from Shanghai Jiao Tong University, Shanghai, in 2004. She is currently pursuing her Ph.D. degree in computer science in Shanghai Jiao Tong University, Shanghai. Her research interests include high-performance computing and parallel compilation.
  • Supported by:
    This work was supported by the National Key Research and Development Program of China under Grant No. 2017YFB0202003.

The message passing interface (MPI) has become a de facto standard for programming models of highperformance computing, but its rich and flexible interface semantics makes the program easy to generate communication deadlock, which seriously affects the usability of the system. However, the existing detection tools for MPI communication deadlock are not scalable enough to adapt to the continuous expansion of system scale. In this context, we propose a framework for MPI runtime communication deadlock detection, namely MPI-RCDD, which contains three kinds of main mechanisms. Firstly, MPI-RCDD has a message logging protocol that is associated with deadlock detection to ensure that the communication messages required for deadlock analysis are not lost. Secondly, it uses the asynchronous processing thread provided by the MPI to implement the transfer of dependencies between processes, so that multiple processes can participate in deadlock detection simultaneously, thus alleviating the performance bottleneck problem of centralized analysis. In addition, it uses an AND⊕OR model based algorithm named AODA to perform deadlock analysis work. The AODA algorithm combines the advantages of both timeout-based and dependency-based deadlock analysis approaches, and allows the processes in the timeout state to search for a deadlock circle or knot in the process of dependency transfer. Further, the AODA algorithm cannot lead to false positives and can represent the source of the deadlock accurately. The experimental results on typical MPI communication deadlock benchmarks such as Umpire Test Suit demonstrate the capability of MPIRCDD. Additionally, the experiments on the NPB benchmarks obtain the satisfying performance cost, which show that the MPI-RCDD has strong scalability.

Key words: high-performance computing; message passing interface (MPI); communication deadlock; deadlock detection; AND⊕OR model;

[1] Vakkalanka S. Efficient dynamic verification algorithms for MPI applications[Ph.D. Thesis]. School of Computing, The University of Utah, 2010.
[2] Luecke G R, Zou Y, Coyle J et al. Deadlock detection in MPI programs. Concurrency and Computation:Practice and Experience, 2002, 14(11):911-932.
[3] Krammer B, Bidmon K, Müller M S et al. MARMOT:An MPI analysis and checking tool. Advances in Parallel Computing, 2004, 13:493-500.
[4] Vetter J S, de Supinski B R. Dynamic software testing of MPI applications with Umpire. In Proc. the 2000 ACM/IEEE Conference on Supercomputing, November 2000, Article No. 51.
[5] Hilbrich T, Schulz M, de Supinski B R et al. MUST:A scalable approach to runtime error detection in MPI programs. In Proc. the 3rd International Workshop on Parallel Tools for High Performance Computing, September 2000, pp.53-66.
[6] Hilbrich T, Protze J, Schulz M et al. MPI runtime error detection with MUST:Advances in deadlock detection. Scientific Programming, 2013, 21(3/4):109-121.
[7] Do-Mai A T, Diep T D, Thoai N. Race condition and deadlock detection for large-scale applications. In Proc. the 15th International Symposium on Parallel and Distributed Computing, July 2016, pp.319-326.
[8] Forejt V, Joshi S, Kroening D et al. Precise predictive analysis for discovering communication deadlocks in MPI programs. ACM Transactions on Programming Languages and Systems, 2017, 39(4):Article No. 15.
[9] Alnemari R A, Fadel M A, Eassa F. Integrating static and dynamic analysis techniques for detecting dynamic errors in MPI programs. International Journal of Computer Science and Mobile Computing, 2018, 7(4):141-147.
[10] Alghamdi A M, Eassa F E. Software testing techniques for parallel systems:A survey. International Journal of Computer Science and Network Security, 2019, 19(4):176-186.
[11] Hilbrich T, de Supinski B R, Schulz M et al. A graph based approach for MPI deadlock detection. In Proc. the 23rd International Conference on Supercomputing, June 2009, pp.296-305.
[12] Siegel S F, Zirkel T K. FEVS:A functional equivalence verification suite for high-performance scientific computing. Mathematics in Computer Science, 2011, 5(4):427-435.
[13] Müller M, de Supinski B, Gopalakrishnan G et al. Dealing with MPI bugs at scale:Best practices, automatic detection, debugging, and formal verification. http://www.cs.utah.edu/fv/publications/sc11_with_handson.pptx,October 2019.
[14] Bailey D H, Barszcz E, Barton J T et al. The NAS parallel benchmarks. The International Journal of Supercomputing Applications, 1991, 5(3):63-73.
[1] André Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, Marc-André Vef. Ad Hoc File Systems for High-Performance Computing [J]. Journal of Computer Science and Technology, 2020, 35(1): 4-26.
[2] Yu-Tong Lu, Peng Cheng, Zhi-Guang Chen. Design and Implementation of the Tianhe-2 Data Storage and Management System [J]. Journal of Computer Science and Technology, 2020, 35(1): 27-46.
[3] Marc-André Vef, Nafiseh Moti, Tim Süß, Markus Tacke, Tommaso Tocci, Ramon Nou, Alberto Miranda, Toni Cortes, André Brinkmann. GekkoFS—A Temporary Burst Buffer File System for HPC Applications [J]. Journal of Computer Science and Technology, 2020, 35(1): 72-91.
[4] Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, Qing Zheng. Mochi: Composing Data Services for High-Performance Computing Environments [J]. Journal of Computer Science and Technology, 2020, 35(1): 121-144.
[5] Xu Tan, Xiao-Wei Shen, Xiao-Chun Ye, Da Wang, Dong-Rui Fan, Lunkai Zhang, Wen-Ming Li, Zhi-Min Zhang, Zhi-Min Tang. A Non-Stop Double Buffering Mechanism for Dataflow Architecture [J]. , 2018, 33(1): 145-157.
[6] Xiao-Wei Shen, Xiao-Chun Ye, Xu Tan, Da Wang, Lunkai Zhang, Wen-Ming Li, Zhi-Min Zhang, Dong-Rui Fan, Ning-Hui Sun. An Efficient Network-on-Chip Router for Dataflow Architecture [J]. , 2017, 32(1): 11-25.
[7] WANG Xiaodong; XU Ming; ZHOU Xingming;. Fast Multicast on Multistage Interconnection Networks Using Multi-Head Worms [J]. , 1999, 14(3): 250-258.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] C.Y.Chung; H.R.Hwa;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[4] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[5] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[6] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[7] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[8] Zheng Guoliang; Li Hui;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[9] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[10] Huang Xuedong; Cai Lianhong; Fang Ditang; Chi Bianjin; Zhou Li; Jiang Li;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved