We use cookies to improve your experience with our site.
Yan YJ, Li HB, Zhao T et al. 10-million atoms simulation of first-principle package LS3DF. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 39(1): 45−62 Jan. 2024. DOI: 10.1007/s11390-023-3011-6.
Citation: Yan YJ, Li HB, Zhao T et al. 10-million atoms simulation of first-principle package LS3DF. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 39(1): 45−62 Jan. 2024. DOI: 10.1007/s11390-023-3011-6.

10-Million Atoms Simulation of First-Principle Package LS3DF

  • The growing demand for semiconductor devices simulation poses a big challenge for large-scale electronic structure calculations. Among various methods, the linearly scaling three-dimensional fragment (LS3DF) method exhibits excellent scalability in large-scale simulations. Based on algorithmic and system-level optimizations, we propose a highly scalable and highly efficient implementation of LS3DF on a domestic heterogeneous supercomputer equipped with accelerators. In terms of algorithmic optimizations, the original all-band conjugate gradient algorithm is refined to achieve faster convergence, and mixed precision computing is adopted to increase overall efficiency. In terms of system-level optimizations, the original two-layer parallel structure is replaced by a coarse-grained parallel method. Optimization strategies such as multi-stream, kernel fusion, and redundant computation removal are proposed to increase further utilization of the computational power provided by the heterogeneous machines. As a result, our optimized LS3DF can scale to a 10-million silicon atoms system, attaining a peak performance of 34.8 PFLOPS (21.2% of the peak). All the improvements can be adapted to the next-generation supercomputers for larger simulations.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return