High Performance MPI over the Slingshot Interconnect

Kawthar Shafie Khorassani; Chen-Chun Chen; Bharath Ramesh; Aamir Shafi; Hari Subramoni; Dhabaleswar K. Panda

doi:10.1007/s11390-023-2907-5

Khorassani KS, Chen CC, Ramesh B et al. High performance MPI over the Slingshot interconnect. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 38(1): 128−145 Jan. 2023. DOI: 10.1007/s11390-023-2907-5.

Citation:

Khorassani KS, Chen CC, Ramesh B et al. High performance MPI over the Slingshot interconnect. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 38(1): 128−145 Jan. 2023. DOI: 10.1007/s11390-023-2907-5.

Citation:

Khorassani KS, Chen CC, Ramesh B et al. High performance MPI over the Slingshot interconnect. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 38(1): 128−145 Jan. 2023. DOI: 10.1007/s11390-023-2907-5.

High Performance MPI over the Slingshot Interconnect

Abstract

Abstract

The Slingshot interconnect designed by HPE/Cray is becoming more relevant in high-performance computing with its deployment on the upcoming exascale systems. In particular, it is the interconnect empowering the first exascale and highest-ranked supercomputer in the world, Frontier. It offers various features such as adaptive routing, congestion control, and isolated workloads. The deployment of newer interconnects sparks interest related to performance, scalability, and any potential bottlenecks as they are critical elements contributing to the scalability across nodes on these systems. In this paper, we delve into the challenges the Slingshot interconnect poses with current state-of-the-art MPI (message passing interface) libraries. In particular, we look at the scalability performance when using Slingshot across nodes. We present a comprehensive evaluation using various MPI and communication libraries including Cray MPICH, OpenMPI + UCX, RCCL, and MVAPICH2 on CPUs and GPUs on the Spock system, an early access cluster deployed with Slingshot-10, AMD MI100 GPUs and AMD Epyc Rome CPUs to emulate the Frontier system. We also evaluate preliminary CPU-based support of MPI libraries on the Slingshot-11 interconnect.

FullText(HTML)

References (10)

Relative Articles

Supplements (3)

Cited By

High Performance MPI over the Slingshot Interconnect

Abstract

Catalog

Export File

Citation

Format

Content