由Slingshot互连的高性能信息传递接口

KhorassaniKawthar Shafie; ChenChen-Chun; RameshBharath; ShafiAamir; SubramoniHari; PandaDhabaleswar K.

doi:10.1007/s11390-023-2907-5

由Slingshot互连的高性能信息传递接口

High Performance MPI over the Slingshot Interconnect

摘要

摘要: 随着 Slingshot 技术在即将到来的百亿亿次系统中的应用，这项由HPE/Cray设计的互连联技术正在高性能计算领域中变得越来越重要。值得一提的是，这项互连联技术支撑起了世界上第一台排名最高的百亿亿次超级计算机：Frontier。它提供了诸如自适应路由选择、拥塞控制、隔离工作负载等功能。新型互连技术网络的运用引发了研究者对其性能、可扩展性和任何潜在的瓶颈的关注和研究兴趣，因为它们是在这些系统中的节点间进行扩展的关键因素。本文深入探讨了由Slingshot互连技术和目前最先进的MPI(消息传递接口)库所带来的挑战，尤其是研究了在跨节点间使用Slingshot的可扩展性。我们在Spock系统中进行了一项综合性能评估：在CPU和GPU上，使用不同的MPT和通信库（包括Cray MPICH, Open-MPI + UCX, RCCL, and MVAPICH2）仿真Frontier系统。Spock系统是一个部署了Slingshot-10, AMD MI100 GPU和AMD Epyc Rome CPU的聚簇。我们还初步评估了CPU环境下MPI库对Slingshot-11互连联的支持。

Abstract: The Slingshot interconnect designed by HPE/Cray is becoming more relevant in high-performance computing with its deployment on the upcoming exascale systems. In particular, it is the interconnect empowering the first exascale and highest-ranked supercomputer in the world, Frontier. It offers various features such as adaptive routing, congestion control, and isolated workloads. The deployment of newer interconnects sparks interest related to performance, scalability, and any potential bottlenecks as they are critical elements contributing to the scalability across nodes on these systems. In this paper, we delve into the challenges the Slingshot interconnect poses with current state-of-the-art MPI (message passing interface) libraries. In particular, we look at the scalability performance when using Slingshot across nodes. We present a comprehensive evaluation using various MPI and communication libraries including Cray MPICH, OpenMPI + UCX, RCCL, and MVAPICH2 on CPUs and GPUs on the Spock system, an early access cluster deployed with Slingshot-10, AMD MI100 GPUs and AMD Epyc Rome CPUs to emulate the Frontier system. We also evaluate preliminary CPU-based support of MPI libraries on the Slingshot-11 interconnect.

HTML全文

参考文献()

施引文献

资源附件()