We use cookies to improve your experience with our site.
Ruo-Han Wu, Xian-Yu Zhu, Jun-Shi Chen, Hong An. SwFormer: Enabling Faster Foundation Models on new Sunway Supercomputer via Holistic Kernel Tiling and Scheduling[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-025-4761-0
Citation: Ruo-Han Wu, Xian-Yu Zhu, Jun-Shi Chen, Hong An. SwFormer: Enabling Faster Foundation Models on new Sunway Supercomputer via Holistic Kernel Tiling and Scheduling[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-025-4761-0

SwFormer: Enabling Faster Foundation Models on new Sunway Supercomputer via Holistic Kernel Tiling and Scheduling

  • Deep learning's continuous evolution has driven the creation of increasingly large foundation models, such as GPT-3, which requires optimized performance on large-scale computing platforms. The new Sunway Supercomputer, equipped with numerous SW26010pro processors, supports AI workloads in both all-shared and single-CG modes. However, existing optimizations primarily target AI operators like Generalized Matrix Multiplication (GEMM) in the single-CG mode, leaving challenges in scaling performance across all 6 CGs in the all-shared mode. This paper introduces SwFormer, a framework designed to accelerate foundation models via intra-op tiling and inter-op scheduling. The intra-op tiling method breaks down operators into fine-grained tiled kernels and employs an offline profiling-based approach to determine the optimal tiling strategy. The inter-op scheduling method employs heuristic graph traversal algorithms to automatically reorder the computation of these tiled kernels, thereby maximizing hardware utilization. Compared with operator libraries for the all-shared mode such as SWDNNv2 and SWattention, SwFormer's intra-op tiling method accelerates end-to-end GPT-3 6.7B and 13B models training by up to 1.27x. Evaluated with GPT-style models, the inter-op scheduling approach further outperforms the intra-op tiling method by up to 1.32x.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return