Optimization of Generalized Eigensolver for Dense Symmetric Matrices on AMD GPU
-
Abstract
Accelerating the eigensolver on GPUs is getting more and more attention due to its ubiquitous usage in scientific and engineering fields. However, it is very challenging to achieve high performance on eigensolvers because of the intricate computational patterns which cause inefficient memory access and workload imbalance on GPUs. In this work, we propose a series of optimizations for generalized dense symmetric eigenvalue problems from both the system and operator perspectives on AMD GPUs. Firstly, we adjust the workload assignments between CPU and GPU and find the computational performance balance between different levels of computation. Besides, we propose a multi-level pre-aggregation strategy for symmetric matrix-vector multiplication (SYMV) and general matrix-vector multiplication (GEMV) operators to tackle the performance issue caused by lacking hardware support for atomic operation. Furthermore, we optimize Cholesky decomposition and SYR2K by adopting a better overlapping method and utilizing symmetry to reduce computation. Experiments on AMD MI60 GPUs show that our optimized eigensolver outperforms the previous state-of-the-art with roughly 1.8x–3.8x speedups.
-
-