Decoupled Vector Processing Unit: Past, Present, and Future
-
Abstract
Vector architectures are widely employed in modern processors due to their high performance and energy efficiency in exploiting data-level parallelism through single instruction multiple data (SIMD) paradigms. The built-in scalar cores and the vector processing units (VPUs) can be organized as integrated or decoupled. The decoupled vector architecture primarily offers the advantage of independent operation, allowing the VPU and the scalar core to execute concurrently at different frequencies, enhancing overall throughput and performance. This enables specialized VPU optimization for long vectors, complex vector operations, and separate power management, which excels in computation-intensive applications. This paper comprehensively reviews processors with decoupled VPUs, discussing their advantages and various implementations. Design challenges and corresponding potential solutions are also be included.
-
-