Decoupled Vector Processing Unit: Past, Present, and Future
-
Abstract
Vector architectures are widely employed in modern processors due to their high performance and energy efficiency in exploiting data-level parallelism through SIMD paradigms. The built-in scalar core and the vector processing unit (VPU) can be organized as integrated or decoupled. The decoupled vector architecture primarily offers the advantage of independent operation, allowing the VPU and the scalar core to execute concurrently at different frequencies, enhancing overall throughput and performance. This enables specialized VPU optimization for long vectors, complex vector operations, and separate power management, which excels in computation-intensive applications. This paper comprehensively reviews processors with decoupled VPUs, discussing their advantages and various implementations. Design challenges and corresponding potential solutions will also be included.
-
-