Approximate Processing Element Design and Analysis for the Implementation of CNN Accelerators

Tong Li; Hong-Lan Jiang; Hai Mo; Jie Han; Lei-Bo Liu; Zhi-Gang Mao

doi:10.1007/s11390-023-2548-8

Li T, Jiang HL, Mo H et al. Approximate processing element design and analysis for the implementation of CNN accelerators. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 38(2): 309−327 Mar. 2023. DOI: 10.1007/s11390-023-2548-8.

Citation:

Approximate Processing Element Design and Analysis for the Implementation of CNN Accelerators

Abstract

Abstract

As a primary computation unit, a processing element (PE) is key to the energy efficiency of a convolutional neural network (CNN) accelerator. Taking advantage of the inherent error tolerance of CNNs, approximate computing with high hardware efficiency has been considered for implementing the computation units of CNN accelerators. However, individual approximate designs such as multipliers and adders can only achieve limited accuracy and hardware improvements. In this paper, an approximate PE is dedicatedly devised for CNN accelerators by synergistically considering the data representation, multiplication and accumulation. An approximate data format is defined for the weights using stochastic rounding. This data format enables a simple implementation of multiplication by using small lookup tables, an adder and a shifter. Two approximate accumulators are further proposed for the product accumulation in the PE. Compared with the exact 8-bit fixed-point design, the proposed PE saves more than 29% and 20% in power-delay product for 3 × 3 and 5 × 5 sum of products, respectively. Also, compared with the PEs consisting of state-of-the-art approximate multipliers, the proposed design shows significantly smaller error bias with lower hardware overhead. Moreover, the application of the approximate PEs in CNN accelerators is analyzed by implementing a multi-task CNN for face detection and alignment. We conclude that 1) an approximate PE is more effective for face detection than for alignment, 2) an approximate PE with high statistically-measured accuracy does not necessarily result in good quality in face detection, and 3) properly increasing the number of PEs in a CNN accelerator can improve its power and energy efficiency.

FullText(HTML)

References (53)

Relative Articles

Supplements (3)

Cited By

Approximate Processing Element Design and Analysis for the Implementation of CNN Accelerators

Abstract

Catalog

Export File

Citation

Format

Content