We use cookies to improve your experience with our site.

SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array

Shu-Quan Wang, Lei Wang, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu

downloadPDF
SIES:在FPGA上的脉冲卷积神经网络推理引擎的新型实现[J]. 计算机科学技术学报, 2020, 35(2): 475-489. DOI: 10.1007/s11390-020-9686-z
引用本文: SIES:在FPGA上的脉冲卷积神经网络推理引擎的新型实现[J]. 计算机科学技术学报, 2020, 35(2): 475-489. DOI: 10.1007/s11390-020-9686-z
Shu-Quan Wang, Lei Wang, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu. SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array[J]. Journal of Computer Science and Technology, 2020, 35(2): 475-489. DOI: 10.1007/s11390-020-9686-z
Citation: Shu-Quan Wang, Lei Wang, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu. SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array[J]. Journal of Computer Science and Technology, 2020, 35(2): 475-489. DOI: 10.1007/s11390-020-9686-z
SIES:在FPGA上的脉冲卷积神经网络推理引擎的新型实现[J]. 计算机科学技术学报, 2020, 35(2): 475-489. CSTR: 32374.14.s11390-020-9686-z
引用本文: SIES:在FPGA上的脉冲卷积神经网络推理引擎的新型实现[J]. 计算机科学技术学报, 2020, 35(2): 475-489. CSTR: 32374.14.s11390-020-9686-z
Shu-Quan Wang, Lei Wang, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu. SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array[J]. Journal of Computer Science and Technology, 2020, 35(2): 475-489. CSTR: 32374.14.s11390-020-9686-z
Citation: Shu-Quan Wang, Lei Wang, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu. SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array[J]. Journal of Computer Science and Technology, 2020, 35(2): 475-489. CSTR: 32374.14.s11390-020-9686-z

SIES:在FPGA上的脉冲卷积神经网络推理引擎的新型实现

SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array

  • 摘要: 类脑计算被认为是机器学习的未来,它提供了一种新的认知计算方式。受到脉冲神经网络(SNN)在低功耗和并行计算领域的出色性能的启发,许多团队试图用硬件平台模拟SNN。然而,用拟神经形态算法训练SNN的效率还不够理想。面对这一点,迈克尔等人提出了DNN-to-SNN方法。通过这种方法,我们可以轻松地将训练好的深度神经网络(DNN)转换为SCNN(脉冲卷积神经网络)。到目前为止,有一些工作侧重于SCNN的硬件加速。本文的动机是设计一个SNN处理器来加速通过这种DNN到SNN方法获得的SNN的SNN推断。我们提出了SIES(用于SCNN加速的脉冲神经网络推理引擎)。它使用收缩阵列来完成膜电位增量计算的任务。它集成了max-pooling的可选硬件模块,以减少主机和SIES之间移动的额外数据。我们还为SIES上的卷积层设计了硬件数据建立机制,通过它我们可以最大限度地减少输入脉冲准备的时间。我们在FPGA XCVU440上实现了SIES。它支持的神经元数量高达4 K,而突触为256 K.SIES可以在200 MHz的工作频率下运行,其峰值性能为1.5625 TOPS。
    Abstract: Neuromorphic computing is considered to be the future of machine learning, and it provides a new way of cognitive computing. Inspired by the excellent performance of spiking neural networks (SNNs) on the fields of low-power consumption and parallel computing, many groups tried to simulate the SNN with the hardware platform. However, the efficiency of training SNNs with neuromorphic algorithms is not ideal enough. Facing this, Michael et al. proposed a method which can solve the problem with the help of DNN (deep neural network). With this method, we can easily convert a well-trained DNN into an SCNN (spiking convolutional neural network). So far, there is a little of work focusing on the hardware accelerating of SCNN. The motivation of this paper is to design an SNN processor to accelerate SNN inference for SNNs obtained by this DNN-to-SNN method. We propose SIES (Spiking Neural Network Inference Engine for SCNN Accelerating). It uses a systolic array to accomplish the task of membrane potential increments computation. It integrates an optional hardware module of max-pooling to reduce additional data moving between the host and the SIES. We also design a hardware data setup mechanism for the convolutional layer on the SIES with which we can minimize the time of input spikes preparing. We implement the SIES on FPGA XCVU440. The number of neurons it supports is up to 4 000 while the synapses are 256 000. The SIES can run with the working frequency of 200 MHz, and its peak performance is 1.562 5 TOPS.
  • [1]

    Akopyan F, Sawada J, Cassidy A, Alvarez-Icaza R, Arthur J, Merolla P, Imam N, Nakamura Y, Datta P, Nam G J. TrueNorth:Design and tool flow of a 65mW 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2015, 34(10):1537-1557.

    [2]

    Geddes J, Lloyd S, Simpson A C et al. NeuroGrid:Using grid technology to advance neuroscience. In Proc. the 18th IEEE Symposium on Computer-Based Medical Systems, June 2005, pp.570-572.

    [3]

    Schemmel J, Grübl A, Hartmann S et al. Live demonstration:A scaled-down version of the BrainScaleS wafer-scale neuromorphic system. In Proc. the 2012 IEEE International Symposium on Circuits Systems, May 2012, p.702.

    [4]

    Furber S B, Lester D R, Plana L A, Garside J D, Painkras E, Temple S, Brown A D. Overview of the spiNNaker system architecture. IEEE Transactions on Computers, 2013, 62(12):2454-2467.

    [5]

    Davies M, Jain S, Liao Y et al. Loihi:A neuromorphic manycore processor with on-chip learning. IEEE Micro, 2018, 38(1):82-99.

    [6]

    Diehl P U, Neil D, Binas J, Cook M, Liu S C, Pfeiffer M. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In Proc. the 2015 International Joint Conference on Neural Networks, July 2015.

    [7]

    Rueckauer B, Lungu I A, Hu Y, Pfeiffer M, Liu S C. Conversion of continuous-valued deep networks to efficient eventdriven networks for image classification. Frontiers in Neuroscience, 2017, 11:Article No. 682.

    [8]

    Rueckauer B, Lungu L A, Hu Y H, Pfeiffer M. Theory and tools for the conversion of analog to spiking convolutional neural networks. arXiv:1612.04052, 2016. https://arxiv.org/pdf/1612.04052.pdf,Nov.2019.

    [9]

    Du Z D, Fasthuber R, Chen T S, Ienne P, Li L, Luo T, Feng X B, Chen Y J, Temam O. ShiDianNao:Shifting vision processing closer to the sensor. In Proc. the 42nd ACM/IEEE International Symposium on Computer Architecture, June 2015, pp.92-104.

    [10]

    Guan Y J, Yuan Z H, Sun G Y, Cong J. FPGA-based accelerator for long short-term memory recurrent neural networks. In Proc. the 22nd Asia and South Pacific Design Automation Conference, January 2017, pp.629-634.

    [11]

    Zhou Y M, Jiang J F. An FPGA-based accelerator implementation for deep convolutional neural networks. In Proc. the 4th International Conference on Computer Science Network Technology, December 2016, pp.829-832.

    [12]

    Neil D, Liu S C. Minitaur, an event-driven FPGA-based spiking network accelerator. IEEE Transactions on Very Large Scale Integration Systems, 2014, 22(12):2621-2628.

    [13]

    Wang R, Thakur C S, Cohen G, Hamilton T J, Tapson J, van Schaik A. Neuromorphic hardware architecture using the neural engineering framework for pattern recognition. IEEE Trans. Biomed Circuits Syst., 2017, 11(3):574-584.

    [14]

    Glackin B, Mcginnity T M, Maguire L P, Wu Q X, Belatreche A. A novel approach for the implementation of large scale spiking neural networks on FPGA hardware. In Lecture Notes in Computer Science 3512, Cabestany J, Prieto A, Sandoral (eds.), Springer, 2005, pp.552-563.

    [15]

    Cheung K, Schultz S R, Luk W. A large-scale spiking neural network accelerator for FPGA systems. In Proc. the 22nd International Conference on Artificial Neural Networks, September 2012, pp.113-130.

    [16]

    Benton A L. Foundations of physiological psychology. Neurology, 1968, 18(6):609-612.

    [17]

    Hodgkin A L, Huxley A F, Katz B. Measurement of currentvoltage relations in the membrane of the giant axon of Loligo. J. Physiol., 1952, 116(4):424-448.

    [18]

    Izhikevich E M. Simple model of spiking neurons. IEEE Transactions on Neural Networks, 2003, 14(6):1569-1572.

    [19]

    Brunel N, van Rossum M C W. Lapicque's 1907 paper:From frogs to integrate-and-fire. Biological Cybernetics, 2007, 97(5/6):337-339.

    [20]

    Liu Y H, Wang X J. Spike-frequency adaptation of a generalized leaky integrate-and-fire model neuron. Journal of Computational Neuroscience, 2001, 10(1):25-45.

    [21]

    Brette R, Gerstner W. Adaptive exponential integrate-andfire model as an effective description of neuronal activity. Journal of Neurophysiology, 2005, 94(5):3637-3642.

    [22]

    Paninski L, Pillow J W, Simoncelli E P. Maximum likelihood estimation of a stochastic integrate-and-fire neural encoding model. Neural Computation, 2014, 16(12):2533-2561.

    [23]

    Tsumoto K, Kitajima H, Yoshinaga T, Aihara K, Kawakami H. Bifurcations in Morris-Lecar neuron model. Neurocomputing, 2006, 69(4-6):293-316.

    [24]

    Linares-Barranco B, Sanchez-Sinencio E, RodriguezVazquez A, Huertas J L. A CMOS implementation of the Fitzhugh-Nagumo neuron model. IEEE Journal of SolidState Circuits, 1991, 26(7):956-965.

    [25]

    Yadav R N, Kalra P K, John J. Time series prediction with single multiplicative neuron model. Applied Soft Computing, 2007, 7(4):1157-1163.

    [26]

    Maguire L P, Mcginnity T M, Glackin B, Ghani A, Belatreche A, Harkin J. Challenges for large-scale implementations of spiking neural networks on FPGAs. Neurocomputing, 2007, 71(1):13-29.

    [27]

    Gerstner W, Kistler W. Spiking Neuron Models:Single Neurons, Populations, Plasticity (1st edition). Cambridge University Press, 2002.

    [28]

    Gerstner W. Spiking neuron models. In Encyclopedia of Neuroscience, Squire L R (ed.), Academic Press, 2009, pp.277-280.

    [29]

    Lopresti D P. P-NAC:A systolic array for comparing nucleic acid sequences. Computer, 1987, 20(7):98-99.

    [30]

    Samajdar A, Zhu Y, Whatmough P, Mattina M, Krishna T. SCALE-Sim:Systolic CNN accelerator simulator. Distributed, Parallel, and Cluster Computing, 2018.

    [31]

    Jouppi N P, Young C, Patil N et al. In-datacenter performance analysis of a tensor processing unit. In Proc. International Symposium on Computer Architecture, May 2017.

    [32]

    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In Proc. the 3rd International Conference on Learning Representations, May 2015, Article No. 4.

    [33]

    Shen J C, Ma D, Gu Z H, Zhang M, Zhu X L, Xu X Q, Xu Q, Shen Y J, Pan G. Darwin:A neuromorphic hardware co-processor based on spiking neural networks. SCIENCE CHINA Information Sciences, 2016, 59(2):Article No. 023401.

    [34]

    Kousanakis E, Dollas A, Sotiriades E et al. An architecture for the acceleration of a hybrid leaky integrate and fire SNN on the convey HC-2ex FPGA-based processor. In Proc. the 25th IEEE International Symposium on Fieldprogrammable Custom Computing Machines, April 2017, pp.56-63.

    [35]

    Fang H, Shrestha A, Ma D et al. Scalable NoCbased neuromorphic hardware learning and inference. arXiv:1810.09233, 2018. https://arxiv.org/pdf/1810.0923-3v1.pdf,Dec.2019.

    [36]

    Cheung K, Schultz S R, Luk W. NeuroFlow:A general purpose spiking neural network simulation platform using customizable processors. Frontiers in Neuroscience, 2015, 9:Article No. 516.

    [37]

    Albericio J, Judd P, Hetherington T et al. Cnvlutin:Ineffectual-neuron-free deep neural network computing. ACM SIGARCH Computer Architecture News, 2016, 44(3):1-13.

    [38]

    Guo S, Wang L, Chen B, Dou Q. An overhead-free maxpooling method for SNN. IEEE Embedded Systems Letters. doi: 10.1109/LES.2019.2919244.

    [39]

    Sengupta A, Ye Y T, Wang R, Liu C, Roy K. Going deeper in spiking neural networks:VGG and residual architectures. arXiv:1802.02627, 2018. https://arxiv.org/pdf/1802.02627.pdf,Dec.2019.

    [40]

    LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324.

    [41]

    Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A Y. Reading digits in natural images with unsupervised feature learning. In Proc. the NIPS Workshop on Deep Learning and Unsupervised Feature Learning, December 2011.

    [42]

    Krizhevsky A. Learning multiple layers of features from tiny images. Technical Report, University of Toronto, 2009. http://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf,Dec.2019.

  • 期刊类型引用(28)

    1. HOU Yue, XIANG Shuiying, ZOU Tao, et al. Implementation of a High-Efficiency, Lightweight Residual Spiking Neural Network Processor Based on FPGA. Acta Physica Sinica, 2025, 74(14): 0. 必应学术
    2. Mehrzad Karamimanesh, Ebrahim Abiri, Mahyar Shahsavari, et al. Spiking neural networks on FPGA: A survey of methodologies and recent advancements. Neural Networks, 2025, 186: 107256. 必应学术
    3. Yuehai Chen, Wujian Ye, Yijun Liu, et al. SiBrain: A Sparse Spatio-Temporal Parallel Neuromorphic Architecture for Accelerating Spiking Convolution Neural Networks With Low Latency. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024, 71(12): 6482. 必应学术
    4. Zilin Wang, Zehong Ou, Yi Zhong, et al. NeuroREC: A 28-nm Efficient Neuromorphic Processor for Radar Emitter Classification. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024, 71(12): 6215. 必应学术
    5. Jindong Li, Guobin Shen, Dongcheng Zhao, et al. FireFly v2: Advancing Hardware Support for High-Performance Spiking Neural Network With a Spatiotemporal FPGA Accelerator. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024, 43(9): 2647. 必应学术
    6. Zilin Wang, Yi Zhong, Zehong Ou, et al. Marmotini: A Weight Density Adaptation Architecture With Hybrid Compression Method for Spiking Neural Network. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2024, 32(12): 2293. 必应学术
    7. Lei Kang, Xu Yang, Chi Zhang, et al. A 24.3 μJ/Image SNN Accelerator for DVS-Gesture With WS-LOS Dataflow and Sparse Methods. IEEE Transactions on Circuits and Systems II: Express Briefs, 2023, 70(11): 4226. 必应学术
    8. Yuehai Chen, Yijun Liu, Wujian Ye, et al. The High-Performance Design of a General Spiking Convolution Computation Unit for Supporting Neuromorphic Hardware Acceleration. IEEE Transactions on Circuits and Systems II: Express Briefs, 2023, 70(9): 3634. 必应学术
    9. Wujian Ye, Yuehai Chen, Yijun Liu. The Implementation and Optimization of Neuromorphic Hardware for Supporting Spiking Neural Networks With MLP and CNN Topologies. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023, 42(2): 448. 必应学术
    10. Jindong Li, Guobin Shen, Dongcheng Zhao, et al. FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks With Efficient DSP and Memory Optimization. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2023, 31(8): 1178. 必应学术
    11. Vijay Kakani, Xingyou Li, Xuenan Cui, et al. Implementation of Field-Programmable Gate Array Platform for Object Classification Tasks Using Spike-Based Backpropagated Deep Convolutional Spiking Neural Networks. Micromachines, 2023, 14(7): 1353. 必应学术
    12. Qinyu Chen, Chang Gao, Yuxiang Fu. Cerebron: A Reconfigurable Architecture for Spatiotemporal Sparse Spiking Neural Networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2022, 30(10): 1425. 必应学术
    13. Jeong-Jun Lee, Wenrui Zhang, Yuan Xie, et al. SaARSP: An Architecture for Systolic-Array Acceleration of Recurrent Spiking Neural Networks. ACM Journal on Emerging Technologies in Computing Systems, 2022, 18(4): 1. 必应学术
    14. Jan Sommer, M. Akif Ozkan, Oliver Keszocze, et al. Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11): 3767. 必应学术
    15. Ling Zhang, Jing Yang, Cong Shi, et al. A Cost-Efficient High-Speed VLSI Architecture for Spiking Convolutional Neural Network Inference Using Time-Step Binary Spike Maps. Sensors, 2021, 21(18): 6006. 必应学术
    16. Hanwen Liu, Yi Chen, Zihang Zeng, et al. A Low Power and Low Latency FPGA-Based Spiking Neural Network Accelerator. 2023 International Joint Conference on Neural Networks (IJCNN), 必应学术
    17. Ayesha Siddique, Khaza Anuarul Hoque. Improving Reliability of Spiking Neural Networks through Fault Aware Threshold Voltage Optimization. 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 必应学术
    18. Minhui Hu, Jianhua Fan, Yongyang Hu, et al. Modeling and optimizing PE utilization rate for systolic array based CNN accelerators. Eighth International Conference on Electronic Technology and Information Science (ICETIS 2023), 必应学术
    19. Jeong-Jun Lee, Wenrui Zhang, Peng Li. Parallel Time Batching: Systolic-Array Acceleration of Sparse Spiking Neural Computation. 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 必应学术
    20. Jingjing Tao, Mingsheng Zhang. Application of Intelligent Systems in Multi-modal Information Analytics. Lecture Notes on Data Engineering and Communications Technologies, 必应学术
    21. Yijie Miao, Makoto Ikeda. Artificial Intelligence Applications and Innovations. IFIP Advances in Information and Communication Technology, 必应学术
    22. Rui Gong, Lei Wang, Wei Shi, et al. A Hardware Security Isolation Architecture for Intelligent Accelerator. 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), 必应学术
    23. Jeong-Jun Lee, Jianhao Chen, Wenrui Zhang, et al. Systolic-Array Spiking Neural Accelerators with Dynamic Heterogeneous Voltage Regulation. 2021 International Joint Conference on Neural Networks (IJCNN), 必应学术
    24. Zilin Wang, Yi Zhong, Guang Chen, et al. A Hybrid Heterogeneous Neural Network Accelerator based on Systolic Array. 2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS), 必应学术
    25. Shaogang Hu, Tianchen Li, Yue Zuo, et al. A Binarized Systolic Array-Based Neuromorphic Architecture with High Efficiency. 2022 6th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), 必应学术
    26. Rasha Karakchi. A Scratchpad Spiking Neural Network Accelerator. 2024 IEEE 3rd International Conference on Computing and Machine Intelligence (ICMI), 必应学术
    27. Jeong-Jun Lee, Peng Li. Reconfigurable Dataflow Optimization for Spatiotemporal Spiking Neural Computation on Systolic Array Accelerators. 2020 IEEE 38th International Conference on Computer Design (ICCD), 必应学术
    28. Patrick Plagwitz, Frank Hannig, Jürgen Teich, et al. Applied Reconfigurable Computing. Architectures, Tools, and Applications. Lecture Notes in Computer Science, 必应学术

    其他类型引用(0)

  • 其他相关附件

计量
  • 文章访问数:  120
  • HTML全文浏览量:  0
  • PDF下载量:  437
  • 被引次数: 28
出版历程
  • 收稿日期:  2019-04-30
  • 修回日期:  2020-02-12
  • 发布日期:  2020-03-04

目录

    /

    返回文章
    返回