Journal of Computer Science and Technology ›› 2020, Vol. 35 ›› Issue (2): 475-489.doi: 10.1007/s11390-020-9686-z

• Special Section of ChinaSys 2019 • Previous Articles    

SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array

Shu-Quan Wang, Lei Wang*, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu        

  1. College of Computer Science and Technology, National University of Defense Technology, Changsha 430041, China
  • Received:2019-05-01 Revised:2020-02-13 Online:2020-03-05 Published:2020-03-18
  • Contact: Lei Wang E-mail:arrowya@gmail.com
  • About author:Shu-Quan Wang was a Master student in College of Computer Science and Technology, National University of Defense Technology, Changsha. He received his B.S. degree in computer science and technology in 2011. His research interests include computer architecture, asynchronous circuit, artificial intelligence and neuromorphic computation.
  • Supported by:
    The work was supported by the HeGaoJi Program of China under Grant Nos. 2017ZX01028103-002 and 2017ZX01038104-002, and the National Natural Science Foundation of China under Grant No. 61472432.

Neuromorphic computing is considered to be the future of machine learning, and it provides a new way of cognitive computing. Inspired by the excellent performance of spiking neural networks (SNNs) on the fields of low-power consumption and parallel computing, many groups tried to simulate the SNN with the hardware platform. However, the efficiency of training SNNs with neuromorphic algorithms is not ideal enough. Facing this, Michael et al. proposed a method which can solve the problem with the help of DNN (deep neural network). With this method, we can easily convert a well-trained DNN into an SCNN (spiking convolutional neural network). So far, there is a little of work focusing on the hardware accelerating of SCNN. The motivation of this paper is to design an SNN processor to accelerate SNN inference for SNNs obtained by this DNN-to-SNN method. We propose SIES (Spiking Neural Network Inference Engine for SCNN Accelerating). It uses a systolic array to accomplish the task of membrane potential increments computation. It integrates an optional hardware module of max-pooling to reduce additional data moving between the host and the SIES. We also design a hardware data setup mechanism for the convolutional layer on the SIES with which we can minimize the time of input spikes preparing. We implement the SIES on FPGA XCVU440. The number of neurons it supports is up to 4 000 while the synapses are 256 000. The SIES can run with the working frequency of 200 MHz, and its peak performance is 1.562 5 TOPS.

Key words: spiking neural network (SNN); field-programmable gate array (FPGA); neuromorphic; systolic array; spiking convolutional neural network (SCNN); integrete and fire (I&F) model; hardware accelerator;

[1] Akopyan F, Sawada J, Cassidy A, Alvarez-Icaza R, Arthur J, Merolla P, Imam N, Nakamura Y, Datta P, Nam G J. TrueNorth:Design and tool flow of a 65mW 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2015, 34(10):1537-1557.
[2] Geddes J, Lloyd S, Simpson A C et al. NeuroGrid:Using grid technology to advance neuroscience. In Proc. the 18th IEEE Symposium on Computer-Based Medical Systems, June 2005, pp.570-572.
[3] Schemmel J, Grübl A, Hartmann S et al. Live demonstration:A scaled-down version of the BrainScaleS wafer-scale neuromorphic system. In Proc. the 2012 IEEE International Symposium on Circuits Systems, May 2012, p.702.
[4] Furber S B, Lester D R, Plana L A, Garside J D, Painkras E, Temple S, Brown A D. Overview of the spiNNaker system architecture. IEEE Transactions on Computers, 2013, 62(12):2454-2467.
[5] Davies M, Jain S, Liao Y et al. Loihi:A neuromorphic manycore processor with on-chip learning. IEEE Micro, 2018, 38(1):82-99.
[6] Diehl P U, Neil D, Binas J, Cook M, Liu S C, Pfeiffer M. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In Proc. the 2015 International Joint Conference on Neural Networks, July 2015.
[7] Rueckauer B, Lungu I A, Hu Y, Pfeiffer M, Liu S C. Conversion of continuous-valued deep networks to efficient eventdriven networks for image classification. Frontiers in Neuroscience, 2017, 11:Article No. 682.
[8] Rueckauer B, Lungu L A, Hu Y H, Pfeiffer M. Theory and tools for the conversion of analog to spiking convolutional neural networks. arXiv:1612.04052, 2016. https://arxiv.org/pdf/1612.04052.pdf,Nov.2019.
[9] Du Z D, Fasthuber R, Chen T S, Ienne P, Li L, Luo T, Feng X B, Chen Y J, Temam O. ShiDianNao:Shifting vision processing closer to the sensor. In Proc. the 42nd ACM/IEEE International Symposium on Computer Architecture, June 2015, pp.92-104.
[10] Guan Y J, Yuan Z H, Sun G Y, Cong J. FPGA-based accelerator for long short-term memory recurrent neural networks. In Proc. the 22nd Asia and South Pacific Design Automation Conference, January 2017, pp.629-634.
[11] Zhou Y M, Jiang J F. An FPGA-based accelerator implementation for deep convolutional neural networks. In Proc. the 4th International Conference on Computer Science Network Technology, December 2016, pp.829-832.
[12] Neil D, Liu S C. Minitaur, an event-driven FPGA-based spiking network accelerator. IEEE Transactions on Very Large Scale Integration Systems, 2014, 22(12):2621-2628.
[13] Wang R, Thakur C S, Cohen G, Hamilton T J, Tapson J, van Schaik A. Neuromorphic hardware architecture using the neural engineering framework for pattern recognition. IEEE Trans. Biomed Circuits Syst., 2017, 11(3):574-584.
[14] Glackin B, Mcginnity T M, Maguire L P, Wu Q X, Belatreche A. A novel approach for the implementation of large scale spiking neural networks on FPGA hardware. In Lecture Notes in Computer Science 3512, Cabestany J, Prieto A, Sandoral (eds.), Springer, 2005, pp.552-563.
[15] Cheung K, Schultz S R, Luk W. A large-scale spiking neural network accelerator for FPGA systems. In Proc. the 22nd International Conference on Artificial Neural Networks, September 2012, pp.113-130.
[16] Benton A L. Foundations of physiological psychology. Neurology, 1968, 18(6):609-612.
[17] Hodgkin A L, Huxley A F, Katz B. Measurement of currentvoltage relations in the membrane of the giant axon of Loligo. J. Physiol., 1952, 116(4):424-448.
[18] Izhikevich E M. Simple model of spiking neurons. IEEE Transactions on Neural Networks, 2003, 14(6):1569-1572.
[19] Brunel N, van Rossum M C W. Lapicque's 1907 paper:From frogs to integrate-and-fire. Biological Cybernetics, 2007, 97(5/6):337-339.
[20] Liu Y H, Wang X J. Spike-frequency adaptation of a generalized leaky integrate-and-fire model neuron. Journal of Computational Neuroscience, 2001, 10(1):25-45.
[21] Brette R, Gerstner W. Adaptive exponential integrate-andfire model as an effective description of neuronal activity. Journal of Neurophysiology, 2005, 94(5):3637-3642.
[22] Paninski L, Pillow J W, Simoncelli E P. Maximum likelihood estimation of a stochastic integrate-and-fire neural encoding model. Neural Computation, 2014, 16(12):2533-2561.
[23] Tsumoto K, Kitajima H, Yoshinaga T, Aihara K, Kawakami H. Bifurcations in Morris-Lecar neuron model. Neurocomputing, 2006, 69(4-6):293-316.
[24] Linares-Barranco B, Sanchez-Sinencio E, RodriguezVazquez A, Huertas J L. A CMOS implementation of the Fitzhugh-Nagumo neuron model. IEEE Journal of SolidState Circuits, 1991, 26(7):956-965.
[25] Yadav R N, Kalra P K, John J. Time series prediction with single multiplicative neuron model. Applied Soft Computing, 2007, 7(4):1157-1163.
[26] Maguire L P, Mcginnity T M, Glackin B, Ghani A, Belatreche A, Harkin J. Challenges for large-scale implementations of spiking neural networks on FPGAs. Neurocomputing, 2007, 71(1):13-29.
[27] Gerstner W, Kistler W. Spiking Neuron Models:Single Neurons, Populations, Plasticity (1st edition). Cambridge University Press, 2002.
[28] Gerstner W. Spiking neuron models. In Encyclopedia of Neuroscience, Squire L R (ed.), Academic Press, 2009, pp.277-280.
[29] Lopresti D P. P-NAC:A systolic array for comparing nucleic acid sequences. Computer, 1987, 20(7):98-99.
[30] Samajdar A, Zhu Y, Whatmough P, Mattina M, Krishna T. SCALE-Sim:Systolic CNN accelerator simulator. Distributed, Parallel, and Cluster Computing, 2018.
[31] Jouppi N P, Young C, Patil N et al. In-datacenter performance analysis of a tensor processing unit. In Proc. International Symposium on Computer Architecture, May 2017.
[32] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In Proc. the 3rd International Conference on Learning Representations, May 2015, Article No. 4.
[33] Shen J C, Ma D, Gu Z H, Zhang M, Zhu X L, Xu X Q, Xu Q, Shen Y J, Pan G. Darwin:A neuromorphic hardware co-processor based on spiking neural networks. SCIENCE CHINA Information Sciences, 2016, 59(2):Article No. 023401.
[34] Kousanakis E, Dollas A, Sotiriades E et al. An architecture for the acceleration of a hybrid leaky integrate and fire SNN on the convey HC-2ex FPGA-based processor. In Proc. the 25th IEEE International Symposium on Fieldprogrammable Custom Computing Machines, April 2017, pp.56-63.
[35] Fang H, Shrestha A, Ma D et al. Scalable NoCbased neuromorphic hardware learning and inference. arXiv:1810.09233, 2018. https://arxiv.org/pdf/1810.0923-3v1.pdf,Dec.2019.
[36] Cheung K, Schultz S R, Luk W. NeuroFlow:A general purpose spiking neural network simulation platform using customizable processors. Frontiers in Neuroscience, 2015, 9:Article No. 516.
[37] Albericio J, Judd P, Hetherington T et al. Cnvlutin:Ineffectual-neuron-free deep neural network computing. ACM SIGARCH Computer Architecture News, 2016, 44(3):1-13.
[38] Guo S, Wang L, Chen B, Dou Q. An overhead-free maxpooling method for SNN. IEEE Embedded Systems Letters. doi:10.1109/LES.2019.2919244.
[39] Sengupta A, Ye Y T, Wang R, Liu C, Roy K. Going deeper in spiking neural networks:VGG and residual architectures. arXiv:1802.02627, 2018. https://arxiv.org/pdf/1802.02627.pdf,Dec.2019.
[40] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[41] Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A Y. Reading digits in natural images with unsupervised feature learning. In Proc. the NIPS Workshop on Deep Learning and Unsupervised Feature Learning, December 2011.
[42] Krizhevsky A. Learning multiple layers of features from tiny images. Technical Report, University of Toronto, 2009. http://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf,Dec.2019.
[1] Lan Huang, Da-Lin Li, Kang-Ping Wang, Teng Gao, Adriano Tavares. A Survey on Performance Optimization of High-Level Synthesis Tools [J]. Journal of Computer Science and Technology, 2020, 35(3): 697-720.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Feng Yulin;. Recursive Implementation of VLSI Circuits[J]. , 1986, 1(2): 72 -82 .
[2] Zhang Cui; Zhao Qinping; Xu Jiafu;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 .
[3] Qu Yanwen;. AGDL: A Definition Language for Attribute Grammars[J]. , 1986, 1(3): 80 -91 .
[4] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Zhong Renbao; Xing Lin; Ren Zhaoyang;. An Interactive System SDI on Microcomputer[J]. , 1987, 2(1): 64 -71 .
[7] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[8] Qiao Xiangzhen;. An Efficient Parallel Algorithm for FFT[J]. , 1987, 2(3): 174 -190 .
[9] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .
[10] Huang Guoxiang; Liu Jian;. A Key-Lock Access Control[J]. , 1987, 2(3): 236 -243 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved