Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (5): 1155-1166.doi: 10.1007/s11390-021-0906-y

Special Issue: Computer Architecture and Systems

• Special Section of 2020 CCF Integrated Circuit Design and Automation Conference • Previous Articles     Next Articles

Area Efficient Pattern Representation of Binary Neural Networks on RRAM

Feng Wang1, Guo-Jie Luo1,*, Member, CCF, ACM, IEEE, Guang-Yu Sun1, Member, CCF, ACM, IEEE Yu-Hao Wang2, Di-Min Niu2, and Hong-Zhong Zheng2        

  1. 1 Center for Energy-Efficient Computing and Applications, Peking University, Beijing 100871, China;
    2 Pingtouge, Alibaba Group, Hangzhou 310052, China
  • Received:2020-08-14 Revised:2021-08-26 Online:2021-09-30 Published:2021-09-30
  • About author:Feng Wang received his Ph.D. degree in computer science from Peking University, Beijing, in 2021. He is currently a research scientist in LEDA Technology. His research interests include processing-in-memory and EDA algorithms.
  • Supported by:
    This work is partly supported by the National Key Research and Development Program of China under Grant No. 2020AAA0130400, Beijing Municipal Science and Technology Program of China under Grant No. Z201100004220007, the National Natural Science Foundation of China under Grant No. 62090021, Beijing Academy of Artificial Intelligence (BAAI), and Alibaba Innovative Research (AIR) Program.

Resistive random access memory (RRAM) has been demonstrated to implement multiply-and-accumulate (MAC) operations using a highly parallel analog fashion, which dramatically accelerates the convolutional neural networks (CNNs). Since CNNs require considerable converters between analog crossbars and digital peripheral circuits, recent studies map the binary neural networks (BNNs) onto RRAM and binarize the weights to {+1, -1}. However, two mainstream representations for BNN weights introduce patterns of redundant 0s and 1s when dealing with negative weights. In this work, we reduce the area of redundant 0s and 1s by proposing a BNN weight representation framework based on the novel pattern representation and a corresponding architecture. First, we spilt the weight matrix into several small matrices by clustering adjacent columns together. Second, we extract 1s' patterns, i.e., the submatrices only containing 1s, from the small weight matrix, such that each final output can be represented by the sum of several patterns. Third, we map these patterns onto RRAM crossbars, including pattern computation crossbars (PCCs) and pattern accumulation crossbars (PACs). Finally, we compare the pattern representation with two mainstream representations and adopt the more area efficient one. The evaluation results demonstrate that our framework can save over 20% of crossbar area effectively, compared with two mainstream representations.

Key words: binary neural network (BNN); pattern; resistive random access memory (RRAM);

[1] Hinton G, Deng L, Yu D et al. Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups. IEEE Signal Process Mag., 2012, 29(6):82-97. DOI:10.1109/MSP.2012.2205597.
[2] Akinaga H, Shima H. Resistive random access memory (ReRAM) based on metal oxides. Proc. IEEE, 2010, 98(12):2237-2251. DOI:10.1109/JPROC.2010.2070830.
[3] Chi P, Li S, Xu C et al. PRIME:A novel processing-inmemory architecture for neural network computation in ReRAM-based main memory. In Proc. the 43rd International Symposium on Computer Architecture, Jun. 2016, pp.27-39. DOI:10.1109/ISCA.2016.13.
[4] Chen L, Li J, Chen Y et al. Accelerator-friendly neuralnetwork training:Learning variations and defects in RRAM crossbar. In Proc. the Design, Automation & Test in Europe Conference & Exhibition, Mar. 2017, pp.19-24. DOI:10.23919/DATE.2017.7926952.
[5] Liu C, Yan B, Yang C et al. A spiking neuromorphic design with resistive crossbar. In Proc. the 52nd Design Automation Conference, Jun. 2015. DOI:10.1145/2744769.2744783.
[6] Rastegari M, Ordonez V, Redmon J, Farhadi A. XNORNet:ImageNet classification using binary convolutional neural networks. In Proc. the 14th European Conference on Computer Vision, Oct. 2016, pp.525-542. DOI:10.1007/978-3-319-46493-032.
[7] Alemdar H, Leroy V, Prost-Boucle A, Pétrot F. Ternary neural networks for resource-efficient AI applications. In Proc. the International Joint Conference on Neural Networks, May 2017, pp.2547-2554. DOI:10.1109/IJCNN.2017.7966166.
[8] Tang T, Xia L, Li B, Wang Y, Yang H. Binary convolutional neural network on RRAM. In Proc. the 22nd Asia and South Pacific Design Automation Conference, Jan. 2017, pp.782-787. DOI:10.1109/ASPDAC.2017.7858419.
[9] Ni L, Liu Z, Song W et al. An energy-efficient and highthroughput bitwise CNN on sneak-path-free digital ReRAM crossbar. In Proc. the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, Jul. 2017. DOI:10.1109/ISLPED.2017.8009177.
[10] Sun X, Yin S, Peng X, Liu R, Seo J, Yu S. XNOR-RRAM:A scalable and parallel resistive synaptic architecture for binary neural networks. In Proc. the Design, Automation & Test in Europe Conference & Exhibition, Mar. 2018, pp.1423-1428. DOI:10.23919/DATE.2018.8342235.
[11] Sun X, Peng X, Chen P Y, Liu R, Seo J, Yu S. Fully parallel RRAM synaptic array for implementing binary neural network with (+1, -1) weights and (+1, 0) neurons. In Proc. the 23rd Asia and South Pacific Design Automation Conference, Jan. 2018, pp.574-579. DOI:10.1109/ASPDAC.2018.8297384.
[12] Wang P, Ji Y, Hong C, Lyu Y, Wang D, Xie Y. SNrram:An efficient sparse neural network computation architecture based on resistive random-access memory. In Proc. the 55th ACM/ESDA/IEEE Design Automation Conference, Jun. 2018. DOI:10.1109/DAC.2018.8465793.
[13] Chi C C, Jiang J H R. Logic synthesis of binarized neural networks for efficient circuit implementation. IEEE Trans. Comput. Des. Integr. Circuits Syst.. DOI:10.1109/TCAD.2021.3078606.
[14] Garey M R, Johnson D S, Stockmeyer L. Some simplified NP-complete problems. In Proc. the 6th ACM Symposium on Theory of Computing, Apr. 30-May 2, 1974, pp.47-63. DOI:10.1145/800119.803884.
[15] Kazemi A, Alessandri C, Seabaugh A C, Sharon H X, Niemier M, Joshi S. A device non-ideality resilient approach for mapping neural networks to crossbar arrays. In Proc. the 57th ACM/IEEE Design Automation Conference, Jul. 2020. DOI:10.1109/DAC18072.2020.9218544.
[16] Song L, Qian X, Li H, Chen Y. PipeLayer:A pipelined ReRAM-based accelerator for deep learning. In Proc. the International Symposium on High Performance Computer Architecture, Feb. 2017, pp.541-552. DOI:10.1109/HPCA.2017.55.
[17] Shafiee A, Nag A, Muralimanohar N et al. ISAAC:A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Comput. Archit. News, 2016, 44(3):14-26. DOI:10.1145/3007787.3001139.
[18] Zhu Z, Sun H, Lin Y et al. A configurable multi-precision CNN computing framework based on single bit RRAM. In Proc. the 56th ACM/IEEE Design Automation Conference, Jun. 2019, Article No. 56. DOI:10.1145/3316781.3317739.
[19] Peng X, Liu R, Yu S. Optimizing weight mapping and dataflow for convolutional neural networks on processing-in-memory architectures. IEEE Trans. Circuits Syst. I Regul. Pap., 2020, 67(4):1333-1343. DOI:10.1109/TCSI.2019.2958568.
[20] Cheng M, Xia L, Zhu Z et al. TIME:A training-in-memory architecture for RRAM-based deep neural networks. IEEE Trans. Comput. Des. Integr. Circuits Syst., 2019, 38(5):834-847. DOI:10.1109/TCAD.2018.2824304.
[21] Zhu Z, Lin J, Cheng M et al. Mixed size crossbar based RRAM CNN accelerator with overlapped mapping method. In Proc. the International Conference on Computer-Aided Design, Nov. 2018, Article No. 69. DOI:10.1145/3240765.3240825.
[1] Que-Ping Kong, Zi-Yan Wang, Yuan Huang, Xiang-Ping Chen, Xiao-Cong Zhou, Zi-Bin Zheng, and Gang Huang. Characterizing and Detecting Gas-Inefficient Patterns in Smart Contracts [J]. Journal of Computer Science and Technology, 2022, 37(1): 67-82.
[2] Zeynep Banu Ozger, Nurgul Yuzbasioglu Uslu. An Effective Discrete Artificial Bee Colony Based SPARQL Query Path Optimization by Reordering Triples [J]. Journal of Computer Science and Technology, 2021, 36(2): 445-462.
[3] Xin Li, Patrick Gardy, Yu-Xin Deng, Hiroyuki Seki. Reachability of Patterned Conditional Pushdown Systems [J]. Journal of Computer Science and Technology, 2020, 35(6): 1295-1311.
[4] Dong Liu, Zhi-Lei Ren, Zhong-Tian Long, Guo-Jun Gao, He Jiang. Mining Design Pattern Use Scenarios and Related Design Pattern Pairs: A Case Study on Online Posts [J]. Journal of Computer Science and Technology, 2020, 35(5): 963-978.
[5] Monidipa Das, Soumya K. Ghosh. Data-Driven Approaches for Spatio-Temporal Analysis: A Survey of the State-of-the-Arts [J]. Journal of Computer Science and Technology, 2020, 35(3): 665-696.
[6] Wen-Yan Chen, Ke-Jiang Ye, Cheng-Zhi Lu, Dong-Dai Zhou, Cheng-Zhong Xu. Interference Analysis of Co-Located Container Workloads: A Perspective from Hardware Performance Counters [J]. Journal of Computer Science and Technology, 2020, 35(2): 412-417.
[7] Jiu-Ru Gao, Wei Chen, Jia-Jie Xu, An Liu, Zhi-Xu Li, Hongzhi Yin, Lei Zhao. An Efficient Framework for Multiple Subgraph Pattern Matching Models [J]. Journal of Computer Science and Technology, 2019, 34(6): 1185-1202.
[8] Zhe Liu, Cheng-Jian Qiu, Yu-Qing Song, Xiao-Hong Liu, Juan Wang, Victor S. Sheng. Texture Feature Extraction from Thyroid MR Imaging Using High-Order Derived Mean CLBP [J]. Journal of Computer Science and Technology, 2019, 34(1): 35-46.
[9] Aakash Ahmad, Claus Pahl, Ahmed B. Altamimi, Abdulrahman Alreshidi. Mining Patterns from Change Logs to Support Reuse-Driven Evolution of Software Architectures [J]. Journal of Computer Science and Technology, 2018, 33(6): 1278-1306.
[10] Xin Xu, Jiaheng Lu, Wei Wang. Hierarchical Clustering of Complex Symbolic Data and Application for Emitter Identification [J]. , 2018, 33(4): 807-822.
[11] Lei Guo, Yu-Fei Wen, Xin-Hua Wang. Exploiting Pre-Trained Network Embeddings for Recommendations in Social Networks [J]. , 2018, 33(4): 682-696.
[12] Guo-Wei Wang, Jin-Dou Zhang, Jing Li. Complete Your Mobility: Linking Trajectories Across Heterogeneous Mobility Data Sources [J]. , 2018, 33(4): 792-806.
[13] Guochen Cai, Kyungmi Lee, Ickjai Lee. Mining Semantic Trajectory Patterns from Geo-Tagged Data [J]. , 2018, 33(4): 849-862.
[14] Shi-Ming Guo, Hong Gao. HUITWU: An Efficient Algorithm for High-Utility Itemset Mining in Transaction Databases [J]. , 2016, 31(4): 776-786.
[15] Hoon Park, Anping He, Marly Roncken, Xiaoyu Song, Ivan Sutherland. Modular Timing Constraints for Delay-Insensitive Systems [J]. , 2016, 31(1): 77-106.
Full text



[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Zheng Guoliang; Li Hui;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[7] Huang Xuedong; Cai Lianhong; Fang Ditang; Chi Bianjin; Zhou Li; Jiang Li;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[8] Xu Xiaoshu;. Simplification of Multivalued Sequential SULM Network by Using Cascade Decomposition[J]. , 1986, 1(4): 84 -95 .
[9] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[10] Zhong Renbao; Xing Lin; Ren Zhaoyang;. An Interactive System SDI on Microcomputer[J]. , 1987, 2(1): 64 -71 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved