计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (2): 305-317.doi: 10.1007/s11390-019-1912-1

所属专题: Artificial Intelligence and Pattern Recognition

• • 上一篇    下一篇

深度卷积神经网络的空间高效量化

Dong-Di Zhao1, Fan Li1,*, Member, CCF, ACM, IEEE, Kashif Sharif1, Member, CCF, ACM, IEEE, Guang-Min Xia1, Yu Wang2,*, Fellow, IEEE, Senior Member, ACM   

  1. 1 School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;
    2 Wireless Networking and Sensing Laboratory, Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC 28223, U.S.A.
  • 收稿日期:2018-07-15 修回日期:2019-01-27 出版日期:2019-03-05 发布日期:2019-03-16
  • 通讯作者: Fan Li, Yu Wang E-mail:fli@bit.edu.cn;yu.wang@uncc.edu
  • 作者简介:Dong-Di Zhao received his B.E. degree in the Internet of Things from the School of Computer Science, Beijing Institute of Technology, Beijing, in 2016. He is currently pursuing his Master's degree at Beijing Institute of Technology, Beijing. His research interests include mobile sensing, mobile computing, and deep learning.
  • 基金资助:
    The work of Fan Li is partially supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 61772077 and 61370192, and Beijing Natural Science Foundation of China under Grant No. 4192051. The work of Yu Wang is partially supported by NSFC under Grant Nos. 61428203 and 61572347.

Space Efficient Quantization for Deep Convolutional Neural Networks

Dong-Di Zhao1, Fan Li1,*, Member, CCF, ACM, IEEE, Kashif Sharif1, Member, CCF, ACM, IEEE, Guang-Min Xia1, Yu Wang2,*, Fellow, IEEE, Senior Member, ACM   

  1. 1 School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;
    2 Wireless Networking and Sensing Laboratory, Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC 28223, U.S.A.
  • Received:2018-07-15 Revised:2019-01-27 Online:2019-03-05 Published:2019-03-16
  • Contact: Fan Li, Yu Wang E-mail:fli@bit.edu.cn;yu.wang@uncc.edu
  • About author:Dong-Di Zhao received his B.E. degree in the Internet of Things from the School of Computer Science, Beijing Institute of Technology, Beijing, in 2016. He is currently pursuing his Master's degree at Beijing Institute of Technology, Beijing. His research interests include mobile sensing, mobile computing, and deep learning.
  • Supported by:
    The work of Fan Li is partially supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 61772077 and 61370192, and Beijing Natural Science Foundation of China under Grant No. 4192051. The work of Yu Wang is partially supported by NSFC under Grant Nos. 61428203 and 61572347.

随着社交网络中信息发布数量的急剧增长,隐私问题已受到公众越来越多的关注。尽管差分隐私的出现为隐私保护提供了很好的理论基础,隐私保护和数据可用性之间的平衡仍需进一步提升。然而,大多数现有工作没有量化考虑攻击者行为对数据可用性造成的影响。本文首先提出了基于距离的个性化差分隐私保护方法。其次,我们分析了用户和攻击者在不知道对方策略情况下可获得的最大数据可用性。根据静态贝叶斯博弈,我们在差分隐私场景下定义了收益函数,并使用改进的增强学习算法求得纳什均衡解。所提方法可以快速的达到收敛,同时可以最大化用户的数据可用性。通过真实数据集上的实验证明了所提方法的有效性和灵活性。目的:设计一种个性化差分隐私保护方法,并分析刻画攻击者行为对数据可用性的影响,基于博弈论建立模型,实现数据可用性的最大化。创新点:1 提出了一种基于社交距离的个性化差分隐私保护方法。所提方法需要较少的隐私预算,同时可以获得更高的数据可用性;2 在差分隐私场景下,通过静态贝叶斯博弈描述用户和攻击者之间的对抗行为,在量化双方行为模式的基础上,通过获取贝叶斯纳什均衡进一步提升了保护数据的可用性;3 使用改进的Q-learning算法快速获取贝叶斯纳什均衡。所提方法通过减少数据的基数简化了迭代过程,从而使求解过程快速收敛;4 在真实数据集上从多个角度对所提模型进行评估,实验结果验证了所提方法的有效性。方法:首先设计了一种基于距离的个性化差分隐私保护方法。其次,我们从博弈的角度分析刻画了用户和攻击者在隐私保护场景下的对抗行为。进一步根据静态贝叶斯博弈定义出所研究场景下的收益函数,最后使用改进的增强学习算法求得最优解。结论:所提方法实现了个性化差分隐私保护的要求。与传统方法相比,所提方法可以获得更高的数据可用性。同时在算法效率上也得到了进一步提升。深度卷积神经网络(DCNN)已经在计算机视觉、自然语言处理和复杂系统分析等领域取得了出色的表现。随着更深的层数带来的性能提升,DCNN也会带来更高的计算复杂度和更大的存储需求,因此在资源有限的嵌入式系统(如移动设备或物联网设备)上部署DCNN非常困难。网络的量化能够有效地减少DCNN所需的存储空间。然而,随着量化比特数的降低,DCNN的性能通常迅速下降。在本文中,我们提出了一种空间高效的量化方案,它使用8位或更少的比特数来表示原始32位的权重。我们引入了奇异值分解(SVD)方法来减少全连接层的参数以进一步压缩。另外,我们提出了一种基于动态边界的权值裁切方法,以提高模型在使用更低精度时的性能。实验结果表示我们的方法对于模型可以实现最高14倍的压缩,同时保留与全精度模型几乎相同的准确率。当使用更低的精度时,所提出的权值裁切方法可以显著提升DCNN的性能。目的:提出一种空间高效的针对深度卷积神经网络(DCNN)的量化压缩方案,使得量化后的网络可以使用8比特甚至更低的精度来表示参数。同时,提出一种对于权重的预处理方法,使得模型能够在使用更低精度时尽可能地保留性能,从而使模型可以通过更容易的微调或再训练过程恢复至较高的准确率。创新点:1)我们提出了一种针对深度卷积神经网络(DCNN)的权值的量化方法。在不需要任何微调或再训练的情况下,提出的量化方法可以减少存储DCNN模型的参数所需要的比特数,同时保留着和全精度模型几乎相同的准确率。2)我们证明了将提出的量化方法与其他用来压缩DCNN的方法结合的可行性。我们将奇异值分解(SVD)引入到了提出的框架中,以同时模型的参数数量和所需的比特数。整合后的框架进一步压缩了网络的存储空间,尤其是针对一些包含多层全连接层的网络,如VGGNets。3)针对于量化DCNN时需要更低精度的情况,我们提出了一种基于动态边界的权值裁切方法。提出的方法在使用低于8比特的精度的定点数表示DCNN的权值的情况下显著提升了DCNN的准确率。量化后的低精度网络可以保留较高的性能并能够较容易的通过微调来恢复性能而不需要迭代地重复量化、微调过程。方法:本文提出的框架包含两个部分:权值量化和参数压缩。DCNN的卷积层和全连接层的参数大多服从近似正态分布,基于这个经验性的结论,我们提出了一种线性的量化方法,主要包括三个步骤:1) 放缩因子的计算:线性的量化方法通过线性映射将32位的权重映射为低精度的定点数,通过每一层权重中最大的绝对值和所需的精度,可以计算得到线性放缩因子。该放缩因子可以将该层的权重映射至近似的低精度值空间中。2)权重的值空间裁切:为了防止放缩后的权重超出低精度的值空间,对原值空间两端的稀疏值进行裁切。3)权重的量化:将经过放缩因子放缩后的权值通过最近舍入的方法近似为低精度的定点数,完成量化。为进一步压缩模型的存储空间,我们在提出的框架中引入了SVD方法,旨在验证提出的量化方法与其他压缩方法相结合的可行性。引入的SVD方法将全连接层的权重矩阵分解为3个较小的矩阵,显著降低了存储需求。本文进一步探索了将DCNN压缩值低于8比特的更低精度的可行性。模型在压缩至更低的精度时,准确率极具地下降。我们进一步提出了一种基于动态边界的权值裁切方法。权值裁切方法通过最小化裁切前后权重分布的KL散度来确定每一层的最优裁切边界。针对每一层的参数分布,首先切分为细粒度的窗口,对于两端窗口中每一个候选的裁切边界,通过1)参考分布生成,2)候选分布计算两个步骤,并最终通过计算KL散度选择最优的裁切边界。权重裁切的方法可以帮助模型在量化后显著地提升性能,并可以通过简单的微调过程恢复模型的准确率。结论:在本文中,我们提出了一种空间高效的量化方法(SEQ),并将模型压缩方法引入到提出的框架中,以减少深度卷积神经网络(DCNN)巨大的存储需求。SEQ能够在达到和全精度DCNN相近的表现的同时获得高的压缩比,并不需要任何的微调或再训练过程。另外,我们提出了一种动态的权重裁切方法,以显著地提升模型在更低精度的量化后的性能。权重裁切方法帮助模型在量化后保持较高的准确率并能够简单的微调过程来恢复性能。SEQ可以通过数秒的量化过程达到对DCNN最高14倍的压缩,并在此过程中不需要微调或再训练过程。SEQ的空间高效的特性能够帮助DCNN在嵌入式硬件或资源首先的设备上得到有效的部署。

关键词: 卷积神经网络, 压缩, 量化

Abstract: Deep convolutional neural networks (DCNNs) have shown outstanding performance in the fields of computer vision, natural language processing, and complex system analysis. With the improvement of performance with deeper layers, DCNNs incur higher computational complexity and larger storage requirement, making it extremely difficult to deploy DCNNs on resource-limited embedded systems (such as mobile devices or Internet of Things devices). Network quantization efficiently reduces storage space required by DCNNs. However, the performance of DCNNs often drops rapidly as the quantization bit reduces. In this article, we propose a space efficient quantization scheme which uses eight or less bits to represent the original 32-bit weights. We adopt singular value decomposition (SVD) method to decrease the parameter size of fully-connected layers for further compression. Additionally, we propose a weight clipping method based on dynamic boundary to improve the performance when using lower precision. Experimental results demonstrate that our approach can achieve up to approximately 14x compression while preserving almost the same accuracy compared with the full-precision models. The proposed weight clipping method can also significantly improve the performance of DCNNs when lower precision is required.

Key words: convolutional neural network, memory compression, network quantization

[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Annual Conf. Neural Information Processing Systems, December 2012, pp.1106-1114.
[2] Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proc. the 29th Annual Conf. Neural Information Processing Systems, December 2015, pp.91-99.
[3] Abdel-Hamid O, Mohamed A R, Jiang H, Deng L, Penn G, Yu D. Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio, Speech, and Language processing, 2014, 22(10): 1533-1545.
[4] Mao H, Alizadeh M, Menache I, Kandula S. Resource management with deep reinforcement learning. In Proc. the 15th ACM Workshop on Hot Topics in Networks, November 2016, pp.50-56.
[5] Deng J, Dong W, Socher R, Li L J, Li K, Li F F. ImageNet: A large-scale hierarchical image database. In Proc. the 2009 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, June 2009, pp.248-255.
[6] He K, Shang X, Ren S, Sun J. Deep residual learning for image recognition. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2016, pp.770-778.
[7] Yao S, Hu S, Zhao Y, Zhang A, Abdelzaher T. DeepSense: A unified deep learning framework for time-series mobile sensing data processing. In Proc. the 26th International Conference on World Wide Web, April 2017, pp.351-360.
[8] Guo B, Wang Z, Yu Z, Wang Y, Yen N, Huang R, Zhou X. Mobile crowd sensing and computing: The review of an emerging human-powered sensing paradigm. ACM Computing Surveys, 2015, 48(1): Article No. 7.
[9] Vanhoucke V, Senior A, Mao M Z. Improving the speed of neural networks on CPUs. In Proc. NIPS Deep Learning and Unsupervised Feature Learning Workshop, December 2011, pp.611-620.
[10] Han S, Mao H, Dally W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proc. Int. Conf. Learning Representations, May 2016, pp.351-360.
[11] Gysel P, Motamedi M, Ghiasi S. Hardware-oriented approximation of convolutional neural networks. arXiv:1604.03168, 2016. https://arxiv.org/abs/1604.03168,October2018.
[12] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556,April2018.
[13] Chen W, Wilson J T, Tyree S, Weinberger K Q, Chen Y. Compressing neural networks with the hashing trick. In Proc. the 32nd Int. Conf. Machine Learning, July 2015, pp.2285-2294.
[14] Wu J, Leng C, Wang Y, Hu Q, Cheng J. Quantized convolutional neural networks for mobile devices. In Proc. the 2016 IEEE Conf. Computer Vision and Pattern Recognition, June 2016, pp.4820-4828.
[15] Zhou A, Yao A, Guo Y, Xu L, Chen Y. Incremental network quantization: Towards lossless CNNs with low precision weights. arXiv:1702.03044, 2017. https://arxiv.org/abs/1702.03044,August2017.
[16] Park E, Ahn J, Yoo S. Weighted-entropy-based quantization for deep neural networks. In Proc. the 2017 IEEE Conf. Computer Vision and Pattern Recognition, July 2017, pp.7197-7205.
[17] Jaderberg M, Vedaldi A, Zisserman A. Speeding up convolutional neural networks with low rank expansions. In Proc. British Machine Vision Conference, September 2014, Article No. 73.
[18] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503.02531, 2015. https://arxiv.org/pdf/1503.02531.pdf,November2018.
[19] Iandola F N, Han S, Moskewicz M W, Ashraf A, Dally W J, Keutzer K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv:1602.07360, 2016. https://arxiv.org/abs/1602.07360,November2018.
[20] Chollet F. Xception: Deep learning with depthwise separable convolutions. In Proc. the 2017 IEEE Conf. Computer Vision and Pattern Recognition, July 2017, pp.1800-1807.
[21] Lin D, Talathi S, Annapureddy S. Fixed point quantization of deep convolutional networks. In Proc. the 33rd Int. Conf. Machine Learning, Jun. 2016, pp.2849-2858.
[22] Gupta S, Argawal A, Gopalakrishnan K, Narayanan P. Deep learning with limited numerical precision. In Proc. the 32nd Int. Conf. Machine Learning, July 2015, pp.1737- 1746.
[23] Gong Y, Liu L, Yang M., Bourdev L. Compressing deep convolutional networks using vector quantization. arXiv:1412.6115, 2014. https://arxiv.org/abs/1412.6115,December2018.
[24] Kullback S, Leibler R A. On information and sufficiency. The Annals of Mathematical Statistics, 1951, 22(1): 79-86.
[25] Abadi M, Barham P, Chen J, Chen Z et al. TensorFlow: A system for large-scale machine learning. In Proc. the 12th USENIX Symposium on Operating Systems Design and Implementation, November 2016, pp.265-283.
[1] 张鑫, 陆思源, 王水花, 余翔, 王甦菁, 姚仑, 潘毅, 张煜东. 通过新型深度学习架构诊断COVID-19肺炎[J]. 计算机科学技术学报, 2022, 37(2): 330-343.
[2] Shao-Jie Qiao, Guo-Ping Yang, Nan Han, Hao Chen, Fa-Liang Huang, Kun Yue, Yu-Gen Yi, Chang-An Yuan. 基数估计器:利用垂直扫描卷积神经网络处理SQL[J]. 计算机科学技术学报, 2021, 36(4): 762-777.
[3] Yang Liu, Ruili He, Xiaoqian Lv, Wei Wang, Xin Sun, Shengping Zhang. 婴儿的年龄和性别容易被识别吗?[J]. 计算机科学技术学报, 2021, 36(3): 508-519.
[4] Zhang-Jin Huang, Xiang-Xiang He, Fang-Jun Wang, Qing Shen. 基于卷积神经网络的实时多阶段斑马鱼头部姿态估计框架[J]. 计算机科学技术学报, 2021, 36(2): 434-444.
[5] 梁盾, 郭元晨, 张少魁, 穆太江, 黄晓蕾. 车道检测-新结果和调查研究[J]. 计算机科学技术学报, 2020, 35(3): 493-505.
[6] Jin-Gong Jia, Yuan-Feng Zhou, Xing-Wei Hao, Feng Li, Christian Desrosiers, Cai-Ming Zhang. 双流时间卷积神经网络用于基于骨架的人体动作识别[J]. 计算机科学技术学报, 2020, 35(3): 538-550.
[7] Rui-Song Zhang, Wei-Ze Quan, Lu-Bin Fan, Li-Ming Hu, Dong-Ming Yan. 基于通道和像素相关性的计算机生成图像与自然图像鉴别[J]. 计算机科学技术学报, 2020, 35(3): 592-602.
[8] Robail Yasrab. SRNET:用于解析奇点的基于浅跳跃连接的卷积神经网络[J]. 计算机科学技术学报, 2019, 34(4): 924-938.
[9] Ri-Sheng Liu, Cai-Sheng Mao, Zhi-Hui Wang, Hao-Jie Li. 基于灵活稀疏结构控制和自适应优化算法的模糊图像盲复原[J]. 计算机科学技术学报, 2019, 34(3): 609-621.
[10] Han Liu, Hang Du, Dan Zeng, Qi Tian. 基于超像素分类和语义分割的云检测算法[J]. 计算机科学技术学报, 2019, 34(3): 622-633.
[11] Amichai Painsky, Saharon Rosset. 随机森林无损压缩[J]. 计算机科学技术学报, 2019, 34(2): 494-506.
[12] Tie-Ke He, Hao Lian, Ze-Min Qin, Zhen-Yu Chen, Bin Luo. 一种用于罚金判定的主题模型[J]. , 2018, 33(4): 756-767.
[13] Bei-Ji Zou, Yun-Di Guo, Qi He, Ping-Bo Ouyang, Ke Liu, Zai-Liang Chen. 基于块匹配三维滤波和卷积神经网络的图像去噪[J]. , 2018, 33(4): 838-848.
[14] Zhi-Feng Xie, Yu-Chen Guo, Shu-Han Zhang, Wen-Jun Zhang, Li-Zhuang Ma. 基于深度卷积网络的多重曝光运动估计[J]. , 2018, 33(3): 487-501.
[15] Nai-Ming Yao, Hui Chen, Qing-Pei Guo, Hong-An Wang. 使用一种基于Depth-Patch的深度神经网络的非正面人脸表情识别[J]. , 2017, 32(6): 1172-1185.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: