2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种基于动态量化编码的深度神经网络压缩方法

饶川 陈靓影 徐如意 刘乐元

饶川, 陈靓影, 徐如意, 刘乐元. 一种基于动态量化编码的深度神经网络压缩方法. 自动化学报, 2019, 45(10): 1960-1968. doi: 10.16383/j.aas.c180554
引用本文: 饶川, 陈靓影, 徐如意, 刘乐元. 一种基于动态量化编码的深度神经网络压缩方法. 自动化学报, 2019, 45(10): 1960-1968. doi: 10.16383/j.aas.c180554
RAO Chuan, CHEN Jing-Ying, XU Ru-Yi, LIU Le-Yuan. A Dynamic Quantization Coding Based Deep Neural Network Compression Method. ACTA AUTOMATICA SINICA, 2019, 45(10): 1960-1968. doi: 10.16383/j.aas.c180554
Citation: RAO Chuan, CHEN Jing-Ying, XU Ru-Yi, LIU Le-Yuan. A Dynamic Quantization Coding Based Deep Neural Network Compression Method. ACTA AUTOMATICA SINICA, 2019, 45(10): 1960-1968. doi: 10.16383/j.aas.c180554

一种基于动态量化编码的深度神经网络压缩方法

doi: 10.16383/j.aas.c180554
基金项目: 

湖北省自然科学基金 2017CFB504

湖北省创新研究团队 2017CFA007

国家自然科学基金 61807014

国家重点研发计划 2018YFB1004504

中央高校基本业务费 CCNU19Z02002

中国博士后科学基金 2018M632889

国家自然科学基金 61702208

详细信息
    作者简介:

    饶川  华中师范大学国家数字化学习工程技术研究中心硕士研究生.2015年获得湖北大学计算机与信息工程学院学士学位.主要研究方向为深度模型的压缩与加速.E-mail:raoguoc@163.com

    徐如意  华中师范大学国家数字化学习工程技术研究中心算法工程师.2008年获得武汉科技大学学士学位, 2016年获得华中科技大学硕士学位.主要研究方向为计算机视觉及多媒体应用.E-mail:86798653@qq.com

    刘乐元  华中师范大学国家数字化学习工程技术研究中心副教授.主要研究方向为计算机视觉, 模式识别, 多模态人机交互.E-mail:lyliu@mail.ccnu.edu.cn

    通讯作者:

    陈靓影   华中师范大学国家数字化学习工程技术研究中心教授.2001年获得南洋理工计算机科学与工程系博士学位.主要研究方向为图像处理, 计算机视觉, 模式识别, 多媒体应用.本文通信作者.E-mail:chenjy@mail.ccnu.edu.cn

A Dynamic Quantization Coding Based Deep Neural Network Compression Method

Funds: 

Hubei Natural Science Foundation 2017CFB504

Foundation for Innovative Research Groups of Hubei Province 2017CFA007

National Natural Science Foundation of China 61807014

National Basic Research Program of China 2018YFB1004504

Basic Operating Costs of Central Universities CCNU19Z02002

China Postdoctoral Science Fund 2018M632889

National Natural Science Foundation of China 61702208

More Information
    Author Bio:

      Master student at National Engineering Research Center for E-Learning, Central China Normal University. He received his bachelor degree from Hubei University in 2015. His research interest covers deep neural network compression and acceleration

       Algorithmic engineer at the National Engineering Research Center for E-Learning, Central China Normal University. He received his bachelor degree from Wuhan University of Science and Technology, and master degree from Huazhong University of Science and Technology in 2008 and 2016 respectively. His research interest covers computer vision and multimedia applications

       Associate professor at the National Engineering Research Center for E-Learning, Central China Normal University. His research interest covers computer vision, pattern recognition, and multimodal human-computer interaction

    Corresponding author: CHEN Jing-Ying   Professor at National Engineering Research Center for E-Learning, Central China Normal University. She received her Ph.D. degree from the School of Computer Engineering, Nanyang Technological University, Singapore in 2001. Her research interest covers image processing, computer vision, pattern recognition, and multimedia applications. Corresponding author of this paper
  • 摘要: 近年来深度神经网络(Deep neural network,DNN)从众多机器学习方法中脱颖而出,引起了广泛的兴趣和关注.然而,在主流的深度神经网络模型中,其参数数以百万计,需要消耗大量的计算和存储资源,难以应用于手机等移动嵌入式设备.为了解决这一问题,本文提出了一种基于动态量化编码(Dynamic quantization coding,DQC)的深度神经网络压缩方法.不同于现有的采用静态量化编码(Static quantitative coding,SQC)的方法,本文提出的方法在模型训练过程中同时对量化码本进行更新,使码本尽可能减小较大权重参数量化引起的误差.通过大量的对比实验表明,本文提出的方法优于现有基于静态编码的模型压缩方法.
    1)  本文责任编委 胡清华
  • 图  1  网络权值的量化规则

    Fig.  1  Quantization rules for network weights

    图  2  动态量化编码压缩方法的训练流程

    Fig.  2  The process of dynamic quantization coding

    图  3  码本中无0, SQC和DQC的量化比较

    Fig.  3  Quantization performance of SQC and DQC with 0 in codebook

    图  4  码本中有0, SQC和DQC的量化效果比较

    Fig.  4  Quantization performance of SQC and DQC without 0 in codebook

    表  1  LeNet在Softmax-loss下量化效果

    Table  1  Quantization performance of LeNet under Softmax-loss

    位宽 码本无0 码本有0
    3 99.29 % 99.22 %
    4 99.30 % 99.25 %
    5 99.35 % 99.32 %
    下载: 导出CSV

    表  2  LeNet在Softmax-loss+L1下量化效果

    Table  2  Quantization performance of LeNet under Softmax-loss and L1

    位宽 码本无0 码本有0
    3 98.69 % 99.25 %
    4 99.09 % 99.25 %
    5 99.14 % 99.27 %
    下载: 导出CSV

    表  3  LeNet在Softmax-loss+L2下量化效果

    Table  3  Quantization performance of LeNet under Softmax-loss and L2

    位宽 码本无0 码本有0
    3 99.26 % 99.29 %
    4 99.29 % 99.28 %
    5 99.36 % 99.28 %
    下载: 导出CSV

    表  4  ResNet-20在不同码本下量化效果

    Table  4  Quantization performance of ResNet-20 under different codebook

    位宽 码本无0 码本有0
    3 90.07 % 90.78 %
    4 91.71 % 91.91 %
    5 92.63 % 92.82 %
    下载: 导出CSV

    表  5  ResNet-32在不同码本下量化效果

    Table  5  Quantization performance of ResNet-32 under different codebook

    位宽 码本无0 码本有0
    3 91.44 % 92.11 %
    4 92.53 % 92.36 %
    5 92.87 % 92.33 %
    下载: 导出CSV

    表  6  ResNet-44在不同码本下量化效果

    Table  6  Quantization performance of ResNet-44 under different codebook

    位宽 码本无0 码本有0
    3 92.68 % 92.53 %
    4 93.14 % 93.37 %
    5 93.28 % 93.14 %
    下载: 导出CSV

    表  7  ResNet-56在不同码本下量化效果

    Table  7  Quantization performance of ResNet-56 under different codebook

    位宽 码本无0 码本有0
    3 92.72 % 92.69 %
    4 93.54 % 93.39 %
    5 93.21 % 93.24 %
    下载: 导出CSV

    表  8  固定码本下量化效果

    Table  8  Quantization performance of SQC

    网络 3 bits码本 3bits码本 3 bits码本 3 bits码本 3 bits码本 3 bits码本
    SQC无0 SQC有0 SQC无0 SQC有0 SQC无0 SQC有0
    ResNet-20 92.72 % 92.69 % 92.72 % 92.69 % 92.72 % 92.69 %
    ResNet-32 93.54 % 93.39 % 92.72 % 92.69 % 92.72 % 92.69 %
    ResNet-44 93.21 % 93.24 % 92.72 % 92.69 % 92.72 % 92.69 %
    ResNet-56 93.21 % 93.24 % 92.72 % 92.69 % 92.72 % 92.69 %
    下载: 导出CSV

    表  9  Deep compression与DQC的实验比较

    Table  9  Comparison of deep compression and DQC

    压缩方法 位宽 准确率
    Deep compression 5 99.20 %
    DQC 5 99.70 %
    下载: 导出CSV

    表  10  量化为5 bits时INQ和DQC在CIFAR-10上的准确率比较

    Table  10  Compare the accuracy of INQ and DQC on CIFAR-10 with 5 bits

    网络 INQ DQC码本无0 DQC码本有0
    ResNet-20 91.01 % 92.63 % 92.82 %
    ResNet-32 91.78 % 92.87 % 92.33 %
    ResNet-44 92.30 % 93.28 % 93.14 %
    ResNet-56 92.29 % 93.21 % 93.24 %
    下载: 导出CSV
  • [1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc, 2012. 1097-1105
    [2] Russakovsky O, Deng J, Su H, et al. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 2015, 115(3):211-252 http://d.old.wanfangdata.com.cn/NSTLHY/NSTL_HYCC0214533907/
    [3] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science, 2014 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=10.1080/01431161.2018.1506593
    [4] Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. In: Proceeding of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA: IEEE, 2015. 1-9
    [5] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770-778
    [6] He K, Zhang X, Ren S, et al. Identity Mappings in Deep Residual Networks. In: Proceeding of the European Conference on Computer Vision. Springer International Publishing, 2016. 630-645
    [7] Gong Y C, Liu L, Yang M, Bourdev L. Compressing deep concolutional net-works using vector quantization. arXiv preprint, arXiv: 1412.6115v1, 2014.
    [8] Chen W, Wilson J T, Tyree S, et al. Compressing Neural Networks with the Hashing Trick. Computer Science, 2015:2285-2294 http://arxiv.org/abs/1504.04788
    [9] Han S, Mao H, Dally W J. Deep compression:compressing deep neural networks with pruning, trained quantization and Huffman coding. Fiber, 2015, 56(4):3-7
    [10] Courbariaux M, Bengio Y, David J P. BinaryConnect: training deep neural networks with binary weights during propagations. arXiv preprint, arXiv: 1511.00363, 2015.
    [11] Courbariaux M, Hubara I, Soudry D, et al. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1. arXiv preprint, arXiv: 1602.02830, 2016.
    [12] Rastegari M, Ordonez V, Redmon J, et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. In: Proceedings of the European Conference on Computer Vision. Springer, Cham, 2016. 525-542
    [13] Li Z, Ni B, Zhang W, Yang X, Gao W. Performance Guaranteed Network Acceleration via High-Order Residual Quantization. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017. 2603-2611
    [14] Li F, Zhang B, Liu B. Ternary weight networks. arXiv preprint, arXiv: 1605.04711, 2016.
    [15] Zhu C Z, Han S, Mao H Z, Dally W J. Trained ternary quantization. arXiv preprint, arXiv: 1612.01064, 2016.
    [16] Cai Z, He X, Sun J, Vasconcelos N. Deep Learning with Low Precision by Half-Wave Gaussian Quantization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). arXiv preprint, arXiv: 1702.00953, 2017.
    [17] Zhou A J, Yao A B, Guo Y W, Xu L, Chen Y R. Incremental network quantization: towards lossless CNNs with low-precision weights. arXiv preprint, arXiv: 1702.03044, 2017.
    [18] Song Han, Jeff Pool, John Tran, William J. Dally. Learning Both Weights and Connections for Efficient Neural Networks. arXiv: 1506.02626, 2015.
    [19] Anwar S, Sung W Y. Coarse pruning of convolutional neural networks with random masks. In: Proceedings of the Int'l Conference on Learning and Representation (ICLR). IEEE, 2017. 134-145
    [20] Li H, Kadav A, Durdanovic I, Samet H, Graf H P. Pruning filters for efficient ConvNets. In: Proceedings of the Int'l Conference on Learning and Representation (ICLR). IEEE, 2017. 34-42
    [21] Luo J H, Wu J, Lin W. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression. 2017: arXiv: 1707.06342
    [22] Hu H, Peng R, Tai Y W, Tang C K. Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. In: Proceedings of the Int'l Conference on Learning and Representation (ICLR). IEEE, 2017. 214-222
    [23] Luo J, Wu J. An Entropy-based Pruning Method for CNN Compression. CoRR, 2017. abs/1706.05791
    [24] Yang T, Chen Y, Sze V. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6071-6079
    [25] Hinton G, Vinyals O, Dean J. Distilling the Knowledge in a Neural Network. Computer Science, 2015, 14(7):38-39
    [26] Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y. Fitnets: Hints for thin deep nets. In: Proceedings of the Int'l Conference on Learning and Representation (ICLR). IEEE, 2017. 124-133
    [27] Zagoruyko S, Komodakis N.Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer. CoRR, 2016. abs/1612.03928
    [28] Zhang X, Zou J, He K, et al. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10):1943-1955 doi: 10.1109/TPAMI.2015.2502579
    [29] Lebedev V, Ganin Y, Rakhuba M, et al. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. Computer Science, 2015.
    [30] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size. 2016.
    [31] Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. 2017. arXiv: 1704.04861
    [32] Zhang X, Zhou X, Lin M, et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. 2017.
    [33] Bengio, Yoshua, Léonard, Nicholas, and Courville, Aaron. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv: 1308.3432, 2013.
    [34] LeCun, Bottou, Bengio, Haffner. Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, 1998. 86(11): 2278-2324
    [35] Krizhevsky A. Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases, 2009, 1(4).
  • 加载中
图(4) / 表(10)
计量
  • 文章访问数:  1518
  • HTML全文浏览量:  449
  • PDF下载量:  140
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-08-17
  • 录用日期:  2019-01-02
  • 刊出日期:  2019-10-20

目录

    /

    返回文章
    返回