We use cookies to improve your experience with our site.

基于知识蒸馏的通道剪枝深度模型的微调

Fine-Tuning Channel-Pruned Deep Model via Knowledge Distillation

  • 摘要: 由于智能手机、嵌入式GPU等边缘设备的计算资源有限,高性能的深度卷积神经网络很难部署在许多现实应用中。为了缓解这种硬件限制,从模型方面对深度神经网络进行压缩变得非常重要。深度卷积模型的通道剪裁技术是目前最受关注的方法之一,它可以在不显著影响网络性能的前提下,有效地去除CNN中冗余的卷积通道。现有的方法更加侧重于剪裁设计,评估不同卷积通道在CNN模型中的重要性。而深度神经网络在经过卷积通道剪裁之后的微调过程同样很重要,目前迫切需要一种快速有效的微调方法,在模型剪裁之后恢复精度。在本文中,我们提出了一种微调方法KDFT,该方法通过引入知识蒸馏,将通道剪裁前后的深度学习模型分别作为知识蒸馏过程的师生模型,在几乎可以忽略训练开销的情况下显著提高了微调模型的准确性。我们在具有代表性的CNN模型的基准数据集上的大量实验结果表明,基于通道剪裁,该方法使剪裁模型在微调过程的准确率最高提高了4.86 %,最多可节省79 %的训练时间。

     

    Abstract: Deep convolutional neural networks with high performance are hard to be deployed in many real world applications, since the computing resources of edge devices such as smart phones or embedded GPU are limited. To alleviate this hardware limitation, the compression of deep neural networks from the model side becomes important. As one of the most popular methods in the spotlight, channel pruning of the deep convolutional model can effectively remove redundant convolutional channels from the CNN (convolutional neural network) without affecting the network’s performance remarkably. Existing methods focus on pruning design, evaluating the importance of different convolutional filters in the CNN model. A fast and effective fine-tuning method to restore accuracy is urgently needed. In this paper, we propose a fine-tuning method KDFT (Knowledge Distillation Based Fine-Tuning), which improves the accuracy of fine-tuned models with almost negligible training overhead by introducing knowledge distillation. Extensive experimental results on benchmark datasets with representative CNN models show that up to 4.86% accuracy improvement and 79% time saving can be obtained.

     

/

返回文章
返回