We study network pruning which aims to remove redundant channels/kernels and accelerate the inference of deep networks. Existing pruning methods either train from scratch with sparsity constraints or minimize the… Click to show full abstract
We study network pruning which aims to remove redundant channels/kernels and accelerate the inference of deep networks. Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones. Both strategies suffer from some limitations: the former kind is computationally expensive and difficult to converge, while the latter kind optimizes the reconstruction error but ignores the discriminative power of channels. In this paper, we propose a discrimination-aware channel pruning (DCP) method to choose the channels that actually contribute to the discriminative power. Based on DCP, we further propose several techniques to improve the optimization efficiency. Note that the parameters of a channel (3D tensor) may contain redundant kernels (each with a 2D matrix). To solve this issue, we propose a discrimination-aware kernel pruning (DKP) method to select the kernels with promising discriminative power. Experiments on image classification and face recognition demonstrate the effectiveness of our methods. For example, on ILSVRC-12, the resultant ResNet-50 with 30% reduction of channels even outperforms the baseline model by 0.36% on Top-1 accuracy. The pruned MobileNetV1 and MobileNetV2 achieve 1.93x and 1.42x inference acceleration on a mobile device, respectively, with negligible performance degradation.
               
Click one of the above tabs to view related content.