Crossbar architecture has been widely adopted in neural network accelerators due to the efficient implementations on vector-matrix multiplication operations. However, in the case of convolutional neural networks (CNNs), the efficiency… Click to show full abstract
Crossbar architecture has been widely adopted in neural network accelerators due to the efficient implementations on vector-matrix multiplication operations. However, in the case of convolutional neural networks (CNNs), the efficiency is compromised dramatically because of the large amounts of data reuse. Although some mapping methods have been designed to achieve a balance between the execution throughput and resource overhead, the resource consumption cost is still huge while maintaining the throughput. Network pruning is a promising and widely studied method to shrink the model size, whereas prior work for CNNs compression rarely considered the crossbar architecture and the corresponding mapping method and cannot be directly utilized by crossbar-based neural network accelerators. This paper proposes a crossbar-aware pruning framework based on a formulated $L_{0}$ -norm constrained optimization problem. Specifically, we design an $L_{0}$ -norm constrained gradient descent with relaxant probabilistic projection to solve this problem. Two types of sparsity are successfully achieved: 1) intuitive crossbar-grain sparsity and 2) column-grain sparsity with output recombination, based on which we further propose an input feature maps reorder method to improve the model accuracy. We evaluate our crossbar-aware pruning framework on the median-scale CIFAR10 data set and the large-scale ImageNet data set with VGG and ResNet models. Our method is able to reduce the crossbar overhead by 44%–72% with insignificant accuracy degradation. This paper significantly reduce the resource overhead and the related energy cost and provides a new co-design solution for mapping CNNs onto various crossbar devices with much better efficiency.
               
Click one of the above tabs to view related content.