In this article, an energy-efficient deep learning processor is proposed for deep neural network (DNN) training in mobile platforms. Conventional mobile DNN training processors suffer from high-bit precision requirement and… Click to show full abstract
In this article, an energy-efficient deep learning processor is proposed for deep neural network (DNN) training in mobile platforms. Conventional mobile DNN training processors suffer from high-bit precision requirement and high ReLU-dependencies. The proposed processor breaks through these fundamental issues by adopting three new features. It first combines the runtime automatic bit precision searching method addition to both conventional dynamic fixed-point representation and stochastic rounding to realize low-precision training. It adopts bit-slice scalable core architecture with the input skipping functionality to exploit bit-slice-level fine-grained sparsity. The iterative channel reordering unit helps the processor to maintain high core utilization by solving the workload unbalancing problem during zero-slice skipping. It finally achieves at least 4.4× higher energy efficiency compared with the conventional DNN training processors.
               
Click one of the above tabs to view related content.