This paper presents, DeepTrain, an embedded platform for high-performance and energy-efficient training of deep neural network (DNN). The key architectural concept of DeepTrain is to develop a spatially homogeneous computing… Click to show full abstract
This paper presents, DeepTrain, an embedded platform for high-performance and energy-efficient training of deep neural network (DNN). The key architectural concept of DeepTrain is to develop a spatially homogeneous computing (and memory) fabric with temporally heterogeneous programmable data flows to optimize memory mapping and data reuse during different phases of training operation.The DeepTrain is demonstrated as an in-memory accelerator integrated in the logic layer of a 3-D memory module. A programming model and supporting architecture utilizes the flexible data flow to efficiently accelerate training of various types of DNNs. The cycle level simulation and synthesized design in 15 nm FinFET shows power efficiency of 500 GFLOPS/W, and almost similar throughput for a wide range of DNNs, including convolutional, recurrent, and mixed (CNN+RNN) networks.
               
Click one of the above tabs to view related content.