LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC

Photo from archive.org

Deep neural networks (DNNs) is one of the most popular machine learning methods and is widely used in many modern applications. The training process of DNNs is a time-consuming process.… Click to show full abstract

Deep neural networks (DNNs) is one of the most popular machine learning methods and is widely used in many modern applications. The training process of DNNs is a time-consuming process. Accelerating the training of DNNs has been the focus of many research works. In this paper, we speed up the training of DNNs applied for automatic speech recognition and the target architecture is heterogeneous (CPU + MIC). We apply asynchronous methods for I/O and communication operations and propose an adaptive load balancing method. Besides, we also employ a momentum idea to speed up the convergence of the gradient descent algorithm. Experimental results show that our optimized algorithm is able to acquire a 20-fold speedup on a CPU + MIC platform compared with the original sequential algorithm on a single-core CPU.

Keywords: accelerating deep; cpu mic; parallel mechanism; deep learning; learning parallel; mechanism using

Journal Title: International Journal of Parallel Programming
Year Published: 2017

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.