LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Distributed Encoding and Updating for SAZD Coded Distributed Training

Photo by cdc from unsplash

Linear combination (LC) based coded distributed computing (CDC) suffers from the problem of poor numerical stability. Therefore, LC-CDC based model parallel (MP) training for a deep nueral network (DNN) may… Click to show full abstract

Linear combination (LC) based coded distributed computing (CDC) suffers from the problem of poor numerical stability. Therefore, LC-CDC based model parallel (MP) training for a deep nueral network (DNN) may have poor accuracy. To enhance accuracy, we propose to replace LC by shift-and-addition (SA) and replace matrix inversion by zigzag decoding (ZD) in the encoding and decoding process of each layer, respectively, and call the scheme Naive SAZD-CDC based MP training (N-SAZD-CDC-MP-T). However, N-SAZD-CDC-MP-T encounters the problem of bottleneck at the master node, which is caused by frequent encoding/decoding at the master node and frequent huge volume of data delivery between master and worker node. This bottleneck problem may pull down the training speed significantly. To alleviate this bottleneck problem, we further design an enhanced version, by offloading certain processing from master node to distributed encoding and updating (DEU) at the worker nodes and call it DEU-SAZD-CDC-MP-T. A proof that DEU-SAZD-CDC-MP-T automatically maitains the code structure during each iteration is provided. Extensive numerical studies show that the prediction accuracy of SAZD-CDC-MP-T improves significantly over that of Poly (which is representative of LC) based scheme. In addition, the training speed of DEU-SAZD-CDC-MP-T over N-SAZD-CDC-MP-T is improved significantly.

Keywords: cdc; sazd cdc; encoding updating; coded distributed; distributed encoding

Journal Title: IEEE Transactions on Parallel and Distributed Systems
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.