Sign Up to like & get
recommendations!
1
Published in 2022 at "IEEE/ACM Transactions on Networking"
DOI: 10.1109/tnet.2021.3117042
Abstract: To tackle the increasingly larger training data and models, researchers and engineers resort to multiple servers in a data center for distributed machine learning (DML). On one hand, DML enables us to leverage the computation…
read more here.
Keywords:
topology;
dml;
parameter synchronization;
synchronization topology ... See more keywords
Sign Up to like & get
recommendations!
1
Published in 2022 at "IEEE Transactions on Network Science and Engineering"
DOI: 10.1109/tnse.2021.3068155
Abstract: It's common practice to speed up machine learning (ML) training by distributing it across a cluster of computing nodes. Data-parallel distributed ML (DML) training relieves the pressure of computing node; however, the communication traffic introduced…
read more here.
Keywords:
dml training;
parameter synchronization;
parameter;
communication ... See more keywords