Sign Up to like & get
recommendations!
1
Published in 2022 at "IEEE Computer Architecture Letters"
DOI: 10.48550/arxiv.2204.10943
Abstract: Training state-of-the-art artificial intelligence (AI) models requires scaling to many compute nodes and relies heavily on collective communication operations, such as all-reduce, to exchange the weight gradients between nodes. The overhead of these operations can…
read more here.
Keywords:
nics scalable;
fpga based;
distributed training;
smart nics ... See more keywords