Sign Up to like & get
recommendations!
1
Published in 2018 at "International Journal of Parallel Programming"
DOI: 10.1007/s10766-018-00623-w
Abstract: As one of the most influential deep learning frameworks, MXNet has achieved excellent performance and many breakthroughs in academic and industrial fields for various machine learning situations. The initial implementation of MXNet uses proxy-socket interface,…
read more here.
Keywords:
distributed mxnet;
improving performance;
rdma;
performance ... See more keywords
Sign Up to like & get
recommendations!
1
Published in 2020 at "IEEE Systems Journal"
DOI: 10.1109/jsyst.2019.2936519
Abstract: Data centers, the infrastructure of cloud computing, have been widely deployed around the world to accommodate the increasing cloud computing demands. A data center network (DCN) connects tens or hundreds of thousands of servers in…
read more here.
Keywords:
rdma;
data center;
control;
traffic control ... See more keywords
Sign Up to like & get
recommendations!
1
Published in 2022 at "IEEE Transactions on Parallel and Distributed Systems"
DOI: 10.1109/tpds.2022.3175666
Abstract: Remote Direct Memory Access (RDMA) is widely used in High Performance Computing (HPC) while making inroads in datacenters and accelerators. State-of-the-art RDMA Engines typically do not endure page faults, therefore users are forced to pin…
read more here.
Keywords:
rdma;
fault handling;
page fault;
page ... See more keywords