Articles with "distributed training" as a keyword



Privacy preserving distributed training of neural networks

Sign Up to like & get
recommendations!
Published in 2020 at "Neural Computing and Applications"

DOI: 10.1007/s00521-020-04880-0

Abstract: L earnae is a system aiming to achieve a fully distributed way of neural network training. It follows a “Vires in Numeris” approach, combining the resources of commodity personal computers. It has a full peer-to-peer… read more here.

Keywords: network; privacy preserving; training data; distributed training ... See more keywords

Large-Scale Distributed Training of Transformers for Chemical Fingerprinting

Sign Up to like & get
recommendations!
Published in 2022 at "Journal of Chemical Information and Modeling"

DOI: 10.1021/acs.jcim.2c00715

Abstract: Transformer models have become a popular choice for various machine learning tasks due to their often outstanding performance. Recently, transformers have been used in chemistry for classifying reactions, reaction prediction, physiochemical property prediction, and more.… read more here.

Keywords: distributed training; molecular fingerprints; large scale; chemistry ... See more keywords

Heter-Train: A Distributed Training Framework Based on Semi-Asynchronous Parallel Mechanism for Heterogeneous Intelligent Transportation Systems

Sign Up to like & get
recommendations!
Published in 2024 at "IEEE Transactions on Intelligent Transportation Systems"

DOI: 10.1109/tits.2023.3286400

Abstract: Transportation big data (TBD) are increasingly combined with artificial intelligence to mine novel patterns and information due to the powerful representational capabilities of deep neural networks (DNNs), especially for anti-COVID19 applications. The distributed cloud-edge-vehicle training… read more here.

Keywords: parallel mechanism; transportation systems; training; intelligent transportation ... See more keywords

ALEPH: Accelerating Distributed Training With eBPF-Based Hierarchical Gradient Aggregation

Sign Up to like & get
recommendations!
Published in 2024 at "IEEE/ACM Transactions on Networking"

DOI: 10.1109/tnet.2024.3404999

Abstract: Distributed training includes two important operations: gradient transmission and gradient aggregation, which will consume massive bandwidth and computing resources. To achieve efficient distributed training, one must overcome two critical challenges: heterogeneity of bandwidth resources and… read more here.

Keywords: gradient aggregation; ebpf; distributed training;

FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Computer Architecture Letters"

DOI: 10.48550/arxiv.2204.10943

Abstract: Training state-of-the-art artificial intelligence (AI) models requires scaling to many compute nodes and relies heavily on collective communication operations, such as all-reduce, to exchange the weight gradients between nodes. The overhead of these operations can… read more here.

Keywords: nics scalable; fpga based; distributed training; smart nics ... See more keywords