Federated Edge Learning (FEL) enables a massive number of edge devices (e.g. smart phones) to train machine learning models collaboratively. Due to the inherent unreliability of participating edge devices and… Click to show full abstract
Federated Edge Learning (FEL) enables a massive number of edge devices (e.g. smart phones) to train machine learning models collaboratively. Due to the inherent unreliability of participating edge devices and the unpredictable network conditions, the final model accuracy is largely determined by the communication time between edge devices and the parameter server (PS). However, there exists very limited work to quantify the influence of the communication time in FEL and show how to minimize its negative impacts from a theoretic perspective. In this paper, we are among the first to develop a formal model of the communication time in FEL and its influence on the final model accuracy. In our work, the set of edge devices involved in each global iteration is defined as the ECP (Engaged Client Pool). We model the communication time cost as a function with respect to the response time distribution of individual devices and the ECP size. By incorporating communication time cost into the convergence rate analysis, we propose the ECPA and H-ECPA algorithms to automatically adjust the size of the ECP so as to maximize the model accuracy in both homogeneous and heterogeneous networks. We also analyze how the tail shape of response time affects the convergence rate, and prove that the heavy tail shape can significantly lower the model accuracy. Finally, we conduct extensive experiments with real datasets, and the results confirm the correctness of our analysis and demonstrate the superiority of our proposed algorithms.
               
Click one of the above tabs to view related content.