Federated learning (FL) has been proposed as a machine learning approach to collaboratively learn a shared prediction model. Although, during FL training, only a subset of workers participate in each… Click to show full abstract
Federated learning (FL) has been proposed as a machine learning approach to collaboratively learn a shared prediction model. Although, during FL training, only a subset of workers participate in each round, existing approaches introduce model bias when considering the average of local model parameters of heterogeneous workers, which degrades the accuracy of the learned global model. In this paper, we introduce NIFL, a new strategy for worker selection that handles the statistical challenges of FL when local data is Non-Independent and Identically Distributed (N-IID). In NIFL, the server starts sending the signal to the workers that react by sending the number of their samples. The server then selects a percentage of workers with the highest number of samples and requests data statistics such as mean and standard deviation. After that, the server calculates our proposed N-IID index, based on the statistical information collected from the workers without having access to their data, and uses this index as a criterion for worker selection. Finally, the server broadcasts the global model to the selected workers. NIFL takes into account the disparity in the distribution of workers’ data in order to improve the performance of the model in heterogeneous data environment. We have performed several experiments with N-IID data. The obtained results show that both the convergence of our method and the test accuracy increased considerably comparing to the other techniques while keeping a reasonable computation and communication costs.
               
Click one of the above tabs to view related content.