Federated learning is a type of machine learning where clients collaborate in training the model but do not have to exchange raw data. This way, data privacy is protected. One of the algorithms for federated learning is Federated averaging. It relies on aggregating locally trained models by weighted averaging. Nevertheless, the clients’ data distributions are likely non-independent and identically distributed. Therefore, a recent paper on arXiv.org suggests a more optimal way to aggregate models.
The model learns aggregation weights in a data-driven fashion. The method is adaptive to the underlying data and learning progress. A novel communication-efficient algorithm is also proposed to fulfill the goal without violating the data privacy constraint. The results show that the suggested approach outperforms state-of-the-art approaches on two multi-institutional medical imaging studies with real-world datasets.
Federated learning (FL) enables collaborative model training while preserving each participant’s privacy, which is particularly beneficial to the medical field. FedAvg is a standard algorithm that uses fixed weights, often originating from the dataset sizes at each client, to aggregate the distributed learned models on a server during the FL process. However, non-identical data distribution across clients, known as the non-i.i.d problem in FL, could make this assumption for setting fixed aggregation weights sub-optimal. In this work, we design a new data-driven approach, namely Auto-FedAvg, where aggregation weights are dynamically adjusted, depending on data distributions across data silos and the current training progress of the models. We disentangle the parameter set into two parts, local model parameters and global aggregation parameters, and update them iteratively with a communication-efficient algorithm. We first show the validity of our approach by outperforming state-of-the-art FL methods for image recognition on a heterogeneous data split of CIFAR-10. Furthermore, we demonstrate our algorithm’s effectiveness on two multi-institutional medical image analysis tasks, i.e., COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.
Research paper: Xia, Y., “Auto-FedAvg: Learnable Federated Averaging for Multi-Institutional Medical Image Segmentation”, 2021. Link: https://arxiv.org/abs/2104.10195v1