Abstract
Cross-silo federated learning (FL) enables the development of machine learning models on datasets distributed across data centers such as hospitals and clinical research laboratories. However, recent research has found that current FL algorithms face a trade-off between local and global performance when confronted with distribution shifts. Specifically, personalized FL methods have a tendency to overfit to local data, leading to a sharp valley in the local model and inhibiting its ability to generalize to out-of-distribution data. In this paper, we propose a novel federated model soup method (i.e., selective interpolation of model parameters) to optimize the trade-off between local and global performance. Specifically, during the federated training phase, each client maintains its own global model pool by monitoring the performance of the interpolated model between the local and global models. This allows us to alleviate overfitting and seek flat minima, which can significantly improve the model’s generalization performance. We evaluate our method on retinal and pathological image classification tasks, and our proposed method achieves significant improvements for out-of-distribution generalization. Our code is available at https://github.com/ubc-tea/FedSoup.
This work is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC), Public Safety Canada, Compute Canada and National Natural Science Foundation of China (Project No. 62201485).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In the heterogeneous setting (\(\mathcal {D}_i \ne \mathcal {D}_j\)), \(\mathcal {D}_j\) is viewed as the OOD data for client i.
- 2.
We take a random subset from the original Camelyon17 dataset to match the small data settings in FL [18].
References
Bándi, P., et al.: From detection of individual metastases to classification of lymph node status at the patient level: the CAMELYON17 challenge. IEEE Trans. Med. Imaging 38(2), 550–560 (2019)
Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)
Cha, J., et al.: SWAD: domain generalization by seeking flat minima. In: NeurIPS, pp. 22405–22418 (2021)
Chaudhari, P., et al.: Entropy-SGD: biasing gradient descent into wide valleys. In: ICLR (Poster). OpenReview.net (2017)
Collins, L., Hassani, H., Mokhtari, A., Shakkottai, S.: Exploiting shared representations for personalized federated learning. In: ICML. Proceedings of Machine Learning Research, vol. 139, pp. 2089–2099. PMLR (2021)
Dayan, I., et al.: Federated learning for predicting clinical outcomes in patients with covid-19. Nat. Med. 27(10), 1735–1743 (2021)
Dou, Q., So, T.Y., Jiang, M., Liu, Q., Vardhanabhuti, V., Kaissis, G., et al.: Federated deep learning for detecting covid-19 lung abnormalities in CT: a privacy-preserving multinational validation study. NPJ Digit. Med. 4(1), 1–11 (2021)
Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization. In: ICLR. OpenReview.net (2021)
Fumero, F., Alayón, S., Sánchez, J.L., Sigut, J.F., González-Hernández, M.: RIM-ONE: an open retinal image database for optic nerve evaluation. In: CBMS, pp. 1–6. IEEE Computer Society (2011)
Ilharco, G., et al.: Patching open-vocabulary models by interpolating weights. CoRR abs/2208.05592 (2022)
Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D.P., Wilson, A.G.: Averaging weights leads to wider optima and better generalization. In: UAI, pp. 876–885. AUAI Press (2018)
Jiang, M., Yang, H., Li, X., Liu, Q., Heng, PA., Dou, Q.: Dynamic bank learning for semi-supervised federated image diagnosis with class imbalance. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2022. MICCAI 2022. LNCS, vol. 13433, pp. 196–206. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_19
Kaddour, J., Liu, L., Silva, R., Kusner, M.J.: Questions for flat-minima optimization of modern neural networks. CoRR abs/2202.00661 (2022)
Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: CVPR, pp. 10713–10722. Computer Vision Foundation/IEEE (2021)
Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: MLSys. mlsys.org (2020)
Li, X., Gu, Y., Dvornek, N., Staib, L.H., Ventola, P., Duncan, J.S.: Multi-site FMRI analysis using privacy-preserving federated learning and domain adaptation: abide results. Med. Image Anal. 65, 101765 (2020)
Li, X., Jiang, M., Zhang, X., Kamp, M., Dou, Q.: FedBN: federated learning on non-IID features via local batch normalization. In: ICLR. OpenReview.net (2021)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS. Proceedings of Machine Learning Research, vol. 54, pp. 1273–1282. PMLR (2017)
Mirzadeh, S., Farajtabar, M., Görür, D., Pascanu, R., Ghasemzadeh, H.: Linear mode connectivity in multitask and continual learning. In: ICLR. OpenReview.net (2021)
Oh, J., Kim, S., Yun, S.: Fedbabu: Towards enhanced representation for federated image classification. CoRR abs/2106.06042 (2021)
Orlando, J.I., et al.: REFUGE challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Medical Image Anal. 59, 101570 (2020)
Pati, S., et al.: Federated learning enables big data for rare cancer boundary detection. Nat. Commun. 13(1), 7346 (2022)
Qu, Z., Li, X., Duan, R., Liu, Y., Tang, B., Lu, Z.: Generalized federated learning via sharpness aware minimization. In: ICML. Proceedings of Machine Learning Research, vol. 162, pp. 18250–18280. PMLR (2022)
Ramé, A., Ahuja, K., Zhang, J., Cord, M., Bottou, L., Lopez-Paz, D.: Recycling diverse models for out-of-distribution generalization. CoRR abs/2212.10445 (2022)
Rieke, N., et al.: The future of digital health with federated learning. NPJ Digit. Med. 3(1), 1–7 (2020)
Sivaswamy, J., Krishnadas, S., Chakravarty, A., Joshi, G., Tabish, A.S., et al.: A comprehensive retinal image dataset for the assessment of glaucoma from the optic nerve head analysis. JSM Biomed. Imaging Data Papers 2(1), 1004 (2015)
Wortsman, M., et al.: Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In: ICML. Proceedings of Machine Learning Research, vol. 162, pp. 23965–23998. PMLR (2022)
Wu, S., et al.: Motley: benchmarking heterogeneity and personalization in federated learning. CoRR abs/2206.09262 (2022)
Yao, Z., Gholami, A., Keutzer, K., Mahoney, M.W.: PyHessian: neural networks through the lens of the hessian. In: IEEE BigData, pp. 581–590. IEEE (2020)
Zhang, M., Sapra, K., Fidler, S., Yeung, S., Alvarez, J.M.: Personalized federated learning with first order model optimization. In: ICLR. OpenReview.net (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, M., Jiang, M., Dou, Q., Wang, Z., Li, X. (2023). FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14221. Springer, Cham. https://doi.org/10.1007/978-3-031-43895-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-43895-0_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43894-3
Online ISBN: 978-3-031-43895-0
eBook Packages: Computer ScienceComputer Science (R0)