Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3658664.3659642acmconferencesArticle/Chapter ViewAbstractPublication Pagesih-n-mmsecConference Proceedingsconference-collections
research-article
Open access

Making Federated Learning Accessible to Scientists: The AI4EOSC Approach

Published: 24 June 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Access to computing resources is a critical requirement for researchers in a wide diversity of areas. This has become even more important with the rise of artificial intelligence techniques through the training of machine learning and deep learning models. In this sense, the AI4EOSC project aims to respond to this need by delivering an enhanced set of advanced services and tools for the development of artificial intelligence, machine and deep models, such as federated learning, in the European Open Science Cloud (EOSC). Federated learning is a technology in the field of privacy-preserving machine learning techniques that has revolutionized the current state of the art, evolving from classical centralized approaches to allow training models in a decentralized way, without sharing raw data. In this work, we present the production implementation of a federated learning system based on the Flower framework that allows users, without a technological background, to exploit this technique, performing federated learning training within the AI4EOSC platform. The objective is to be able to train this type of architecture in an intuitive way; for this purpose, a user-friendly dashboard has been implemented, whose development will be reviewed. The frameworks and technologies used for this implementation will be exposed together with an example of use from scratch, in order to demonstrate the use of this functionality of the platform. Finally, two scenarios concerning client availability are analyzed.

    References

    [1]
    [n. d.]. API Documentation & Design Tools for Teams | Swagger - swagger. io. https://swagger.io/. [Accessed 01-02--2024].
    [2]
    [n. d.]. Vault by HashiCorp - vaultproject.io. https://www.vaultproject.io/. [Accessed 29-02--2024].
    [3]
    Sawsan AbdulRahman, Hanine Tout, Hakima Ould-Slimane, Azzam Mourad, Chamseddine Talhi, and Mohsen Guizani. 2021. A Survey on Federated Learning: The Journey From Centralized to Distributed On-Site Learning and Beyond. IEEE INTERNET OF THINGS JOURNAL 8, 7 (APR 1 2021), 5476--5497. https: //doi.org/10.1109/JIOT.2020.3030072
    [4]
    AI4EOSC. 2024. GitHub - AI4EOSC/flower: Adaptations of the flower framework for use in AI4EOSC. - github.com. https://github.com/AI4EOSC/flower/tree/ credentials [Accessed 29-02--2024].
    [5]
    AI4EOSC. 2024. GitHub - ai4os/ai4-flwr: AI4OS extensions for the Flower framework. - github.com. https://github.com/ai4os/ai4-flwr/tree/develop [Accessed 29-02--2024].
    [6]
    AI4EOSC. 2024. GitHub - ai4os/ai4-papi: A Python library for interacting with the AI4EOSC services. - github.com. https://github.com/ai4os/ai4-papi [Accessed 12-01--2024].
    [7]
    AI4EOSC Consortium. 2024. AI4EOSC Project. https://ai4eosc.eu/ [Accessed 15-01--2024].
    [8]
    Syreen Banabilah, Moayad Aloqaily, Eitaa Alsayed, Nida Malik, and Yaser Jararweh. 2022. Federated learning review: Fundamentals, enabling technologies, and future applications. Information processing & management 59, 6 (2022), 103061.
    [9]
    Daniel J Beutel, Taner Topal, Akhil Mathur, Xinchi Qiu, Javier Fernandez-Marques, Yan Gao, Lorenzo Sani, Hei Li Kwing, Titouan Parcollet, Pedro PB de Gusmão, and Nicholas D Lane. 2020. Flower: A Friendly Federated Learning Research Framework. arXiv preprint arXiv:2007.14390 (2020).
    [10]
    R Cook, W Michener, D Vieglais, A Budden, and R Koskela. 2012. Dataone: A distributed environmental and earth science data network supporting the full data life cycle. In EGU General Assembly Conference Abstracts. 11863.
    [11]
    CYVERSE. 2024. CyVerse: Cyberinfrastructure for Life Sciences. https://cyverse. org/ [Accessed 16-01--2024].
    [12]
    Kang Zhang Daniel Kermany and Michael Goldbaum. 2018. Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification. Mendeley Data, V2. https://www.kaggle.com/paultimothymooney/chest-xraypneumonia/ code. https://doi.org/10.17632/rscbjbr9sj.2
    [13]
    EGI. 2024. EGI: Advanced Computing for a Data-Driven Future. https://www. egi.eu/ [Accessed 15-01--2024].
    [14]
    Mohammed El Hanjri, Hibatallah Kabbaj, Abdellatif Kobbane, and Amine Abouaomar. 2023. Federated Learning for Water Consumption Forecasting in Smart Cities. In ICC 2023 - IEEE International Conference on Communications. 1798--1803. https://doi.org/10.1109/ICC45041.2023.10279576
    [15]
    EOSC. Year Accessed. European Open Science Cloud (EOSC) Portal. https: //eosc-portal.eu/ [Accessed 16-02--2024].
    [16]
    FATE. 2024. FATE (Federated AI Technology Enabler). https://github.com/ FederatedAI/FATE [Accessed 09-01--2024].
    [17]
    Yujia Gao, Liang Liu, Binxuan Hu, Tianzi Lei, and Huadong Ma. 2020. Federated Region-Learning for Environment Sensing in Edge Computing System. IEEE Transactions on Network Science and Engineering 7, 4 (2020), 2192--2204. https: //doi.org/10.1109/TNSE.2020.3016035
    [18]
    Tal Garfinkel, Ben Pfaff, Jim Chow, Mendel Rosenblum, and Dan Boneh. 2003. Terra: A Virtual Machine-Based Platform for Trusted Computing. SIGOPS Oper. Syst. Rev. 37, 5 (oct 2003), 193--206. https://doi.org/10.1145/1165389.945464
    [19]
    Google. 2024. Angular. https://angular.io/ [Accessed 26-01--2024].
    [20]
    Google. 2024. Angular Material UI Library. https://material.angular.io/ [Accessed 26-01--2024].
    [21]
    Google. 2024. Google Colaboratory. https://colab.google/ [Accessed 15-01--2024].
    [22]
    Google. 2024. Release-Please project. https://github.com/googleapis/releaseplease [Accessed 26-01--2024].
    [23]
    Harbor. 2024. Harbor registry. https://goharbor.io/ [Accessed 26-01--2024].
    [24]
    HashiCorp. 2024. Consul | HashiCorp Developer - developer.hashicorp.com. https://developer.hashicorp.com/consul [Accessed 15-01--2024].
    [25]
    HashiCorp. 2024. Nomad | HashiCorp Developer - developer.hashicorp.com. https://developer.hashicorp.com/nomad [Accessed 12-01--2024].
    [26]
    Jest. 2024. Jestjs testing library. https://jestjs.io/ [Accessed 26-01--2024].
    [27]
    jrxFive. [n. d.]. GitHub - jrxFive/python-nomad: Client library Hashicorp Nomad - github.com. https://github.com/jrxFive/python-nomad. [Accessed 01-02--2024].
    [28]
    Jupyter. 2024. JupyterLab: A Next-Generation Notebook Interface. https: //jupyter.org/ [Accessed 15-01--2024].
    [29]
    Jakub Konecný, H. B. McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated Learning: Strategies for Improving Communication Efficiency. ArXiv abs/1610.05492 (2016). https://api.semanticscholar. org/CorpusID:14999259
    [30]
    Traefik Labs. 2024. Traefik Proxy | Traefik Labs - traefik.io. https://traefik.io/ traefik [Accessed 15-01--2024].
    [31]
    Junqing Le, Di Zhang, Xinyu Lei, Long Jiao, Kai Zeng, and Xiaofeng Liao. 2023. Privacy-Preserving Federated Learning With Malicious Clients and Honest-but- Curious Servers. IEEE Transactions on Information Forensics and Security 18 (2023), 4329--4344. https://doi.org/10.1109/TIFS.2023.3295949
    [32]
    Álvaro López García, Jesús Marco De Lucas, Marica Antonacci, Wolfgang Zu Castell, Mario David, Marcus Hardt, Lara Lloret Iglesias, Germán Moltó, Marcin Plociennik, Viet Tran, Andy S. Alic, Miguel Caballer, Isabel Campos Plasencia, Alessandro Costantini, Stefan Dlugolinsky, Doina Cristina Duma, Giacinto Donvito, Jorge Gomes, Ignacio Heredia Cacha, Keiichi Ito, Valentin Y. Kozlov, Giang Nguyen, Pablo Orviz Fernández, Zd?nek ?ustr, and PawelWolniewicz. 2020. A Cloud-Based Framework for Machine Learning Workloads and Applications. IEEE Access 8 (2020), 18681--18692. https://doi.org/10.1109/ACCESS.2020.2964386
    [33]
    Microsoft. 2024. Visual Studio Code. https://code.visualstudio.com/ [Accessed 15-01--2024].
    [34]
    Dianwen Ng, Xiang Lan, Melissa Min-Szu Yao, Wing P Chan, and Mengling Feng. 2021. Federated learning: a collaborative effort to achieve better medical imaging models for individual sites that have small labelled datasets. Quantitative Imaging in Medicine and Surgery 11, 2 (2021), 852.
    [35]
    Dinh C Nguyen, Ming Ding, Pubudu N Pathirana, Aruna Seneviratne, Jun Li, Dusit Niyato, and H Vincent Poor. 2021. Federated learning for industrial internet of things in future industries. IEEE Wireless Communications 28, 6 (2021), 192--199.
    [36]
    Solmaz Niknam, Harpreet S Dhillon, and Jeffrey H Reed. 2020. Federated learning for wireless communications: Motivation, opportunities, and challenges. IEEE Communications Magazine 58, 6 (2020), 46--51.
    [37]
    The Open Science Grid Executive Board on behalf of the OSG Consortium: Ruth Pordes, Don Petravick, Bill Kramer, Doug Olson, Miron Livny, Alain Roy, Paul Avery, Kent Blackburn, Torre Wenaus, Frank Würthwein, Ian Foster, Rob Gardner, Mike Wilde, Alan Blatecky, John McGee, and Rob Quick. 2007. The Open Science Grid. Journal of Physics: Conference Series 78, 1 (July 2007), 012057. https://doi.org/10.1088/1742--6596/78/1/012057
    [38]
    OpenMined. 2024. OpenMined/PySyft. https://github.com/OpenMined/PySyft [Accessed 09-01--2024].
    [39]
    AI4EOSC project. 2023. AI4EOSC Flower extensions. https://github.com/ AI4EOSC/flower/tree/develop [Accessed 09-01--2024].
    [40]
    AI4EOSC project. 2024. AI4EOSC Architecture repository. https://github.com/ AI4EOSC/ai4-architecture [Accessed 15-01--2024].
    [41]
    AI4EOSC project. 2024. AI4EOSC dashboard (GitHub). https://github.com/ai4os/ ai4-dashboard [Accessed 15-01--2024].
    [42]
    AI4EOSC project. 2024. AI4EOSC dashboard (platform). https://dashboard.cloud. ai4eosc.eu/ [Accessed 15-01--2024].
    [43]
    AI4EOSC project. 2024. Federated Learning Server. https://github.com/deephdc/ federated-server [Accessed 09-01--2024].
    [44]
    Sebastián Ramírez. [n. d.]. FastAPI. https://github.com/tiangolo/fastapi
    [45]
    Nicola Rieke, Jonny Hancox, Wenqi Li, Fausto Milletari, Holger R Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N Galtier, Bennett A Landman, Klaus Maier- Hein, et al. 2020. The future of digital health with federated learning. NPJ digital medicine 3, 1 (2020), 119.
    [46]
    Holger R Roth, Yan Cheng, Yuhong Wen, Isaac Yang, Ziyue Xu, Yuan-Ting Hsieh, Kristopher Kersten, Ahmed Harouni, Can Zhao, Kevin Lu, et al. 2022. Nvidia flare: Federated learning from simulation to real-world. arXiv preprint arXiv:2210.13291 (2022).
    [47]
    Judith Sáinz-Pardo Díaz, María Castrillo, and Álvaro López García. 2023. Deep learning based soft-sensor for continuous chlorophyll estimation on decentralized data. Water Research 246 (2023), 120726.
    [48]
    Micah J Sheller, Brandon Edwards, G Anthony Reina, Jason Martin, Sarthak Pati, Aikaterini Kotrotsou, Mikhail Milchenko, Weilin Xu, Daniel Marcus, Rivka R Colen, et al. 2020. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Scientific reports 10, 1 (2020), 12598.
    [49]
    Judith Sáinz-Pardo Díaz and Álvaro López García. 2023. Study of the performance and scalability of federated learning for medical imaging with intermittent clients. Neurocomputing 518 (2023), 142--154. https://doi.org/10.1016/j.neucom.2022.11. 011
    [50]
    TF-Federated. 2024. Tensorflow Federated: Machine Learning on Decentralized Data. https://www.tensorflow.org/federated [Accessed 09-01--2024].
    [51]
    Guido Van Rossum and Fred L Drake Jr. 1995. Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam.
    [52]
    Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim Verbelen, and Jan S Rellermeyer. 2020. A survey on distributed machine learning. Acm computing surveys (csur) 53, 2 (2020), 1--33.
    [53]
    Runhua Xu, Nathalie Baracaldo, and James Joshi. 2021. Privacy-Preserving Machine Learning: Methods, Challenges and Directions. arXiv:2108.04417 [cs.LG]

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IH&MMSec '24: Proceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security
    June 2024
    305 pages
    ISBN:9798400706370
    DOI:10.1145/3658664
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 June 2024

    Check for updates

    Author Tags

    1. artificial intelligence
    2. federated learning
    3. open science
    4. privacy-preserving
    5. software development

    Qualifiers

    • Research-article

    Funding Sources

    • European Union's Horizon Europe research and innovation programme

    Conference

    IH&MMSEC '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 128 of 318 submissions, 40%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 41
      Total Downloads
    • Downloads (Last 12 months)41
    • Downloads (Last 6 weeks)41
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media