-
Comparing modern techniques for querying data starting from top-k and skyline queries
Authors:
Fabio Patella
Abstract:
To make intelligent decisions over complex data by discovering a set of interesting options is something that has become very important for users of modern applications. Consequently, researchers are studying new techniques to overcome limitations of traditional ways of querying data from databases as top-k queries and skyline queries. Over the past few years new methods have been developed as Fle…
▽ More
To make intelligent decisions over complex data by discovering a set of interesting options is something that has become very important for users of modern applications. Consequently, researchers are studying new techniques to overcome limitations of traditional ways of querying data from databases as top-k queries and skyline queries. Over the past few years new methods have been developed as Flexible Skylines, Regret Minimization and Skyline ordering/ranking. The aim of this survey is to describe these techniques and some their possible variants comparing them and explaining how they improve traditional methods.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation
Authors:
Pengfei Guo,
Dong Yang,
Ali Hatamizadeh,
An Xu,
Ziyue Xu,
Wenqi Li,
Can Zhao,
Daguang Xu,
Stephanie Harmon,
Evrim Turkbey,
Baris Turkbey,
Bradford Wood,
Francesca Patella,
Elvira Stellato,
Gianpaolo Carrafiello,
Vishal M. Patel,
Holger R. Roth
Abstract:
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning t…
▽ More
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.
△ Less
Submitted 31 August, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Auto-FedAvg: Learnable Federated Averaging for Multi-Institutional Medical Image Segmentation
Authors:
Yingda Xia,
Dong Yang,
Wenqi Li,
Andriy Myronenko,
Daguang Xu,
Hirofumi Obinata,
Hitoshi Mori,
Peng An,
Stephanie Harmon,
Evrim Turkbey,
Baris Turkbey,
Bradford Wood,
Francesca Patella,
Elvira Stellato,
Gianpaolo Carrafiello,
Anna Ierardi,
Alan Yuille,
Holger Roth
Abstract:
Federated learning (FL) enables collaborative model training while preserving each participant's privacy, which is particularly beneficial to the medical field. FedAvg is a standard algorithm that uses fixed weights, often originating from the dataset sizes at each client, to aggregate the distributed learned models on a server during the FL process. However, non-identical data distribution across…
▽ More
Federated learning (FL) enables collaborative model training while preserving each participant's privacy, which is particularly beneficial to the medical field. FedAvg is a standard algorithm that uses fixed weights, often originating from the dataset sizes at each client, to aggregate the distributed learned models on a server during the FL process. However, non-identical data distribution across clients, known as the non-i.i.d problem in FL, could make this assumption for setting fixed aggregation weights sub-optimal. In this work, we design a new data-driven approach, namely Auto-FedAvg, where aggregation weights are dynamically adjusted, depending on data distributions across data silos and the current training progress of the models. We disentangle the parameter set into two parts, local model parameters and global aggregation parameters, and update them iteratively with a communication-efficient algorithm. We first show the validity of our approach by outperforming state-of-the-art FL methods for image recognition on a heterogeneous data split of CIFAR-10. Furthermore, we demonstrate our algorithm's effectiveness on two multi-institutional medical image analysis tasks, i.e., COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan
Authors:
Dong Yang,
Ziyue Xu,
Wenqi Li,
Andriy Myronenko,
Holger R. Roth,
Stephanie Harmon,
Sheng Xu,
Baris Turkbey,
Evrim Turkbey,
Xiaosong Wang,
Wentao Zhu,
Gianpaolo Carrafiello,
Francesca Patella,
Maurizio Cariati,
Hirofumi Obinata,
Hitoshi Mori,
Kaku Tamura,
Peng An,
Bradford J. Wood,
Daguang Xu
Abstract:
The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and di…
▽ More
The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.