Search | arXiv e-print repository

arXiv:2405.19458 [pdf, other]

MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection

Authors: Raman Dutt, Pedro Sanchez, Ondrej Bohdal, Sotirios A. Tsaftaris, Timothy Hospedales

Abstract: Diffusion models show a remarkable ability in generating images that closely mirror the training distribution. However, these models are prone to training data memorization, leading to significant privacy, ethical, and legal concerns, particularly in sensitive fields such as medical imaging. We hypothesize that memorization is driven by the overparameterization of deep models, suggesting that regu… ▽ More Diffusion models show a remarkable ability in generating images that closely mirror the training distribution. However, these models are prone to training data memorization, leading to significant privacy, ethical, and legal concerns, particularly in sensitive fields such as medical imaging. We hypothesize that memorization is driven by the overparameterization of deep models, suggesting that regularizing model capacity during fine-tuning could be an effective mitigation strategy. Parameter-efficient fine-tuning (PEFT) methods offer a promising approach to capacity control by selectively updating specific parameters. However, finding the optimal subset of learnable parameters that balances generation quality and memorization remains elusive. To address this challenge, we propose a bi-level optimization framework that guides automated parameter selection by utilizing memorization and generation quality metrics as rewards. Our framework successfully identifies the optimal parameter set to be updated to satisfy the generation-memorization tradeoff. We perform our experiments for the specific task of medical image generation and outperform existing state-of-the-art training-time mitigation strategies by fine-tuning as few as 0.019% of model parameters. Furthermore, we show that the strategies learned through our framework are transferable across different datasets and domains. Our proposed framework is scalable to large datasets and agnostic to the choice of reward functions. Finally, we show that our framework can be combined with existing approaches for further memorization mitigation. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.09318 [pdf, other]

Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: In the current cybersecurity landscape, protecting military devices such as communication and battlefield management systems against sophisticated cyber attacks is crucial. Malware exploits vulnerabilities through stealth methods, often evading traditional detection mechanisms such as software signatures. The application of ML/DL in vulnerability detection has been extensively explored in the lite… ▽ More In the current cybersecurity landscape, protecting military devices such as communication and battlefield management systems against sophisticated cyber attacks is crucial. Malware exploits vulnerabilities through stealth methods, often evading traditional detection mechanisms such as software signatures. The application of ML/DL in vulnerability detection has been extensively explored in the literature. However, current ML/DL vulnerability detection methods struggle with understanding the context and intent behind complex attacks. Integrating large language models (LLMs) with system call analysis offers a promising approach to enhance malware detection. This work presents a novel framework leveraging LLMs to classify malware based on system call data. The framework uses transfer learning to adapt pre-trained LLMs for malware detection. By retraining LLMs on a dataset of benign and malicious system calls, the models are refined to detect signs of malware activity. Experiments with a dataset of over 1TB of system calls demonstrate that models with larger context sizes, such as BigBird and Longformer, achieve superior accuracy and F1-Score of approximately 0.86. The results highlight the importance of context size in improving detection rates and underscore the trade-offs between computational complexity and performance. This approach shows significant potential for real-time detection in high-stakes environments, offering a robust solution to evolving cyber threats. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: Submitted to IEEE MILCOM 2024

arXiv:2404.12920 [pdf, other]

Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models

Authors: Konstantinos Vilouras, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Localizing the exact pathological regions in a given medical scan is an important imaging problem that requires a large amount of bounding box ground truth annotations to be accurately solved. However, there exist alternative, potentially weaker, forms of supervision, such as accompanying free-text reports, which are readily available. The task of performing localization with textual guidance is c… ▽ More Localizing the exact pathological regions in a given medical scan is an important imaging problem that requires a large amount of bounding box ground truth annotations to be accurately solved. However, there exist alternative, potentially weaker, forms of supervision, such as accompanying free-text reports, which are readily available. The task of performing localization with textual guidance is commonly referred to as phrase grounding. In this work, we use a publicly available Foundation Model, namely the Latent Diffusion Model, to solve this challenging task. This choice is supported by the fact that the Latent Diffusion Model, despite being generative in nature, contains mechanisms (cross-attention) that implicitly align visual and textual features, thus leading to intermediate representations that are suitable for the task at hand. In addition, we aim to perform this task in a zero-shot manner, i.e., without any further training on target data, meaning that the model's weights remain frozen. To this end, we devise strategies to select features and also refine them via post-processing without extra learnable parameters. We compare our proposed method with state-of-the-art approaches which explicitly enforce image-text alignment in a joint embedding space via contrastive learning. Results on a popular chest X-ray benchmark indicate that our method is competitive wih SOTA on different types of pathology, and even outperforms them on average in terms of two metrics (mean IoU and AUC-ROC). Source code will be released upon acceptance. △ Less

Submitted 17 July, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

Comments: 10 pages, 3 figures, submitted to IEEE J-BHI Special Issue on Foundation Models in Medical Imaging

arXiv:2404.12356 [pdf, other]

Improving the interpretability of GNN predictions through conformal-based graph sparsification

Authors: Pablo Sanchez-Martin, Kinaan Aamir Khan, Isabel Valera

Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that j… ▽ More Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that jointly i) finds the most predictive subgraph by removing edges and/or nodes -- -\emph{without making assumptions about the subgraph structure} -- while ii) optimizing the performance of the graph classification task. To that end, we rely on reinforcement learning to solve the resulting bi-level optimization with a reward function based on conformal predictions to account for the current in-training uncertainty of the classifier. Our empirical results on nine different graph classification datasets show that our method competes in performance with baselines while relying on significantly sparser subgraphs, leading to more interpretable GNN-based predictions. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2403.20287 [pdf, other]

Benchmarking Counterfactual Image Generation

Authors: Thomas Melistas, Nikos Spyrou, Nefeli Gkouti, Pedro Sanchez, Athanasios Vlontzos, Yannis Panagakis, Giorgos Papanastasiou, Sotirios A. Tsaftaris

Abstract: Generative AI has revolutionised visual content editing, empowering users to effortlessly modify images and videos. However, not all edits are equal. To perform realistic edits in domains such as natural image or medical imaging, modifications must respect causal relationships inherent to the data generation process. Such image editing falls into the counterfactual image generation regime. Evaluat… ▽ More Generative AI has revolutionised visual content editing, empowering users to effortlessly modify images and videos. However, not all edits are equal. To perform realistic edits in domains such as natural image or medical imaging, modifications must respect causal relationships inherent to the data generation process. Such image editing falls into the counterfactual image generation regime. Evaluating counterfactual image generation is substantially complex: not only it lacks observable ground truths, but also requires adherence to causal constraints. Although several counterfactual image generation methods and evaluation metrics exist, a comprehensive comparison within a unified setting is lacking. We present a comparison framework to thoroughly benchmark counterfactual image generation methods. We integrate all models that have been used for the task at hand and expand them to novel datasets and causal graphs, demonstrating the superiority of Hierarchical VAEs across most datasets and metrics. Our framework is implemented in a user-friendly Python package that can be extended to incorporate additional SCMs, causal methods, generative models, and datasets for the community to build on. △ Less

Submitted 10 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.17776 [pdf, other]

Exploring the Boundaries of Ambient Awareness in Twitter

Authors: Pablo Sanchez-Martin, Sonja Utz, Isabel Valera

Abstract: Ambient awareness refers to the ability of social media users to obtain knowledge about who knows what (i.e., users' expertise) in their network, by simply being exposed to other users' content (e.g, tweets on Twitter). Previous work, based on user surveys, reveals that individuals self-report ambient awareness only for parts of their networks. However, it is unclear whether it is their limited co… ▽ More Ambient awareness refers to the ability of social media users to obtain knowledge about who knows what (i.e., users' expertise) in their network, by simply being exposed to other users' content (e.g, tweets on Twitter). Previous work, based on user surveys, reveals that individuals self-report ambient awareness only for parts of their networks. However, it is unclear whether it is their limited cognitive capacity or the limited exposure to diagnostic tweets (i.e., online content) that prevents people from developing ambient awareness for their complete network. In this work, we focus on in-wall ambient awareness (IWAA) in Twitter and conduct a two-step data-driven analysis, that allows us to explore to which extent IWAA is likely, or even possible. First, we rely on reactions (e.g., likes), as strong evidence of users being aware of experts in Twitter. Unfortunately, such strong evidence can be only measured for active users, which represent the minority in the network. Thus to study the boundaries of IWAA to a larger extent, in the second part of our analysis, we instead focus on the passive exposure to content generated by other users -- which we refer to as in-wall visibility. This analysis shows that (in line with \citet{levordashka2016ambient}) only for a subset of users IWAA is plausible, while for the majority it is unlikely, if even possible, to develop IWAA. We hope that our methodology paves the way for the emergence of data-driven approaches for the study of ambient awareness. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2312.12865 [pdf, other]

RadEdit: stress-testing biomedical vision models via diffusion image editing

Authors: Fernando Pérez-García, Sam Bond-Taylor, Pedro P. Sanchez, Boris van Breugel, Daniel C. Castro, Harshita Sharma, Valentina Salvatelli, Maria T. A. Wetscherek, Hannah Richardson, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse

Abstract: Biomedical imaging datasets are often small and biased, meaning that real-world performance of predictive models can be substantially lower than expected from internal testing. This work proposes using generative image editing to simulate dataset shifts and diagnose failure modes of biomedical vision models; this can be used in advance of deployment to assess readiness, potentially reducing cost a… ▽ More Biomedical imaging datasets are often small and biased, meaning that real-world performance of predictive models can be substantially lower than expected from internal testing. This work proposes using generative image editing to simulate dataset shifts and diagnose failure modes of biomedical vision models; this can be used in advance of deployment to assess readiness, potentially reducing cost and patient harm. Existing editing methods can produce undesirable changes, with spurious correlations learned due to the co-occurrence of disease and treatment interventions, limiting practical applicability. To address this, we train a text-to-image diffusion model on multiple chest X-ray datasets and introduce a new editing method RadEdit that uses multiple masks, if present, to constrain changes and ensure consistency in the edited images. We consider three types of dataset shifts: acquisition shift, manifestation shift, and population shift, and demonstrate that our approach can diagnose failures and quantify model robustness without additional data collection, complementing more qualitative tools for explainable AI. △ Less

Submitted 3 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

arXiv:2311.07268 [pdf, other]

doi 10.1109/ACCESS.2024.3397732

ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers

Authors: Edison P. Velasco Sánchez, Miguel Ángel Muñoz-Bañón, Francisco A. Candelas, Santiago T. Puente, Fernando Torres

Abstract: While Visual Servoing is deeply studied to perform simple maneuvers, the literature does not commonly address complex cases where the target is far out of the camera's field of view (FOV) during the maneuver. For this reason, in this paper, we present ViKi-HyCo (Visual Servoing and Kinematic Hybrid-Controller). This approach generates the necessary maneuvers for the complex positioning of a non-ho… ▽ More While Visual Servoing is deeply studied to perform simple maneuvers, the literature does not commonly address complex cases where the target is far out of the camera's field of view (FOV) during the maneuver. For this reason, in this paper, we present ViKi-HyCo (Visual Servoing and Kinematic Hybrid-Controller). This approach generates the necessary maneuvers for the complex positioning of a non-holonomic mobile robot in outdoor environments. In this method, we use \hbox{LiDAR-camera} fusion to estimate objects bounding boxes using image and metrics modalities. With the multi-modality nature of our representation, we can automatically obtain a target for a visual servoing controller. At the same time, we also have a metric target, which allows us to hybridize with a kinematic controller. Given this hybridization, we can perform complex maneuvers even when the target is far away from the camera's FOV. The proposed approach does not require an object-tracking algorithm and can be applied to any robotic positioning task where its kinematic model is known. ViKi-HyCo has an error of 0.0428 \pm 0.0467 m in the X-axis and 0.0515 \pm 0.0323 m in the Y-axis at the end of a complete positioning task. △ Less

Submitted 16 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: This paper is published at the journal "IEEE Access"

Journal ref: In IEEE Access, vol. 12, pp. 65428-65443, May. 2024

arXiv:2311.07229 [pdf, other]

Understanding the Influence of Data Characteristics on the Performance of Point-of-Interest Recommendation Algorithms

Authors: Linus W. Dietz, Pablo Sánchez, Alejandro Bellogín

Abstract: The performance of recommendation algorithms is closely tied to key characteristics of the data sets they use, such as sparsity, popularity bias, and preference distributions. In this paper, we conduct a comprehensive explanatory analysis to shed light on the impact of a broad range of data characteristics within the point-of-interest (POI) recommendation domain. To accomplish this, we extend prio… ▽ More The performance of recommendation algorithms is closely tied to key characteristics of the data sets they use, such as sparsity, popularity bias, and preference distributions. In this paper, we conduct a comprehensive explanatory analysis to shed light on the impact of a broad range of data characteristics within the point-of-interest (POI) recommendation domain. To accomplish this, we extend prior methodologies used to characterize traditional recommendation problems by introducing new explanatory variables specifically relevant to POI recommendation. We subdivide a POI recommendation data set on New York City into domain-driven subsamples to measure the effect of varying these characteristics on different state-of-the-art POI recommendation algorithms in terms of accuracy, novelty, and item exposure. Our findings, obtained through the application of an explanatory framework employing multiple-regression models, reveal that the relevant independent variables encompass all categories of data characteristics and account for as much as $R^2 = $ 85-90\% of the accuracy and item exposure achieved by the algorithms. Our study reaffirms the pivotal role of prominent data characteristics, such as density, popularity bias, and the distribution of check-ins in POI recommendation. Additionally, we unveil novel factors, such as the proximity of user activity to the city center and the duration of user activity. In summary, our work reveals why certain POI recommendation algorithms excel in specific recommendation problems and, conversely, offers practical insights into which data characteristics should be modified (or explicitly recognized) to achieve better performance. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.00476 [pdf, other]

Group Distributionally Robust Knowledge Distillation

Authors: Konstantinos Vilouras, Xiao Liu, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Knowledge distillation enables fast and effective transfer of features learned from a bigger model to a smaller one. However, distillation objectives are susceptible to sub-population shifts, a common scenario in medical imaging analysis which refers to groups/domains of data that are underrepresented in the training set. For instance, training models on health data acquired from multiple scanners… ▽ More Knowledge distillation enables fast and effective transfer of features learned from a bigger model to a smaller one. However, distillation objectives are susceptible to sub-population shifts, a common scenario in medical imaging analysis which refers to groups/domains of data that are underrepresented in the training set. For instance, training models on health data acquired from multiple scanners or hospitals can yield subpar performance for minority groups. In this paper, inspired by distributionally robust optimization (DRO) techniques, we address this shortcoming by proposing a group-aware distillation loss. During optimization, a set of weights is updated based on the per-group losses at a given iteration. This way, our method can dynamically focus on groups that have low performance during training. We empirically validate our method, GroupDistil on two benchmark datasets (natural images and cardiac MRIs) and show consistent improvement in terms of worst-group accuracy. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 9 pages, MLMI workshop

arXiv:2310.20435 [pdf, other]

Assessing the Sustainability and Trustworthiness of Federated Learning Models

Authors: Alberto Huertas Celdran, Chao Feng, Pedro Miguel Sanchez Sanchez, Lynn Zumtaugwald, Gerome Bovet, Burkhard Stiller

Abstract: Artificial intelligence (AI) plays a pivotal role in various sectors, influencing critical decision-making processes in our daily lives. Within the AI landscape, novel AI paradigms, such as Federated Learning (FL), focus on preserving data privacy while collaboratively training AI models. In such a context, a group of experts from the European Commission (AI-HLEG) has identified sustainable AI as… ▽ More Artificial intelligence (AI) plays a pivotal role in various sectors, influencing critical decision-making processes in our daily lives. Within the AI landscape, novel AI paradigms, such as Federated Learning (FL), focus on preserving data privacy while collaboratively training AI models. In such a context, a group of experts from the European Commission (AI-HLEG) has identified sustainable AI as one of the key elements that must be considered to provide trustworthy AI. While existing literature offers several taxonomies and solutions for assessing the trustworthiness of FL models, a significant gap exists in considering sustainability and the carbon footprint associated with FL. Thus, this work introduces the sustainability pillar to the most recent and comprehensive trustworthy FL taxonomy, making this work the first to address all AI-HLEG requirements. The sustainability pillar assesses the FL system environmental impact, incorporating notions and metrics for hardware efficiency, federation complexity, and energy grid carbon intensity. Then, this work designs and implements an algorithm for evaluating the trustworthiness of FL models by incorporating the sustainability pillar. Extensive evaluations with the FederatedScope framework and various scenarios varying federation participants, complexities, hardware, and energy grids demonstrate the usefulness of the proposed solution. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.04800 [pdf, other]

Towards Long-Range 3D Object Detection for Autonomous Vehicles

Authors: Ajinkya Khoche, Laura Pereira Sánchez, Nazre Batool, Sina Sharif Mansouri, Patric Jensfelt

Abstract: 3D object detection at long range is crucial for ensuring the safety and efficiency of self driving vehicles, allowing them to accurately perceive and react to objects, obstacles, and potential hazards from a distance. But most current state of the art LiDAR based methods are range limited due to sparsity at long range, which generates a form of domain gap between points closer to and farther away… ▽ More 3D object detection at long range is crucial for ensuring the safety and efficiency of self driving vehicles, allowing them to accurately perceive and react to objects, obstacles, and potential hazards from a distance. But most current state of the art LiDAR based methods are range limited due to sparsity at long range, which generates a form of domain gap between points closer to and farther away from the ego vehicle. Another related problem is the label imbalance for faraway objects, which inhibits the performance of Deep Neural Networks at long range. To address the above limitations, we investigate two ways to improve long range performance of current LiDAR based 3D detectors. First, we combine two 3D detection networks, referred to as range experts, one specializing at near to mid range objects, and one at long range 3D detection. To train a detector at long range under a scarce label regime, we further weigh the loss according to the labelled point's distance from ego vehicle. Second, we augment LiDAR scans with virtual points generated using Multimodal Virtual Points (MVP), a readily available image-based depth completion algorithm. Our experiments on the long range Argoverse2 (AV2) dataset indicate that MVP is more effective in improving long range performance, while maintaining a straightforward implementation. On the other hand, the range experts offer a computationally efficient and simpler alternative, avoiding dependency on image-based segmentation networks and perfect camera-LiDAR calibration. △ Less

Submitted 20 May, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

Comments: Accepted to Intelligent Vehicle Symposium (IV) 2024

arXiv:2308.05978 [pdf, other]

CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation

Authors: Chao Feng, Alberto Huertas Celdran, Pedro Miguel Sanchez Sanchez, Jan Kreischer, Jan von der Assen, Gerome Bovet, Gregorio Martinez Perez, Burkhard Stiller

Abstract: Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective agains… ▽ More Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective against a rising number of heterogeneous zero-day attacks. Thus, this work presents CyberForce, a framework that combines Federated and Reinforcement Learning (FRL) to collaboratively and privately learn suitable MTD techniques for mitigating zero-day attacks. CyberForce integrates device fingerprinting and anomaly detection to reward or penalize MTD mechanisms chosen by an FRL-based agent. The framework has been deployed and evaluated in a scenario consisting of ten physical devices of a real IoT platform affected by heterogeneous malware samples. A pool of experiments has demonstrated that CyberForce learns the MTD technique mitigating each attack faster than existing RL-based centralized approaches. In addition, when various devices are exposed to different attacks, CyberForce benefits from knowledge transfer, leading to enhanced performance and reduced learning time in comparison to recent works. Finally, different aggregation algorithms used during the agent learning process provide CyberForce with notable robustness to malicious attacks. △ Less

Submitted 8 September, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

Comments: 11 pages, 8 figures

arXiv:2308.03554 [pdf, other]

TemporalFED: Detecting Cyberattacks in Industrial Time-Series Data Using Decentralized Federated Learning

Authors: Ángel Luis Perales Gómez, Enrique Tomás Martínez Beltrán, Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán

Abstract: Industry 4.0 has brought numerous advantages, such as increasing productivity through automation. However, it also presents major cybersecurity issues such as cyberattacks affecting industrial processes. Federated Learning (FL) combined with time-series analysis is a promising cyberattack detection mechanism proposed in the literature. However, the fact of having a single point of failure and netw… ▽ More Industry 4.0 has brought numerous advantages, such as increasing productivity through automation. However, it also presents major cybersecurity issues such as cyberattacks affecting industrial processes. Federated Learning (FL) combined with time-series analysis is a promising cyberattack detection mechanism proposed in the literature. However, the fact of having a single point of failure and network bottleneck are critical challenges that need to be tackled. Thus, this article explores the benefits of the Decentralized Federated Learning (DFL) in terms of cyberattack detection and resource consumption. The work presents TemporalFED, a software module for detecting anomalies in industrial environments using FL paradigms and time series. TemporalFED incorporates three components: Time Series Conversion, Feature Engineering, and Time Series Stationary Conversion. To evaluate TemporalFED, it was deployed on Fedstellar, a DFL framework. Then, a pool of experiments measured the detection performance and resource consumption in a chemical gas industrial environment with different time-series configurations, FL paradigms, and topologies. The results showcase the superiority of the configuration utilizing DFL and Semi-Decentralized Federated Learning (SDFL) paradigms, along with a fully connected topology, which achieved the best performance in anomaly detection. Regarding resource consumption, the configuration without feature engineering employed less bandwidth, CPU, and RAM than other configurations. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2307.15208 [pdf, other]

Generative AI for Medical Imaging: extending the MONAI Framework

Authors: Walter H. L. Pinaya, Mark S. Graham, Eric Kerfoot, Petru-Daniel Tudosiu, Jessica Dafflon, Virginia Fernandez, Pedro Sanchez, Julia Wolleb, Pedro F. da Costa, Ashay Patel, Hyungjin Chung, Can Zhao, Wei Peng, Zelong Liu, Xueyan Mei, Oeslle Lucena, Jong Chul Ye, Sotirios A. Tsaftaris, Prerna Dogra, Andrew Feng, Marc Modat, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the comp… ▽ More Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the complexity of these models, their implementation and reproducibility can be difficult. This complexity can hinder progress, act as a use barrier, and dissuade the comparison of new methods with existing works. In this study, we present MONAI Generative Models, a freely available open-source platform that allows researchers and developers to easily train, evaluate, and deploy generative models and related applications. Our platform reproduces state-of-art studies in a standardised way involving different architectures (such as diffusion models, autoregressive transformers, and GANs), and provides pre-trained models for the community. We have implemented these models in a generalisable fashion, illustrating that their results can be extended to 2D or 3D scenarios, including medical images with different modalities (like CT, MRI, and X-Ray data) and from different anatomical areas. Finally, we adopt a modular and extensible approach, ensuring long-term maintainability and the extension of current applications for future features. △ Less

Submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.11730 [pdf, other]

Mitigating Communications Threats in Decentralized Federated Learning through Moving Target Defense

Authors: Enrique Tomás Martínez Beltrán, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: The rise of Decentralized Federated Learning (DFL) has enabled the training of machine learning models across federated participants, fostering decentralized model aggregation and reducing dependence on a server. However, this approach introduces unique communication security challenges that have yet to be thoroughly addressed in the literature. These challenges primarily originate from the decent… ▽ More The rise of Decentralized Federated Learning (DFL) has enabled the training of machine learning models across federated participants, fostering decentralized model aggregation and reducing dependence on a server. However, this approach introduces unique communication security challenges that have yet to be thoroughly addressed in the literature. These challenges primarily originate from the decentralized nature of the aggregation process, the varied roles and responsibilities of the participants, and the absence of a central authority to oversee and mitigate threats. Addressing these challenges, this paper first delineates a comprehensive threat model focused on DFL communications. In response to these identified risks, this work introduces a security module to counter communication-based attacks for DFL platforms. The module combines security techniques such as symmetric and asymmetric encryption with Moving Target Defense (MTD) techniques, including random neighbor selection and IP/port switching. The security module is implemented in a DFL platform, Fedstellar, allowing the deployment and monitoring of the federation. A DFL scenario with physical and virtual deployments have been executed, encompassing three security configurations: (i) a baseline without security, (ii) an encrypted configuration, and (iii) a configuration integrating both encryption and MTD techniques. The effectiveness of the security module is validated through experiments with the MNIST dataset and eclipse attacks. The results showed an average F1 score of 95%, with the most secure configuration resulting in CPU usage peaking at 68% (+-9%) in virtual deployments and network traffic reaching 480.8 MB (+-18 MB), effectively mitigating risks associated with eavesdropping or eclipse attacks. △ Less

Submitted 9 December, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

arXiv:2307.05704 [pdf, other]

A Causal Ordering Prior for Unsupervised Representation Learning

Authors: Avinash Kori, Pedro Sanchez, Konstantinos Vilouras, Ben Glocker, Sotirios A. Tsaftaris

Abstract: Unsupervised representation learning with variational inference relies heavily on independence assumptions over latent variables. Causal representation learning (CRL), however, argues that factors of variation in a dataset are, in fact, causally related. Allowing latent variables to be correlated, as a consequence of causal relationships, is more realistic and generalisable. So far, provably ident… ▽ More Unsupervised representation learning with variational inference relies heavily on independence assumptions over latent variables. Causal representation learning (CRL), however, argues that factors of variation in a dataset are, in fact, causally related. Allowing latent variables to be correlated, as a consequence of causal relationships, is more realistic and generalisable. So far, provably identifiable methods rely on: auxiliary information, weak labels, and interventional or even counterfactual data. Inspired by causal discovery with functional causal models, we propose a fully unsupervised representation learning method that considers a data generation process with a latent additive noise model (ANM). We encourage the latent space to follow a causal ordering via loss function based on the Hessian of the latent distribution. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2306.15559 [pdf, other]

RansomAI: AI-powered Ransomware for Stealthy Encryption

Authors: Jan von der Assen, Alberto Huertas Celdrán, Janik Luechinger, Pedro Miguel Sánchez Sánchez, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Cybersecurity solutions have shown promising performance when detecting ransomware samples that use fixed algorithms and encryption rates. However, due to the current explosion of Artificial Intelligence (AI), sooner than later, ransomware (and malware in general) will incorporate AI techniques to intelligently and dynamically adapt its encryption behavior to be undetected. It might result in inef… ▽ More Cybersecurity solutions have shown promising performance when detecting ransomware samples that use fixed algorithms and encryption rates. However, due to the current explosion of Artificial Intelligence (AI), sooner than later, ransomware (and malware in general) will incorporate AI techniques to intelligently and dynamically adapt its encryption behavior to be undetected. It might result in ineffective and obsolete cybersecurity solutions, but the literature lacks AI-powered ransomware to verify it. Thus, this work proposes RansomAI, a Reinforcement Learning-based framework that can be integrated into existing ransomware samples to adapt their encryption behavior and stay stealthy while encrypting files. RansomAI presents an agent that learns the best encryption algorithm, rate, and duration that minimizes its detection (using a reward mechanism and a fingerprinting intelligent detection system) while maximizing its damage function. The proposed framework was validated in a ransomware, Ransomware-PoC, that infected a Raspberry Pi 4, acting as a crowdsensor. A pool of experiments with Deep Q-Learning and Isolation Forest (deployed on the agent and detection system, respectively) has demonstrated that RansomAI evades the detection of Ransomware-PoC affecting the Raspberry Pi 4 in a few minutes with >90% accuracy. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.09750 [pdf, other]

doi 10.1016/j.eswa.2023.122861

Fedstellar: A Platform for Decentralized Federated Learning

Authors: Enrique Tomás Martínez Beltrán, Ángel Luis Perales Gómez, Chao Feng, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: In 2016, Google proposed Federated Learning (FL) as a novel paradigm to train Machine Learning (ML) models across the participants of a federation while preserving data privacy. Since its birth, Centralized FL (CFL) has been the most used approach, where a central entity aggregates participants' models to create a global one. However, CFL presents limitations such as communication bottlenecks, sin… ▽ More In 2016, Google proposed Federated Learning (FL) as a novel paradigm to train Machine Learning (ML) models across the participants of a federation while preserving data privacy. Since its birth, Centralized FL (CFL) has been the most used approach, where a central entity aggregates participants' models to create a global one. However, CFL presents limitations such as communication bottlenecks, single point of failure, and reliance on a central server. Decentralized Federated Learning (DFL) addresses these issues by enabling decentralized model aggregation and minimizing dependency on a central entity. Despite these advances, current platforms training DFL models struggle with key issues such as managing heterogeneous federation network topologies. To overcome these challenges, this paper presents Fedstellar, a platform extended from p2pfl library and designed to train FL models in a decentralized, semi-decentralized, and centralized fashion across diverse federations of physical or virtualized devices. The Fedstellar implementation encompasses a web application with an interactive graphical interface, a controller for deploying federations of nodes using physical or virtual devices, and a core deployed on each device which provides the logic needed to train, aggregate, and communicate in the network. The effectiveness of the platform has been demonstrated in two scenarios: a physical deployment involving single-board devices such as Raspberry Pis for detecting cyberattacks, and a virtualized deployment comparing various FL approaches in a controlled environment using MNIST and CIFAR-10 datasets. In both scenarios, Fedstellar demonstrated consistent performance and adaptability, achieving F1 scores of 91%, 98%, and 91.2% using DFL for detecting cyberattacks and classifying MNIST and CIFAR-10, respectively, reducing training time by 32% compared to centralized approaches. △ Less

Submitted 8 April, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.08495 [pdf, other]

Single-board Device Individual Authentication based on Hardware Performance and Autoencoder Transformer Models

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: The proliferation of the Internet of Things (IoT) has led to the emergence of crowdsensing applications, where a multitude of interconnected devices collaboratively collect and analyze data. Ensuring the authenticity and integrity of the data collected by these devices is crucial for reliable decision-making and maintaining trust in the system. Traditional authentication methods are often vulnerab… ▽ More The proliferation of the Internet of Things (IoT) has led to the emergence of crowdsensing applications, where a multitude of interconnected devices collaboratively collect and analyze data. Ensuring the authenticity and integrity of the data collected by these devices is crucial for reliable decision-making and maintaining trust in the system. Traditional authentication methods are often vulnerable to attacks or can be easily duplicated, posing challenges to securing crowdsensing applications. Besides, current solutions leveraging device behavior are mostly focused on device identification, which is a simpler task than authentication. To address these issues, an individual IoT device authentication framework based on hardware behavior fingerprinting and Transformer autoencoders is proposed in this work. This solution leverages the inherent imperfections and variations in IoT device hardware to differentiate between devices with identical specifications. By monitoring and analyzing the behavior of key hardware components, such as the CPU, GPU, RAM, and Storage on devices, unique fingerprints for each device are created. The performance samples are considered as time series data and used to train outlier detection transformer models, one per device and aiming to model its normal data distribution. Then, the framework is validated within a spectrum crowdsensing system leveraging Raspberry Pi devices. After a pool of experiments, the model from each device is able to individually authenticate it between the 45 devices employed for validation. An average True Positive Rate (TPR) of 0.74+-0.13 and an average maximum False Positive Rate (FPR) of 0.06+-0.09 demonstrate the effectiveness of this approach in enhancing authentication, security, and trust in crowdsensing applications. △ Less

Submitted 11 November, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

arXiv:2306.07783 [pdf, other]

Compositionally Equivariant Representation Learning

Authors: Xiao Liu, Pedro Sanchez, Spyridon Thermos, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Deep learning models often need sufficient supervision (i.e. labelled data) in order to be trained effectively. By contrast, humans can swiftly learn to identify important anatomy in medical images like MRI and CT scans, with minimal guidance. This recognition capability easily generalises to new images from different medical facilities and to new tasks in different settings. This rapid and genera… ▽ More Deep learning models often need sufficient supervision (i.e. labelled data) in order to be trained effectively. By contrast, humans can swiftly learn to identify important anatomy in medical images like MRI and CT scans, with minimal guidance. This recognition capability easily generalises to new images from different medical facilities and to new tasks in different settings. This rapid and generalisable learning ability is largely due to the compositional structure of image patterns in the human brain, which are not well represented in current medical models. In this paper, we study the utilisation of compositionality in learning more interpretable and generalisable representations for medical image segmentation. Overall, we propose that the underlying generative factors that are used to generate the medical images satisfy compositional equivariance property, where each factor is compositional (e.g. corresponds to the structures in human anatomy) and also equivariant to the task. Hence, a good representation that approximates well the ground truth factor has to be compositionally equivariant. By modelling the compositional representations with learnable von-Mises-Fisher (vMF) kernels, we explore how different design and learning biases can be used to enforce the representations to be more compositionally equivariant under un-, weakly-, and semi-supervised settings. Extensive results show that our methods achieve the best performance over several strong baselines on the task of semi-supervised domain-generalised medical image segmentation. Code will be made publicly available upon acceptance at https://github.com/vios-s. △ Less

Submitted 17 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: Submitted. 10 pages. arXiv admin note: text overlap with arXiv:2206.14538

arXiv:2306.01322 [pdf, other]

Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models

Authors: Virginia Fernandez, Pedro Sanchez, Walter Hugo Lopez Pinaya, Grzegorz Jacenków, Sotirios A. Tsaftaris, Jorge Cardoso

Abstract: Knowledge distillation in neural networks refers to compressing a large model or dataset into a smaller version of itself. We introduce Privacy Distillation, a framework that allows a text-to-image generative model to teach another model without exposing it to identifiable data. Here, we are interested in the privacy issue faced by a data provider who wishes to share their data via a multimodal ge… ▽ More Knowledge distillation in neural networks refers to compressing a large model or dataset into a smaller version of itself. We introduce Privacy Distillation, a framework that allows a text-to-image generative model to teach another model without exposing it to identifiable data. Here, we are interested in the privacy issue faced by a data provider who wishes to share their data via a multimodal generative model. A question that immediately arises is ``How can a data provider ensure that the generative model is not leaking identifiable information about a patient?''. Our solution consists of (1) training a first diffusion model on real data (2) generating a synthetic dataset using this model and filtering it to exclude images with a re-identifiability risk (3) training a second diffusion model on the filtered synthetic data only. We showcase that datasets sampled from models trained with privacy distillation can effectively reduce re-identification risk whilst maintaining downstream performance. △ Less

Submitted 2 June, 2023; originally announced June 2023.

arXiv:2305.08252 [pdf, other]

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity

Authors: Raman Dutt, Linus Ericsson, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy Hospedales

Abstract: Foundation models have significantly advanced medical image analysis through the pre-train fine-tune paradigm. Among various fine-tuning algorithms, Parameter-Efficient Fine-Tuning (PEFT) is increasingly utilized for knowledge transfer across diverse tasks, including vision-language and text-to-image generation. However, its application in medical image analysis is relatively unexplored due to the… ▽ More Foundation models have significantly advanced medical image analysis through the pre-train fine-tune paradigm. Among various fine-tuning algorithms, Parameter-Efficient Fine-Tuning (PEFT) is increasingly utilized for knowledge transfer across diverse tasks, including vision-language and text-to-image generation. However, its application in medical image analysis is relatively unexplored due to the lack of a structured benchmark for evaluating PEFT methods. This study fills this gap by evaluating 17 distinct PEFT algorithms across convolutional and transformer-based networks on image classification and text-to-image generation tasks using six medical datasets of varying size, modality, and complexity. Through a battery of over 700 controlled experiments, our findings demonstrate PEFT's effectiveness, particularly in low data regimes common in medical imaging, with performance gains of up to 22% in discriminative and generative tasks. These recommendations can assist the community in incorporating PEFT into their workflows and facilitate fair comparisons of future PEFT methods, ensuring alignment with advancements in other areas of machine learning and AI. △ Less

Submitted 10 June, 2024; v1 submitted 14 May, 2023; originally announced May 2023.

Comments: Accepted as Oral Presentation at MIDL 2024

arXiv:2302.09844 [pdf, other]

FederatedTrust: A Solution for Trustworthy Federated Learning

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Ning Xie, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: The rapid expansion of the Internet of Things (IoT) and Edge Computing has presented challenges for centralized Machine and Deep Learning (ML/DL) methods due to the presence of distributed data silos that hold sensitive information. To address concerns regarding data privacy, collaborative and privacy-preserving ML/DL techniques like Federated Learning (FL) have emerged. However, ensuring data pri… ▽ More The rapid expansion of the Internet of Things (IoT) and Edge Computing has presented challenges for centralized Machine and Deep Learning (ML/DL) methods due to the presence of distributed data silos that hold sensitive information. To address concerns regarding data privacy, collaborative and privacy-preserving ML/DL techniques like Federated Learning (FL) have emerged. However, ensuring data privacy and performance alone is insufficient since there is a growing need to establish trust in model predictions. Existing literature has proposed various approaches on trustworthy ML/DL (excluding data privacy), identifying robustness, fairness, explainability, and accountability as important pillars. Nevertheless, further research is required to identify trustworthiness pillars and evaluation metrics specifically relevant to FL models, as well as to develop solutions that can compute the trustworthiness level of FL models. This work examines the existing requirements for evaluating trustworthiness in FL and introduces a comprehensive taxonomy consisting of six pillars (privacy, robustness, fairness, explainability, accountability, and federation), along with over 30 metrics for computing the trustworthiness of FL models. Subsequently, an algorithm named FederatedTrust is designed based on the pillars and metrics identified in the taxonomy to compute the trustworthiness score of FL models. A prototype of FederatedTrust is implemented and integrated into the learning process of FederatedScope, a well-established FL framework. Finally, five experiments are conducted using different configurations of FederatedScope to demonstrate the utility of FederatedTrust in computing the trustworthiness of FL models. Three experiments employ the FEMNIST dataset, and two utilize the N-BaIoT dataset considering a real-world IoT security use case. △ Less

Submitted 6 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

arXiv:2301.08330 [pdf, other]

The role of noise in denoising models for anomaly detection in medical images

Authors: Antanas Kascenas, Pedro Sanchez, Patrick Schrempf, Chaoyang Wang, William Clackett, Shadia S. Mikhael, Jeremy P. Voisey, Keith Goatman, Alexander Weir, Nicolas Pugeault, Sotirios A. Tsaftaris, Alison Q. O'Neil

Abstract: Pathological brain lesions exhibit diverse appearance in brain images, in terms of intensity, texture, shape, size, and location. Comprehensive sets of data and annotations are difficult to acquire. Therefore, unsupervised anomaly detection approaches have been proposed using only normal data for training, with the aim of detecting outlier anomalous voxels at test time. Denoising methods, for inst… ▽ More Pathological brain lesions exhibit diverse appearance in brain images, in terms of intensity, texture, shape, size, and location. Comprehensive sets of data and annotations are difficult to acquire. Therefore, unsupervised anomaly detection approaches have been proposed using only normal data for training, with the aim of detecting outlier anomalous voxels at test time. Denoising methods, for instance classical denoising autoencoders (DAEs) and more recently emerging diffusion models, are a promising approach, however naive application of pixelwise noise leads to poor anomaly detection performance. We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes, with similar noise parameter adjustments giving good performance for both DAEs and diffusion models. Visual inspection of the reconstructions suggests that the training noise influences the trade-off between the extent of the detail that is reconstructed and the extent of erasure of anomalies, both of which contribute to better anomaly detection performance. We validate our findings on two real-world datasets (tumor detection in brain MRI and hemorrhage/ischemia/tumor detection in brain CT), showing good detection on diverse anomaly appearances. Overall, we find that a DAE trained with coarse noise is a fast and simple method that gives state-of-the-art accuracy. Diffusion models applied to anomaly detection are as yet in their infancy and provide a promising avenue for further research. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: Submitted to Medical Image Analysis special issue for MIDL 2022

MSC Class: 68T99; 92C55; 68U10

arXiv:2212.14677 [pdf, other]

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new… ▽ More In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance. △ Less

Submitted 30 December, 2022; originally announced December 2022.

arXiv:2212.14647 [pdf, other]

doi 10.1109/TIFS.2024.3402055

RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoT

Authors: Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Jan von der Assen, Timo Schenk, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learnin… ▽ More Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM. △ Less

Submitted 30 December, 2022; originally announced December 2022.

arXiv:2211.08413 [pdf, other]

doi 10.1109/COMST.2023.3315746

Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and Challenges

Authors: Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustwo… ▽ More In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustworthiness concerns affecting the entity responsible for the global model creation. Decentralized Federated Learning (DFL) emerged to address these concerns by promoting decentralized model aggregation and minimizing reliance on centralized architectures. However, despite the work done in DFL, the literature has not (i) studied the main aspects differentiating DFL and CFL; (ii) analyzed DFL frameworks to create and evaluate new solutions; and (iii) reviewed application scenarios using DFL. Thus, this article identifies and analyzes the main fundamentals of DFL in terms of federation architectures, topologies, communication mechanisms, security approaches, and key performance indicators. Additionally, the paper at hand explores existing mechanisms to optimize critical DFL fundamentals. Then, the most relevant features of the current DFL frameworks are reviewed and compared. After that, it analyzes the most used DFL application scenarios, identifying solutions based on the fundamentals and frameworks previously defined. Finally, the evolution of existing DFL solutions is studied to provide a list of trends, lessons learned, and open challenges. △ Less

Submitted 13 September, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

arXiv:2210.11501 [pdf, other]

Trust-as-a-Service: A reputation-enabled trust framework for 5G networks

Authors: José María Jorquera Valero, Pedro Miguel Sánchez Sánchez, Manuel Gil Pérez, Alberto Huertas Celdrán, Gregorio Martínez Pérez

Abstract: Trust, security, and privacy are three of the major pillars to assemble the fifth generation network and beyond. Despite such pillars are principally interconnected, they arise a multitude of challenges to be addressed separately. 5G ought to offer flexible and pervasive computing capabilities across multiple domains according to user demands and assuring trustworthy network providers. Distributed… ▽ More Trust, security, and privacy are three of the major pillars to assemble the fifth generation network and beyond. Despite such pillars are principally interconnected, they arise a multitude of challenges to be addressed separately. 5G ought to offer flexible and pervasive computing capabilities across multiple domains according to user demands and assuring trustworthy network providers. Distributed marketplaces expect to boost the trading of heterogeneous resources so as to enable the establishment of pervasive service chains between cross-domains. Nevertheless, the need for reliable parties as ``marketplace operators'' plays a pivotal role to achieving a trustworthy ecosystem. One of the principal blockages in managing foreseeable networks is the need of adapting previous trust models to accomplish the new network and business requirements. In this regard, this article is centered on trust management of 5G multi-party networks. The design of a reputation-based trust framework is proposed as a Trust-as-a-Service (TaaS) solution for any distributed multi-stakeholder environment where zero trust and zero-touch principles should be met. Besides, a literature review is also conducted to recognize the network and business requirements currently envisaged. Finally, the validation of the proposed trust framework is performed in a real research environment, the 5GBarcelona testbed, leveraging 12% of a 2.1GHz CPU with 20 cores and 2% of the 30GiB memory. In this regard, these outcomes reveal the feasibility of the TaaS solution in the context of determining reliable network operators. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.11061 [pdf, other]

Analyzing the Robustness of Decentralized Horizontal and Vertical Federated Learning Architectures in a Non-IID Scenario

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Enrique Tomás Martínez Beltrán, Daniel Demeter, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Federated learning (FL) allows participants to collaboratively train machine and deep learning models while protecting data privacy. However, the FL paradigm still presents drawbacks affecting its trustworthiness since malicious participants could launch adversarial attacks against the training process. Related work has studied the robustness of horizontal FL scenarios under different attacks. How… ▽ More Federated learning (FL) allows participants to collaboratively train machine and deep learning models while protecting data privacy. However, the FL paradigm still presents drawbacks affecting its trustworthiness since malicious participants could launch adversarial attacks against the training process. Related work has studied the robustness of horizontal FL scenarios under different attacks. However, there is a lack of work evaluating the robustness of decentralized vertical FL and comparing it with horizontal FL architectures affected by adversarial attacks. Thus, this work proposes three decentralized FL architectures, one for horizontal and two for vertical scenarios, namely HoriChain, VertiChain, and VertiComb. These architectures present different neural networks and training protocols suitable for horizontal and vertical scenarios. Then, a decentralized, privacy-preserving, and federated use case with non-IID data to classify handwritten digits is deployed to evaluate the performance of the three architectures. Finally, a set of experiments computes and compares the robustness of the proposed architectures when they are affected by different data poisoning based on image watermarks and gradient poisoning adversarial attacks. The experiments show that even though particular configurations of both attacks can destroy the classification performance of the architectures, HoriChain is the most robust one. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.07719 [pdf, other]

doi 10.1109/ICC45041.2023.10278951

A Lightweight Moving Target Defense Framework for Multi-purpose Malware Affecting IoT Devices

Authors: Jan von der Assen, Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Jordan Cedeño, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Malware affecting Internet of Things (IoT) devices is rapidly growing due to the relevance of this paradigm in real-world scenarios. Specialized literature has also detected a trend towards multi-purpose malware able to execute different malicious actions such as remote control, data leakage, encryption, or code hiding, among others. Protecting IoT devices against this kind of malware is challengi… ▽ More Malware affecting Internet of Things (IoT) devices is rapidly growing due to the relevance of this paradigm in real-world scenarios. Specialized literature has also detected a trend towards multi-purpose malware able to execute different malicious actions such as remote control, data leakage, encryption, or code hiding, among others. Protecting IoT devices against this kind of malware is challenging due to their well-known vulnerabilities and limitation in terms of CPU, memory, and storage. To improve it, the moving target defense (MTD) paradigm was proposed a decade ago and has shown promising results, but there is a lack of IoT MTD solutions dealing with multi-purpose malware. Thus, this work proposes four MTD mechanisms changing IoT devices' network, data, and runtime environment to mitigate multi-purpose malware. Furthermore, it presents a lightweight and IoT-oriented MTD framework to decide what, when, and how the MTD mechanisms are deployed. Finally, the efficiency and effectiveness of the framework and MTD mechanisms are evaluated in a real-world scenario with one IoT spectrum sensor affected by multi-purpose malware. △ Less

Submitted 14 October, 2022; originally announced October 2022.

arXiv:2210.06201 [pdf, other]

Diffusion Models for Causal Discovery via Topological Ordering

Authors: Pedro Sanchez, Xiao Liu, Alison Q O'Neil, Sotirios A. Tsaftaris

Abstract: Discovering causal relations from observational data becomes possible with additional assumptions such as considering the functional relations to be constrained as nonlinear with additive noise (ANM). Even with strong assumptions, causal discovery involves an expensive search problem over the space of directed acyclic graphs (DAGs). \emph{Topological ordering} approaches reduce the optimisation sp… ▽ More Discovering causal relations from observational data becomes possible with additional assumptions such as considering the functional relations to be constrained as nonlinear with additive noise (ANM). Even with strong assumptions, causal discovery involves an expensive search problem over the space of directed acyclic graphs (DAGs). \emph{Topological ordering} approaches reduce the optimisation space of causal discovery by searching over a permutation rather than graph space. For ANMs, the \emph{Hessian} of the data log-likelihood can be used for finding leaf nodes in a causal graph, allowing its topological ordering. However, existing computational methods for obtaining the Hessian still do not scale as the number of variables and the number of samples increase. Therefore, inspired by recent innovations in diffusion probabilistic models (DPMs), we propose \emph{DiffAN}\footnote{Implementation is available at \url{https://github.com/vios-s/DiffAN} .}, a topological ordering algorithm that leverages DPMs for learning a Hessian function. We introduce theory for updating the learned Hessian without re-training the neural network, and we show that computing with a subset of samples gives an accurate approximation of the ordering, which allows scaling to datasets with more samples and variables. We show empirically that our method scales exceptionally well to datasets with up to $500$ nodes and up to $10^5$ samples while still performing on par over small datasets with state-of-the-art causal discovery methods. Implementation is available at https://github.com/vios-s/DiffAN . △ Less

Submitted 26 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: Implementation is available at https://github.com/vios-s/DiffAN Published as a conference paper at ICLR 2023 https://openreview.net/forum?id=Idusfje4-Wq

Report number: ICLR 2023

Journal ref: The Eleventh International Conference on Learning Representations (2023)

arXiv:2208.03563 [pdf, other]

HSIC-InfoGAN: Learning Unsupervised Disentangled Representations by Maximising Approximated Mutual Information

Authors: Xiao Liu, Spyridon Thermos, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Learning disentangled representations requires either supervision or the introduction of specific model designs and learning constraints as biases. InfoGAN is a popular disentanglement framework that learns unsupervised disentangled representations by maximising the mutual information between latent representations and their corresponding generated images. Maximisation of mutual information is ach… ▽ More Learning disentangled representations requires either supervision or the introduction of specific model designs and learning constraints as biases. InfoGAN is a popular disentanglement framework that learns unsupervised disentangled representations by maximising the mutual information between latent representations and their corresponding generated images. Maximisation of mutual information is achieved by introducing an auxiliary network and training with a latent regression loss. In this short exploratory paper, we study the use of the Hilbert-Schmidt Independence Criterion (HSIC) to approximate mutual information between latent representation and image, termed HSIC-InfoGAN. Directly optimising the HSIC loss avoids the need for an additional auxiliary network. We qualitatively compare the level of disentanglement in each model, suggest a strategy to tune the hyperparameters of HSIC-InfoGAN, and discuss the potential of HSIC-InfoGAN for medical applications. △ Less

Submitted 6 August, 2022; originally announced August 2022.

Comments: MICCAI MAD Workshop 2022

arXiv:2207.12268 [pdf, other]

What is Healthy? Generative Counterfactual Diffusion for Lesion Localization

Authors: Pedro Sanchez, Antanas Kascenas, Xiao Liu, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Reducing the requirement for densely annotated masks in medical image segmentation is important due to cost constraints. In this paper, we consider the problem of inferring pixel-level predictions of brain lesions by only using image-level labels for training. By leveraging recent advances in generative diffusion probabilistic models (DPM), we synthesize counterfactuals of "How would a patient app… ▽ More Reducing the requirement for densely annotated masks in medical image segmentation is important due to cost constraints. In this paper, we consider the problem of inferring pixel-level predictions of brain lesions by only using image-level labels for training. By leveraging recent advances in generative diffusion probabilistic models (DPM), we synthesize counterfactuals of "How would a patient appear if X pathology was not present?". The difference image between the observed patient state and the healthy counterfactual can be used for inferring the location of pathology. We generate counterfactuals that correspond to the minimal change of the input such that it is transformed to healthy domain. This requires training with healthy and unhealthy data in DPMs. We improve on previous counterfactual DPMs by manipulating the generation process with implicit guidance along with attention conditioning instead of using classifiers. Code is available at https://github.com/vios-s/Diff-SCM. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: Accepted at the Deep Generative Models workshop at MICCAI 2022

arXiv:2207.11765 [pdf, other]

Neurosymbolic Repair for Low-Code Formula Languages

Authors: Rohan Bavishi, Harshit Joshi, José Pablo Cambronero Sánchez, Anna Fariha, Sumit Gulwani, Vu Le, Ivan Radicek, Ashish Tiwari

Abstract: Most users of low-code platforms, such as Excel and PowerApps, write programs in domain-specific formula languages to carry out nontrivial tasks. Often users can write most of the program they want, but introduce small mistakes that yield broken formulas. These mistakes, which can be both syntactic and semantic, are hard for low-code users to identify and fix, even though they can be resolved with… ▽ More Most users of low-code platforms, such as Excel and PowerApps, write programs in domain-specific formula languages to carry out nontrivial tasks. Often users can write most of the program they want, but introduce small mistakes that yield broken formulas. These mistakes, which can be both syntactic and semantic, are hard for low-code users to identify and fix, even though they can be resolved with just a few edits. We formalize the problem of producing such edits as the last-mile repair problem. To address this problem, we developed LaMirage, a LAst-MIle RepAir-engine GEnerator that combines symbolic and neural techniques to perform last-mile repair in low-code formula languages. LaMirage takes a grammar and a set of domain-specific constraints/rules, which jointly approximate the target language, and uses these to generate a repair engine that can fix formulas in that language. To tackle the challenges of localizing the errors and ranking the candidate repairs, LaMirage leverages neural techniques, whereas it relies on symbolic methods to generate candidate repairs. This combination allows LaMirage to find repairs that satisfy the provided grammar and constraints, and then pick the most natural repair. We compare LaMirage to state-of-the-art neural and symbolic approaches on 400 real Excel and PowerFx formulas, where LaMirage outperforms all baselines. We release these benchmarks to encourage subsequent work in low-code domains. △ Less

Submitted 24 July, 2022; originally announced July 2022.

arXiv:2206.14538 [pdf, other]

vMFNet: Compositionality Meets Domain-generalised Segmentation

Authors: Xiao Liu, Spyridon Thermos, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Training medical image segmentation models usually requires a large amount of labeled data. By contrast, humans can quickly learn to accurately recognise anatomy of interest from medical (e.g. MRI and CT) images with some limited guidance. Such recognition ability can easily generalise to new images from different clinical centres. This rapid and generalisable learning ability is mostly due to the… ▽ More Training medical image segmentation models usually requires a large amount of labeled data. By contrast, humans can quickly learn to accurately recognise anatomy of interest from medical (e.g. MRI and CT) images with some limited guidance. Such recognition ability can easily generalise to new images from different clinical centres. This rapid and generalisable learning ability is mostly due to the compositional structure of image patterns in the human brain, which is less incorporated in medical image segmentation. In this paper, we model the compositional components (i.e. patterns) of human anatomy as learnable von-Mises-Fisher (vMF) kernels, which are robust to images collected from different domains (e.g. clinical centres). The image features can be decomposed to (or composed by) the components with the composing operations, i.e. the vMF likelihoods. The vMF likelihoods tell how likely each anatomical part is at each position of the image. Hence, the segmentation mask can be predicted based on the vMF likelihoods. Moreover, with a reconstruction module, unlabeled data can also be used to learn the vMF kernels and likelihoods by recombining them to reconstruct the input image. Extensive experiments show that the proposed vMFNet achieves improved generalisation performance on two benchmarks, especially when annotations are limited. Code is publicly available at: https://github.com/vios-s/vMFNet. △ Less

Submitted 29 June, 2022; originally announced June 2022.

Comments: Accepted by MICCAI 2022

arXiv:2205.11402 [pdf, other]

Causal Machine Learning for Healthcare and Precision Medicine

Authors: Pedro Sanchez, Jeremy P. Voisey, Tian Xia, Hannah I. Watson, Alison Q. ONeil, Sotirios A. Tsaftaris

Abstract: Causal machine learning (CML) has experienced increasing popularity in healthcare. Beyond the inherent capabilities of adding domain knowledge into learning systems, CML provides a complete toolset for investigating how a system would react to an intervention (e.g.\ outcome given a treatment). Quantifying effects of interventions allows actionable decisions to be made whilst maintaining robustness… ▽ More Causal machine learning (CML) has experienced increasing popularity in healthcare. Beyond the inherent capabilities of adding domain knowledge into learning systems, CML provides a complete toolset for investigating how a system would react to an intervention (e.g.\ outcome given a treatment). Quantifying effects of interventions allows actionable decisions to be made whilst maintaining robustness in the presence of confounders. Here, we explore how causal inference can be incorporated into different aspects of clinical decision support (CDS) systems by using recent advances in machine learning. Throughout this paper, we use Alzheimer's disease (AD) to create examples for illustrating how CML can be advantageous in clinical scenarios. Furthermore, we discuss important challenges present in healthcare applications such as processing high-dimensional and unstructured data, generalisation to out-of-distribution samples, and temporal relationships, that despite the great effort from the research community remain to be solved. Finally, we review lines of research within causal representation learning, causal discovery and causal reasoning which offer the potential towards addressing the aforementioned challenges. △ Less

Submitted 31 May, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: 19 pages, 4 figures, 1 table

arXiv:2204.08516 [pdf, other]

doi 10.1016/j.iot.2023.100764

LwHBench: A low-level hardware component benchmark and dataset for Single Board Computers

Authors: Pedro Miguel Sánchez Sánchez, José María Jorquera Valero, Alberto Huertas Celdrán, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: In today's computing environment, where Artificial Intelligence (AI) and data processing are moving toward the Internet of Things (IoT) and Edge computing paradigms, benchmarking resource-constrained devices is a critical task to evaluate their suitability and performance. Between the employed devices, Single-Board Computers arise as multi-purpose and affordable systems. The literature has explore… ▽ More In today's computing environment, where Artificial Intelligence (AI) and data processing are moving toward the Internet of Things (IoT) and Edge computing paradigms, benchmarking resource-constrained devices is a critical task to evaluate their suitability and performance. Between the employed devices, Single-Board Computers arise as multi-purpose and affordable systems. The literature has explored Single-Board Computers performance when running high-level benchmarks specialized in particular application scenarios, such as AI or medical applications. However, lower-level benchmarking applications and datasets are needed to enable new Edge-based AI solutions for network, system and service management based on device and component performance, such as individual device identification. Thus, this paper presents LwHBench, a low-level hardware benchmarking application for Single-Board Computers that measures the performance of CPU, GPU, Memory and Storage taking into account the component constraints in these types of devices. LwHBench has been implemented for Raspberry Pi devices and run for 100 days on a set of 45 devices to generate an extensive dataset that allows the usage of AI techniques in scenarios where performance data can help in the device management process. Besides, to demonstrate the inter-scenario capability of the dataset, a series of AI-enabled use cases about device identification and context impact on performance are presented as exploration of the published data. Finally, the benchmark application has been adapted and applied to an agriculture-focused scenario where three RockPro64 devices are present. △ Less

Submitted 24 October, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

arXiv:2203.07815 [pdf, other]

Adversarial Counterfactual Augmentation: Application in Alzheimer's Disease Classification

Authors: Tian Xia, Pedro Sanchez, Chen Qin, Sotirios A. Tsaftaris

Abstract: Due to the limited availability of medical data, deep learning approaches for medical image analysis tend to generalise poorly to unseen data. Augmenting data during training with random transformations has been shown to help and became a ubiquitous technique for training neural networks. Here, we propose a novel adversarial counterfactual augmentation scheme that aims at finding the most \textit{… ▽ More Due to the limited availability of medical data, deep learning approaches for medical image analysis tend to generalise poorly to unseen data. Augmenting data during training with random transformations has been shown to help and became a ubiquitous technique for training neural networks. Here, we propose a novel adversarial counterfactual augmentation scheme that aims at finding the most \textit{effective} synthesised images to improve downstream tasks, given a pre-trained generative model. Specifically, we construct an adversarial game where we update the input \textit{conditional factor} of the generator and the downstream \textit{classifier} with gradient backpropagation alternatively and iteratively. This can be viewed as finding the `\textit{weakness}' of the classifier and purposely forcing it to \textit{overcome} its weakness via the generative model. To demonstrate the effectiveness of the proposed approach, we validate the method with the classification of Alzheimer's Disease (AD) as a downstream task. The pre-trained generative model synthesises brain images using age as conditional factor. Extensive experiments and ablation studies have been performed to show that the proposed approach improves classification performance and has potential to alleviate spurious correlations and catastrophic forgetting. Code will be released upon acceptance. △ Less

Submitted 1 October, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

arXiv:2202.10166 [pdf, other]

Diffusion Causal Models for Counterfactual Estimation

Authors: Pedro Sanchez, Sotirios A. Tsaftaris

Abstract: We consider the task of counterfactual estimation from observational imaging data given a known causal structure. In particular, quantifying the causal effect of interventions for high-dimensional data with neural networks remains an open challenge. Herein we propose Diff-SCM, a deep structural causal model that builds on recent advances of generative energy-based models. In our setting, inference… ▽ More We consider the task of counterfactual estimation from observational imaging data given a known causal structure. In particular, quantifying the causal effect of interventions for high-dimensional data with neural networks remains an open challenge. Herein we propose Diff-SCM, a deep structural causal model that builds on recent advances of generative energy-based models. In our setting, inference is performed by iteratively sampling gradients of the marginal and conditional distributions entailed by the causal model. Counterfactual estimation is achieved by firstly inferring latent variables with deterministic forward diffusion, then intervening on a reverse diffusion process using the gradients of an anti-causal predictor w.r.t the input. Furthermore, we propose a metric for evaluating the generated counterfactuals. We find that Diff-SCM produces more realistic and minimal counterfactuals than baselines on MNIST data and can also be applied to ImageNet data. Code is available https://github.com/vios-s/Diff-SCM. △ Less

Submitted 21 February, 2022; originally announced February 2022.

Comments: Accepted at CLeaR (Causal Learning and Reasoning) 2022

arXiv:2202.00137 [pdf, other]

doi 10.1109/TDSC.2022.3204535

Studying the Robustness of Anti-adversarial Federated Learning Models Detecting Cyberattacks in IoT Spectrum Sensors

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Timo Schenk, Adrian Lars Benjamin Iten, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Device fingerprinting combined with Machine and Deep Learning (ML/DL) report promising performance when detecting cyberattacks targeting data managed by resource-constrained spectrum sensors. However, the amount of data needed to train models and the privacy concerns of such scenarios limit the applicability of centralized ML/DL-based approaches. Federated learning (FL) addresses these limitations… ▽ More Device fingerprinting combined with Machine and Deep Learning (ML/DL) report promising performance when detecting cyberattacks targeting data managed by resource-constrained spectrum sensors. However, the amount of data needed to train models and the privacy concerns of such scenarios limit the applicability of centralized ML/DL-based approaches. Federated learning (FL) addresses these limitations by creating federated and privacy-preserving models. However, FL is vulnerable to malicious participants, and the impact of adversarial attacks on federated models detecting spectrum sensing data falsification (SSDF) attacks on spectrum sensors has not been studied. To address this challenge, the first contribution of this work is the creation of a novel dataset suitable for FL and modeling the behavior (usage of CPU, memory, or file system, among others) of resource-constrained spectrum sensors affected by different SSDF attacks. The second contribution is a pool of experiments analyzing and comparing the robustness of federated models according to i) three families of spectrum sensors, ii) eight SSDF attacks, iii) four scenarios dealing with unsupervised (anomaly detection) and supervised (binary classification) federated models, iv) up to 33% of malicious participants implementing data and model poisoning attacks, and v) four aggregation functions acting as anti-adversarial mechanisms to increase the models robustness. △ Less

Submitted 31 January, 2022; originally announced February 2022.

arXiv:2201.05410 [pdf, other]

doi 10.1109/TDSC.2023.3252918

CyberSpec: Intelligent Behavioral Fingerprinting to Detect Attacks on Crowdsensing Spectrum Sensors

Authors: Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Integrated sensing and communication (ISAC) is a novel paradigm using crowdsensing spectrum sensors to help with the management of spectrum scarcity. However, well-known vulnerabilities of resource-constrained spectrum sensors and the possibility of being manipulated by users with physical access complicate their protection against spectrum sensing data falsification (SSDF) attacks. Most recent li… ▽ More Integrated sensing and communication (ISAC) is a novel paradigm using crowdsensing spectrum sensors to help with the management of spectrum scarcity. However, well-known vulnerabilities of resource-constrained spectrum sensors and the possibility of being manipulated by users with physical access complicate their protection against spectrum sensing data falsification (SSDF) attacks. Most recent literature suggests using behavioral fingerprinting and Machine/Deep Learning (ML/DL) for improving similar cybersecurity issues. Nevertheless, the applicability of these techniques in resource-constrained devices, the impact of attacks affecting spectrum data integrity, and the performance and scalability of models suitable for heterogeneous sensors types are still open challenges. To improve limitations, this work presents seven SSDF attacks affecting spectrum sensors and introduces CyberSpec, an ML/DL-oriented framework using device behavioral fingerprinting to detect anomalies produced by SSDF attacks affecting resource-constrained spectrum sensors. CyberSpec has been implemented and validated in ElectroSense, a real crowdsensing RF monitoring platform where several configurations of the proposed SSDF attacks have been executed in different sensors. A pool of experiments with different unsupervised ML/DL-based models has demonstrated the suitability of CyberSpec detecting the previous attacks within an acceptable timeframe. △ Less

Submitted 14 January, 2022; originally announced January 2022.

arXiv:2111.14434 [pdf, other]

doi 10.1007/s10586-022-03949-w

Robust Federated Learning for execution time-based device model identification under label-flipping attack

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, José Rafael Buendía Rubio, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, seve… ▽ More The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flipping attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2110.14690 [pdf, other]

VACA: Design of Variational Graph Autoencoders for Interventional and Counterfactual Queries

Authors: Pablo Sanchez-Martin, Miriam Rateike, Isabel Valera

Abstract: In this paper, we introduce VACA, a novel class of variational graph autoencoders for causal inference in the absence of hidden confounders, when only observational data and the causal graph are available. Without making any parametric assumptions, VACA mimics the necessary properties of a Structural Causal Model (SCM) to provide a flexible and practical framework for approximating interventions (… ▽ More In this paper, we introduce VACA, a novel class of variational graph autoencoders for causal inference in the absence of hidden confounders, when only observational data and the causal graph are available. Without making any parametric assumptions, VACA mimics the necessary properties of a Structural Causal Model (SCM) to provide a flexible and practical framework for approximating interventions (do-operator) and abduction-action-prediction steps. As a result, and as shown by our empirical results, VACA accurately approximates the interventional and counterfactual distributions on diverse SCMs. Finally, we apply VACA to evaluate counterfactual fairness in fair classification problems, as well as to learn fair classifiers without compromising performance. △ Less

Submitted 27 October, 2021; originally announced October 2021.

arXiv:2108.12043 [pdf, other]

doi 10.1016/j.media.2022.102516

Learning Disentangled Representations in the Imaging Domain

Authors: Xiao Liu, Pedro Sanchez, Spyridon Thermos, Alison Q. O'Neil, Sotirios A. Tsaftaris

Abstract: Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation re… ▽ More Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities. △ Less

Submitted 29 July, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: Accepted by Medical Image Analysis. This paper follows a tutorial style but also surveys a considerable (more than 260 citations) number of works

arXiv:2106.10069 [pdf, other]

doi 10.1145/3510409

Point-of-Interest Recommender Systems based on Location-Based Social Networks: A Survey from an Experimental Perspective

Authors: Pablo Sánchez, Alejandro Bellogín

Abstract: Point-of-Interest recommendation is an increasing research and developing area within the widely adopted technologies known as Recommender Systems. Among them, those that exploit information coming from Location-Based Social Networks (LBSNs) are very popular nowadays and could work with different information sources, which pose several challenges and research questions to the community as a whole.… ▽ More Point-of-Interest recommendation is an increasing research and developing area within the widely adopted technologies known as Recommender Systems. Among them, those that exploit information coming from Location-Based Social Networks (LBSNs) are very popular nowadays and could work with different information sources, which pose several challenges and research questions to the community as a whole. We present a systematic review focused on the research done in the last 10 years about this topic. We discuss and categorize the algorithms and evaluation methodologies used in these works and point out the opportunities and challenges that remain open in the field. More specifically, we report the leading recommendation techniques and information sources that have been exploited more often (such as the geographical signal and deep learning approaches) while we also alert about the lack of reproducibility in the field that may hinder real performance improvements. △ Less

Submitted 27 January, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: Submitted in Jul 2020 (revised in September 2021, accepted in January 2022, online January 2022)

arXiv:2106.08209 [pdf, other]

doi 10.1016/j.jnca.2022.103579

A methodology to identify identical single-board computers based on hardware behavior fingerprinting

Authors: Pedro Miguel Sánchez Sánchez, José María Jorquera Valero, Alberto Huertas Celdrán, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: The connectivity and resource-constrained nature of single-board devices open the door to cybersecurity concerns affecting Internet of Things (IoT) scenarios. One of the most important issues is the presence of unauthorized IoT devices that want to impersonate legitimate ones by using identical hardware and software specifications. This situation can provoke sensitive information leakages, data po… ▽ More The connectivity and resource-constrained nature of single-board devices open the door to cybersecurity concerns affecting Internet of Things (IoT) scenarios. One of the most important issues is the presence of unauthorized IoT devices that want to impersonate legitimate ones by using identical hardware and software specifications. This situation can provoke sensitive information leakages, data poisoning, or privilege escalation in IoT scenarios. Combining behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques is a promising approach to identify these malicious spoofing devices by detecting minor performance differences generated by imperfections in manufacturing. However, existing solutions are not suitable for single-board devices since they do not consider their hardware and software limitations, underestimate critical aspects such as fingerprint stability or context changes, and do not explore the potential of ML/DL techniques. To improve it, this work first identifies the essential properties for single-board device identification: uniqueness, stability, diversity, scalability, efficiency, robustness, and security. Then, a novel methodology relies on behavioral fingerprinting to identify identical single-board devices and meet the previous properties. The methodology leverages the different built-in components of the system and ML/DL techniques, comparing the device internal behavior with each other to detect manufacturing variations. The methodology validation has been performed in a real environment composed of 15 identical Raspberry Pi 4 B and 10 Raspberry Pi 3 B+ devices, obtaining a 91.9% average TPR and identifying all devices by setting a 50% threshold in the evaluation process. Finally, a discussion compares the proposed solution with related work, highlighting the fingerprint properties not met, and provides important lessons learned and limitations. △ Less

Submitted 22 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

arXiv:2106.01722 [pdf, other]

GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture

Authors: Weijin Zhu, Yao Shen, Linfeng Yu, Lizeth Patricia Aguirre Sanchez

Abstract: Recent studies on unsupervised object detection based on spatial attention have achieved promising results. Models, such as AIR and SPAIR, output "what" and "where" latent variables that represent the attributes and locations of objects in a scene, respectively. Most of the previous studies concentrate on the "where" localization performance; however, we claim that acquiring "what" object attribut… ▽ More Recent studies on unsupervised object detection based on spatial attention have achieved promising results. Models, such as AIR and SPAIR, output "what" and "where" latent variables that represent the attributes and locations of objects in a scene, respectively. Most of the previous studies concentrate on the "where" localization performance; however, we claim that acquiring "what" object attributes is also essential for representation learning. This paper presents a framework, GMAIR, for unsupervised object detection. It incorporates spatial attention and a Gaussian mixture in a unified deep generative model. GMAIR can locate objects in a scene and simultaneously cluster them without supervision. Furthermore, we analyze the "what" latent variables and clustering process. Finally, we evaluate our model on MultiMNIST and Fruit2D datasets and show that GMAIR achieves competitive results on localization and clustering compared to state-of-the-art methods. △ Less

Submitted 3 June, 2021; originally announced June 2021.

Comments: 15 pages, 5 figures

arXiv:2104.09994 [pdf, other]

doi 10.1016/j.comnet.2021.108693

Federated Learning for Malware Detection in IoT Devices

Authors: Valerian Rey, Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Martin Jaggi

Abstract: This work investigates the possibilities enabled by federated learning concerning IoT malware detection and studies security issues inherent to this new learning paradigm. In this context, a framework that uses federated learning to detect malware affecting IoT devices is presented. N-BaIoT, a dataset modeling network traffic of several real IoT devices while affected by malware, has been used to… ▽ More This work investigates the possibilities enabled by federated learning concerning IoT malware detection and studies security issues inherent to this new learning paradigm. In this context, a framework that uses federated learning to detect malware affecting IoT devices is presented. N-BaIoT, a dataset modeling network traffic of several real IoT devices while affected by malware, has been used to evaluate the proposed framework. Both supervised and unsupervised federated models (multi-layer perceptron and autoencoder) able to detect malware affecting seen and unseen IoT devices of N-BaIoT have been trained and evaluated. Furthermore, their performance has been compared to two traditional approaches. The first one lets each participant locally train a model using only its own data, while the second consists of making the participants share their data with a central entity in charge of training a global model. This comparison has shown that the use of more diverse and large data, as done in the federated and centralized methods, has a considerable positive impact on the model performance. Besides, the federated models, while preserving the participant's privacy, show similar results as the centralized ones. As an additional contribution and to measure the robustness of the federated approach, an adversarial setup with several malicious participants poisoning the federated model has been considered. The baseline model aggregation averaging step used in most federated learning algorithms appears highly vulnerable to different attacks, even with a single adversary. The performance of other model aggregation functions acting as countermeasures is thus evaluated under the same attack scenarios. These functions provide a significant improvement against malicious participants, but more efforts are still needed to make federated approaches robust. △ Less

Submitted 19 November, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

Journal ref: Rey, V., Sánchez, P. M. S., Celdrán, A. H., & Bovet, G. (2022). Federated learning for malware detection in iot devices. Computer Networks, 108693

arXiv:2008.03343 [pdf, other]

doi 10.1109/COMST.2021.3064259

A Survey on Device Behavior Fingerprinting: Data Sources, Techniques, Application Scenarios, and Datasets

Authors: Pedro Miguel Sánchez Sánchez, Jose María Jorquera Valero, Alberto Huertas Celdrán, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., senso… ▽ More In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., sensors, actuators) and detect potential misbehavior that may arise due to cyberattacks, system faults, or misconfigurations. With this goal in mind, a promising research field emerged focusing on creating and managing fingerprints that model the behavior of both the device actions and its components. The article at hand studies the recent growth of the device behavior fingerprinting field in terms of application scenarios, behavioral sources, and processing and evaluation techniques. First, it performs a comprehensive review of the device types, behavioral data, and processing and evaluation techniques used by the most recent and representative research works dealing with two major scenarios: device identification and device misbehavior detection. After that, each work is deeply analyzed and compared, emphasizing its characteristics, advantages, and limitations. This article also provides researchers with a review of the most relevant characteristics of existing datasets as most of the novel processing techniques are based on machine learning and deep learning. Finally, it studies the evolution of these two scenarios in recent years, providing lessons learned, current trends, and future research challenges to guide new solutions in the area. △ Less

Submitted 3 March, 2021; v1 submitted 7 August, 2020; originally announced August 2020.

Showing 1–50 of 59 results for author: Sanchez, P