-
Geodesic Optimization for Predictive Shift Adaptation on EEG data
Authors:
Apolline Mellot,
Antoine Collas,
Sylvain Chevallier,
Alexandre Gramfort,
Denis A. Engemann
Abstract:
Electroencephalography (EEG) data is often collected from diverse contexts involving different populations and EEG devices. This variability can induce distribution shifts in the data $X$ and in the biomedical variables of interest $y$, thus limiting the application of supervised machine learning (ML) algorithms. While domain adaptation (DA) methods have been developed to mitigate the impact of th…
▽ More
Electroencephalography (EEG) data is often collected from diverse contexts involving different populations and EEG devices. This variability can induce distribution shifts in the data $X$ and in the biomedical variables of interest $y$, thus limiting the application of supervised machine learning (ML) algorithms. While domain adaptation (DA) methods have been developed to mitigate the impact of these shifts, such methods struggle when distribution shifts occur simultaneously in $X$ and $y$. As state-of-the-art ML models for EEG represent the data by spatial covariance matrices, which lie on the Riemannian manifold of Symmetric Positive Definite (SPD) matrices, it is appealing to study DA techniques operating on the SPD manifold. This paper proposes a novel method termed Geodesic Optimization for Predictive Shift Adaptation (GOPSA) to address test-time multi-source DA for situations in which source domains have distinct $y$ distributions. GOPSA exploits the geodesic structure of the Riemannian manifold to jointly learn a domain-specific re-centering operator representing site-specific intercepts and the regression model. We performed empirical benchmarks on the cross-site generalization of age-prediction models with resting-state EEG data from a large multi-national dataset (HarMNqEEG), which included $14$ recording sites and more than $1500$ human participants. Compared to state-of-the-art methods, our results showed that GOPSA achieved significantly higher performance on three regression metrics ($R^2$, MAE, and Spearman's $ρ$) for several source-target site combinations, highlighting its effectiveness in tackling multi-source DA with predictive shifts in EEG data analysis. Our method has the potential to combine the advantages of mixed-effects modeling with machine learning for biomedical applications of EEG, such as multicenter clinical trials.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Variable Importance in High-Dimensional Settings Requires Grouping
Authors:
Ahmad Chamma,
Bertrand Thirion,
Denis A. Engemann
Abstract:
Explaining the decision process of machine learning algorithms is nowadays crucial for both model's performance enhancement and human comprehension. This can be achieved by assessing the variable importance of single variables, even for high-capacity non-linear methods, e.g. Deep Neural Networks (DNNs). While only removal-based approaches, such as Permutation Importance (PI), can bring statistical…
▽ More
Explaining the decision process of machine learning algorithms is nowadays crucial for both model's performance enhancement and human comprehension. This can be achieved by assessing the variable importance of single variables, even for high-capacity non-linear methods, e.g. Deep Neural Networks (DNNs). While only removal-based approaches, such as Permutation Importance (PI), can bring statistical validity, they return misleading results when variables are correlated. Conditional Permutation Importance (CPI) bypasses PI's limitations in such cases. However, in high-dimensional settings, where high correlations between the variables cancel their conditional importance, the use of CPI as well as other methods leads to unreliable results, besides prohibitive computation costs. Grouping variables statistically via clustering or some prior knowledge gains some power back and leads to better interpretations. In this work, we introduce BCPI (Block-Based Conditional Permutation Importance), a new generic framework for variable importance computation with statistical guarantees handling both single and group cases. Furthermore, as handling groups with high cardinality (such as a set of observations of a given modality) are both time-consuming and resource-intensive, we also introduce a new stacking approach extending the DNN architecture with sub-linear layers adapted to the group structure. We show that the ensuing approach extended with stacking controls the type-I error even with highly-correlated groups and shows top accuracy across benchmarks. Furthermore, we perform a real-world data analysis in a large-scale medical dataset where we aim to show the consistency between our results and the literature for a biomarker prediction.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
Statistically Valid Variable Importance Assessment through Conditional Permutations
Authors:
Ahmad Chamma,
Denis A. Engemann,
Bertrand Thirion
Abstract:
Variable importance assessment has become a crucial step in machine-learning applications when using complex learners, such as deep neural networks, on large-scale data. Removal-based importance assessment is currently the reference approach, particularly when statistical guarantees are sought to justify variable inclusion. It is often implemented with variable permutation schemes. On the flip sid…
▽ More
Variable importance assessment has become a crucial step in machine-learning applications when using complex learners, such as deep neural networks, on large-scale data. Removal-based importance assessment is currently the reference approach, particularly when statistical guarantees are sought to justify variable inclusion. It is often implemented with variable permutation schemes. On the flip side, these approaches risk misidentifying unimportant variables as important in the presence of correlations among covariates. Here we develop a systematic approach for studying Conditional Permutation Importance (CPI) that is model agnostic and computationally lean, as well as reusable benchmarks of state-of-the-art variable importance estimators. We show theoretically and empirically that $\textit{CPI}$ overcomes the limitations of standard permutation importance by providing accurate type-I error control. When used with a deep neural network, $\textit{CPI}$ consistently showed top accuracy across benchmarks. An experiment on real-world data analysis in a large-scale medical dataset showed that $\textit{CPI}$ provides a more parsimonious selection of statistically significant variables. Our results suggest that $\textit{CPI}$ can be readily used as drop-in replacement for permutation-based methods.
△ Less
Submitted 25 October, 2023; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Manifold-regression to predict from MEG/EEG brain signals without source modeling
Authors:
David Sabbagh,
Pierre Ablin,
Gael Varoquaux,
Alexandre Gramfort,
Denis A. Engemann
Abstract:
Magnetoencephalography and electroencephalography (M/EEG) can reveal neuronal dynamics non-invasively in real-time and are therefore appreciated methods in medicine and neuroscience. Recent advances in modeling brain-behavior relationships have highlighted the effectiveness of Riemannian geometry for summarizing the spatially correlated time-series from M/EEG in terms of their covariance. However,…
▽ More
Magnetoencephalography and electroencephalography (M/EEG) can reveal neuronal dynamics non-invasively in real-time and are therefore appreciated methods in medicine and neuroscience. Recent advances in modeling brain-behavior relationships have highlighted the effectiveness of Riemannian geometry for summarizing the spatially correlated time-series from M/EEG in terms of their covariance. However, after artefact-suppression, M/EEG data is often rank deficient which limits the application of Riemannian concepts. In this article, we focus on the task of regression with rank-reduced covariance matrices. We study two Riemannian approaches that vectorize the M/EEG covariance between-sensors through projection into a tangent space. The Wasserstein distance readily applies to rank-reduced data but lacks affine-invariance. This can be overcome by finding a common subspace in which the covariance matrices are full rank, enabling the affine-invariant geometric distance. We investigated the implications of these two approaches in synthetic generative models, which allowed us to control estimation bias of a linear model for prediction. We show that Wasserstein and geometric distances allow perfect out-of-sample prediction on the generative models. We then evaluated the methods on real data with regard to their effectiveness in predicting age from M/EEG covariance matrices. The findings suggest that the data-driven Riemannian methods outperform different sensor-space estimators and that they get close to the performance of biophysics-driven source-localization model that requires MRI acquisitions and tedious data processing. Our study suggests that the proposed Riemannian methods can serve as fundamental building-blocks for automated large-scale analysis of M/EEG.
△ Less
Submitted 22 November, 2019; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Clustered marginalization of minorities during social transitions induced by co-evolution of behaviour and network structure
Authors:
Carl-Friedrich Schleussner,
Jonathan F. Donges,
Denis A. Engemann,
Anders Levermann
Abstract:
Large-scale transitions in societies are associated with both individual behavioural change and restructuring of the social network. These two factors have often been considered independently, yet recent advances in social network research challenge this view. Here we show that common features of societal marginalization and clustering emerge naturally during transitions in a co-evolutionary adapt…
▽ More
Large-scale transitions in societies are associated with both individual behavioural change and restructuring of the social network. These two factors have often been considered independently, yet recent advances in social network research challenge this view. Here we show that common features of societal marginalization and clustering emerge naturally during transitions in a co-evolutionary adaptive network model. This is achieved by explicitly considering the interplay between individual interaction and a dynamic network structure in behavioural selection. We exemplify this mechanism by simulating how smoking behaviour and the network structure get reconfigured by changing social norms. Our results are consistent with empirical findings: The prevalence of smoking was reduced, remaining smokers were preferentially connected among each other and formed increasingly marginalised clusters. We propose that self-amplifying feedbacks between individual behaviour and dynamic restructuring of the network are main drivers of the transition. This generative mechanism for co-evolution of individual behaviour and social network structure may apply to a wide range of examples beyond smoking.
△ Less
Submitted 5 July, 2016; v1 submitted 15 December, 2015;
originally announced December 2015.