-
Reliability and predictability of phenotype information from functional connectivity in large imaging datasets
Authors:
Jessica Dafflon,
Dustin Moraczewski,
Eric Earl,
Dylan M. Nielson,
Gabriel Loewinger,
Patrick McClure,
Adam G. Thomas,
Francisco Pereira
Abstract:
One of the central objectives of contemporary neuroimaging research is to create predictive models that can disentangle the connection between patterns of functional connectivity across the entire brain and various behavioral traits. Previous studies have shown that models trained to predict behavioral features from the individual's functional connectivity have modest to poor performance. In this…
▽ More
One of the central objectives of contemporary neuroimaging research is to create predictive models that can disentangle the connection between patterns of functional connectivity across the entire brain and various behavioral traits. Previous studies have shown that models trained to predict behavioral features from the individual's functional connectivity have modest to poor performance. In this study, we trained models that predict observable individual traits (phenotypes) and their corresponding singular value decomposition (SVD) representations - herein referred to as latent phenotypes from resting state functional connectivity. For this task, we predicted phenotypes in two large neuroimaging datasets: the Human Connectome Project (HCP) and the Philadelphia Neurodevelopmental Cohort (PNC). We illustrate the importance of regressing out confounds, which could significantly influence phenotype prediction. Our findings reveal that both phenotypes and their corresponding latent phenotypes yield similar predictive performance. Interestingly, only the first five latent phenotypes were reliably identified, and using just these reliable phenotypes for predicting phenotypes yielded a similar performance to using all latent phenotypes. This suggests that the predictable information is present in the first latent phenotypes, allowing the remainder to be filtered out without any harm in performance. This study sheds light on the intricate relationship between functional connectivity and the predictability and reliability of phenotypic information, with potential implications for enhancing predictive modeling in the realm of neuroimaging research.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
Federated Bayesian Deep Learning: The Application of Statistical Aggregation Methods to Bayesian Models
Authors:
John Fischer,
Marko Orescanin,
Justin Loomis,
Patrick McClure
Abstract:
Federated learning (FL) is an approach to training machine learning models that takes advantage of multiple distributed datasets while maintaining data privacy and reducing communication costs associated with sharing local datasets. Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models; however, modern deterministic deep learning (DL)…
▽ More
Federated learning (FL) is an approach to training machine learning models that takes advantage of multiple distributed datasets while maintaining data privacy and reducing communication costs associated with sharing local datasets. Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models; however, modern deterministic deep learning (DL) models are often poorly calibrated and lack the ability to communicate a measure of epistemic uncertainty in prediction, which is desirable for remote sensing platforms and safety-critical applications. Conversely, Bayesian DL models are often well calibrated and capable of quantifying and communicating a measure of epistemic uncertainty along with a competitive prediction accuracy. Unfortunately, because the weights and biases in Bayesian DL models are defined by a probability distribution, simple application of the aggregation methods associated with FL schemes for deterministic models is either impossible or results in sub-optimal performance. In this work, we use independent and identically distributed (IID) and non-IID partitions of the CIFAR-10 dataset and a fully variational ResNet-20 architecture to analyze six different aggregation strategies for Bayesian DL models. Additionally, we analyze the traditional federated averaging approach applied to an approximate Bayesian Monte Carlo dropout model as a lightweight alternative to more complex variational inference methods in FL. We show that aggregation strategy is a key hyperparameter in the design of a Bayesian FL system with downstream effects on accuracy, calibration, uncertainty quantification, training stability, and client compute requirements.
△ Less
Submitted 4 April, 2024; v1 submitted 22 March, 2024;
originally announced March 2024.
-
Concrete Safety for ML Problems: System Safety for ML Development and Assessment
Authors:
Edgar W. Jatho,
Logan O. Mailloux,
Eugene D. Williams,
Patrick McClure,
Joshua A. Kroll
Abstract:
Many stakeholders struggle to make reliances on ML-driven systems due to the risk of harm these systems may cause. Concerns of trustworthiness, unintended social harms, and unacceptable social and ethical violations undermine the promise of ML advancements. Moreover, such risks in complex ML-driven systems present a special challenge as they are often difficult to foresee, arising over periods of…
▽ More
Many stakeholders struggle to make reliances on ML-driven systems due to the risk of harm these systems may cause. Concerns of trustworthiness, unintended social harms, and unacceptable social and ethical violations undermine the promise of ML advancements. Moreover, such risks in complex ML-driven systems present a special challenge as they are often difficult to foresee, arising over periods of time, across populations, and at scale. These risks often arise not from poor ML development decisions or low performance directly but rather emerge through the interactions amongst ML development choices, the context of model use, environmental factors, and the effects of a model on its target. Systems safety engineering is an established discipline with a proven track record of identifying and managing risks even in high-complexity sociotechnical systems. In this work, we apply a state-of-the-art systems safety approach to concrete applications of ML with notable social and ethical risks to demonstrate a systematic means for meeting the assurance requirements needed to argue for safe and trustworthy ML in sociotechnical systems.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
VICE: Variational Interpretable Concept Embeddings
Authors:
Lukas Muttenthaler,
Charles Y. Zheng,
Patrick McClure,
Robert A. Vandermeulen,
Martin N. Hebart,
Francisco Pereira
Abstract:
A central goal in the cognitive sciences is the development of numerical models for mental representations of object concepts. This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. VICE uses variational inference to obtain sparse, non-n…
▽ More
A central goal in the cognitive sciences is the development of numerical models for mental representations of object concepts. This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. VICE uses variational inference to obtain sparse, non-negative representations of object concepts with uncertainty estimates for the embedding values. These estimates are used to automatically select the dimensions that best explain the data. We derive a PAC learning bound for VICE that can be used to estimate generalization performance or determine a sufficient sample size for experimental design. VICE rivals or outperforms its predecessor, SPoSE, at predicting human behavior in the triplet odd-one-out task. Furthermore, VICE's object representations are more reproducible and consistent across random initializations, highlighting the unique advantage of using VICE for deriving interpretable embeddings from human behavior.
△ Less
Submitted 6 October, 2022; v1 submitted 2 May, 2022;
originally announced May 2022.
-
A Deep Neural Network Tool for Automatic Segmentation of Human Body Parts in Natural Scenes
Authors:
Patrick McClure,
Gabrielle Reimann,
Michal Ramot,
Francisco Pereira
Abstract:
This short article describes a deep neural network trained to perform automatic segmentation of human body parts in natural scenes. More specifically, we trained a Bayesian SegNet with concrete dropout on the Pascal-Parts dataset to predict whether each pixel in a given frame was part of a person's hair, head, ear, eyebrows, legs, arms, mouth, neck, nose, or torso.
This short article describes a deep neural network trained to perform automatic segmentation of human body parts in natural scenes. More specifically, we trained a Bayesian SegNet with concrete dropout on the Pascal-Parts dataset to predict whether each pixel in a given frame was part of a person's hair, head, ear, eyebrows, legs, arms, mouth, neck, nose, or torso.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
Improving the Interpretability of fMRI Decoding using Deep Neural Networks and Adversarial Robustness
Authors:
Patrick McClure,
Dustin Moraczewski,
Ka Chun Lam,
Adam Thomas,
Francisco Pereira
Abstract:
Deep neural networks (DNNs) are being increasingly used to make predictions from functional magnetic resonance imaging (fMRI) data. However, they are widely seen as uninterpretable "black boxes", as it can be difficult to discover what input information is used by the DNN in the process, something important in both cognitive neuroscience and clinical applications. A saliency map is a common approa…
▽ More
Deep neural networks (DNNs) are being increasingly used to make predictions from functional magnetic resonance imaging (fMRI) data. However, they are widely seen as uninterpretable "black boxes", as it can be difficult to discover what input information is used by the DNN in the process, something important in both cognitive neuroscience and clinical applications. A saliency map is a common approach for producing interpretable visualizations of the relative importance of input features for a prediction. However, methods for creating maps often fail due to DNNs being sensitive to input noise, or by focusing too much on the input and too little on the model. It is also challenging to evaluate how well saliency maps correspond to the truly relevant input information, as ground truth is not always available. In this paper, we review a variety of methods for producing gradient-based saliency maps, and present a new adversarial training method we developed to make DNNs robust to input noise, with the goal of improving interpretability. We introduce two quantitative evaluation procedures for saliency map methods in fMRI, applicable whenever a DNN or linear model is being trained to decode some information from imaging data. We evaluate the procedures using a synthetic dataset where the complex activation structure is known, and on saliency maps produced for DNN and linear models for task decoding in the Human Connectome Project (HCP) dataset. Our key finding is that saliency maps produced with different methods vary widely in interpretability, in both in synthetic and HCP fMRI data. Strikingly, even when DNN and linear models decode at comparable levels of performance, DNN saliency maps score higher on interpretability than linear model saliency maps (derived via weights or gradient). Finally, saliency maps produced with our adversarial training method outperform those from other methods.
△ Less
Submitted 17 December, 2020; v1 submitted 23 April, 2020;
originally announced April 2020.
-
Quantum Computing at the Frontiers of Biological Sciences
Authors:
Prashant S. Emani,
Jonathan Warrell,
Alan Anticevic,
Stefan Bekiranov,
Michael Gandal,
Michael J. McConnell,
Guillermo Sapiro,
Alán Aspuru-Guzik,
Justin Baker,
Matteo Bastiani,
Patrick McClure,
John Murray,
Stamatios N Sotiropoulos,
Jacob Taylor,
Geetha Senthil,
Thomas Lehner,
Mark B. Gerstein,
Aram W. Harrow
Abstract:
The search for meaningful structure in biological data has relied on cutting-edge advances in computational technology and data science methods. However, challenges arise as we push the limits of scale and complexity in biological problems. Innovation in massively parallel, classical computing hardware and algorithms continues to address many of these challenges, but there is a need to simultaneou…
▽ More
The search for meaningful structure in biological data has relied on cutting-edge advances in computational technology and data science methods. However, challenges arise as we push the limits of scale and complexity in biological problems. Innovation in massively parallel, classical computing hardware and algorithms continues to address many of these challenges, but there is a need to simultaneously consider new paradigms to circumvent current barriers to processing speed. Accordingly, we articulate a view towards quantum computation and quantum information science, where algorithms have demonstrated potential polynomial and exponential computational speedups in certain applications, such as machine learning. The maturation of the field of quantum computing, in hardware and algorithm development, also coincides with the growth of several collaborative efforts to address questions across length and time scales, and scientific disciplines. We use this coincidence to explore the potential for quantum computing to aid in one such endeavor: the merging of insights from genetics, genomics, neuroimaging and behavioral phenotyping. By examining joint opportunities for computational innovation across fields, we highlight the need for a common language between biological data analysis and quantum computing. Ultimately, we consider current and future prospects for the employment of quantum computing algorithms in the biological sciences.
△ Less
Submitted 16 November, 2019;
originally announced November 2019.
-
Knowing what you know in brain segmentation using Bayesian deep neural networks
Authors:
Patrick McClure,
Nao Rho,
John A. Lee,
Jakub R. Kaczmarzyk,
Charles Zheng,
Satrajit S. Ghosh,
Dylan Nielson,
Adam G. Thomas,
Peter Bandettini,
Francisco Pereira
Abstract:
In this paper, we describe a Bayesian deep neural network (DNN) for predicting FreeSurfer segmentations of structural MRI volumes, in minutes rather than hours. The network was trained and evaluated on a large dataset (n = 11,480), obtained by combining data from more than a hundred different sites, and also evaluated on another completely held-out dataset (n = 418). The network was trained using…
▽ More
In this paper, we describe a Bayesian deep neural network (DNN) for predicting FreeSurfer segmentations of structural MRI volumes, in minutes rather than hours. The network was trained and evaluated on a large dataset (n = 11,480), obtained by combining data from more than a hundred different sites, and also evaluated on another completely held-out dataset (n = 418). The network was trained using a novel spike-and-slab dropout-based variational inference approach. We show that, on these datasets, the proposed Bayesian DNN outperforms previously proposed methods, in terms of the similarity between the segmentation predictions and the FreeSurfer labels, and the usefulness of the estimate uncertainty of these predictions. In particular, we demonstrated that the prediction uncertainty of this network at each voxel is a good indicator of whether the network has made an error and that the uncertainty across the whole brain can predict the manual quality control ratings of a scan. The proposed Bayesian DNN method should be applicable to any new network architecture for addressing the segmentation problem.
△ Less
Submitted 18 September, 2019; v1 submitted 3 December, 2018;
originally announced December 2018.
-
Distributed Weight Consolidation: A Brain Segmentation Case Study
Authors:
Patrick McClure,
Charles Y. Zheng,
Jakub R. Kaczmarzyk,
John A. Lee,
Satrajit S. Ghosh,
Dylan Nielson,
Peter Bandettini,
Francisco Pereira
Abstract:
Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distrib…
▽ More
Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.
△ Less
Submitted 16 January, 2019; v1 submitted 28 May, 2018;
originally announced May 2018.
-
Robustly representing uncertainty in deep neural networks through sampling
Authors:
Patrick McClure,
Nikolaus Kriegeskorte
Abstract:
As deep neural networks (DNNs) are applied to increasingly challenging problems, they will need to be able to represent their own uncertainty. Modeling uncertainty is one of the key features of Bayesian methods. Using Bernoulli dropout with sampling at prediction time has recently been proposed as an efficient and well performing variational inference method for DNNs. However, sampling from other…
▽ More
As deep neural networks (DNNs) are applied to increasingly challenging problems, they will need to be able to represent their own uncertainty. Modeling uncertainty is one of the key features of Bayesian methods. Using Bernoulli dropout with sampling at prediction time has recently been proposed as an efficient and well performing variational inference method for DNNs. However, sampling from other multiplicative noise based variational distributions has not been investigated in depth. We evaluated Bayesian DNNs trained with Bernoulli or Gaussian multiplicative masking of either the units (dropout) or the weights (dropconnect). We tested the calibration of the probabilistic predictions of Bayesian convolutional neural networks (CNNs) on MNIST and CIFAR-10. Sampling at prediction time increased the calibration of the DNNs' probabalistic predictions. Sampling weights, whether Gaussian or Bernoulli, led to more robust representation of uncertainty compared to sampling of units. However, using either Gaussian or Bernoulli dropout led to increased test set classification accuracy. Based on these findings we used both Bernoulli dropout and Gaussian dropconnect concurrently, which we show approximates the use of a spike-and-slab variational distribution without increasing the number of learned parameters. We found that spike-and-slab sampling had higher test set performance than Gaussian dropconnect and more robustly represented its uncertainty compared to Bernoulli dropout.
△ Less
Submitted 20 January, 2018; v1 submitted 5 November, 2016;
originally announced November 2016.
-
Representational Distance Learning for Deep Neural Networks
Authors:
Patrick McClure,
Nikolaus Kriegeskorte
Abstract:
Deep neural networks (DNNs) provide useful models of visual representational transformations. We present a method that enables a DNN (student) to learn from the internal representational spaces of a reference model (teacher), which could be another DNN or, in the future, a biological brain. Representational spaces of the student and the teacher are characterized by representational distance matric…
▽ More
Deep neural networks (DNNs) provide useful models of visual representational transformations. We present a method that enables a DNN (student) to learn from the internal representational spaces of a reference model (teacher), which could be another DNN or, in the future, a biological brain. Representational spaces of the student and the teacher are characterized by representational distance matrices (RDMs). We propose representational distance learning (RDL), a stochastic gradient descent method that drives the RDMs of the student to approximate the RDMs of the teacher. We demonstrate that RDL is competitive with other transfer learning techniques for two publicly available benchmark computer vision datasets (MNIST and CIFAR-100), while allowing for architectural differences between student and teacher. By pulling the student's RDMs towards those of the teacher, RDL significantly improved visual classification performance when compared to baseline networks that did not use transfer learning. In the future, RDL may enable combined supervised training of deep neural networks using task constraints (e.g. images and category labels) and constraints from brain-activity measurements, so as to build models that replicate the internal representational spaces of biological brains.
△ Less
Submitted 7 November, 2016; v1 submitted 12 November, 2015;
originally announced November 2015.