-
CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI
Authors:
Zi Wang,
Fanwen Wang,
Chen Qin,
Jun Lyu,
Ouyang Cheng,
Shuo Wang,
Yan Li,
Mengyao Yu,
Haoyu Zhang,
Kunyuan Guo,
Zhang Shi,
Qirong Li,
Ziqiang Xu,
Yajing Zhang,
Hao Li,
Sha Hua,
Binghua Chen,
Longyu Sun,
Mengting Sun,
Qin Li,
Ying-Hua Chu,
Wenjia Bai,
Jing Qin,
Xiahai Zhuang,
Claudia Prieto
, et al. (7 additional authors not shown)
Abstract:
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h…
▽ More
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
On the maximal L1 influence of real-valued boolean functions
Authors:
Andrew J. Young,
Henry D. Pfister
Abstract:
We show that any sequence of well-behaved (e.g. bounded and non-constant) real-valued functions of $n$ boolean variables $\{f_n\}$ admits a sequence of coordinates whose $L^1$ influence under the $p$-biased distribution, for any $p\in(0,1)$, is $Ω(\text{var}(f_n) \frac{\ln n}{n})$.
We show that any sequence of well-behaved (e.g. bounded and non-constant) real-valued functions of $n$ boolean variables $\{f_n\}$ admits a sequence of coordinates whose $L^1$ influence under the $p$-biased distribution, for any $p\in(0,1)$, is $Ω(\text{var}(f_n) \frac{\ln n}{n})$.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
NeST: Neural Stress Tensor Tomography by leveraging 3D Photoelasticity
Authors:
Akshat Dave,
Tianyi Zhang,
Aaron Young,
Ramesh Raskar,
Wolfgang Heidrich,
Ashok Veeraraghavan
Abstract:
Photoelasticity enables full-field stress analysis in transparent objects through stress-induced birefringence. Existing techniques are limited to 2D slices and require destructively slicing the object. Recovering the internal 3D stress distribution of the entire object is challenging as it involves solving a tensor tomography problem and handling phase wrapping ambiguities. We introduce NeST, an…
▽ More
Photoelasticity enables full-field stress analysis in transparent objects through stress-induced birefringence. Existing techniques are limited to 2D slices and require destructively slicing the object. Recovering the internal 3D stress distribution of the entire object is challenging as it involves solving a tensor tomography problem and handling phase wrapping ambiguities. We introduce NeST, an analysis-by-synthesis approach for reconstructing 3D stress tensor fields as neural implicit representations from polarization measurements. Our key insight is to jointly handle phase unwrapping and tensor tomography using a differentiable forward model based on Jones calculus. Our non-linear model faithfully matches real captures, unlike prior linear approximations. We develop an experimental multi-axis polariscope setup to capture 3D photoelasticity and experimentally demonstrate that NeST reconstructs the internal stress distribution for objects with varying shape and force conditions. Additionally, we showcase novel applications in stress analysis, such as visualizing photoelastic fringes by virtually slicing the object and viewing photoelastic fringes from unseen viewpoints. NeST paves the way for scalable non-destructive 3D photoelastic analysis.
△ Less
Submitted 24 June, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Analysis of Impulsive Interference in Digital Audio Broadcasting Systems in Electric Vehicles
Authors:
Chin-Hung Chen,
Wen-Hung Huang,
Boris Karanov,
Alex Young,
Yan Wu,
Wim van Houtum
Abstract:
Recently, new types of interference in electric vehicles (EVs), such as converters switching and/or battery chargers, have been found to degrade the performance of wireless digital transmission systems. Measurements show that such an interference is characterized by impulsive behavior and is widely varying in time. This paper uses recorded data from our EV testbed to analyze the impulsive interfer…
▽ More
Recently, new types of interference in electric vehicles (EVs), such as converters switching and/or battery chargers, have been found to degrade the performance of wireless digital transmission systems. Measurements show that such an interference is characterized by impulsive behavior and is widely varying in time. This paper uses recorded data from our EV testbed to analyze the impulsive interference in the digital audio broadcasting band. Moreover, we use our analysis to obtain a corresponding interference model. In particular, we studied the temporal characteristics of the interference and confirmed that its amplitude indeed exhibits an impulsive behavior. Our results show that impulsive events span successive received signal samples and thus indicate a bursty nature. To this end, we performed a data-driven modification of a well-established model for bursty impulsive interference, the Markov-Middleton model, to produce synthetic noise realization. We investigate the optimal symbol detector design based on the proposed model and show significant performance gains compared to the conventional detector based on the additive white Gaussian noise assumption.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Data-Driven Symbol Detection for Intersymbol Interference Channels with Bursty Impulsive Noise
Authors:
Boris Karanov,
Chin-Hung Chen,
Yan Wu,
Alex Young,
Wim van Houtum
Abstract:
We developed machine learning approaches for data-driven trellis-based soft symbol detection in coded transmission over intersymbol interference (ISI) channels in presence of bursty impulsive noise (IN), for example encountered in wireless digital broadcasting systems and vehicular communications. This enabled us to obtain optimized detectors based on the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm…
▽ More
We developed machine learning approaches for data-driven trellis-based soft symbol detection in coded transmission over intersymbol interference (ISI) channels in presence of bursty impulsive noise (IN), for example encountered in wireless digital broadcasting systems and vehicular communications. This enabled us to obtain optimized detectors based on the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm while circumventing the use of full channel state information (CSI) for computing likelihoods and trellis state transition probabilities. First, we extended the application of the neural network (NN)-aided BCJR, recently proposed for ISI channels with additive white Gaussian noise (AWGN). Although suitable for estimating likelihoods via labeling of transmission sequences, the BCJR-NN method does not provide a framework for learning the trellis state transitions. In addition to detection over the joint ISI and IN states we also focused on another scenario where trellis transitions are not trivial: detection for the ISI channel with AWGN with inaccurate knowledge of the channel memory at the receiver. Without access to the accurate state transition matrix, the BCJR- NN performance significantly degrades in both settings. To this end, we devised an alternative approach for data-driven BCJR detection based on the unsupervised learning of a hidden Markov model (HMM). The BCJR-HMM allowed us to optimize both the likelihood function and the state transition matrix without labeling. Moreover, we demonstrated the viability of a hybrid NN and HMM BCJR detection where NN is used for learning the likelihoods, while the state transitions are optimized via HMM. While reducing the required prior channel knowledge, the examined data-driven detectors with learned trellis state transitions achieve bit error rates close to the optimal full CSI-based BCJR, significantly outperforming detection with inaccurate CSI.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Goal-conditioned reinforcement learning for ultrasound navigation guidance
Authors:
Abdoul Aziz Amadou,
Vivek Singh,
Florin C. Ghesu,
Young-Ho Kim,
Laura Stanciulescu,
Harshitha P. Sai,
Puneet Sharma,
Alistair Young,
Ronak Rajani,
Kawal Rhode
Abstract:
Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance me…
▽ More
Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance method based on contrastive learning as goal-conditioned reinforcement learning (GCRL). We augment the previous framework using a novel contrastive patient batching method (CPB) and a data-augmented contrastive loss, both of which we demonstrate are essential to ensure generalization to anatomical variations across patients. The proposed framework enables navigation to both standard diagnostic as well as intricate interventional views with a single model. Our method was developed with a large dataset of 789 patients and obtained an average error of 6.56 mm in position and 9.36 degrees in angle on a testing dataset of 140 patients, which is competitive or superior to models trained on individual views. Furthermore, we quantitatively validate our method's ability to navigate to interventional views such as the Left Atrial Appendage (LAA) view used in LAA closure. Our approach holds promise in providing valuable guidance during transesophageal ultrasound examinations, contributing to the advancement of skill acquisition for cardiac ultrasound practitioners.
△ Less
Submitted 22 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
FraGNNet: A Deep Probabilistic Model for Mass Spectrum Prediction
Authors:
Adamo Young,
Fei Wang,
David Wishart,
Bo Wang,
Hannes Röst,
Russ Greiner
Abstract:
The process of identifying a compound from its mass spectrum is a critical step in the analysis of complex mixtures. Typical solutions for the mass spectrum to compound (MS2C) problem involve matching the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to mass spectrum (C2MS) models can improve retrieval rate…
▽ More
The process of identifying a compound from its mass spectrum is a critical step in the analysis of complex mixtures. Typical solutions for the mass spectrum to compound (MS2C) problem involve matching the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to mass spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted spectra. Unfortunately, many existing C2MS models suffer from problems with prediction resolution, scalability, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately predict high-resolution spectra. FraGNNet uses a structured latent space to provide insight into the underlying processes that define the spectrum. Our model achieves state-of-the-art performance in terms of prediction error, and surpasses existing C2MS models as a tool for retrieval-based MS2C.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
A Study on the Use of Simulation in Synthesizing Path-Following Control Policies for Autonomous Ground Robots
Authors:
Harry Zhang,
Stefan Caldararu,
Aaron Young,
Alexis Ruiz,
Huzaifa Unjhawala,
Ishaan Mahajan,
Sriram Ashokkumar,
Nevindu Batagoda,
Zhenhao Zhou,
Luning Bakke,
Dan Negrut
Abstract:
We report results obtained and insights gained while answering the following question: how effective is it to use a simulator to establish path following control policies for an autonomous ground robot? While the quality of the simulator conditions the answer to this question, we found that for the simulation platform used herein, producing four control policies for path planning was straightforwa…
▽ More
We report results obtained and insights gained while answering the following question: how effective is it to use a simulator to establish path following control policies for an autonomous ground robot? While the quality of the simulator conditions the answer to this question, we found that for the simulation platform used herein, producing four control policies for path planning was straightforward once a digital twin of the controlled robot was available. The control policies established in simulation and subsequently demonstrated in the real world are PID control, MPC, and two neural network (NN) based controllers. Training the two NN controllers via imitation learning was accomplished expeditiously using seven simple maneuvers: follow three circles clockwise, follow the same circles counter-clockwise, and drive straight. A test randomization process that employs random micro-simulations is used to rank the ``goodness'' of the four control policies. The policy ranking noted in simulation correlates well with the ranking observed when the control policies were tested in the real world. The simulation platform used is publicly available and BSD3-released as open source; a public Docker image is available for reproducibility studies. It contains a dynamics engine, a sensor simulator, a ROS2 bridge, and a ROS2 autonomy stack the latter employed both in the simulator and the real world experiments.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Quantifying the Sim2real Gap for GPS and IMU Sensors
Authors:
Ishaan Mahajan,
Huzaifa Unjhawala,
Harry Zhang,
Zhenhao Zhou,
Aaron Young,
Alexis Ruiz,
Stefan Caldararu,
Nevindu Batagoda,
Sriram Ashokkumar,
Dan Negrut
Abstract:
Simulation can and should play a critical role in the development and testing of algorithms for autonomous agents. What might reduce its impact is the ``sim2real'' gap -- the algorithm response differs between operation in simulated versus real-world environments. This paper introduces an approach to evaluate this gap, focusing on the accuracy of sensor simulation -- specifically IMU and GPS -- in…
▽ More
Simulation can and should play a critical role in the development and testing of algorithms for autonomous agents. What might reduce its impact is the ``sim2real'' gap -- the algorithm response differs between operation in simulated versus real-world environments. This paper introduces an approach to evaluate this gap, focusing on the accuracy of sensor simulation -- specifically IMU and GPS -- in velocity estimation tasks for autonomous agents. Using a scaled autonomous vehicle, we conduct 40 real-world experiments across diverse environments then replicate the experiments in simulation with five distinct sensor noise models. We note that direct comparison of raw simulation and real sensor data fails to quantify the sim2real gap for robotics applications. We demonstrate that by using a state of the art state-estimation package as a ``judge'', and by evaluating the performance of this state-estimator in both real and simulated scenarios, we can isolate the sim2real discrepancies stemming from sensor simulations alone. The dataset generated is open-source and publicly available for unfettered use.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Yi: Open Foundation Models by 01.AI
Authors:
01. AI,
:,
Alex Young,
Bei Chen,
Chao Li,
Chengen Huang,
Ge Zhang,
Guanwei Zhang,
Heng Li,
Jiangcheng Zhu,
Jianqun Chen,
Jing Chang,
Kaidong Yu,
Peng Liu,
Qiang Liu,
Shawn Yue,
Senbin Yang,
Shiming Yang,
Tao Yu,
Wen Xie,
Wenhao Huang,
Xiaohui Hu,
Xiaoyi Ren,
Xinyao Niu,
Pengcheng Nie
, et al. (7 additional authors not shown)
Abstract:
We introduce the Yi model family, a series of language and multimodal models that demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models. Our base models achieve strong performance on a wide range of benchmarks like MMLU,…
▽ More
We introduce the Yi model family, a series of language and multimodal models that demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models. Our base models achieve strong performance on a wide range of benchmarks like MMLU, and our finetuned chat models deliver strong human preference rate on major evaluation platforms like AlpacaEval and Chatbot Arena. Building upon our scalable super-computing infrastructure and the classical transformer architecture, we attribute the performance of Yi models primarily to its data quality resulting from our data-engineering efforts. For pretraining, we construct 3.1 trillion tokens of English and Chinese corpora using a cascaded data deduplication and quality filtering pipeline. For finetuning, we polish a small scale (less than 10K) instruction dataset over multiple iterations such that every single instance has been verified directly by our machine learning engineers. For vision-language, we combine the chat language model with a vision transformer encoder and train the model to align visual representations to the semantic space of the language model. We further extend the context length to 200K through lightweight continual pretraining and demonstrate strong needle-in-a-haystack retrieval performance. We show that extending the depth of the pretrained checkpoint through continual pretraining further improves performance. We believe that given our current results, continuing to scale up model parameters using thoroughly optimized data will lead to even stronger frontier models.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Towards Reducing Diagnostic Errors with Interpretable Risk Prediction
Authors:
Denis Jered McInerney,
William Dickinson,
Lucy C. Flynn,
Andrea C. Young,
Geoffrey S. Young,
Jan-Willem van de Meent,
Byron C. Wallace
Abstract:
Many diagnostic errors occur because clinicians cannot easily access relevant information in patient Electronic Health Records (EHRs). In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. In particular, we propo…
▽ More
Many diagnostic errors occur because clinicians cannot easily access relevant information in patient Electronic Health Records (EHRs). In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. In particular, we propose a Neural Additive Model to make predictions backed by evidence with individualized risk estimates at time-points where clinicians are still uncertain, aiming to specifically mitigate delays in diagnosis and errors stemming from an incomplete differential. To train such a model, it is necessary to infer temporally fine-grained retrospective labels of eventual "true" diagnoses. We do so with LLMs, to ensure that the input text is from before a confident diagnosis can be made. We use an LLM to retrieve an initial pool of evidence, but then refine this set of evidence according to correlations learned by the model. We conduct an in-depth evaluation of the usefulness of our approach by simulating how it might be used by a clinician to decide between a pre-defined list of differential diagnoses.
△ Less
Submitted 19 March, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Cardiac ultrasound simulation for autonomous ultrasound navigation
Authors:
Abdoul Aziz Amadou,
Laura Peralta,
Paul Dryburgh,
Paul Klein,
Kaloian Petkov,
Richard James Housden,
Vivek Singh,
Rui Liao,
Young-Ho Kim,
Florin Christian Ghesu,
Tommaso Mansi,
Ronak Rajani,
Alistair Young,
Kawal Rhode
Abstract:
Ultrasound is well-established as an imaging modality for diagnostic and interventional purposes. However, the image quality varies with operator skills as acquiring and interpreting ultrasound images requires extensive training due to the imaging artefacts, the range of acquisition parameters and the variability of patient anatomies. Automating the image acquisition task could improve acquisition…
▽ More
Ultrasound is well-established as an imaging modality for diagnostic and interventional purposes. However, the image quality varies with operator skills as acquiring and interpreting ultrasound images requires extensive training due to the imaging artefacts, the range of acquisition parameters and the variability of patient anatomies. Automating the image acquisition task could improve acquisition reproducibility and quality but training such an algorithm requires large amounts of navigation data, not saved in routine examinations. Thus, we propose a method to generate large amounts of ultrasound images from other modalities and from arbitrary positions, such that this pipeline can later be used by learning algorithms for navigation. We present a novel simulation pipeline which uses segmentations from other modalities, an optimized volumetric data representation and GPU-accelerated Monte Carlo path tracing to generate view-dependent and patient-specific ultrasound images. We extensively validate the correctness of our pipeline with a phantom experiment, where structures' sizes, contrast and speckle noise properties are assessed. Furthermore, we demonstrate its usability to train neural networks for navigation in an echocardiography view classification experiment by generating synthetic images from more than 1000 patients. Networks pre-trained with our simulations achieve significantly superior performance in settings where large real datasets are not available, especially for under-represented classes. The proposed approach allows for fast and accurate patient-specific ultrasound image generation, and its usability for training networks for navigation-related tasks is demonstrated.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
On the Robustness of Deep Learning-aided Symbol Detectors to Varying Conditions and Imperfect Channel Knowledge
Authors:
Chin-Hung Chen,
Boris Karanov,
Wim van Houtum,
Wu Yan,
Alex Young,
Alex Alvarado
Abstract:
Recently, a data-driven Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm tailored to channels with intersymbol interference has been introduced. This so-called BCJRNet algorithm utilizes neural networks to calculate channel likelihoods. BCJRNet has demonstrated resilience against inaccurate channel tap estimations when applied to a time-invariant channel with ideal exponential decay profiles. However, it…
▽ More
Recently, a data-driven Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm tailored to channels with intersymbol interference has been introduced. This so-called BCJRNet algorithm utilizes neural networks to calculate channel likelihoods. BCJRNet has demonstrated resilience against inaccurate channel tap estimations when applied to a time-invariant channel with ideal exponential decay profiles. However, its generalization capabilities for practically-relevant time-varying channels, where the receiver can only access incorrect channel parameters, remain largely unexplored. The primary contribution of this paper is to expand upon the results from existing literature to encompass a variety of imperfect channel knowledge cases that appear in real-world transmissions. Our findings demonstrate that BCJRNet significantly outperforms the conventional BCJR algorithm for stationary transmission scenarios when learning from noisy channel data and with imperfect channel decay profiles. However, this advantage is shown to diminish when the operating channel is also rapidly time-varying. Our results also show the importance of memory assumptions for conventional BCJR and BCJRNet. An underestimation of the memory largely degrades the performance of both BCJR and BCJRNet, especially in a slow-decaying channel. To mimic a situation closer to a practical scenario, we also combined channel tap uncertainty with imperfect channel memory knowledge. Somewhat surprisingly, our results revealed improved performance when employing the conventional BCJR with an underestimated memory assumption. BCJRNet, on the other hand, showed a consistent performance improvement as the level of accurate memory knowledge increased.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Proximal Algorithms for Accelerated Langevin Dynamics
Authors:
Duy H. Thai,
Alexander L. Young,
David B. Dunson
Abstract:
We develop a novel class of MCMC algorithms based on a stochastized Nesterov scheme. With an appropriate addition of noise, the result is a time-inhomogeneous underdamped Langevin equation, which we prove emits a specified target distribution as its invariant measure. Convergence rates to stationarity under Wasserstein-2 distance are established as well. Metropolis-adjusted and stochastic gradient…
▽ More
We develop a novel class of MCMC algorithms based on a stochastized Nesterov scheme. With an appropriate addition of noise, the result is a time-inhomogeneous underdamped Langevin equation, which we prove emits a specified target distribution as its invariant measure. Convergence rates to stationarity under Wasserstein-2 distance are established as well. Metropolis-adjusted and stochastic gradient versions of the proposed Langevin dynamics are also provided. Experimental illustrations show superior performance of the proposed method over typical Langevin samplers for different models in statistics and image processing including better mixing of the resulting Markov chains.
△ Less
Submitted 28 November, 2023; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Generalized super-resolution 4D Flow MRI $\unicode{x2013}$ using ensemble learning to extend across the cardiovascular system
Authors:
Leon Ericsson,
Adam Hjalmarsson,
Muhammad Usman Akbar,
Edward Ferdian,
Mia Bonini,
Brandon Hardy,
Jonas Schollenberger,
Maria Aristova,
Patrick Winter,
Nicholas Burris,
Alexander Fyrdahl,
Andreas Sigfridsson,
Susanne Schnell,
C. Alberto Figueroa,
David Nordsletten,
Alistair A. Young,
David Marlevi
Abstract:
4D Flow Magnetic Resonance Imaging (4D Flow MRI) is a non-invasive measurement technique capable of quantifying blood flow across the cardiovascular system. While practical use is limited by spatial resolution and image noise, incorporation of trained super-resolution (SR) networks has potential to enhance image quality post-scan. However, these efforts have predominantly been restricted to narrow…
▽ More
4D Flow Magnetic Resonance Imaging (4D Flow MRI) is a non-invasive measurement technique capable of quantifying blood flow across the cardiovascular system. While practical use is limited by spatial resolution and image noise, incorporation of trained super-resolution (SR) networks has potential to enhance image quality post-scan. However, these efforts have predominantly been restricted to narrowly defined cardiovascular domains, with limited exploration of how SR performance extends across the cardiovascular system; a task aggravated by contrasting hemodynamic conditions apparent across the cardiovasculature. The aim of our study was to explore the generalizability of SR 4D Flow MRI using a combination of heterogeneous training sets and dedicated ensemble learning. With synthetic training data generated across three disparate domains (cardiac, aortic, cerebrovascular), varying convolutional base and ensemble learners were evaluated as a function of domain and architecture, quantifying performance on both in-silico and acquired in-vivo data from the same three domains. Results show that both bagging and stacking ensembling enhance SR performance across domains, accurately predicting high-resolution velocities from low-resolution input data in-silico. Likewise, optimized networks successfully recover native resolution velocities from downsampled in-vivo data, as well as show qualitative potential in generating denoised SR-images from clinical level input data. In conclusion, our work presents a viable approach for generalized SR 4D Flow MRI, with ensemble learning extending utility across various clinical areas of interest.
△ Less
Submitted 21 November, 2023; v1 submitted 20 November, 2023;
originally announced November 2023.
-
Wearable data from subjects playing Super Mario, sitting university exams, or performing physical exercise help detect acute mood episodes via self-supervised learning
Authors:
Filippo Corponi,
Bryan M. Li,
Gerard Anmella,
Clàudia Valenzuela-Pascual,
Ariadna Mas,
Isabella Pacchiarotti,
Marc Valentí,
Iria Grande,
Antonio Benabarre,
Marina Garriga,
Eduard Vieta,
Allan H Young,
Stephen M. Lawrie,
Heather C. Whalley,
Diego Hidalgo-Mazzei,
Antonio Vergari
Abstract:
Personal sensing, leveraging data passively and near-continuously collected with wearables from patients in their ecological environment, is a promising paradigm to monitor mood disorders (MDs), a major determinant of worldwide disease burden. However, collecting and annotating wearable data is very resource-intensive. Studies of this kind can thus typically afford to recruit only a couple dozens…
▽ More
Personal sensing, leveraging data passively and near-continuously collected with wearables from patients in their ecological environment, is a promising paradigm to monitor mood disorders (MDs), a major determinant of worldwide disease burden. However, collecting and annotating wearable data is very resource-intensive. Studies of this kind can thus typically afford to recruit only a couple dozens of patients. This constitutes one of the major obstacles to applying modern supervised machine learning techniques to MDs detection. In this paper, we overcome this data bottleneck and advance the detection of MDs acute episode vs stable state from wearables data on the back of recent advances in self-supervised learning (SSL). This leverages unlabelled data to learn representations during pre-training, subsequently exploited for a supervised task. First, we collected open-access datasets recording with an Empatica E4 spanning different, unrelated to MD monitoring, personal sensing tasks -- from emotion recognition in Super Mario players to stress detection in undergraduates -- and devised a pre-processing pipeline performing on-/off-body detection, sleep-wake detection, segmentation, and (optionally) feature extraction. With 161 E4-recorded subjects, we introduce E4SelfLearning, the largest to date open access collection, and its pre-processing pipeline. Second, we show that SSL confidently outperforms fully-supervised pipelines using either our novel E4-tailored Transformer architecture (E4mer) or classical baseline XGBoost: 81.23% against 75.35% (E4mer) and 72.02% (XGBoost) correctly classified recording segments from 64 (half acute, half stable) patients. Lastly, we illustrate that SSL performance is strongly associated with the specific surrogate task employed for pre-training as well as with unlabelled data availability.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Zero-Shot Policy Transferability for the Control of a Scale Autonomous Vehicle
Authors:
Harry Zhang,
Stefan Caldararu,
Sriram Ashokkumar,
Ishaan Mahajan,
Aaron Young,
Alexis Ruiz,
Huzaifa Unjhawala,
Luning Bakke,
Dan Negrut
Abstract:
We report on a study that employs an in-house developed simulation infrastructure to accomplish zero shot policy transferability for a control policy associated with a scale autonomous vehicle. We focus on implementing policies that require no real world data to be trained (Zero-Shot Transfer), and are developed in-house as opposed to being validated by previous works. We do this by implementing a…
▽ More
We report on a study that employs an in-house developed simulation infrastructure to accomplish zero shot policy transferability for a control policy associated with a scale autonomous vehicle. We focus on implementing policies that require no real world data to be trained (Zero-Shot Transfer), and are developed in-house as opposed to being validated by previous works. We do this by implementing a Neural Network (NN) controller that is trained only on a family of circular reference trajectories. The sensors used are RTK-GPS and IMU, the latter for providing heading. The NN controller is trained using either a human driver (via human in the loop simulation), or a Model Predictive Control (MPC) strategy. We demonstrate these two approaches in conjunction with two operation scenarios: the vehicle follows a waypoint-defined trajectory at constant speed; and the vehicle follows a speed profile that changes along the vehicle's waypoint-defined trajectory. The primary contribution of this work is the demonstration of Zero-Shot Transfer in conjunction with a novel feed-forward NN controller trained using a general purpose, in-house developed simulation platform.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: the FLARE22 Challenge
Authors:
Jun Ma,
Yao Zhang,
Song Gu,
Cheng Ge,
Shihao Ma,
Adamo Young,
Cheng Zhu,
Kangkang Meng,
Xin Yang,
Ziyan Huang,
Fan Zhang,
Wentao Liu,
YuanKe Pan,
Shoujin Huang,
Jiacheng Wang,
Mingze Sun,
Weixin Xu,
Dengqiang Jia,
Jae Won Choi,
Natália Alves,
Bram de Wilde,
Gregor Koehler,
Yajun Wu,
Manuel Wiesenfarth,
Qiongjie Zhu
, et al. (4 additional authors not shown)
Abstract:
Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations,…
▽ More
Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations, we organized the FLARE 2022 Challenge, the largest abdominal organ analysis challenge to date, to benchmark fast, low-resource, accurate, annotation-efficient, and generalized AI algorithms. We constructed an intercontinental and multinational dataset from more than 50 medical groups, including Computed Tomography (CT) scans with different races, diseases, phases, and manufacturers. We independently validated that a set of AI algorithms achieved a median Dice Similarity Coefficient (DSC) of 90.0\% by using 50 labeled scans and 2000 unlabeled scans, which can significantly reduce annotation requirements. The best-performing algorithms successfully generalized to holdout external validation sets, achieving a median DSC of 89.5\%, 90.9\%, and 88.3\% on North American, European, and Asian cohorts, respectively. They also enabled automatic extraction of key organ biology features, which was labor-intensive with traditional manual measurements. This opens the potential to use unlabeled data to boost performance and alleviate annotation shortages for modern AI models.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
On-Sensor Data Filtering using Neuromorphic Computing for High Energy Physics Experiments
Authors:
Shruti R. Kulkarni,
Aaron Young,
Prasanna Date,
Narasinga Rao Miniskar,
Jeffrey S. Vetter,
Farah Fahim,
Benjamin Parpillon,
Jennet Dickinson,
Nhan Tran,
Jieun Yoo,
Corrinne Mills,
Morris Swartz,
Petar Maksimovic,
Catherine D. Schuman,
Alice Bean
Abstract:
This work describes the investigation of neuromorphic computing-based spiking neural network (SNN) models used to filter data from sensor electronics in high energy physics experiments conducted at the High Luminosity Large Hadron Collider. We present our approach for developing a compact neuromorphic model that filters out the sensor data based on the particle's transverse momentum with the goal…
▽ More
This work describes the investigation of neuromorphic computing-based spiking neural network (SNN) models used to filter data from sensor electronics in high energy physics experiments conducted at the High Luminosity Large Hadron Collider. We present our approach for developing a compact neuromorphic model that filters out the sensor data based on the particle's transverse momentum with the goal of reducing the amount of data being sent to the downstream electronics. The incoming charge waveforms are converted to streams of binary-valued events, which are then processed by the SNN. We present our insights on the various system design choices - from data encoding to optimal hyperparameters of the training algorithm - for an accurate and compact SNN optimized for hardware deployment. Our results show that an SNN trained with an evolutionary algorithm and an optimized set of hyperparameters obtains a signal efficiency of about 91% with nearly half as many parameters as a deep neural network.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Authors:
Brian Belgodere,
Pierre Dognin,
Adam Ivankay,
Igor Melnyk,
Youssef Mroueh,
Aleksandra Mojsilovic,
Jiri Navratil,
Apoorva Nitsure,
Inkit Padhi,
Mattia Rigotti,
Jerret Ross,
Yair Schiff,
Radhika Vedpathak,
Richard A. Young
Abstract:
Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining fidelity to the original data. However, assessing the trustworthiness of synthetic datasets and models is a critical challenge. We introduce a holistic auditing framew…
▽ More
Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining fidelity to the original data. However, assessing the trustworthiness of synthetic datasets and models is a critical challenge. We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models. It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation. We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases like education, healthcare, banking, and human resources, spanning different data modalities such as tabular, time-series, vision, and natural language. This holistic assessment is essential for compliance with regulatory safeguards. We introduce a trustworthiness index to rank synthetic datasets based on their safeguards trade-offs. Furthermore, we present a trustworthiness-driven model selection and cross-validation process during training, exemplified with "TrustFormers" across various data types. This approach allows for controllable trustworthiness trade-offs in synthetic data creation. Our auditing framework fosters collaboration among stakeholders, including data scientists, governance experts, internal reviewers, external certifiers, and regulators. This transparent reporting should become a standard practice to prevent bias, discrimination, and privacy violations, ensuring compliance with policies and providing accountability, safety, and performance guarantees.
△ Less
Submitted 9 June, 2024; v1 submitted 21 April, 2023;
originally announced April 2023.
-
A Graph-Based Context-Aware Model to Understand Online Conversations
Authors:
Vibhor Agarwal,
Anthony P. Young,
Sagar Joglekar,
Nishanth Sastry
Abstract:
Online forums that allow for participatory engagement between users have been transformative for the public discussion of many important issues. However, such conversations can sometimes escalate into full-blown exchanges of hate and misinformation. Existing approaches in natural language processing (NLP), such as deep learning models for classification tasks, use as inputs only a single comment o…
▽ More
Online forums that allow for participatory engagement between users have been transformative for the public discussion of many important issues. However, such conversations can sometimes escalate into full-blown exchanges of hate and misinformation. Existing approaches in natural language processing (NLP), such as deep learning models for classification tasks, use as inputs only a single comment or a pair of comments depending upon whether the task concerns the inference of properties of the individual comments or the replies between pairs of comments, respectively. But in online conversations, comments and replies may be based on external context beyond the immediately relevant information that is input to the model. Therefore, being aware of the conversations' surrounding contexts should improve the model's performance for the inference task at hand.
We propose GraphNLI, a novel graph-based deep learning architecture that uses graph walks to incorporate the wider context of a conversation in a principled manner. Specifically, a graph walk starts from a given comment and samples "nearby" comments in the same or parallel conversation threads, which results in additional embeddings that are aggregated together with the initial comment's embedding. We then use these enriched embeddings for downstream NLP prediction tasks that are important for online conversations. We evaluate GraphNLI on two such tasks - polarity prediction and misogynistic hate speech detection - and found that our model consistently outperforms all relevant baselines for both tasks. Specifically, GraphNLI with a biased root-seeking random walk performs with a macro-F1 score of 3 and 6 percentage points better than the best-performing BERT-based baselines for the polarity prediction and hate speech detection tasks, respectively.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
ART/ATK: A research platform for assessing and mitigating the sim-to-real gap in robotics and autonomous vehicle engineering
Authors:
Asher Elmquist,
Aaron Young,
Thomas Hansen,
Sriram Ashokkumar,
Stefan Caldararu,
Abhiraj Dashora,
Ishaan Mahajan,
Harry Zhang,
Luning Fang,
He Shen,
Xiangru Xu,
Radu Serban,
Dan Negrut
Abstract:
We discuss a platform that has both software and hardware components, and whose purpose is to support research into characterizing and mitigating the sim-to-real gap in robotics and vehicle autonomy engineering. The software is operating-system independent and has three main components: a simulation engine called Chrono, which supports high-fidelity vehicle and sensor simulation; an autonomy stack…
▽ More
We discuss a platform that has both software and hardware components, and whose purpose is to support research into characterizing and mitigating the sim-to-real gap in robotics and vehicle autonomy engineering. The software is operating-system independent and has three main components: a simulation engine called Chrono, which supports high-fidelity vehicle and sensor simulation; an autonomy stack for algorithm design and testing; and a development environment that supports visualization and hardware-in-the-loop experimentation. The accompanying hardware platform is a 1/6th scale vehicle augmented with reconfigurable mountings for computing, sensing, and tracking. Since this vehicle platform has a digital twin within the simulation environment, one can test the same autonomy perception, state estimation, or controls algorithms, as well as the processors they run on, in both simulation and reality. A demonstration is provided to show the utilization of this platform for autonomy research. Future work will concentrate on augmenting ART/ATK with support for a full-sized Chevy Bolt EUV, which will be made available to this group in the immediate future.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Deep learning extraction of band structure parameters from density of states: a case study on trilayer graphene
Authors:
Paul Henderson,
Areg Ghazaryan,
Alexander A. Zibrov,
Andrea F. Young,
Maksym Serbyn
Abstract:
The development of two-dimensional materials has resulted in a diverse range of novel, high-quality compounds with increasing complexity. A key requirement for a comprehensive quantitative theory is the accurate determination of these materials' band structure parameters. However, this task is challenging due to the intricate band structures and the indirect nature of experimental probes. In this…
▽ More
The development of two-dimensional materials has resulted in a diverse range of novel, high-quality compounds with increasing complexity. A key requirement for a comprehensive quantitative theory is the accurate determination of these materials' band structure parameters. However, this task is challenging due to the intricate band structures and the indirect nature of experimental probes. In this work, we introduce a general framework to derive band structure parameters from experimental data using deep neural networks. We applied our method to the penetration field capacitance measurement of trilayer graphene, an effective probe of its density of states. First, we demonstrate that a trained deep network gives accurate predictions for the penetration field capacitance as a function of tight-binding parameters. Next, we use the fast and accurate predictions from the trained network to automatically determine tight-binding parameters directly from experimental data, with extracted parameters being in a good agreement with values in the literature. We conclude by discussing potential applications of our method to other materials and experimental techniques beyond penetration field capacitance.
△ Less
Submitted 18 September, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Optimising Chest X-Rays for Image Analysis by Identifying and Removing Confounding Factors
Authors:
Shahab Aslani,
Watjana Lilaonitkul,
Vaishnavi Gnanananthan,
Divya Raj,
Bojidar Rangelov,
Alexandra L Young,
Yipeng Hu,
Paul Taylor,
Daniel C Alexander,
Joseph Jacob
Abstract:
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions. This variation is seen in the CXR projections used, image annotations added and in the inspiratory effort and degree of rotation of clinical images. The image analysis community has attempted to ease the burden on overst…
▽ More
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions. This variation is seen in the CXR projections used, image annotations added and in the inspiratory effort and degree of rotation of clinical images. The image analysis community has attempted to ease the burden on overstretched radiology departments during the pandemic by developing automated COVID-19 diagnostic algorithms, the input for which has been CXR imaging. Large publicly available CXR datasets have been leveraged to improve deep learning algorithms for COVID-19 diagnosis. Yet the variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance. COVID-19 diagnosis may be inferred by an algorithm from non-anatomical features on an image such as image labels. These imaging shortcuts may be dataset-specific and limit the generalisability of AI systems. Understanding and correcting key potential biases in CXR images is therefore an essential first step prior to CXR image analysis. In this study, we propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases. We perform ablation studies to show the impact of each individual step. The results suggest that using our proposed pipeline could increase accuracy of the baseline COVID-19 detection algorithm by up to 13%.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Encoding Integers and Rationals on Neuromorphic Computers using Virtual Neuron
Authors:
Prasanna Date,
Shruti Kulkarni,
Aaron Young,
Catherine Schuman,
Thomas Potok,
Jeffrey Vetter
Abstract:
Neuromorphic computers perform computations by emulating the human brain, and use extremely low power. They are expected to be indispensable for energy-efficient computing in the future. While they are primarily used in spiking neural network-based machine learning applications, neuromorphic computers are known to be Turing-complete, and thus, capable of general-purpose computation. However, to fu…
▽ More
Neuromorphic computers perform computations by emulating the human brain, and use extremely low power. They are expected to be indispensable for energy-efficient computing in the future. While they are primarily used in spiking neural network-based machine learning applications, neuromorphic computers are known to be Turing-complete, and thus, capable of general-purpose computation. However, to fully realize their potential for general-purpose, energy-efficient computing, it is important to devise efficient mechanisms for encoding numbers. Current encoding approaches have limited applicability and may not be suitable for general-purpose computation. In this paper, we present the virtual neuron as an encoding mechanism for integers and rational numbers. We evaluate the performance of the virtual neuron on physical and simulated neuromorphic hardware and show that it can perform an addition operation using 23 nJ of energy on average using a mixed-signal memristor-based neuromorphic processor. We also demonstrate its utility by using it in some of the mu-recursive functions, which are the building blocks of general-purpose computation.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Cloud-Based Real-Time Molecular Screening Platform with MolFormer
Authors:
Brian Belgodere,
Vijil Chenthamarakshan,
Payel Das,
Pierre Dognin,
Toby Kurien,
Igor Melnyk,
Youssef Mroueh,
Inkit Padhi,
Mattia Rigotti,
Jarret Ross,
Yair Schiff,
Richard A. Young
Abstract:
With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The pla…
▽ More
With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The platform currently supports three tasks: nearest neighbor retrieval, chemical space visualization, and property prediction. Based on the functionalities of this platform and results obtained, we believe that such a platform can play a pivotal role in automating chemistry and chemical engineering research, as well as assist in drug discovery and material design tasks. A demo of our platform is provided at \url{www.ibm.biz/molecular_demo}.
△ Less
Submitted 13 August, 2022;
originally announced August 2022.
-
A software toolkit and hardware platform for investigating and comparing robot autonomy algorithms in simulation and reality
Authors:
Asher Elmquist,
Aaron Young,
Ishaan Mahajan,
Kyle Fahey,
Abhiraj Dashora,
Sriram Ashokkumar,
Stefan Caldararu,
Victor Freire,
Xiangru Xu,
Radu Serban,
Dan Negrut
Abstract:
We describe a software framework and a hardware platform used in tandem for the design and analysis of robot autonomy algorithms in simulation and reality. The software, which is open source, containerized, and operating system (OS) independent, has three main components: a ROS 2 interface to a C++ vehicle simulation framework (Chrono), which provides high-fidelity wheeled/tracked vehicle and sens…
▽ More
We describe a software framework and a hardware platform used in tandem for the design and analysis of robot autonomy algorithms in simulation and reality. The software, which is open source, containerized, and operating system (OS) independent, has three main components: a ROS 2 interface to a C++ vehicle simulation framework (Chrono), which provides high-fidelity wheeled/tracked vehicle and sensor simulation; a basic ROS 2-based autonomy stack for algorithm design and testing; and, a development ecosystem which enables visualization, and hardware-in-the-loop experimentation in perception, state estimation, path planning, and controls. The accompanying hardware platform is a 1/6th scale vehicle augmented with reconfigurable mountings for computing, sensing, and tracking. Its purpose is to allow algorithms and sensor configurations to be physically tested and improved. Since this vehicle platform has a digital twin within the simulation environment, one can test and compare the same algorithms and autonomy stack in simulation and reality. This platform has been built with an eye towards characterizing and managing the simulation-to-reality gap. Herein, we describe how this platform is set up, deployed, and used to improve autonomy for mobility applications.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
A Deep Learning-based Integrated Framework for Quality-aware Undersampled Cine Cardiac MRI Reconstruction and Analysis
Authors:
Inês P. Machado,
Esther Puyol-Antón,
Kerstin Hammernik,
Gastão Cruz,
Devran Ugurlu,
Ihsane Olakorede,
Ilkay Oksuz,
Bram Ruijsink,
Miguel Castelo-Branco,
Alistair A. Young,
Claudia Prieto,
Julia A. Schnabel,
Andrew P. King
Abstract:
Cine cardiac magnetic resonance (CMR) imaging is considered the gold standard for cardiac function evaluation. However, cine CMR acquisition is inherently slow and in recent decades considerable effort has been put into accelerating scan times without compromising image quality or the accuracy of derived results. In this paper, we present a fully-automated, quality-controlled integrated framework…
▽ More
Cine cardiac magnetic resonance (CMR) imaging is considered the gold standard for cardiac function evaluation. However, cine CMR acquisition is inherently slow and in recent decades considerable effort has been put into accelerating scan times without compromising image quality or the accuracy of derived results. In this paper, we present a fully-automated, quality-controlled integrated framework for reconstruction, segmentation and downstream analysis of undersampled cine CMR data. The framework enables active acquisition of radial k-space data, in which acquisition can be stopped as soon as acquired data are sufficient to produce high quality reconstructions and segmentations. This results in reduced scan times and automated analysis, enabling robust and accurate estimation of functional biomarkers. To demonstrate the feasibility of the proposed approach, we perform realistic simulations of radial k-space acquisitions on a dataset of subjects from the UK Biobank and present results on in-vivo cine CMR k-space data collected from healthy subjects. The results demonstrate that our method can produce quality-controlled images in a mean scan time reduced from 12 to 4 seconds per slice, and that image quality is sufficient to allow clinically relevant parameters to be automatically estimated to within 5% mean absolute difference.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Surface Vision Transformers: Flexible Attention-Based Modelling of Biomedical Surfaces
Authors:
Simon Dahan,
Hao Xu,
Logan Z. J. Williams,
Abdulah Fawaz,
Chunhui Yang,
Timothy S. Coalson,
Michelle C. Williams,
David E. Newby,
A. David Edwards,
Matthew F. Glasser,
Alistair A. Young,
Daniel Rueckert,
Emma C. Robinson
Abstract:
Recent state-of-the-art performances of Vision Transformers (ViT) in computer vision tasks demonstrate that a general-purpose architecture, which implements long-range self-attention, could replace the local feature learning operations of convolutional neural networks. In this paper, we extend ViTs to surfaces by reformulating the task of surface learning as a sequence-to-sequence learning problem…
▽ More
Recent state-of-the-art performances of Vision Transformers (ViT) in computer vision tasks demonstrate that a general-purpose architecture, which implements long-range self-attention, could replace the local feature learning operations of convolutional neural networks. In this paper, we extend ViTs to surfaces by reformulating the task of surface learning as a sequence-to-sequence learning problem, by proposing patching mechanisms for general surface meshes. Sequences of patches are then processed by a transformer encoder and used for classification or regression. We validate our method on a range of different biomedical surface domains and tasks: brain age prediction in the developing Human Connectome Project (dHCP), fluid intelligence prediction in the Human Connectome Project (HCP), and coronary artery calcium score classification using surfaces from the Scottish Computed Tomography of the Heart (SCOT-HEART) dataset, and investigate the impact of pretraining and data augmentation on model performance. Results suggest that Surface Vision Transformers (SiT) demonstrate consistent improvement over geometric deep learning methods for brain age and fluid intelligence prediction and achieve comparable performance on calcium score classification to standard metrics used in clinical practice. Furthermore, analysis of transformer attention maps offers clear and individualised predictions of the features driving each task. Code is available on Github: https://github.com/metrics-lab/surface-vision-transformers
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
SELFIES and the future of molecular string representations
Authors:
Mario Krenn,
Qianxiang Ai,
Senja Barthel,
Nessa Carson,
Angelo Frei,
Nathan C. Frey,
Pascal Friederich,
Théophile Gaudin,
Alberto Alexander Gayle,
Kevin Maik Jablonka,
Rafael F. Lameiro,
Dominik Lemm,
Alston Lo,
Seyed Mohamad Moosavi,
José Manuel Nápoles-Duarte,
AkshatKumar Nigam,
Robert Pollice,
Kohulan Rajan,
Ulrich Schatzschneider,
Philippe Schwaller,
Marta Skreta,
Berend Smit,
Felix Strieth-Kalthoff,
Chong Sun,
Gary Tom
, et al. (6 additional authors not shown)
Abstract:
Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool…
▽ More
Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, SMILES, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, SMILES has several shortcomings -- most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100\% robustness: SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and enabled numerous new applications in chemistry. In this manuscript, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete Future Projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
Deep neural networks for fine-grained surveillance of overdose mortality
Authors:
Patrick J. Ward,
April M. Young,
Svetla Slavova,
Madison Liford,
Lara Daniels,
Ripley Lucas,
Ramakanth Kavuluru
Abstract:
Surveillance of drug overdose deaths relies on death certificates for identification of the substances that caused death. Drugs and drug classes can be identified through the International Classification of Diseases, 10th Revision (ICD-10) codes present on death certificates. However, ICD-10 codes do not always provide high levels of specificity in drug identification. To achieve more fine-grained…
▽ More
Surveillance of drug overdose deaths relies on death certificates for identification of the substances that caused death. Drugs and drug classes can be identified through the International Classification of Diseases, 10th Revision (ICD-10) codes present on death certificates. However, ICD-10 codes do not always provide high levels of specificity in drug identification. To achieve more fine-grained identification of substances on a death certificate, the free-text cause of death section, completed by the medical certifier, must be analyzed. Current methods for analyzing free-text death certificates rely solely on look-up tables for identifying specific substances, which must be frequently updated and maintained. To improve identification of drugs on death certificates, a deep learning named-entity recognition model was developed, which achieved an F1-score of 99.13%. This model can identify new drug misspellings and novel substances that are not present on current surveillance look-up tables, enhancing the surveillance of drug overdose deaths.
△ Less
Submitted 6 June, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates
Authors:
Vibhor Agarwal,
Sagar Joglekar,
Anthony P. Young,
Nishanth Sastry
Abstract:
Online forums that allow participatory engagement between users have been transformative for public discussion of important issues. However, debates on such forums can sometimes escalate into full blown exchanges of hate or misinformation. An important tool in understanding and tackling such problems is to be able to infer the argumentative relation of whether a reply is supporting or attacking th…
▽ More
Online forums that allow participatory engagement between users have been transformative for public discussion of important issues. However, debates on such forums can sometimes escalate into full blown exchanges of hate or misinformation. An important tool in understanding and tackling such problems is to be able to infer the argumentative relation of whether a reply is supporting or attacking the post it is replying to. This so called polarity prediction task is difficult because replies may be based on external context beyond a post and the reply whose polarity is being predicted. We propose GraphNLI, a novel graph-based deep learning architecture that uses graph walk techniques to capture the wider context of a discussion thread in a principled fashion. Specifically, we propose methods to perform root-seeking graph walks that start from a post and captures its surrounding context to generate additional embeddings for the post. We then use these embeddings to predict the polarity relation between a reply and the post it is replying to. We evaluate the performance of our models on a curated debate dataset from Kialo, an online debating platform. Our model outperforms relevant baselines, including S-BERT, with an overall accuracy of 83%.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
Investigating Cargo Loss in Logistics Systems using Low-Cost Impact Sensors
Authors:
Prasang Gupta,
Antoinette Young,
Anand Rao
Abstract:
Cargo loss/damage is a very common problem faced by almost any business with a supply chain arm, leading to major problems like revenue loss and reputation tarnishing. This problem can be solved by employing an asset and impact tracking solution. This would be more practical and effective for high-cost cargo in comparison to low-cost cargo due to the high costs associated with the sensors and over…
▽ More
Cargo loss/damage is a very common problem faced by almost any business with a supply chain arm, leading to major problems like revenue loss and reputation tarnishing. This problem can be solved by employing an asset and impact tracking solution. This would be more practical and effective for high-cost cargo in comparison to low-cost cargo due to the high costs associated with the sensors and overall solution. In this study, we propose a low-cost solution architecture that is scalable, user-friendly, easy to adopt and is viable for a large range of cargo and logistics systems. Taking inspiration from a real-life use case we solved for a client, we also provide insights into the architecture as well as the design decisions that make this a reality.
△ Less
Submitted 19 April, 2022; v1 submitted 2 January, 2022;
originally announced January 2022.
-
Non-invasive hemodynamic analysis for aortic regurgitation using computational fluid dynamics and deep learning
Authors:
Derek Long,
Cameron McMurdo,
Edward Ferdian,
Charlene A. Mauger,
David Marlevi,
Alistair A. Young,
Martyn P. Nash
Abstract:
Changes in cardiovascular hemodynamics are closely related to the development of aortic regurgitation, a type of valvular heart disease. Metrics derived from blood flows are used to indicate aortic regurgitation onset and evaluate its severity. These metrics can be non-invasively obtained using four-dimensional (4D) flow magnetic resonance imaging (MRI), where accuracy is primarily dependent on sp…
▽ More
Changes in cardiovascular hemodynamics are closely related to the development of aortic regurgitation, a type of valvular heart disease. Metrics derived from blood flows are used to indicate aortic regurgitation onset and evaluate its severity. These metrics can be non-invasively obtained using four-dimensional (4D) flow magnetic resonance imaging (MRI), where accuracy is primarily dependent on spatial resolution. However, insufficient resolution often results from limitations in 4D flow MRI and complex aortic regurgitation hemodynamics. To address this, computational fluid dynamics simulations were transformed into synthetic 4D flow MRI data and used to train a variety of neural networks. These networks generated super resolution, full-field phase images with an upsample factor of 4. Results showed decreased velocity error, high structural similarity scores, and improved learning capabilities from previous work. Further validation was performed on two sets of in-vivo 4D flow MRI data and demonstrated success in de-noising flow images. This approach presents an opportunity to comprehensively analyse aortic regurgitation hemodynamics in a non-invasive manner.
△ Less
Submitted 5 April, 2022; v1 submitted 23 November, 2021;
originally announced November 2021.
-
MassFormer: Tandem Mass Spectrum Prediction for Small Molecules using Graph Transformers
Authors:
Adamo Young,
Bo Wang,
Hannes Röst
Abstract:
Tandem mass spectra capture fragmentation patterns that provide key structural information about a molecule. Although mass spectrometry is applied in many areas, the vast majority of small molecules lack experimental reference spectra. For over seventy years, spectrum prediction has remained a key challenge in the field. Existing deep learning methods do not leverage global structure in the molecu…
▽ More
Tandem mass spectra capture fragmentation patterns that provide key structural information about a molecule. Although mass spectrometry is applied in many areas, the vast majority of small molecules lack experimental reference spectra. For over seventy years, spectrum prediction has remained a key challenge in the field. Existing deep learning methods do not leverage global structure in the molecule, potentially resulting in difficulties when generalizing to new data. In this work we propose a new model, MassFormer, for accurately predicting tandem mass spectra. MassFormer uses a graph transformer architecture to model long-distance relationships between atoms in the molecule. The transformer module is initialized with parameters obtained through a chemical pre-training task, then fine-tuned on spectral data. MassFormer outperforms competing approaches for spectrum prediction on multiple datasets, and is able to recover prior knowledge about the effect of collision energy on the spectrum. By employing gradient-based attribution methods, we demonstrate that the model can identify relationships between fragment peaks. To further highlight MassFormer's utility, we show that it can match or exceed existing prediction-based methods on two spectrum identification tasks. We provide open-source implementations of our model and baseline approaches, with the goal of encouraging future research in this area.
△ Less
Submitted 1 May, 2023; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Deep Learning Analysis of Cardiac MRI in Legacy Datasets: Multi-Ethnic Study of Atherosclerosis
Authors:
Avan Suinesiaputra,
Charlene A Mauger,
Bharath Ambale-Venkatesh,
David A Bluemke,
Josefine Dam Gade,
Kathleen Gilbert,
Mark Janse,
Line Sofie Hald,
Conrad Werkhoven,
Colin Wu,
Joao A Lima,
Alistair A Young
Abstract:
The shape and motion of the heart provide essential clues to understanding the mechanisms of cardiovascular disease. With the advent of large-scale cardiac imaging data, statistical atlases become a powerful tool to provide automated and precise quantification of the status of patient-specific heart geometry with respect to reference populations. The Multi-Ethnic Study of Atherosclerosis (MESA), b…
▽ More
The shape and motion of the heart provide essential clues to understanding the mechanisms of cardiovascular disease. With the advent of large-scale cardiac imaging data, statistical atlases become a powerful tool to provide automated and precise quantification of the status of patient-specific heart geometry with respect to reference populations. The Multi-Ethnic Study of Atherosclerosis (MESA), begun in 2000, was the first large cohort study to incorporate cardiovascular MRI in over 5000 participants, and there is now a wealth of follow-up data over 20 years. Building a machine learning based automated analysis is necessary to extract the additional imaging information necessary for expanding original manual analyses. However, machine learning tools trained on MRI datasets with different pulse sequences fail on such legacy datasets. Here, we describe an automated atlas construction pipeline using deep learning methods applied to the legacy cardiac MRI data in MESA. For detection of anatomical cardiac landmark points, a modified VGGNet convolutional neural network architecture was used in conjunction with a transfer learning sequence between two-chamber, four-chamber, and short-axis MRI views. A U-Net architecture was used for detection of the endocardial and epicardial boundaries in short axis images. Both network architectures resulted in good segmentation and landmark detection accuracies compared with inter-observer variations. Statistical relationships with common risk factors were similar between atlases derived from automated vs manual annotations. The automated atlas can be employed in future studies to examine the relationships between cardiac morphology and future events.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
Physics-Based Models for Magneto-Electric Spin-Orbit Logic Circuits
Authors:
Hai Li,
Dmitri E. Nikonov,
Chia-Ching Lin,
Kerem Camsari,
Yu-Ching Liao,
Chia-Sheng Hsu,
Azad Naeemi,
Ian A. Young
Abstract:
Spintronic devices are a promising beyond-CMOS device option thanks to their energy efficiency and compatibility with CMOS. To accurately capture their multi-physics dynamics, a rigorous treatment of both spin and charge and their inter-conversion is required. Here we present physics-based device models based on 4x4 matrices for the spin-orbit coupling part of the magneto-electric spin-orbit (MESO…
▽ More
Spintronic devices are a promising beyond-CMOS device option thanks to their energy efficiency and compatibility with CMOS. To accurately capture their multi-physics dynamics, a rigorous treatment of both spin and charge and their inter-conversion is required. Here we present physics-based device models based on 4x4 matrices for the spin-orbit coupling part of the magneto-electric spin-orbit (MESO) device. Also, a more rigorous physics model of ferroelectric and magnetoelectric switching of ferromagnets, based on Landau-Lifshitz-Gilbert (LLG) and Landau-Khalatnikov (LK) equations, is presented. With the combined model implemented in a SPICE circuit simulator environment, simulation results were obtained which show feasibility of MESO implementation and functional operation of buffers, oscillators, and majority gates.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
The Impact of Domain Shift on Left and Right Ventricle Segmentation in Short Axis Cardiac MR Images
Authors:
Devran Ugurlu,
Esther Puyol-Anton,
Bram Ruijsink,
Alistair Young,
Ines Machado,
Kerstin Hammernik,
Andrew P. King,
Julia A. Schnabel
Abstract:
Domain shift refers to the difference in the data distribution of two datasets, normally between the training set and the test set for machine learning algorithms. Domain shift is a serious problem for generalization of machine learning models and it is well-established that a domain shift between the training and test sets may cause a drastic drop in the model's performance. In medical imaging, t…
▽ More
Domain shift refers to the difference in the data distribution of two datasets, normally between the training set and the test set for machine learning algorithms. Domain shift is a serious problem for generalization of machine learning models and it is well-established that a domain shift between the training and test sets may cause a drastic drop in the model's performance. In medical imaging, there can be many sources of domain shift such as different scanners or scan protocols, different pathologies in the patient population, anatomical differences in the patient population (e.g. men vs women) etc. Therefore, in order to train models that have good generalization performance, it is important to be aware of the domain shift problem, its potential causes and to devise ways to address it. In this paper, we study the effect of domain shift on left and right ventricle blood pool segmentation in short axis cardiac MR images. Our dataset contains short axis images from 4 different MR scanners and 3 different pathology groups. The training is performed with nnUNet. The results show that scanner differences cause a greater drop in performance compared to changing the pathology group, and that the impact of domain shift is greater on right ventricle segmentation compared to left ventricle segmentation. Increasing the number of training subjects increased cross-scanner performance more than in-scanner performance at small training set sizes, but this difference in improvement decreased with larger training set sizes. Training models using data from multiple scanners improved cross-domain performance.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
CardiSort: a convolutional neural network for cross vendor automated sorting of cardiac MR images
Authors:
Ruth P Lim,
Stefan Kachel,
Adriana DM Villa,
Leighton Kearney,
Nuno Bettencourt,
Alistair A Young,
Amedeo Chiribiri,
Cian M Scannell
Abstract:
Objectives: To develop an image-based automatic deep learning method to classify cardiac MR images by sequence type and imaging plane for improved clinical post-processing efficiency. Methods: Multi-vendor cardiac MRI studies were retrospectively collected from 4 centres and 3 vendors. A two-head convolutional neural network ('CardiSort') was trained to classify 35 sequences by imaging sequence (n…
▽ More
Objectives: To develop an image-based automatic deep learning method to classify cardiac MR images by sequence type and imaging plane for improved clinical post-processing efficiency. Methods: Multi-vendor cardiac MRI studies were retrospectively collected from 4 centres and 3 vendors. A two-head convolutional neural network ('CardiSort') was trained to classify 35 sequences by imaging sequence (n=17) and plane (n=10). Single vendor training (SVT) on single centre images (n=234 patients) and multi-vendor training (MVT) with multicentre images (n = 479 patients, 3 centres) was performed. Model accuracy was compared to manual ground truth labels by an expert radiologist on a hold-out test set for both SVT and MVT. External validation of MVT (MVTexternal) was performed on data from 3 previously unseen magnet systems from 2 vendors (n=80 patients). Results: High sequence and plane accuracies were observed for SVT (85.2% and 93.2% respectively), and MVT (96.5% and 98.1% respectively) on the hold-out test set. MVTexternal yielded sequence accuracy of 92.7% and plane accuracy of 93.0%. There was high accuracy for common sequences and conventional cardiac planes. Poor accuracy was observed for underrepresented classes and sequences where there was greater variability in acquisition parameters across centres, such as perfusion imaging. Conclusions: A deep learning network was developed on multivendor data to classify MRI studies into component sequences and planes, with external validation. With refinement, it has potential to improve workflow by enabling automated sequence selection, an important first step in completely automated post-processing pipelines.
△ Less
Submitted 8 April, 2022; v1 submitted 17 September, 2021;
originally announced September 2021.
-
Quality-aware Cine Cardiac MRI Reconstruction and Analysis from Undersampled k-space Data
Authors:
Ines Machado,
Esther Puyol-Anton,
Kerstin Hammernik,
Gastao Cruz,
Devran Ugurlu,
Bram Ruijsink,
Miguel Castelo-Branco,
Alistair Young,
Claudia Prieto,
Julia A. Schnabel,
Andrew P. King
Abstract:
Cine cardiac MRI is routinely acquired for the assessment of cardiac health, but the imaging process is slow and typically requires several breath-holds to acquire sufficient k-space profiles to ensure good image quality. Several undersampling-based reconstruction techniques have been proposed during the last decades to speed up cine cardiac MRI acquisition. However, the undersampling factor is co…
▽ More
Cine cardiac MRI is routinely acquired for the assessment of cardiac health, but the imaging process is slow and typically requires several breath-holds to acquire sufficient k-space profiles to ensure good image quality. Several undersampling-based reconstruction techniques have been proposed during the last decades to speed up cine cardiac MRI acquisition. However, the undersampling factor is commonly fixed to conservative values before acquisition to ensure diagnostic image quality, potentially leading to unnecessarily long scan times. In this paper, we propose an end-to-end quality-aware cine short-axis cardiac MRI framework that combines image acquisition and reconstruction with downstream tasks such as segmentation, volume curve analysis and estimation of cardiac functional parameters. The goal is to reduce scan time by acquiring only a fraction of k-space data to enable the reconstruction of images that can pass quality control checks and produce reliable estimates of cardiac functional parameters. The framework consists of a deep learning model for the reconstruction of 2D+t cardiac cine MRI images from undersampled data, an image quality-control step to detect good quality reconstructions, followed by a deep learning model for bi-ventricular segmentation, a quality-control step to detect good quality segmentations and automated calculation of cardiac functional parameters. To demonstrate the feasibility of the proposed approach, we perform simulations using a cohort of selected participants from the UK Biobank (n=270), 200 healthy subjects and 70 patients with cardiomyopathies. Our results show that we can produce quality-controlled images in a scan time reduced from 12 to 4 seconds per slice, enabling reliable estimates of cardiac functional parameters such as ejection fraction within 5% mean absolute error.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Investigating External Interaction Modality and Design Between Automated Vehicles and Pedestrians at Crossings
Authors:
Sue Bai,
Dakota Drake Legge,
Ashley Young,
Shan Bao,
Feng Zhou
Abstract:
In this study, we investigated the effectiveness and user acceptance of three external interaction modalities (i.e., visual, auditory, and visual+auditory) in promoting communications between automated vehicle systems (AVS) and pedestrians at a crosswalk through a large number of combined designs. For this purpose, an online survey was designed and distributed to 68 participants. All participants…
▽ More
In this study, we investigated the effectiveness and user acceptance of three external interaction modalities (i.e., visual, auditory, and visual+auditory) in promoting communications between automated vehicle systems (AVS) and pedestrians at a crosswalk through a large number of combined designs. For this purpose, an online survey was designed and distributed to 68 participants. All participants reported their overall preferences for safety, comfort, trust, ease of understanding, usability, and acceptance towards the systems. Results showed that the visual+auditory interaction modality was the mostly preferred, followed by the visual interaction modality and then the auditory one. We also tested different visual and auditory interaction methods, and found that "Pedestrian silhouette on the front of the vehicle" was the best preferred option while middle-aged participants liked "Chime" much better than young participants though it was overall better preferred than others. Finally, communication between the AVS and pedestrians' phones was not well received due to privacy concerns. These results provided important interface design recommendations in identifying better combination of visual and auditory designs and therefore improving AVS communicating their intention with pedestrians.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
Robust Semantic Interpretability: Revisiting Concept Activation Vectors
Authors:
Jacob Pfau,
Albert T. Young,
Jerome Wei,
Maria L. Wei,
Michael J. Keiser
Abstract:
Interpretability methods for image classification assess model trustworthiness by attempting to expose whether the model is systematically biased or attending to the same cues as a human would. Saliency methods for feature attribution dominate the interpretability literature, but these methods do not address semantic concepts such as the textures, colors, or genders of objects within an image. Our…
▽ More
Interpretability methods for image classification assess model trustworthiness by attempting to expose whether the model is systematically biased or attending to the same cues as a human would. Saliency methods for feature attribution dominate the interpretability literature, but these methods do not address semantic concepts such as the textures, colors, or genders of objects within an image. Our proposed Robust Concept Activation Vectors (RCAV) quantifies the effects of semantic concepts on individual model predictions and on model behavior as a whole. RCAV calculates a concept gradient and takes a gradient ascent step to assess model sensitivity to the given concept. By generalizing previous work on concept activation vectors to account for model non-linearity, and by introducing stricter hypothesis testing, we show that RCAV yields interpretations which are both more accurate at the image level and robust at the dataset level. RCAV, like saliency methods, supports the interpretation of individual predictions. To evaluate the practical use of interpretability methods as debugging tools, and the scientific use of interpretability methods for identifying inductive biases (e.g. texture over shape), we construct two datasets and accompanying metrics for realistic benchmarking of semantic interpretability methods. Our benchmarks expose the importance of counterfactual augmentation and negative controls for quantifying the practical usability of interpretability methods.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels
Authors:
Saahil Jain,
Akshay Smit,
Steven QH Truong,
Chanh DT Nguyen,
Minh-Thanh Huynh,
Mudit Jain,
Victoria A. Young,
Andrew Y. Ng,
Matthew P. Lungren,
Pranav Rajpurkar
Abstract:
Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods…
▽ More
Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods to produce labels from radiology reports that have better agreement with radiologists labeling images. Our best performing method, called VisualCheXbert, uses a biomedically-pretrained BERT model to directly map from a radiology report to the image labels, with a supervisory signal determined by a computer vision model trained to detect medical conditions from chest X-ray images. We find that VisualCheXbert outperforms an approach using an existing radiology report labeler by an average F1 score of 0.14 (95% CI 0.12, 0.17). We also find that VisualCheXbert better agrees with radiologists labeling chest X-ray images than do radiologists labeling the corresponding radiology reports by an average F1 score across several medical conditions of between 0.12 (95% CI 0.09, 0.15) and 0.21 (95% CI 0.18, 0.24).
△ Less
Submitted 15 March, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
How to Stay Curious while Avoiding Noisy TVs using Aleatoric Uncertainty Estimation
Authors:
Augustine N. Mavor-Parker,
Kimberly A. Young,
Caswell Barry,
Lewis D. Griffin
Abstract:
Exploration in environments with sparse rewards is difficult for artificial agents. Curiosity driven learning -- using feed-forward prediction errors as intrinsic rewards -- has achieved some success in these scenarios, but fails when faced with action-dependent noise sources. We present aleatoric mapping agents (AMAs), a neuroscience inspired solution modeled on the cholinergic system of the mamm…
▽ More
Exploration in environments with sparse rewards is difficult for artificial agents. Curiosity driven learning -- using feed-forward prediction errors as intrinsic rewards -- has achieved some success in these scenarios, but fails when faced with action-dependent noise sources. We present aleatoric mapping agents (AMAs), a neuroscience inspired solution modeled on the cholinergic system of the mammalian brain. AMAs aim to explicitly ascertain which dynamics of the environment are unpredictable, regardless of whether those dynamics are induced by the actions of the agent. This is achieved by generating separate forward predictions for the mean and variance of future states and reducing intrinsic rewards for those transitions with high aleatoric variance. We show AMAs are able to effectively circumvent action-dependent stochastic traps that immobilise conventional curiosity driven agents. The code for all experiments presented in this paper is open sourced: http://github.com/self-supervisor/Escaping-Stochastic-Traps-With-Aleatoric-Mapping-Agents.
△ Less
Submitted 5 July, 2024; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge
Authors:
Pierre Dognin,
Igor Melnyk,
Youssef Mroueh,
Inkit Padhi,
Mattia Rigotti,
Jarret Ross,
Yair Schiff,
Richard A. Young,
Brian Belgodere
Abstract:
Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on the…
▽ More
Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on these datasets limited as an assistive technology in real-world settings, such as helping visually impaired people navigate and accomplish everyday tasks. This gap motivated the introduction of the novel VizWiz dataset, which consists of images taken by the visually impaired and captions that have useful, task-oriented information. In an attempt to help the machine learning computer vision field realize its promise of producing technologies that have positive social impact, the curators of the VizWiz dataset host several competitions, including one for image captioning. This work details the theory and engineering from our winning submission to the 2020 captioning competition. Our work provides a step towards improved assistive image captioning systems.
△ Less
Submitted 18 June, 2021; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Motivations and Preliminary Design for Mid-Air Deployment of a Science Rotorcraft on Mars
Authors:
Jeff Delaune,
Jacob Izraelevitz,
Larry A. Young,
William Rapin,
Evgeniy Sklyanskiy,
Wayne Johnson,
Aaron Schutte,
Abigail Fraeman,
Valerie Scott,
Carl Leake,
Erik Ballesteros,
Shannah Withrow,
Raghav Bhagwat,
Haley Cummings,
Kim Aaron,
Marcel Veismann,
Skylar Wei,
Regina Lee,
Luis Pabon Madrid,
Morteza Gharib,
Joel Burdick
Abstract:
Mid-Air Deployment (MAD) of a rotorcraft during Entry, Descent and Landing (EDL) on Mars eliminates the need to carry a propulsion or airbag landing system. This reduces the total mass inside the aeroshell by more than 100 kg and simplifies the aeroshell architecture. MAD's lighter and simpler design is likely to bring the risk and cost associated with the mission down. Moreover, the lighter entry…
▽ More
Mid-Air Deployment (MAD) of a rotorcraft during Entry, Descent and Landing (EDL) on Mars eliminates the need to carry a propulsion or airbag landing system. This reduces the total mass inside the aeroshell by more than 100 kg and simplifies the aeroshell architecture. MAD's lighter and simpler design is likely to bring the risk and cost associated with the mission down. Moreover, the lighter entry mass enables landing in the Martian highlands, at elevations inaccessible to current EDL technologies. This paper proposes a novel MAD concept for a Mars helicopter. We suggest a minimum science payload package to perform relevant science in the highlands. A variant of the Ingenuity helicopter is proposed to provide increased deceleration during MAD, and enough lift to fly the science payload in the highlands. We show in simulation that the lighter aeroshell results in a lower terminal velocity (30 m/s) at the end of the parachute phase of the EDL, and at higher altitudes than other approaches. After discussing the aerodynamics, controls, guidance, and mechanical challenges associated with deploying at such speed, we propose a backshell architecture that addresses them to release the helicopter in the safest conditions. Finally, we implemented the helicopter model and aerodynamic descent perturbations in the JPL Dynamics and Real-Time Simulation (DARTS)framework. Preliminary performance evaluation indicates landing and helicopter operation scan be achieved up to 5 km MOLA (Mars Orbiter Laser Altimeter reference).
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Anxiety Detection Leveraging Mobile Passive Sensing
Authors:
Lionel Levine,
Migyeong Gwak,
Kimmo Karkkainen,
Shayan Fazeli,
Bita Zadeh,
Tara Peris,
Alexander Young,
Majid Sarrafzadeh
Abstract:
Anxiety disorders are the most common class of psychiatric problems affecting both children and adults. However, tools to effectively monitor and manage anxiety are lacking, and comparatively limited research has been applied to addressing the unique challenges around anxiety. Leveraging passive and unobtrusive data collection from smartphones could be a viable alternative to classical methods, al…
▽ More
Anxiety disorders are the most common class of psychiatric problems affecting both children and adults. However, tools to effectively monitor and manage anxiety are lacking, and comparatively limited research has been applied to addressing the unique challenges around anxiety. Leveraging passive and unobtrusive data collection from smartphones could be a viable alternative to classical methods, allowing for real-time mental health surveillance and disease management. This paper presents eWellness, an experimental mobile application designed to track a full-suite of sensor and user-log data off an individual's device in a continuous and passive manner. We report on an initial pilot study tracking ten people over the course of a month that showed a nearly 76% success rate at predicting daily anxiety and depression levels based solely on the passively monitored features.
△ Less
Submitted 9 August, 2020;
originally announced August 2020.
-
Fully Automated Myocardial Strain Estimation from CMR Tagged Images using a Deep Learning Framework in the UK Biobank
Authors:
Edward Ferdian,
Avan Suinesiaputra,
Kenneth Fung,
Nay Aung,
Elena Lukaschuk,
Ahmet Barutcu,
Edd Maclean,
Jose Paiva,
Stefan K. Piechnik,
Stefan Neubauer,
Steffen E Petersen,
Alistair A. Young
Abstract:
Purpose: To demonstrate the feasibility and performance of a fully automated deep learning framework to estimate myocardial strain from short-axis cardiac magnetic resonance tagged images. Methods and Materials: In this retrospective cross-sectional study, 4508 cases from the UK Biobank were split randomly into 3244 training and 812 validation cases, and 452 test cases. Ground truth myocardial lan…
▽ More
Purpose: To demonstrate the feasibility and performance of a fully automated deep learning framework to estimate myocardial strain from short-axis cardiac magnetic resonance tagged images. Methods and Materials: In this retrospective cross-sectional study, 4508 cases from the UK Biobank were split randomly into 3244 training and 812 validation cases, and 452 test cases. Ground truth myocardial landmarks were defined and tracked by manual initialization and correction of deformable image registration using previously validated software with five readers. The fully automatic framework consisted of 1) a convolutional neural network (CNN) for localization, and 2) a combination of a recurrent neural network (RNN) and a CNN to detect and track the myocardial landmarks through the image sequence for each slice. Radial and circumferential strain were then calculated from the motion of the landmarks and averaged on a slice basis. Results: Within the test set, myocardial end-systolic circumferential Green strain errors were -0.001 +/- 0.025, -0.001 +/- 0.021, and 0.004 +/- 0.035 in basal, mid, and apical slices respectively (mean +/- std. dev. of differences between predicted and manual strain). The framework reproduced significant reductions in circumferential strain in diabetics, hypertensives, and participants with previous heart attack. Typical processing time was ~260 frames (~13 slices) per second on an NVIDIA Tesla K40 with 12GB RAM, compared with 6-8 minutes per slice for the manual analysis. Conclusions: The fully automated RNNCNN framework for analysis of myocardial strain enabled unbiased strain evaluation in a high-throughput workflow, with similar ability to distinguish impairment due to diabetes, hypertension, and previous heart attack.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
4DFlowNet: Super-Resolution 4D Flow MRI using Deep Learning and Computational Fluid Dynamics
Authors:
Edward Ferdian,
Avan Suinesiaputra,
David Dubowitz,
Debbie Zhao,
Alan Wang,
Brett Cowan,
Alistair Young
Abstract:
4D-flow magnetic resonance imaging (MRI) is an emerging imaging technique where spatiotemporal 3D blood velocity can be captured with full volumetric coverage in a single non-invasive examination. This enables qualitative and quantitative analysis of hemodynamic flow parameters of the heart and great vessels. An increase in the image resolution would provide more accuracy and allow better assessme…
▽ More
4D-flow magnetic resonance imaging (MRI) is an emerging imaging technique where spatiotemporal 3D blood velocity can be captured with full volumetric coverage in a single non-invasive examination. This enables qualitative and quantitative analysis of hemodynamic flow parameters of the heart and great vessels. An increase in the image resolution would provide more accuracy and allow better assessment of the blood flow, especially for patients with abnormal flows. However, this must be balanced with increasing imaging time. The recent success of deep learning in generating super resolution images shows promise for implementation in medical images. We utilized computational fluid dynamics simulations to generate fluid flow simulations and represent them as synthetic 4D flow MRI data. We built our training dataset to mimic actual 4D flow MRI data with its corresponding noise distribution. Our novel 4DFlowNet network was trained on this synthetic 4D flow data and was capable in producing noise-free super resolution 4D flow phase images with upsample factor of 2. We also tested the 4DFlowNet in actual 4D flow MR images of a phantom and normal volunteer data, and demonstrated comparable results with the actual flow rate measurements giving an absolute relative error of 0.6 to 5.8% and 1.1 to 3.8% in the phantom data and normal volunteer data, respectively.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
A Coupled CMOS Oscillator Array for 8ns and 55pJ Inference in Convolutional Neural Networks
Authors:
D. E. Nikonov,
P. Kurahashi,
J. S. Ayers,
H. -J. Lee,
Y. Fan,
I. A. Young
Abstract:
Oscillator neural networks (ONN) based on arrays of 26 CMOS ring oscillators designed and fabricated. ONN are used for inference of dot products with image fragments and kernels necessary for convolutional neural networks. The inputs are encoded as frequency shifts of oscillators using current DACs. Degree of match (DOM) is determined from oscillators synchronization. Measurements demonstrate high…
▽ More
Oscillator neural networks (ONN) based on arrays of 26 CMOS ring oscillators designed and fabricated. ONN are used for inference of dot products with image fragments and kernels necessary for convolutional neural networks. The inputs are encoded as frequency shifts of oscillators using current DACs. Degree of match (DOM) is determined from oscillators synchronization. Measurements demonstrate high correlation of DOM and dot products. Inference requires the time of 8ns and energy of 55pJ.
△ Less
Submitted 25 October, 2019;
originally announced October 2019.