-
FIESTA: Fourier-Based Semantic Augmentation with Uncertainty Guidance for Enhanced Domain Generalizability in Medical Image Segmentation
Authors:
Kwanseok Oh,
Eunjin Jeon,
Da-Woon Heo,
Yooseung Shin,
Heung-Il Suk
Abstract:
Single-source domain generalization (SDG) in medical image segmentation (MIS) aims to generalize a model using data from only one source domain to segment data from an unseen target domain. Despite substantial advances in SDG with data augmentation, existing methods often fail to fully consider the details and uncertain areas prevalent in MIS, leading to mis-segmentation. This paper proposes a Fou…
▽ More
Single-source domain generalization (SDG) in medical image segmentation (MIS) aims to generalize a model using data from only one source domain to segment data from an unseen target domain. Despite substantial advances in SDG with data augmentation, existing methods often fail to fully consider the details and uncertain areas prevalent in MIS, leading to mis-segmentation. This paper proposes a Fourier-based semantic augmentation method called FIESTA using uncertainty guidance to enhance the fundamental goals of MIS in an SDG context by manipulating the amplitude and phase components in the frequency domain. The proposed Fourier augmentative transformer addresses semantic amplitude modulation based on meaningful angular points to induce pertinent variations and harnesses the phase spectrum to ensure structural coherence. Moreover, FIESTA employs epistemic uncertainty to fine-tune the augmentation process, improving the ability of the model to adapt to diverse augmented data and concentrate on areas with higher ambiguity. Extensive experiments across three cross-domain scenarios demonstrate that FIESTA surpasses recent state-of-the-art SDG approaches in segmentation performance and significantly contributes to boosting the applicability of the model in medical imaging modalities.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Domain Generalization for Medical Image Analysis: A Survey
Authors:
Jee Seok Yoon,
Kwanseok Oh,
Yooseung Shin,
Maciej A. Mazurowski,
Heung-Il Suk
Abstract:
Medical image analysis (MedIA) has become an essential tool in medicine and healthcare, aiding in disease diagnosis, prognosis, and treatment planning, and recent successes in deep learning (DL) have made significant contributions to its advances. However, deploying DL models for MedIA in real-world situations remains challenging due to their failure to generalize across the distributional gap bet…
▽ More
Medical image analysis (MedIA) has become an essential tool in medicine and healthcare, aiding in disease diagnosis, prognosis, and treatment planning, and recent successes in deep learning (DL) have made significant contributions to its advances. However, deploying DL models for MedIA in real-world situations remains challenging due to their failure to generalize across the distributional gap between training and testing samples - a problem known as domain shift. Researchers have dedicated their efforts to developing various DL methods to adapt and perform robustly on unknown and out-of-distribution data distributions. This paper comprehensively reviews domain generalization studies specifically tailored for MedIA. We provide a holistic view of how domain generalization techniques interact within the broader MedIA system, going beyond methodologies to consider the operational implications on the entire MedIA workflow. Specifically, we categorize domain generalization methods into data-level, feature-level, model-level, and analysis-level methods. We show how those methods can be used in various stages of the MedIA workflow with DL equipped from data acquisition to model prediction and analysis. Furthermore, we critically analyze the strengths and weaknesses of various methods, unveiling future research opportunities.
△ Less
Submitted 15 February, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
A Quantitatively Interpretable Model for Alzheimer's Disease Prediction Using Deep Counterfactuals
Authors:
Kwanseok Oh,
Da-Woon Heo,
Ahmad Wisnu Mulyadi,
Wonsik Jung,
Eunsong Kang,
Kun Ho Lee,
Heung-Il Suk
Abstract:
Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Recently, counterfactual reasoning has gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explan…
▽ More
Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Recently, counterfactual reasoning has gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explanatory maps based on visual inspection alone are insufficient unless we intuitively demonstrate their medical or neuroscientific validity via quantitative features. In this study, we synthesize the counterfactual-labeled structural MRIs using our proposed framework and transform it into a gray matter density map to measure its volumetric changes over the parcellated region of interest (ROI). We also devised a lightweight linear classifier to boost the effectiveness of constructed ROIs, promoted quantitative interpretation, and achieved comparable predictive performance to DL methods. Throughout this, our framework produces an ``AD-relatedness index'' for each ROI and offers an intuitive understanding of brain status for an individual patient and across patient groups with respect to AD progression.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words
Authors:
Taesu Kim,
SeungHeon Doh,
Gyunpyo Lee,
Hyungseok Jeon,
Juhan Nam,
Hyeon-Jeong Suk
Abstract:
Wake-up words (WUW) is a short sentence used to activate a speech recognition system to receive the user's speech input. WUW utterances include not only the lexical information for waking up the system but also non-lexical information such as speaker identity or emotion. In particular, recognizing the user's emotional state may elaborate the voice communication. However, there is few dataset where…
▽ More
Wake-up words (WUW) is a short sentence used to activate a speech recognition system to receive the user's speech input. WUW utterances include not only the lexical information for waking up the system but also non-lexical information such as speaker identity or emotion. In particular, recognizing the user's emotional state may elaborate the voice communication. However, there is few dataset where the emotional state of the WUW utterances is labeled. In this paper, we introduce Hi, KIA, a new WUW dataset which consists of 488 Korean accent emotional utterances collected from four male and four female speakers and each of utterances is labeled with four emotional states including anger, happy, sad, or neutral. We present the step-by-step procedure to build the dataset, covering scenario selection, post-processing, and human validation for label agreement. Also, we provide two classification models for WUW speech emotion recognition using the dataset. One is based on traditional hand-craft features and the other is a transfer-learning approach using a pre-trained neural network. These classification models could be used as benchmarks in further research.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
XADLiME: eXplainable Alzheimer's Disease Likelihood Map Estimation via Clinically-guided Prototype Learning
Authors:
Ahmad Wisnu Mulyadi,
Wonsik Jung,
Kwanseok Oh,
Jee Seok Yoon,
Heung-Il Suk
Abstract:
Diagnosing Alzheimer's disease (AD) involves a deliberate diagnostic process owing to its innate traits of irreversibility with subtle and gradual progression. These characteristics make AD biomarker identification from structural brain imaging (e.g., structural MRI) scans quite challenging. Furthermore, there is a high possibility of getting entangled with normal aging. We propose a novel deep-le…
▽ More
Diagnosing Alzheimer's disease (AD) involves a deliberate diagnostic process owing to its innate traits of irreversibility with subtle and gradual progression. These characteristics make AD biomarker identification from structural brain imaging (e.g., structural MRI) scans quite challenging. Furthermore, there is a high possibility of getting entangled with normal aging. We propose a novel deep-learning approach through eXplainable AD Likelihood Map Estimation (XADLiME) for AD progression modeling over 3D sMRIs using clinically-guided prototype learning. Specifically, we establish a set of topologically-aware prototypes onto the clusters of latent clinical features, uncovering an AD spectrum manifold. We then measure the similarities between latent clinical features and well-established prototypes, estimating a "pseudo" likelihood map. By considering this pseudo map as an enriched reference, we employ an estimating network to estimate the AD likelihood map over a 3D sMRI scan. Additionally, we promote the explainability of such a likelihood map by revealing a comprehensible overview from two perspectives: clinical and morphological. During the inference, this estimated likelihood map served as a substitute over unseen sMRI scans for effectively conducting the downstream task while providing thorough explainable states.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
TransSleep: Transitioning-aware Attention-based Deep Neural Network for Sleep Staging
Authors:
Jauen Phyo,
Wonjun Ko,
Eunjin Jeon,
Heung-Il Suk
Abstract:
Sleep staging is essential for sleep assessment and plays a vital role as a health indicator. Many recent studies have devised various machine learning as well as deep learning architectures for sleep staging. However, two key challenges hinder the practical use of these architectures: effectively capturing salient waveforms in sleep signals and correctly classifying confusing stages in transition…
▽ More
Sleep staging is essential for sleep assessment and plays a vital role as a health indicator. Many recent studies have devised various machine learning as well as deep learning architectures for sleep staging. However, two key challenges hinder the practical use of these architectures: effectively capturing salient waveforms in sleep signals and correctly classifying confusing stages in transitioning epochs. In this study, we propose a novel deep neural network structure, TransSleep, that captures distinctive local temporal patterns and distinguishes confusing stages using two auxiliary tasks. In particular, TransSleep adopts an attention-based multi-scale feature extractor module to capture salient waveforms; a stage-confusion estimator module with a novel auxiliary task, epoch-level stage classification, to estimate confidence scores for identifying confusing stages; and a context encoder module with the other novel auxiliary task, stage-transition detection, to represent contextual relationships across neighboring epochs. Results show that TransSleep achieves promising performance in automatic sleep staging. The validity of TransSleep is demonstrated by its state-of-the-art performance on two publicly available datasets, Sleep-EDF and MASS. Furthermore, we performed ablations to analyze our results from different perspectives. Based on our overall results, we believe that TransSleep has immense potential to provide new insights into deep learning-based sleep staging.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
A Novel RL-assisted Deep Learning Framework for Task-informative Signals Selection and Classification for Spontaneous BCIs
Authors:
Wonjun Ko,
Eunjin Jeon,
Heung-Il Suk
Abstract:
In this work, we formulate the problem of estimating and selecting task-relevant temporal signal segments from a single EEG trial in the form of a Markov decision process and propose a novel reinforcement-learning mechanism that can be combined with the existing deep-learning based BCI methods. To be specific, we devise an actor-critic network such that an agent can determine which timepoints need…
▽ More
In this work, we formulate the problem of estimating and selecting task-relevant temporal signal segments from a single EEG trial in the form of a Markov decision process and propose a novel reinforcement-learning mechanism that can be combined with the existing deep-learning based BCI methods. To be specific, we devise an actor-critic network such that an agent can determine which timepoints need to be used (informative) or discarded (uninformative) in composing the intention-related features in a given trial, and thus enhancing the intention identification performance. To validate the effectiveness of our proposed method, we conducted experiments with a publicly available big MI dataset and applied our novel mechanism to various recent deep-learning architectures designed for MI classification. Based on the exhaustive experiments, we observed that our proposed method helped achieve statistically significant improvements in performance.
△ Less
Submitted 30 June, 2020;
originally announced July 2020.
-
Deep Recurrent Model for Individualized Prediction of Alzheimer's Disease Progression
Authors:
Wonsik Jung,
Eunji Jun,
Heung-Il Suk
Abstract:
Alzheimer's disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years, with no treatments or available medicines. In this regard, there have been efforts to identify the risk of developing AD in its earliest time. While many of the previous works considered cross-sectional analysis, more recent studies have focused on the diagnosis and…
▽ More
Alzheimer's disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years, with no treatments or available medicines. In this regard, there have been efforts to identify the risk of developing AD in its earliest time. While many of the previous works considered cross-sectional analysis, more recent studies have focused on the diagnosis and prognosis of AD with longitudinal or time series data in a way of disease progression modeling (DPM). Under the same problem settings, in this work, we propose a novel computational framework that can predict the phenotypic measurements of MRI biomarkers and trajectories of clinical status along with cognitive scores at multiple future time points. However, in handling time series data, it generally faces with many unexpected missing observations. In regard to such an unfavorable situation, we define a secondary problem of estimating those missing values and tackle it in a systematic way by taking account of temporal and multivariate relations inherent in time series data. Concretely, we propose a deep recurrent network that jointly tackles the four problems of (i) missing value imputation, (ii) phenotypic measurements forecasting, (iii) trajectory estimation of the cognitive score, and (iv) clinical status prediction of a subject based on his/her longitudinal imaging biomarkers. Notably, the learnable model parameters of our network are trained in an end-to-end manner with our circumspectly defined loss function. In our experiments over TADPOLE challenge cohort, we measured performance for various metrics and compared our method to competing methods in the literature. Exhaustive analyses and ablation studies were also conducted to better confirm the effectiveness of our method.
△ Less
Submitted 27 August, 2020; v1 submitted 6 May, 2020;
originally announced May 2020.
-
Multi-Scale Neural network for EEG Representation Learning in BCI
Authors:
Wonjun Ko,
Eunjin Jeon,
Seungwoo Jeong,
Heung-Il Suk
Abstract:
Recent advances in deep learning have had a methodological and practical impact on brain-computer interface research. Among the various deep network architectures, convolutional neural networks have been well suited for spatio-spectral-temporal electroencephalogram signal representation learning. Most of the existing CNN-based methods described in the literature extract features at a sequential le…
▽ More
Recent advances in deep learning have had a methodological and practical impact on brain-computer interface research. Among the various deep network architectures, convolutional neural networks have been well suited for spatio-spectral-temporal electroencephalogram signal representation learning. Most of the existing CNN-based methods described in the literature extract features at a sequential level of abstraction with repetitive nonlinear operations and involve densely connected layers for classification. However, studies in neurophysiology have revealed that EEG signals carry information in different ranges of frequency components. To better reflect these multi-frequency properties in EEGs, we propose a novel deep multi-scale neural network that discovers feature representations in multiple frequency/time ranges and extracts relationships among electrodes, i.e., spatial representations, for subject intention/condition identification. Furthermore, by completely representing EEG signals with spatio-spectral-temporal information, the proposed method can be utilized for diverse paradigms in both active and passive BCIs, contrary to existing methods that are primarily focused on single-paradigm BCIs. To demonstrate the validity of our proposed method, we conducted experiments on various paradigms of active/passive BCI datasets. Our experimental results demonstrated that the proposed method achieved performance improvements when judged against comparable state-of-the-art methods. Additionally, we analyzed the proposed method using different techniques, such as PSD curves and relevance score inspection to validate the multi-scale EEG signal information capturing ability, activation pattern maps for investigating the learned spatial filters, and t-SNE plotting for visualizing represented features. Finally, we also demonstrated our method's application to real-world problems.
△ Less
Submitted 1 March, 2020;
originally announced March 2020.
-
Self-Driving like a Human driver instead of a Robocar: Personalized comfortable driving experience for autonomous vehicles
Authors:
Il Bae,
Jaeyoung Moon,
Junekyo Jhung,
Ho Suk,
Taewoo Kim,
Hyungbin Park,
Jaekwang Cha,
Jinhyuk Kim,
Dohyun Kim,
Shiho Kim
Abstract:
This paper issues an integrated control system of self-driving autonomous vehicles based on the personal driving preference to provide personalized comfortable driving experience to autonomous vehicle users. We propose an Occupant's Preference Metric (OPM) which is defining a preferred lateral and longitudinal acceleration region with maximum allowable jerk for users. Moreover, we propose a vehicl…
▽ More
This paper issues an integrated control system of self-driving autonomous vehicles based on the personal driving preference to provide personalized comfortable driving experience to autonomous vehicle users. We propose an Occupant's Preference Metric (OPM) which is defining a preferred lateral and longitudinal acceleration region with maximum allowable jerk for users. Moreover, we propose a vehicle controller based on control parameters enabling integrated lateral and longitudinal control via preference-aware maneuvering of autonomous vehicles. The proposed system not only provides the criteria for the occupant's driving preference, but also provides a personalized autonomous self-driving style like a human driver instead of a Robocar. The simulation and experimental results demonstrated that the proposed system can maneuver the self-driving vehicle like a human driver by tracking the specified criterion of admissible acceleration and jerk.
△ Less
Submitted 18 November, 2022; v1 submitted 12 January, 2020;
originally announced January 2020.