Search | arXiv e-print repository

Slicing Through Bias: Explaining Performance Gaps in Medical Image Analysis using Slice Discovery Methods

Authors: Vincent Olesen, Nina Weng, Aasa Feragen, Eike Petersen

Abstract: Machine learning models have achieved high overall accuracy in medical image analysis. However, performance disparities on specific patient groups pose challenges to their clinical utility, safety, and fairness. This can affect known patient groups - such as those based on sex, age, or disease subtype - as well as previously unknown and unlabeled groups. Furthermore, the root cause of such observe… ▽ More Machine learning models have achieved high overall accuracy in medical image analysis. However, performance disparities on specific patient groups pose challenges to their clinical utility, safety, and fairness. This can affect known patient groups - such as those based on sex, age, or disease subtype - as well as previously unknown and unlabeled groups. Furthermore, the root cause of such observed performance disparities is often challenging to uncover, hindering mitigation efforts. In this paper, to address these issues, we leverage Slice Discovery Methods (SDMs) to identify interpretable underperforming subsets of data and formulate hypotheses regarding the cause of observed performance disparities. We introduce a novel SDM and apply it in a case study on the classification of pneumothorax and atelectasis from chest x-rays. Our study demonstrates the effectiveness of SDMs in hypothesis formulation and yields an explanation of previously observed but unexplained performance disparities between male and female patients in widely used chest X-ray datasets and models. Our findings indicate shortcut learning in both classification tasks, through the presence of chest drains and ECG wires, respectively. Sex-based differences in the prevalence of these shortcut features appear to cause the observed classification performance gap, representing a previously underappreciated interaction between shortcut learning and model fairness analyses. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2405.08786 [pdf, other]

Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring

Authors: Tiantian Zhang, Manxi Lin, Hongda Guo, Xiaofan Zhang, Ka Fung Peter Chiu, Aasa Feragen, Qi Dou

Abstract: The Prostate Imaging Reporting and Data System (PI-RADS) is pivotal in the diagnosis of clinically significant prostate cancer through MRI imaging. Current deep learning-based PI-RADS scoring methods often lack the incorporation of essential PI-RADS clinical guidelines~(PICG) utilized by radiologists, potentially compromising scoring accuracy. This paper introduces a novel approach that adapts a m… ▽ More The Prostate Imaging Reporting and Data System (PI-RADS) is pivotal in the diagnosis of clinically significant prostate cancer through MRI imaging. Current deep learning-based PI-RADS scoring methods often lack the incorporation of essential PI-RADS clinical guidelines~(PICG) utilized by radiologists, potentially compromising scoring accuracy. This paper introduces a novel approach that adapts a multi-modal large language model (MLLM) to incorporate PICG into PI-RADS scoring without additional annotations and network parameters. We present a two-stage fine-tuning process aimed at adapting MLLMs originally trained on natural images to the MRI data domain while effectively integrating the PICG. In the first stage, we develop a domain adapter layer specifically tailored for processing 3D MRI image inputs and design the MLLM instructions to differentiate MRI modalities effectively. In the second stage, we translate PICG into guiding instructions for the model to generate PICG-guided image features. Through feature distillation, we align scoring network features with the PICG-guided image feature, enabling the scoring network to effectively incorporate the PICG information. We develop our model on a public dataset and evaluate it in a real-world challenging in-house dataset. Experimental results demonstrate that our approach improves the performance of current scoring networks. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2404.00032 [pdf, other]

Deployment of Deep Learning Model in Real World Clinical Setting: A Case Study in Obstetric Ultrasound

Authors: Chun Kit Wong, Mary Ngo, Manxi Lin, Zahra Bashir, Amihai Heen, Morten Bo Søndergaard Svendsen, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: Despite the rapid development of AI models in medical image analysis, their validation in real-world clinical settings remains limited. To address this, we introduce a generic framework designed for deploying image-based AI models in such settings. Using this framework, we deployed a trained model for fetal ultrasound standard plane detection, and evaluated it in real-time sessions with both novic… ▽ More Despite the rapid development of AI models in medical image analysis, their validation in real-world clinical settings remains limited. To address this, we introduce a generic framework designed for deploying image-based AI models in such settings. Using this framework, we deployed a trained model for fetal ultrasound standard plane detection, and evaluated it in real-time sessions with both novice and expert users. Feedback from these sessions revealed that while the model offers potential benefits to medical practitioners, the need for navigational guidance was identified as a key area for improvement. These findings underscore the importance of early deployment of AI models in real-world settings, leading to insights that can guide the refinement of the model and system based on actual user feedback. △ Less

Submitted 22 March, 2024; originally announced April 2024.

Comments: 10 pages

arXiv:2403.08700 [pdf, other]

Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment

Authors: Paraskevas Pegios, Manxi Lin, Nina Weng, Morten Bo Søndergaard Svendsen, Zahra Bashir, Siavash Bigdeli, Anders Nymark Christensen, Martin Tolsgaard, Aasa Feragen

Abstract: Obstetric ultrasound image quality is crucial for accurate diagnosis and monitoring of fetal health. However, producing high-quality standard planes is difficult, influenced by the sonographer's expertise and factors like the maternal BMI or the fetus dynamics. In this work, we propose using diffusion-based counterfactual explainable AI to generate realistic high-quality standard planes from low-q… ▽ More Obstetric ultrasound image quality is crucial for accurate diagnosis and monitoring of fetal health. However, producing high-quality standard planes is difficult, influenced by the sonographer's expertise and factors like the maternal BMI or the fetus dynamics. In this work, we propose using diffusion-based counterfactual explainable AI to generate realistic high-quality standard planes from low-quality non-standard ones. Through quantitative and qualitative evaluation, we demonstrate the effectiveness of our method in producing plausible counterfactuals of increased quality. This shows future promise both for enhancing training of clinicians by providing visual feedback, as well as for improving image quality and, consequently, downstream diagnosis and monitoring. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.08564 [pdf, other]

Non-discrimination Criteria for Generative Language Models

Authors: Sara Sterlie, Nina Weng, Aasa Feragen

Abstract: Within recent years, generative AI, such as large language models, has undergone rapid development. As these models become increasingly available to the public, concerns arise about perpetuating and amplifying harmful biases in applications. Gender stereotypes can be harmful and limiting for the individuals they target, whether they consist of misrepresentation or discrimination. Recognizing gende… ▽ More Within recent years, generative AI, such as large language models, has undergone rapid development. As these models become increasingly available to the public, concerns arise about perpetuating and amplifying harmful biases in applications. Gender stereotypes can be harmful and limiting for the individuals they target, whether they consist of misrepresentation or discrimination. Recognizing gender bias as a pervasive societal construct, this paper studies how to uncover and quantify the presence of gender biases in generative language models. In particular, we derive generative AI analogues of three well-known non-discrimination criteria from classification, namely independence, separation and sufficiency. To demonstrate these criteria in action, we design prompts for each of the criteria with a focus on occupational gender stereotype, specifically utilizing the medical test to introduce the ground truth in the generative AI context. Our results address the presence of occupational gender bias within such conversational language models. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 14 pages, 5 figures. Submitted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)

arXiv:2403.06748 [pdf, other]

Shortcut Learning in Medical Image Segmentation

Authors: Manxi Lin, Nina Weng, Kamil Mikolaj, Zahra Bashir, Morten Bo Søndergaard Svendsen, Martin Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotatio… ▽ More Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotations such as calipers, and the combination of zero-padded convolutions and center-cropped training sets in the dataset can inadvertently serve as shortcuts, impacting segmentation accuracy. We identify and evaluate the shortcut learning on two different but common medical image segmentation tasks. In addition, we suggest strategies to mitigate the influence of shortcut learning and improve the generalizability of the segmentation models. By uncovering the presence and implications of shortcuts in medical image segmentation, we provide insights and methodologies for evaluating and overcoming this pervasive challenge and call for attention in the community for shortcuts in segmentation. Our code is public at https://github.com/nina-weng/shortcut_skinseg . △ Less

Submitted 27 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 11 pages, 6 figures, accepted at MICCAI 2024

arXiv:2402.08294 [pdf, other]

Learning semantic image quality for fetal ultrasound from noisy ranking annotation

Authors: Manxi Lin, Jakob Ambsdorf, Emilie Pi Fogtmann Sejer, Zahra Bashir, Chun Kit Wong, Paraskevas Pegios, Alberto Raheli, Morten Bo Søndergaard Svendsen, Mads Nielsen, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training dat… ▽ More We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training data, we design an efficient ranking annotation scheme based on the merge sort algorithm. Finally, we compare our ranking algorithm to a number of state-of-the-art ranking algorithms on a challenging fetal ultrasound quality assessment task, showing the superior performance of our method on the majority of rank correlation metrics. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: Extended version of the accepted paper at ISBI 2024

arXiv:2401.12588 [pdf, other]

Interpreting Equivariant Representations

Authors: Andreas Abildtrup Hansen, Anna Calissano, Aasa Feragen

Abstract: Latent representations are used extensively for downstream tasks, such as visualization, interpolation or feature extraction of deep learning models. Invariant and equivariant neural networks are powerful and well-established models for enforcing inductive biases. In this paper, we demonstrate that the inductive bias imposed on the by an equivariant model must also be taken into account when using… ▽ More Latent representations are used extensively for downstream tasks, such as visualization, interpolation or feature extraction of deep learning models. Invariant and equivariant neural networks are powerful and well-established models for enforcing inductive biases. In this paper, we demonstrate that the inductive bias imposed on the by an equivariant model must also be taken into account when using latent representations. We show how not accounting for the inductive biases leads to decreased performance on downstream tasks, and vice versa, how accounting for inductive biases can be done effectively by using an invariant projection of the latent representations. We propose principles for how to choose such a projection, and show the impact of using these principles in two common examples: First, we study a permutation equivariant variational auto-encoder trained for molecule graph generation; here we show that invariant projections can be designed that incur no loss of information in the resulting invariant representation. Next, we study a rotation-equivariant representation used for image classification. Here, we illustrate how random invariant projections can be used to obtain an invariant representation with a high degree of retained information. In both cases, the analysis of invariant latent representations proves superior to their equivariant counterparts. Finally, we illustrate that the phenomena documented here for equivariant neural networks have counterparts in standard neural networks where invariance is encouraged via augmentation. Thus, while these ambiguities may be known by experienced developers of equivariant models, we make both the knowledge as well as effective tools to handle the ambiguities available to the broader community. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2312.14223 [pdf, other]

Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation

Authors: Nina Weng, Paraskevas Pegios, Aasa Feragen, Eike Petersen, Siavash Bigdeli

Abstract: Shortcut learning is when a model -- e.g. a cardiac disease classifier -- exploits correlations between the target label and a spurious shortcut feature, e.g. a pacemaker, to predict the target label based on the shortcut rather than real discriminative features. This is common in medical imaging, where treatment and clinical annotations correlate with disease labels, making them easy shortcuts to… ▽ More Shortcut learning is when a model -- e.g. a cardiac disease classifier -- exploits correlations between the target label and a spurious shortcut feature, e.g. a pacemaker, to predict the target label based on the shortcut rather than real discriminative features. This is common in medical imaging, where treatment and clinical annotations correlate with disease labels, making them easy shortcuts to predict disease. We propose a novel detection and quantification of the impact of potential shortcut features via a fast diffusion-based counterfactual image generation that can synthetically remove or add shortcuts. Via a novel inpainting-based modification we spatially limit the changes made with no extra inference step, encouraging the removal of spatially constrained shortcut features while ensuring that the shortcut-free counterfactuals preserve their remaining image features to a high degree. Using these, we assess how shortcut features influence model predictions. This is enabled by our second contribution: An efficient diffusion-based counterfactual explanation method with significant inference speed-up at comparable image quality as state-of-the-art. We confirm this on two large chest X-ray datasets, a skin lesion dataset, and CelebA. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2309.12325 [pdf, other]

FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare

Authors: Karim Lekadir, Aasa Feragen, Abdul Joseph Fofanah, Alejandro F Frangi, Alena Buyx, Anais Emelie, Andrea Lara, Antonio R Porras, An-Wen Chan, Arcadi Navarro, Ben Glocker, Benard O Botwe, Bishesh Khanal, Brigit Beger, Carol C Wu, Celia Cintas, Curtis P Langlotz, Daniel Rueckert, Deogratias Mzurikwao, Dimitrios I Fotiadis, Doszhan Zhussupov, Enzo Ferrante, Erik Meijering, Eva Weicken, Fabio A González , et al. (93 additional authors not shown)

Abstract: Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted… ▽ More Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI. △ Less

Submitted 11 August, 2023; originally announced September 2023.

ACM Class: I.2.0; I.4.0; I.5.0

arXiv:2308.05129 [pdf, other]

Are Sex-based Physiological Differences the Cause of Gender Bias for Chest X-ray Diagnosis?

Authors: Nina Weng, Siavash Bigdeli, Eike Petersen, Aasa Feragen

Abstract: While many studies have assessed the fairness of AI algorithms in the medical field, the causes of differences in prediction performance are often unknown. This lack of knowledge about the causes of bias hampers the efficacy of bias mitigation, as evidenced by the fact that simple dataset balancing still often performs best in reducing performance gaps but is unable to resolve all performance diff… ▽ More While many studies have assessed the fairness of AI algorithms in the medical field, the causes of differences in prediction performance are often unknown. This lack of knowledge about the causes of bias hampers the efficacy of bias mitigation, as evidenced by the fact that simple dataset balancing still often performs best in reducing performance gaps but is unable to resolve all performance differences. In this work, we investigate the causes of gender bias in machine learning-based chest X-ray diagnosis. In particular, we explore the hypothesis that breast tissue leads to underexposure of the lungs and causes lower model performance. Methodologically, we propose a new sampling method which addresses the highly skewed distribution of recordings per patient in two widely used public datasets, while at the same time reducing the impact of label errors. Our comprehensive analysis of gender differences across diseases, datasets, and gender representations in the training set shows that dataset imbalance is not the sole cause of performance differences. Moreover, relative group performance differs strongly between datasets, indicating important dataset-specific factors influencing male/female group performance. Finally, we investigate the effect of breast tissue more specifically, by cropping out the breasts from recordings, finding that this does not resolve the observed performance gaps. In conclusion, our results indicate that dataset-specific factors, not fundamental physiological differences, are the main drivers of male--female performance gaps in chest X-ray analyses on widely used NIH and CheXpert Dataset. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2305.01397 [pdf, other]

Are demographically invariant models and representations in medical imaging fair?

Authors: Eike Petersen, Enzo Ferrante, Melanie Ganz, Aasa Feragen

Abstract: Medical imaging models have been shown to encode information about patient demographics such as age, race, and sex in their latent representation, raising concerns about their potential for discrimination. Here, we ask whether requiring models not to encode demographic attributes is desirable. We point out that marginal and class-conditional representation invariance imply the standard group fairn… ▽ More Medical imaging models have been shown to encode information about patient demographics such as age, race, and sex in their latent representation, raising concerns about their potential for discrimination. Here, we ask whether requiring models not to encode demographic attributes is desirable. We point out that marginal and class-conditional representation invariance imply the standard group fairness notions of demographic parity and equalized odds, respectively. In addition, however, they require matching the risk distributions, thus potentially equalizing away important group differences. Enforcing the traditional fairness notions directly instead does not entail these strong constraints. Moreover, representationally invariant models may still take demographic attributes into account for deriving predictions, implying unequal treatment - in fact, achieving representation invariance may require doing so. In theory, this can be prevented using counterfactual notions of (individual) fairness or invariance. We caution, however, that properly defining medical image counterfactuals with respect to demographic attributes is fraught with challenges. Finally, we posit that encoding demographic attributes may even be advantageous if it enables learning a task-specific encoding of demographic features that does not rely on social constructs such as 'race' and 'gender.' We conclude that demographically invariant representations are neither necessary nor sufficient for fairness in medical imaging. Models may need to encode demographic attributes, lending further urgency to calls for comprehensive model fairness assessments in terms of predictive performance across diverse patient groups. △ Less

Submitted 3 July, 2024; v1 submitted 2 May, 2023; originally announced May 2023.

arXiv:2304.05463 [pdf, other]

doi 10.1007/978-3-031-44521-7_2

An Automatic Guidance and Quality Assessment System for Doppler Imaging of Umbilical Artery

Authors: Chun Kit Wong, Manxi Lin, Alberto Raheli, Zahra Bashir, Morten Bo Søndergaard Svendsen, Martin Grønnebæk Tolsgaard, Aasa Feragen, Anders Nymark Christensen

Abstract: Examination of the umbilical artery with Doppler ultrasonography is performed to investigate blood supply to the fetus through the umbilical cord, which is vital for the monitoring of fetal health. Such examination involves several steps that must be performed correctly: identifying suitable sites on the umbilical artery for the measurement, acquiring the blood flow curve in the form of a Doppler… ▽ More Examination of the umbilical artery with Doppler ultrasonography is performed to investigate blood supply to the fetus through the umbilical cord, which is vital for the monitoring of fetal health. Such examination involves several steps that must be performed correctly: identifying suitable sites on the umbilical artery for the measurement, acquiring the blood flow curve in the form of a Doppler spectrum, and ensuring compliance to a set of quality standards. These steps rely heavily on the operator's skill, and the shortage of experienced sonographers has thus created a demand for machine assistance. In this work, we propose an automatic system to fill the gap. By using a modified Faster R-CNN network, we obtain an algorithm that can suggest locations suitable for Doppler measurement. Meanwhile, we have also developed a method for assessment of the Doppler spectrum's quality. The proposed system is validated on 657 images from a national ultrasound screening database, with results demonstrating its potential as a guidance system. △ Less

Submitted 6 July, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Fetal Ultrasound, Umbilical Artery, Doppler Ultrasound

Journal ref: ASMUS 2023. Simplifying Medical Ultrasound pp 13-22. Lecture Notes in Computer Science, vol 14337

arXiv:2303.15850 [pdf, other]

That Label's Got Style: Handling Label Style Bias for Uncertain Image Segmentation

Authors: Kilian Zepf, Eike Petersen, Jes Frellsen, Aasa Feragen

Abstract: Segmentation uncertainty models predict a distribution over plausible segmentations for a given input, which they learn from the annotator variation in the training set. However, in practice these annotations can differ systematically in the way they are generated, for example through the use of different labeling tools. This results in datasets that contain both data variability and differing lab… ▽ More Segmentation uncertainty models predict a distribution over plausible segmentations for a given input, which they learn from the annotator variation in the training set. However, in practice these annotations can differ systematically in the way they are generated, for example through the use of different labeling tools. This results in datasets that contain both data variability and differing label styles. In this paper, we demonstrate that applying state-of-the-art segmentation uncertainty models on such datasets can lead to model bias caused by the different label styles. We present an updated modelling objective conditioning on labeling style for aleatoric uncertainty estimation, and modify two state-of-the-art-architectures for segmentation uncertainty accordingly. We show with extensive experiments that this method reduces label style bias, while improving segmentation performance, increasing the applicability of segmentation uncertainty models in the wild. We curate two datasets, with annotations in different label styles, which we will make publicly available along with our code upon publication. △ Less

Submitted 28 March, 2023; originally announced March 2023.

arXiv:2303.13918 [pdf, other]

Removing confounding information from fetal ultrasound images

Authors: Kamil Mikolaj, Manxi Lin, Zahra Bashir, Morten Bo Søndergaard Svendsen, Martin Tolsgaard, Anders Nymark, Aasa Feragen

Abstract: Confounding information in the form of text or markings embedded in medical images can severely affect the training of diagnostic deep learning algorithms. However, data collected for clinical purposes often have such markings embedded in them. In dermatology, known examples include drawings or rulers that are overrepresented in images of malignant lesions. In this paper, we encounter text and cal… ▽ More Confounding information in the form of text or markings embedded in medical images can severely affect the training of diagnostic deep learning algorithms. However, data collected for clinical purposes often have such markings embedded in them. In dermatology, known examples include drawings or rulers that are overrepresented in images of malignant lesions. In this paper, we encounter text and calipers placed on the images found in national databases containing fetal screening ultrasound scans, which correlate with standard planes to be predicted. In order to utilize the vast amounts of data available in these databases, we develop and validate a series of methods for minimizing the confounding effects of embedded text and calipers on deep learning algorithms designed for ultrasound, using standard plane classification as a test case. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: Fetal ultrasound, confounders, shortcut learning

arXiv:2303.13123 [pdf, other]

Laplacian Segmentation Networks: Improved Epistemic Uncertainty from Spatial Aleatoric Uncertainty

Authors: Kilian Zepf, Selma Wanna, Marco Miani, Juston Moore, Jes Frellsen, Søren Hauberg, Aasa Feragen, Frederik Warburg

Abstract: Out of distribution (OOD) medical images are frequently encountered, e.g. because of site- or scanner differences, or image corruption. OOD images come with a risk of incorrect image segmentation, potentially negatively affecting downstream diagnoses or treatment. To ensure robustness to such incorrect segmentations, we propose Laplacian Segmentation Networks (LSN) that jointly model epistemic (mo… ▽ More Out of distribution (OOD) medical images are frequently encountered, e.g. because of site- or scanner differences, or image corruption. OOD images come with a risk of incorrect image segmentation, potentially negatively affecting downstream diagnoses or treatment. To ensure robustness to such incorrect segmentations, we propose Laplacian Segmentation Networks (LSN) that jointly model epistemic (model) and aleatoric (data) uncertainty in image segmentation. We capture data uncertainty with a spatially correlated logit distribution. For model uncertainty, we propose the first Laplace approximation of the weight posterior that scales to large neural networks with skip connections that have high-dimensional outputs. Empirically, we demonstrate that modelling spatial pixel correlation allows the Laplacian Segmentation Network to successfully assign high epistemic uncertainty to out-of-distribution objects appearing within images. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2302.08851 [pdf, other]

On (assessing) the fairness of risk score models

Authors: Eike Petersen, Melanie Ganz, Sune Hannibal Holm, Aasa Feragen

Abstract: Recent work on algorithmic fairness has largely focused on the fairness of discrete decisions, or classifications. While such decisions are often based on risk score models, the fairness of the risk models themselves has received considerably less attention. Risk models are of interest for a number of reasons, including the fact that they communicate uncertainty about the potential outcomes to use… ▽ More Recent work on algorithmic fairness has largely focused on the fairness of discrete decisions, or classifications. While such decisions are often based on risk score models, the fairness of the risk models themselves has received considerably less attention. Risk models are of interest for a number of reasons, including the fact that they communicate uncertainty about the potential outcomes to users, thus representing a way to enable meaningful human oversight. Here, we address fairness desiderata for risk score models. We identify the provision of similar epistemic value to different groups as a key desideratum for risk score fairness. Further, we address how to assess the fairness of risk score models quantitatively, including a discussion of metric choices and meaningful statistical comparisons between groups. In this context, we also introduce a novel calibration error metric that is less sample size-biased than previously proposed metrics, enabling meaningful comparisons between groups of different sizes. We illustrate our methodology - which is widely applicable in many other settings - in two case studies, one in recidivism risk prediction, and one in risk of major depressive disorder (MDD) prediction. △ Less

Submitted 22 February, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

MSC Class: 91B32; 62-XX; 91Gxx; 68T07 ACM Class: K.4.1; K.4.2; J.3; I.5; G.3

arXiv:2211.10630 [pdf, other]

I saw, I conceived, I concluded: Progressive Concepts as Bottlenecks

Authors: Manxi Lin, Aasa Feragen, Zahra Bashir, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen

Abstract: Concept bottleneck models (CBMs) include a bottleneck of human-interpretable concepts providing explainability and intervention during inference by correcting the predicted, intermediate concepts. This makes CBMs attractive for high-stakes decision-making. In this paper, we take the quality assessment of fetal ultrasound scans as a real-life use case for CBM decision support in healthcare. For thi… ▽ More Concept bottleneck models (CBMs) include a bottleneck of human-interpretable concepts providing explainability and intervention during inference by correcting the predicted, intermediate concepts. This makes CBMs attractive for high-stakes decision-making. In this paper, we take the quality assessment of fetal ultrasound scans as a real-life use case for CBM decision support in healthcare. For this case, simple binary concepts are not sufficiently reliable, as they are mapped directly from images of highly variable quality, for which variable model calibration might lead to unstable binarized concepts. Moreover, scalar concepts do not provide the intuitive spatial feedback requested by users. To address this, we design a hierarchical CBM imitating the sequential expert decision-making process of "seeing", "conceiving" and "concluding". Our model first passes through a layer of visual, segmentation-based concepts, and next a second layer of property concepts directly associated with the decision-making task. We note that experts can intervene on both the visual and property concepts during inference. Additionally, we increase the bottleneck capacity by considering task-relevant concept interaction. Our application of ultrasound scan quality assessment is challenging, as it relies on balancing the (often poor) image quality against an assessment of the visibility and geometric properties of standardized image content. Our validation shows that -- in contrast with previous CBM models -- our CBM models actually outperform equivalent concept-free models in terms of predictive performance. Moreover, we illustrate how interventions can further improve our performance over the state-of-the-art. △ Less

Submitted 19 November, 2022; originally announced November 2022.

arXiv:2205.11115 [pdf, other]

DTU-Net: Learning Topological Similarity for Curvilinear Structure Segmentation

Authors: Manxi Lin, Zahra Bashir, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: Curvilinear structure segmentation is important in medical imaging, quantifying structures such as vessels, airways, neurons, or organ boundaries in 2D slices. Segmentation via pixel-wise classification often fails to capture the small and low-contrast curvilinear structures. Prior topological information is typically used to address this problem, often at an expensive computational cost, and some… ▽ More Curvilinear structure segmentation is important in medical imaging, quantifying structures such as vessels, airways, neurons, or organ boundaries in 2D slices. Segmentation via pixel-wise classification often fails to capture the small and low-contrast curvilinear structures. Prior topological information is typically used to address this problem, often at an expensive computational cost, and sometimes requiring prior knowledge of the expected topology. We present DTU-Net, a data-driven approach to topology-preserving curvilinear structure segmentation. DTU-Net consists of two sequential, lightweight U-Nets, dedicated to texture and topology, respectively. While the texture net makes a coarse prediction using image texture information, the topology net learns topological information from the coarse prediction by employing a triplet loss trained to recognize false and missed splits in the structure. We conduct experiments on a challenging multi-class ultrasound scan segmentation dataset as well as a well-known retinal imaging dataset. Results show that our model outperforms existing approaches in both pixel-wise segmentation accuracy and topological continuity, with no need for prior topological knowledge. △ Less

Submitted 4 March, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: 12 pages, 4 figures

arXiv:2204.01737 [pdf, other]

Feature robustness and sex differences in medical imaging: a case study in MRI-based Alzheimer's disease detection

Authors: Eike Petersen, Aasa Feragen, Maria Luise da Costa Zemsch, Anders Henriksen, Oskar Eiler Wiese Christensen, Melanie Ganz

Abstract: Convolutional neural networks have enabled significant improvements in medical image-based diagnosis. It is, however, increasingly clear that these models are susceptible to performance degradation when facing spurious correlations and dataset shift, leading, e.g., to underperformance on underrepresented patient groups. In this paper, we compare two classification schemes on the ADNI MRI dataset:… ▽ More Convolutional neural networks have enabled significant improvements in medical image-based diagnosis. It is, however, increasingly clear that these models are susceptible to performance degradation when facing spurious correlations and dataset shift, leading, e.g., to underperformance on underrepresented patient groups. In this paper, we compare two classification schemes on the ADNI MRI dataset: a simple logistic regression model using manually selected volumetric features, and a convolutional neural network trained on 3D MRI data. We assess the robustness of the trained models in the face of varying dataset splits, training set sex composition, and stage of disease. In contrast to earlier work in other imaging modalities, we do not observe a clear pattern of improved model performance for the majority group in the training dataset. Instead, while logistic regression is fully robust to dataset composition, we find that CNN performance is generally improved for both male and female subjects when including more female subjects in the training dataset. We hypothesize that this might be due to inherent differences in the pathology of the two sexes. Moreover, in our analysis, the logistic regression model outperforms the 3D CNN, emphasizing the utility of manual feature specification based on prior knowledge, and the need for more robust automatic feature selection. △ Less

Submitted 14 July, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: Accepted for presentation at MICCAI 2022

arXiv:2111.14658 [pdf, other]

diffConv: Analyzing Irregular Point Clouds with an Irregular View

Authors: Manxi Lin, Aasa Feragen

Abstract: Standard spatial convolutions assume input data with a regular neighborhood structure. Existing methods typically generalize convolution to the irregular point cloud domain by fixing a regular "view" through e.g. a fixed neighborhood size, where the convolution kernel size remains the same for each point. However, since point clouds are not as structured as images, the fixed neighbor number gives… ▽ More Standard spatial convolutions assume input data with a regular neighborhood structure. Existing methods typically generalize convolution to the irregular point cloud domain by fixing a regular "view" through e.g. a fixed neighborhood size, where the convolution kernel size remains the same for each point. However, since point clouds are not as structured as images, the fixed neighbor number gives an unfortunate inductive bias. We present a novel graph convolution named Difference Graph Convolution (diffConv), which does not rely on a regular view. diffConv operates on spatially-varying and density-dilated neighborhoods, which are further adapted by a learned masked attention mechanism. Experiments show that our model is very robust to the noise, obtaining state-of-the-art performance in 3D shape classification and scene understanding tasks, along with a faster inference speed. △ Less

Submitted 12 July, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: Accepted by ECCV 2022

arXiv:2106.08233 [pdf, other]

Spot the Difference: Detection of Topological Changes via Geometric Alignment

Authors: Steffen Czolbe, Aasa Feragen, Oswin Krause

Abstract: Geometric alignment appears in a variety of applications, ranging from domain adaptation, optimal transport, and normalizing flows in machine learning; optical flow and learned augmentation in computer vision and deformable registration within biomedical imaging. A recurring challenge is the alignment of domains whose topology is not the same; a problem that is routinely ignored, potentially intro… ▽ More Geometric alignment appears in a variety of applications, ranging from domain adaptation, optimal transport, and normalizing flows in machine learning; optical flow and learned augmentation in computer vision and deformable registration within biomedical imaging. A recurring challenge is the alignment of domains whose topology is not the same; a problem that is routinely ignored, potentially introducing bias in downstream analysis. As a first step towards solving such alignment problems, we propose an unsupervised algorithm for the detection of changes in image topology. The model is based on a conditional variational auto-encoder and detects topological changes between two images during the registration step. We account for both topological changes in the image under spatial variation and unexpected transformations. Our approach is validated on two tasks and datasets: detection of topological changes in microscopy images of cells, and unsupervised anomaly detection brain imaging. △ Less

Submitted 26 October, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: Accepted to 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Camera-ready version. code repository: https://github.com/SteffenCzolbe/TopologicalChangeDetection

arXiv:2106.03236 [pdf, other]

Graph2Graph Learning with Conditional Autoregressive Models

Authors: Guan Wang, Francois Bernard Lauze, Aasa Feragen

Abstract: We present a graph neural network model for solving graph-to-graph learning problems. Most deep learning on graphs considers ``simple'' problems such as graph classification or regressing real-valued graph properties. For such tasks, the main requirement for intermediate representations of the data is to maintain the structure needed for output, i.e., keeping classes separated or maintaining the o… ▽ More We present a graph neural network model for solving graph-to-graph learning problems. Most deep learning on graphs considers ``simple'' problems such as graph classification or regressing real-valued graph properties. For such tasks, the main requirement for intermediate representations of the data is to maintain the structure needed for output, i.e., keeping classes separated or maintaining the order indicated by the regressor. However, a number of learning tasks, such as regressing graph-valued output, generative models, or graph autoencoders, aim to predict a graph-structured output. In order to successfully do this, the learned representations need to preserve far more structure. We present a conditional auto-regressive model for graph-to-graph learning and illustrate its representational capabilities via experiments on challenging subgraph predictions from graph algorithmics; as a graph autoencoder for reconstruction and visualization; and on pretraining representations that allow graph classification with limited labeled data. △ Less

Submitted 6 June, 2021; originally announced June 2021.

arXiv:2105.09737 [pdf, other]

doi 10.59275/j.melba.2022-4bf2

Quantifying Topology In Pancreatic Tubular Networks From Live Imaging 3D Microscopy

Authors: Kasra Arnavaz, Oswin Krause, Kilian Zepf, Jelena M. Krivokapic, Silja Heilmann, Jakob Andreas Bærentzen, Pia Nyeng, Aasa Feragen

Abstract: Motivated by the challenging segmentation task of pancreatic tubular networks, this paper tackles two commonly encountered problems in biomedical imaging: Topological consistency of the segmentation, and expensive or difficult annotation. Our contributions are the following: a) We propose a topological score which measures both topological and geometric consistency between the predicted and ground… ▽ More Motivated by the challenging segmentation task of pancreatic tubular networks, this paper tackles two commonly encountered problems in biomedical imaging: Topological consistency of the segmentation, and expensive or difficult annotation. Our contributions are the following: a) We propose a topological score which measures both topological and geometric consistency between the predicted and ground truth segmentations, applied to model selection and validation. b) We provide a full deep-learning methodology for this difficult noisy task on time-series image data. In our method, we first use a semisupervised U-net architecture, applicable to generic segmentation tasks, which jointly trains an autoencoder and a segmentation network. We then use tracking of loops over time to further improve the predicted topology. This semi-supervised approach allows us to utilize unannotated data to learn feature representations that generalize to test data with high variability, in spite of our annotated training data having very limited variation. Our contributions are validated on a challenging segmentation task, locating tubular structures in the fetal pancreas from noisy live imaging confocal microscopy. We show that our semi-supervised model outperforms not only fully supervised and pre-trained models but also an approach which takes topological consistency into account during training. Further, our approach achieves a mean loop score of 0.808 for detecting loops in the fetal pancreas, compared to a U-net trained with clDice with mean loop score 0.762. △ Less

Submitted 4 July, 2022; v1 submitted 20 May, 2021; originally announced May 2021.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://www.melba-journal.org/papers/2022:015.html"

ACM Class: I.4.6

arXiv:2104.10051 [pdf, other]

Semantic similarity metrics for learned image registration

Authors: Steffen Czolbe, Oswin Krause, Aasa Feragen

Abstract: We propose a semantic similarity metric for image registration. Existing metrics like Euclidean Distance or Normalized Cross-Correlation focus on aligning intensity values, giving difficulties with low intensity contrast or noise. Our approach learns dataset-specific features that drive the optimization of a learning-based registration model. We train both an unsupervised approach using an auto-en… ▽ More We propose a semantic similarity metric for image registration. Existing metrics like Euclidean Distance or Normalized Cross-Correlation focus on aligning intensity values, giving difficulties with low intensity contrast or noise. Our approach learns dataset-specific features that drive the optimization of a learning-based registration model. We train both an unsupervised approach using an auto-encoder, and a semi-supervised approach using supplemental segmentation data to extract semantic features for image registration. Comparing to existing methods across multiple image modalities and applications, we achieve consistently high registration accuracy. A learned invariance to noise gives smoother transformations on low-quality images. △ Less

Submitted 20 April, 2021; originally announced April 2021.

Comments: Published at MIDL 2021 (Oral). Reviews and discussion on Open Review: https://openreview.net/forum?id=9M5cH--UdcC. arXiv admin note: text overlap with arXiv:2011.05735

arXiv:2103.16265 [pdf, other]

Is segmentation uncertainty useful?

Authors: Steffen Czolbe, Kasra Arnavaz, Oswin Krause, Aasa Feragen

Abstract: Probabilistic image segmentation encodes varying prediction confidence and inherent ambiguity in the segmentation problem. While different probabilistic segmentation models are designed to capture different aspects of segmentation uncertainty and ambiguity, these modelling differences are rarely discussed in the context of applications of uncertainty. We consider two common use cases of segmentati… ▽ More Probabilistic image segmentation encodes varying prediction confidence and inherent ambiguity in the segmentation problem. While different probabilistic segmentation models are designed to capture different aspects of segmentation uncertainty and ambiguity, these modelling differences are rarely discussed in the context of applications of uncertainty. We consider two common use cases of segmentation uncertainty, namely assessment of segmentation quality and active learning. We consider four established strategies for probabilistic segmentation, discuss their modelling capabilities, and investigate their performance in these two tasks. We find that for all models and both tasks, returned uncertainty correlates positively with segmentation error, but does not prove to be useful for active learning. △ Less

Submitted 30 March, 2021; originally announced March 2021.

Comments: Published at Information Processing in Medical Imaging (IPMI) 2021

arXiv:2011.05735 [pdf, other]

DeepSim: Semantic similarity metrics for learned image registration

Authors: Steffen Czolbe, Oswin Krause, Aasa Feragen

Abstract: We propose a semantic similarity metric for image registration. Existing metrics like euclidean distance or normalized cross-correlation focus on aligning intensity values, giving difficulties with low intensity contrast or noise. Our semantic approach learns dataset-specific features that drive the optimization of a learning-based registration model. Comparing to existing unsupervised and supervi… ▽ More We propose a semantic similarity metric for image registration. Existing metrics like euclidean distance or normalized cross-correlation focus on aligning intensity values, giving difficulties with low intensity contrast or noise. Our semantic approach learns dataset-specific features that drive the optimization of a learning-based registration model. Comparing to existing unsupervised and supervised methods across multiple image modalities and applications, we achieve consistently high registration accuracy and faster convergence than state of the art, and the learned invariance to noise gives smoother transformations on low-quality images. △ Less

Submitted 11 November, 2020; originally announced November 2020.

Comments: Talk given at Medical Imaging Meets NeurIPS, NeurIPS 2020 workshop. Extended Abstract

arXiv:1907.08612

Medical Imaging with Deep Learning: MIDL 2019 -- Extended Abstract Track

Authors: M. Jorge Cardoso, Aasa Feragen, Ben Glocker, Ender Konukoglu, Ipek Oguz, Gozde Unal, Tom Vercauteren

Abstract: This compendium gathers all the accepted extended abstracts from the Second International Conference on Medical Imaging with Deep Learning (MIDL 2019), held in London, UK, 8-10 July 2019. Note that only accepted extended abstracts are listed here, the Proceedings of the MIDL 2019 Full Paper Track are published as Volume 102 of the Proceedings of Machine Learning Research (PMLR) http://proceedings.… ▽ More This compendium gathers all the accepted extended abstracts from the Second International Conference on Medical Imaging with Deep Learning (MIDL 2019), held in London, UK, 8-10 July 2019. Note that only accepted extended abstracts are listed here, the Proceedings of the MIDL 2019 Full Paper Track are published as Volume 102 of the Proceedings of Machine Learning Research (PMLR) http://proceedings.mlr.press/v102/. △ Less

Submitted 22 July, 2019; v1 submitted 21 May, 2019; originally announced July 2019.

Comments: Accepted extended abstracts can also be found at https://openreview.net/group?id=MIDL.io/2019/Conference#abstract-accept-papers

arXiv:1902.08959 [pdf, other]

A Formalization of The Natural Gradient Method for General Similarity Measures

Authors: Anton Mallasto, Tom Dela Haije, Aasa Feragen

Abstract: In optimization, the natural gradient method is well-known for likelihood maximization. The method uses the Kullback-Leibler divergence, corresponding infinitesimally to the Fisher-Rao metric, which is pulled back to the parameter space of a family of probability distributions. This way, gradients with respect to the parameters respect the Fisher-Rao geometry of the space of distributions, which m… ▽ More In optimization, the natural gradient method is well-known for likelihood maximization. The method uses the Kullback-Leibler divergence, corresponding infinitesimally to the Fisher-Rao metric, which is pulled back to the parameter space of a family of probability distributions. This way, gradients with respect to the parameters respect the Fisher-Rao geometry of the space of distributions, which might differ vastly from the standard Euclidean geometry of the parameter space, often leading to faster convergence. However, when minimizing an arbitrary similarity measure between distributions, it is generally unclear which metric to use. We provide a general framework that, given a similarity measure, derives a metric for the natural gradient. We then discuss connections between the natural gradient method and multiple other optimization techniques in the literature. Finally, we provide computations of the formal natural gradient to show overlap with well-known cases and to compute natural gradients in novel frameworks. △ Less

Submitted 24 February, 2019; originally announced February 2019.

arXiv:1902.03642 [pdf, other]

(q,p)-Wasserstein GANs: Comparing Ground Metrics for Wasserstein GANs

Authors: Anton Mallasto, Jes Frellsen, Wouter Boomsma, Aasa Feragen

Abstract: Generative Adversial Networks (GANs) have made a major impact in computer vision and machine learning as generative models. Wasserstein GANs (WGANs) brought Optimal Transport (OT) theory into GANs, by minimizing the $1$-Wasserstein distance between model and data distributions as their objective function. Since then, WGANs have gained considerable interest due to their stability and theoretical fr… ▽ More Generative Adversial Networks (GANs) have made a major impact in computer vision and machine learning as generative models. Wasserstein GANs (WGANs) brought Optimal Transport (OT) theory into GANs, by minimizing the $1$-Wasserstein distance between model and data distributions as their objective function. Since then, WGANs have gained considerable interest due to their stability and theoretical framework. We contribute to the WGAN literature by introducing the family of $(q,p)$-Wasserstein GANs, which allow the use of more general $p$-Wasserstein metrics for $p\geq 1$ in the GAN learning procedure. While the method is able to incorporate any cost function as the ground metric, we focus on studying the $l^q$ metrics for $q\geq 1$. This is a notable generalization as in the WGAN literature the OT distances are commonly based on the $l^2$ ground metric. We demonstrate the effect of different $p$-Wasserstein distances in two toy examples. Furthermore, we show that the ground metric does make a difference, by comparing different $(q,p)$ pairs on the MNIST and CIFAR-10 datasets. Our experiments demonstrate that changing the ground metric and $p$ can notably improve on the common $(q,p) = (2,1)$ case. △ Less

Submitted 10 February, 2019; originally announced February 2019.

arXiv:1806.11377 [pdf, other]

Learning from graphs with structural variation

Authors: Rune Kok Nielsen, Andreas Nugaard Holm, Aasa Feragen

Abstract: We study the effect of structural variation in graph data on the predictive performance of graph kernels. To this end, we introduce a novel, noise-robust adaptation of the GraphHopper kernel and validate it on benchmark data, obtaining modestly improved predictive performance on a range of datasets. Next, we investigate the performance of the state-of-the-art Weisfeiler-Lehman graph kernel under i… ▽ More We study the effect of structural variation in graph data on the predictive performance of graph kernels. To this end, we introduce a novel, noise-robust adaptation of the GraphHopper kernel and validate it on benchmark data, obtaining modestly improved predictive performance on a range of datasets. Next, we investigate the performance of the state-of-the-art Weisfeiler-Lehman graph kernel under increasing synthetic structural errors and find that the effect of introducing errors depends strongly on the dataset. △ Less

Submitted 29 June, 2018; originally announced June 2018.

Comments: Presented at the NIPS 2017 workshop "Learning on Distributions, Functions, Graphs and Groups"

arXiv:1805.09122 [pdf, other]

Probabilistic Riemannian submanifold learning with wrapped Gaussian process latent variable models

Authors: Anton Mallasto, Søren Hauberg, Aasa Feragen

Abstract: Latent variable models (LVMs) learn probabilistic models of data manifolds lying in an \emph{ambient} Euclidean space. In a number of applications, a priori known spatial constraints can shrink the ambient space into a considerably smaller manifold. Additionally, in these applications the Euclidean geometry might induce a suboptimal similarity measure, which could be improved by choosing a differe… ▽ More Latent variable models (LVMs) learn probabilistic models of data manifolds lying in an \emph{ambient} Euclidean space. In a number of applications, a priori known spatial constraints can shrink the ambient space into a considerably smaller manifold. Additionally, in these applications the Euclidean geometry might induce a suboptimal similarity measure, which could be improved by choosing a different metric. Euclidean models ignore such information and assign probability mass to data points that can never appear as data, and vastly different likelihoods to points that are similar under the desired metric. We propose the wrapped Gaussian process latent variable model (WGPLVM), that extends Gaussian process latent variable models to take values strictly on a given ambient Riemannian manifold, making the model blind to impossible data points. This allows non-linear, probabilistic inference of low-dimensional Riemannian submanifolds from data. Our evaluation on diverse datasets show that we improve performance on several tasks, including encoding, visualization and uncertainty quantification. △ Less

Submitted 24 February, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

arXiv:1411.0296 [pdf, other]

Geodesic Exponential Kernels: When Curvature and Linearity Conflict

Authors: Aasa Feragen, Francois Lauze, Søren Hauberg

Abstract: We consider kernel methods on general geodesic metric spaces and provide both negative and positive results. First we show that the common Gaussian kernel can only be generalized to a positive definite kernel on a geodesic metric space if the space is flat. As a result, for data on a Riemannian manifold, the geodesic Gaussian kernel is only positive definite if the Riemannian manifold is Euclidean… ▽ More We consider kernel methods on general geodesic metric spaces and provide both negative and positive results. First we show that the common Gaussian kernel can only be generalized to a positive definite kernel on a geodesic metric space if the space is flat. As a result, for data on a Riemannian manifold, the geodesic Gaussian kernel is only positive definite if the Riemannian manifold is Euclidean. This implies that any attempt to design geodesic Gaussian kernels on curved Riemannian manifolds is futile. However, we show that for spaces with conditionally negative definite distances the geodesic Laplacian kernel can be generalized while retaining positive definiteness. This implies that geodesic Laplacian kernels can be generalized to some curved spaces, including spheres and hyperbolic spaces. Our theoretical results are verified empirically. △ Less

Submitted 17 November, 2014; v1 submitted 2 November, 2014; originally announced November 2014.

Comments: 13 pages

arXiv:1410.2466 [pdf, other]

Quantification and visualization of variation in anatomical trees

Authors: Nina Amenta, Manasi Datar, Asger Dirksen, Marleen de Bruijne, Aasa Feragen, Xiaoyin Ge, Jesper Holst Pedersen, Marylesa Howard, Megan Owen, Jens Petersen, Jie Shi, Qiuping Xu

Abstract: This paper presents two approaches to quantifying and visualizing variation in datasets of trees. The first approach localizes subtrees in which significant population differences are found through hypothesis testing and sparse classifiers on subtree features. The second approach visualizes the global metric structure of datasets through low-distortion embedding into hyperbolic planes in the style… ▽ More This paper presents two approaches to quantifying and visualizing variation in datasets of trees. The first approach localizes subtrees in which significant population differences are found through hypothesis testing and sparse classifiers on subtree features. The second approach visualizes the global metric structure of datasets through low-distortion embedding into hyperbolic planes in the style of multidimensional scaling. A case study is made on a dataset of airway trees in relation to Chronic Obstructive Pulmonary Disease. △ Less

Submitted 9 October, 2014; originally announced October 2014.

Comments: 22 pages

MSC Class: 62H25; 62H35

arXiv:1303.7390 [pdf, ps, other]

Geometric tree kernels: Classification of COPD from airway tree geometry

Authors: Aasa Feragen, Jens Petersen, Dominik Grimm, Asger Dirksen, Jesper Holst Pedersen, Karsten Borgwardt, Marleen de Bruijne

Abstract: Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or othe… ▽ More Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or other vector valued properties. In addition to being flexible in their ability to model different types of attributes, the presented kernels are computationally efficient and some of them can easily be computed for large datasets (N of the order 10.000) of trees with 30-600 branches. Combining the kernels with standard machine learning tools enables us to analyze the relation between disease and anatomical tree structure and geometry. Experimental results: The kernels are used to compare airway trees segmented from low-dose CT, endowed with branch shape descriptors and airway wall area percentage measurements made along the tree. Using kernelized hypothesis testing we show that the geometric airway trees are significantly differently distributed in patients with Chronic Obstructive Pulmonary Disease (COPD) than in healthy individuals. The geometric tree kernels also give a significant increase in the classification accuracy of COPD from geometric tree structure endowed with airway wall thickness measurements in comparison with state-of-the-art methods, giving further insight into the relationship between airway wall thickness and COPD. Software: Software for computing kernels and statistical tests is available at http://image.diku.dk/aasa/software.php. △ Less

Submitted 8 April, 2013; v1 submitted 29 March, 2013; originally announced March 2013.

Comments: 12 pages

MSC Class: 68T10

arXiv:1207.5371 [pdf, other]

Towards a theory of statistical tree-shape analysis

Authors: Aasa Feragen, Pechin Lo, Marleen de Bruijne, Mads Nielsen, Francois Lauze

Abstract: In order to develop statistical methods for shapes with a tree-structure, we construct a shape space framework for tree-like shapes and study metrics on the shape space. This shape space has singularities, corresponding to topological transitions in the represented trees. We study two closely related metrics on the shape space, TED and QED. QED is a quotient Euclidean distance arising naturally fr… ▽ More In order to develop statistical methods for shapes with a tree-structure, we construct a shape space framework for tree-like shapes and study metrics on the shape space. This shape space has singularities, corresponding to topological transitions in the represented trees. We study two closely related metrics on the shape space, TED and QED. QED is a quotient Euclidean distance arising naturally from the shape space formulation, while TED is the classical tree edit distance. Using Gromov's metric geometry we gain new insight into the geometries defined by TED and QED. We show that the new metric QED has nice geometric properties which facilitate statistical analysis, such as existence and local uniqueness of geodesics and averages. TED, on the other hand, does not share the geometric advantages of QED, but has nice algorithmic properties. We provide a theoretical framework and experimental results on synthetic data trees as well as airway trees from pulmonary CT scans. This way, we effectively illustrate that our framework has both the theoretical and qualitative properties necessary to build a theory of statistical tree-shape analysis. △ Less

Submitted 23 July, 2012; originally announced July 2012.

Comments: 36 pages, 15 figures

MSC Class: 62H35; 05C05; 51Fxx; 58A35

arXiv:1110.1981 [pdf, other]

The structure of groups of multigerm equivalences

Authors: Aasa Feragen, Andrew du Plessis

Abstract: We study the structure of classical groups of equivalences for smooth multigerms $f \colon (N,S) \to (P,y)$, and extend several known results for monogerm equivalences to the case of mulitgerms. In particular, we study the group $\A$ of source- and target diffeomorphism germs, and its stabilizer $\A_f$. For monogerms $f$ it is well-known that if $f$ is finitely $\A$-determined, then $\A_f$ has a m… ▽ More We study the structure of classical groups of equivalences for smooth multigerms $f \colon (N,S) \to (P,y)$, and extend several known results for monogerm equivalences to the case of mulitgerms. In particular, we study the group $\A$ of source- and target diffeomorphism germs, and its stabilizer $\A_f$. For monogerms $f$ it is well-known that if $f$ is finitely $\A$-determined, then $\A_f$ has a maximal compact subgroup $MC(\A_f)$, unique up to conjugacy, and $\A_f/MC(\A_f)$ is contractible. We prove the same result for finitely $\A$-determined multigerms $f$. Moreover, we show that for a ministable multigerm $f$, the maximal compact subgroup $MC(\A_f)$ decomposes as a product of maximal compact subgroups $MC(\A_{g_i})$ for suitable representatives $g_i$ of the monogerm components of $f$. We study a product decomposition of $MC(\A_f)$ in terms of $MC(\mathscr{R}_f)$ and a group of target diffeomorphisms, and conjecture a decomposition theorem. Finally, we show that for a large class of maps, maximal compact subgroups are small and easy to compute. △ Less

Submitted 10 October, 2011; originally announced October 2011.

Comments: 24 pages, 1 figure

MSC Class: 58K70; 22F50

arXiv:1011.4145 [pdf, ps, other]

A short and elementary proof of Hanner's theorem

Authors: Aasa Feragen

Abstract: Hanner's theorem is a classical theorem in the theory of retracts and extensors in topological spaces, which states that a local ANE is an ANE. While Hanner's original proof of the theorem is quite simple for separable spaces, it is rather involved for the general case. We provide a proof which is not only short, but also elementary, relying only on well-known classical point-set topology. Hanner's theorem is a classical theorem in the theory of retracts and extensors in topological spaces, which states that a local ANE is an ANE. While Hanner's original proof of the theorem is quite simple for separable spaces, it is rather involved for the general case. We provide a proof which is not only short, but also elementary, relying only on well-known classical point-set topology. △ Less

Submitted 18 November, 2010; originally announced November 2010.

Comments: 2 pages

MSC Class: 54C55; 55M15; 54C20

arXiv:math/0611239 [pdf, ps, other]

Equivariant embedding of metrizable $G$-spaces in linear $G$-spaces

Authors: Aasa Feragen

Abstract: Given a Lie group $G$ we study the class $\M$ of proper metrizable $G$-spaces with metrizable orbit spaces, and show that any $G$-space $X \in \M$ admits a closed $G$-embedding into a convex $G$-subset $C$ of some locally convex linear $G$-space, such that $X$ has some $G$-neighborhood in $C$ which belongs to the class $\M$. As corollaries we see that any $G$-ANE for $\M$ has the $G$-homotopy ty… ▽ More Given a Lie group $G$ we study the class $\M$ of proper metrizable $G$-spaces with metrizable orbit spaces, and show that any $G$-space $X \in \M$ admits a closed $G$-embedding into a convex $G$-subset $C$ of some locally convex linear $G$-space, such that $X$ has some $G$-neighborhood in $C$ which belongs to the class $\M$. As corollaries we see that any $G$-ANE for $\M$ has the $G$-homotopy type of some $G$-CW complex and that any $G$-ANR for $\M$ is a $G$-ANE for $\M$. △ Less

Submitted 8 November, 2006; originally announced November 2006.

Comments: 10 pages

MSC Class: 57S20

Showing 1–39 of 39 results for author: Feragen, A