\melbaid\melbaauthors\firstpageno

1 \melbayear \datesubmitted \datepublished \melbaspecialissue \melbaspecialissueeditors \ShortHeadingsBenchmarking nnU-NetGunawardhana et al. \affiliations\addrAuckland Bioengineering Institute, University of Auckland, New Zealand

How good nnU-Net for Segmenting Cardiac MRI: A Comprehensive Evaluation

\firstnameMalitha \surnameGunawardhana \nameFangqiang Xu \nameJichao Zhao

Abstract

Cardiac segmentation is a critical task in medical imaging, essential for detailed analysis of heart structures, which is crucial for diagnosing and treating various cardiovascular diseases. With the advent of deep learning, automated segmentation techniques have demonstrated remarkable progress, achieving high accuracy and efficiency compared to traditional manual methods. Among these techniques, the nnU-Net framework stands out as a robust and versatile tool for medical image segmentation. In this study, we evaluate the performance of nnU-Net in segmenting cardiac magnetic resonance images (MRIs). Utilizing five cardiac segmentation datasets, we employ various nnU-Net configurations, including 2D, 3D full resolution, 3D low resolution, 3D cascade, and ensemble models. Our study benchmarks the capabilities of these configurations and examines the necessity of developing new models for specific cardiac segmentation tasks.

keywords:

MRI, Segmentation, nnU-Net, Benchmark

1 Introduction

Cardiovascular diseases (CVDs) accounted for an estimated 19.05 million deaths globally in 2020, reflecting an 18.71% increase from 2010. Despite this rise, the age-standardized death rate decreased by 12.19%, reaching 239.80 per 100,000 population. Additionally, the total crude prevalence of CVD worldwide reached 607.64 million cases in 2020, marking a 29.01% increase compared to 2010 (Tsao et al., 2023). These statistics underscore the urgent need for advanced diagnostic and therapeutic approaches in cardiology.

Accurate segmentation of cardiac structures is essential for understanding heart function, planning interventions, and monitoring disease progression. For example, locating and quantifying fibrosis and scars have been demonstrated to be valuable tools for treatment stratification of patients with atrial fibrillation (AF) (Allessie et al., 2002; Boldt et al., 2004) and ventricular tachycardia (Ukwatta et al., 2015). These techniques provide critical guidance for surgical or ablation procedures Vergara and Marrouche (2011), and imaging of post-ablation scars offers valuable insights into treatment outcomes Peters et al. (2007).

Cardiac segmentation involves the precise delineation of key anatomical structures within the heart, including the myocardium, ventricles, atria, and major vessels. In particular, Late Gadolinium Enhancement Magnetic Resonance Imaging (LGE-MRI) has emerged as an invaluable technique in cardiac imaging. LGE-MRI excels in highlighting areas of myocardial scarring and fibrosis, which are critical indicators in the diagnosis and management of various cardiac conditions, including myocardial infarction, cardiomyopathies, and arrhythmias (Akkaya et al., 2013; Bisbal et al., 2014).

Historically, manual segmentation by expert radiologists and cardiologists has been considered the gold standard for cardiac image analysis. However, this method is hindered by significant limitations, particularly its time-consuming nature, often requiring hours for a single dataset, making it impractical for busy clinical settings (Tobon-Gomez et al., 2015). The advent of automated segmentation methods, especially those utilizing deep learning techniques, has transformed the field of cardiac imaging analysis. These methods offer substantial advantages over traditional manual approaches, including consistency and reproducibility by eliminating inter-observer variability, rapid analysis with deep learning models capable of segmenting cardiac structures within seconds, scalability for application to large datasets, and the potential for continuous improvement as models can be fine-tuned and updated with increasing data availability.

Over the past decade, numerous approaches have been developed for automated cardiac segmentation, each with its own strengths and limitations. These methods have explored various approaches to improve segmentation accuracy and robustness, including utilizing uncertainty (Yang et al., 2019; Arega et al., 2022), semi-supervised learning (Shi et al., 2024; Mazher et al., 2022), curriculum learning (Jiang et al., 2022), and multi-task learning (Chen et al., 2019).

Despite these advancements, there remains a notable gap in the literature regarding the comprehensive evaluation of one particular architecture that has shown remarkable success in medical image segmentation across various domains: the nnU-Net (no-new-Net) (Isensee et al., 2021). The nnU-Net is a self-configuring method based on the U-Net architecture that automatically adapts preprocessing, network architecture, training, and post-processing to the specifics of a given dataset. While nnU-Net has demonstrated state-of-the-art performance in numerous biomedical segmentation challenges (Isensee et al., 2024), its potential in the specific context of cardiac segmentation has not been thoroughly explored. This presents a significant research opportunity, as cardiac MRI poses unique challenges due to its high contrast between normal and scarred myocardium, potential artefacts, and variability in image quality across different scanners and institutions.

In this study, we aim to bridge this knowledge gap by conducting a comprehensive analysis of nnU-Net’s performance in segmenting cardiac MRI. We utilize five widely used datasets for this task. Those are LAScarQS 2022 dataset (Zhuang et al., 2023), 2018 LASC dataset (Xiong et al., 2021), ACDC dataset (Bernard et al., 2018), MnM (Campello et al., 2021) and MnM2 datasets (Martín-Isla et al., 2023). To the best of our knowledge, this is the first study to focus exclusively on this combination of methodology and imaging modality. By conducting this comprehensive analysis, we aim to provide the medical imaging community with valuable insights into the capabilities and limitations of nnU-Net for segmenting cardiac MRI. Our findings could potentially influence future directions in algorithm development, clinical adoption of automated segmentation tools, and standardization efforts in cardiac imaging analysis.

The remainder of this paper is organized as follows: Section 2 provides a detailed background on the nnU-Net architecture. Section 3 describes our methodology, including dataset preparation, experimental setup, and evaluation metrics. Section 4 presents our results and analysis. Section 5 discusses the implications of our findings, the limitations of the study, and future research directions.

2 nnU-net architecture

The nnU-Net framework is specifically designed for semantic segmentation, capable of handling both 2D and 3D images with various input modalities or channels. It adeptly processes voxel spacings and anisotropies and exhibits robustness even in scenarios where class distributions are highly imbalanced. Utilizing supervised learning, nnU-Net necessitates the provision of annotated training cases tailored to the application at hand. The quantity of required training cases can vary significantly depending on the complexity of the segmentation task, though nnU-Net often requires fewer cases than other solutions due to its extensive data augmentation strategies.

A key expectation for nnU-Net is its ability to process entire images during both the preprocessing and postprocessing stages, making it unsuitable for exceedingly large images. Nevertheless, it has been successfully tested on images ranging from 40x40x40 pixels up to 1500x1500x1500 in 3D, and from 40x40 up to approximately 30000x30000 in 2D. The capacity for handling larger images is contingent on the available RAM.

When presented with a new dataset, nnU-Net systematically analyzes the provided training cases to generate a ’dataset fingerprint’. Based on this analysis, it constructs several U-Net configurations tailored to the dataset:

•

2D U-Net :- Applicable for both 2D and 3D datasets.
•

3D Full Resolution U-Net :- Operates on high-resolution images and is intended for 3D datasets
•

3D Low Resolution U-Net :- Operates on low-resolution images
•

3D Cascade Full Resolution U-Net:- A 3D U-Net cascade where an initial low-resolution 3D U-Net refines predictions through a subsequent high-resolution 3D U-Net. This configuration is applied to large 3D datasets.

For datasets with smaller image sizes, the U-Net cascade (and thus the 3D low-resolution configuration) is excluded, as the patch size of the full-resolution U-Net is sufficient to cover a significant portion of the input images. The configuration of nnU-Net’s segmentation pipelines is based on a three-step approach:

•

Fixed Parameters: These parameters remain constant and are not adapted. Through the development of nnU-Net, a robust configuration was identified that includes the loss function, most data augmentation strategies, and the learning rate.
•

Rule-Based Parameters: These parameters are adjusted based on the dataset fingerprint using heuristic rules. For instance, network topology, which includes pooling behaviour and network depth, is adapted to the patch size. The patch size, network topology, and batch size are optimized jointly, considering GPU memory constraints.
•

Empirical Parameters: These parameters are determined through trial and error. This involves selecting the most suitable U-Net configuration for the dataset (2D, 3D full resolution, 3D low resolution, 3D cascade) and optimizing the postprocessing strategy.

nnU-Net’s systematic approach to configuring segmentation pipelines based on dataset-specific characteristics and robust default settings makes it a versatile and powerful tool for semantic segmentation tasks.

3 Experiment

3.1 Datasets

Refer to caption — Figure 1: Visualization of the long axis and short axis views in both end diastole and end systole phases for the MnM2 dataset. The right ventricle (RV) is highlighted in white, the Myyocardium (MYO) is highlighted in yellow, and the Left Ventricle (LV) is highlighted in red.

Table 1: Summary of Cardiac MRI Datasets

Dataset	Task	Labels	Training	Testing
LAScarQS	Task 1	LA cavity, scars	50	10
LAScarQS	Task 2	LA cavity	130	20
LASC	-	LA cavity	100	54
ACDC	End-Diastole	LV, MYO, RV	100	50
ACDC	End-Systole	LV, MYO, RV	100	50
MnM-1	End-Diastole	LV, MYO, RV	150	136
MnM-1	End-Systole	LV, MYO, RV	150	136
MnM-2	Short Axis, End-Diastole	LV, MYO, RV	200	160
	Short Axis, End-Systole		200	160
	Long Axis, End-Diastole		200	160
	Long Axis, End-Systole		200	160

In this study, we utilized five datasets. Those are namely Left atrial and Scar Quantification and segmentation Challenge (LAScarQS) 2022 dataset (Zhuang et al., 2023), 2018 Atria Segmentation Challenge (LASC) (Xiong et al., 2021), Automated Cardiac Diagnosis Challenge (ACDC)-2017 (Bernard et al., 2018), Multi-Centre, Multi-Vendorand Multi-Disease Cardiac Image Segmentation Challenge (MnM) (Campello et al., 2021) and MnM2 (Martín-Isla et al., 2023).

3.1.1 LAScarQS Challenge Dataset

The LAScarQS challenge encompasses two primary tasks. The first task involves segmenting the left atrium (LA) cavity and scars, while the second task focuses solely on segmenting the LA cavity. For Task 1, the dataset includes 60 training images with corresponding labels and 10 validation images without labels. Task 2 provides 130 training images with labels and 20 validation images without labels. Consequently, only the training data can be utilized for both training and testing purposes. For Task 1, we allocated 50 images for training and the remaining 10 for testing. For Task 2, we used 115 images for training and 15 for testing.

The LGE-MRIs in this challenge were sourced from the University of Utah, Beth Israel Deaconess Medical Center, and King’s College London. The scans were performed using Siemens Avanto 1.5 T, Siemens Vario 3 T, or Philips Acheiva 1.5 T MRI machines. Scans were acquired either free-breathing with navigator-gating or using navigator-gating with fat suppression. The spatial resolution of the scans varied: 1.25 × 1.25 × 2.5 mm, 1.4 × 1.4 × 1.4 mm, or 1.3 × 1.3 × 4.0 mm. Patients underwent MRI scans either before undergoing ablation or between one and six months post-ablation.

3.1.2 2018 Left Atria Segmentation Challenge (LASC) Dataset

The 2018 Left Atria Segmentation Challenge (LASC) concentrated on the segmentation of the LA cavity. The dataset included 100 training images and 54 testing images, all provided with 3D binary masks of the LA cavity. Each LGE-MRI scan featured a spatial resolution of 0.625 × 0.625 × 0.625 mm³, with spatial dimensions of either 576 × 576 × 88 or 640 × 640 × 88 pixels. These clinical images were obtained using either a 1.5 Tesla Avanto or a 3.0 Tesla Verio whole-body scanner (Siemens Medical Solutions, Erlangen, Germany). The LA cavity volumes were meticulously segmented in consensus and agreement by three trained observers, ensuring the provision of high-quality ground truth annotations for both training and evaluation.

3.1.3 Automated Cardiac Diagnosis Challenge (ACDC) 2017 Dataset

The Automated Cardiac Diagnosis Challenge (ACDC) 2017 dataset comprises 150 MRI scans categorized into five subgroups: normal, previous myocardial infarction, dilated cardiomyopathy, hypertrophic cardiomyopathy, and abnormal right ventricle. These scans were collected over six years using two MRI scanners with different magnetic strengths: 1.5 Tesla (Siemens Area, Siemens Medical Solutions, Germany) and 3.0 Tesla (Siemens Trio Tim, Siemens Medical Solutions, Germany). Cine MRI images were acquired under breath-hold conditions using either retrospective or prospective gating, with a steady-state free precession (SSFP) sequence in the short-axis orientation. The scans consist of a series of short-axis slices covering the left ventricle (LV) from base to apex, with a slice thickness of 5 mm (occasionally 8 mm) and sometimes an interslice gap of 5 mm, resulting in images spaced every 5 or 10 mm depending on the examination. The spatial resolution ranges from 1.37 to 1.68 mm²/pixel, and each series includes 28 to 40 images, covering the cardiac cycle completely or partially. The dataset is divided into 100 training images and 50 testing images for the segmentation of the left ventricle (LV), myocardium (MYO), and right ventricle (RV) during both end-systolic (ES) and end-diastolic (ED) phases.

3.1.4 Multi-Centre, Multi-Vendor, and Multi-Disease Cardiac Image Segmentation Challenge (MnM-1 and MnM-2)

The MnM challenge has been conducted twice, first in 2020 (MnM-1) and then in 2021 (MnM-2). MnM-1 included a total of 345 scans, with 209 images designated for training and 136 for testing. Participants were tasked with segmenting the left ventricle (LV), myocardium (MYO), and right ventricle (RV) in both end-systolic (ES) and end-diastolic (ED) phases. The scans were obtained from clinical centres located in three countries—Spain, Germany, and Canada—and utilized four different magnetic resonance scanner vendors: Siemens, General Electric, Philips, and Canon.

MnM-2 provided a training set of 200 images and a testing set of 160 images. Similar to MnM-1, segmentation was required for the LV, MYO, and RV in both ES and ED phases. However, MnM-2 included both Short-Axis (ShA) and long-axis (LoA) views. The LoA view shows the heart from base to apex, essentially cutting the heart vertically, while the ShA view cuts the heart horizontally, perpendicular to the long axis. It shows circular cross-sections of the ventricles. Figure 1 shows the LoA and ShA for both ES and ED phases. The data for MnM-2 were acquired from clinical centers in Spain, using three different MRI scanner vendors: Siemens, General Electric, and Philips.

A summary of the datasets is shown in Table 1.

3.2 Implementation Details

In this study, we employed nnU-Net, which supports training under five main conditions: 2D, 3D full resolution, 3D low resolution, 3D cascade, and ensemble. However, it was not feasible to evaluate certain datasets using the 3D low resolution and cascade configurations. For datasets with small image sizes, the U-Net cascade (and consequently the 3D low-resolution configuration) was omitted because the patch size of the full-resolution U-Net already covered a substantial portion of the input images.

The models were trained using an NVIDIA A100 80GB PCIe GPU over 1000 epochs, beginning with an initial learning rate of 0.01. The Stochastic Gradient Descent (SGD) optimizer was employed for the training process. To ensure robust and reliable model performance, we implemented five-fold cross-validation. Additionally, for tasks with multiple labels, the models were trained as multi-class segmentation tasks.

3.3 Evaluation Metrics

To assess the performance of our segmentation models, we employ a comprehensive set of evaluation metrics: Dice Similarity Coefficient (DSC), Jaccard Index, Hausdorff Distance (HD), Mean Surface Distance (MSD), and the 95th percentile Hausdorff Distance (HD95). Each of these metrics provides unique insights into different aspects of the segmentation quality, offering a holistic view of model performance.

3.3.1 Dice Score

The Dice Similarity Coefficient (DSC) is a measure of overlap between the predicted segmentation and the ground truth, calculated as twice the area of overlap divided by the total number of pixels in both the predicted and ground truth masks. A higher DSC indicates better performance, signifying a greater degree of similarity between the predicted and actual segmentations.

DSC=\frac{2\cdot|P\cap Q|}{|P|+|Q|}

(1)

where P and Q are the ground truth and predicted masks.

3.3.2 Jaccard Index

The Jaccard Index, also known as the Intersection over Union (IoU), quantifies the similarity between the predicted and ground truth segmentations. It is defined as the area of overlap divided by the area of the union of the predicted and ground truth masks. Like the DSC, a higher Jaccard Index denotes better segmentation performance.

Jaccard=\frac{|P\cap Q|}{|P\cup Q|}

(2)

where P and Q are the ground truth and predicted masks.

3.3.3 Hausdorff Distance

The Hausdorff Distance (HD) measures the maximum distance from a point in the predicted segmentation to the nearest point in the ground truth segmentation, thus indicating the worst-case boundary discrepancy. Lower HD values indicate more accurate boundary delineation.

HD(P,Q)=\max(h(P,Q),h(Q,P))

(3)

where $h(P,Q)$ is the oriented Hausdorff distance from $P$ to $Q$ :

h(P,Q)=\max_{p_{i}\in P}\min_{q_{j}\in Q}\rho(p_{i},q_{j})

(4)

and $\rho(p_{i},q_{j})$ is the Euclidean distance between points $p_{i}$ and $q_{j}$ .

3.3.4 Mean Surface Distance

The Mean Surface Distance (MSD) calculates the average distance between points on the surface of the predicted segmentation and the nearest points on the surface of the ground truth segmentation. Lower MSD values suggest closer average alignment between the predicted and actual boundaries.

MSD(P,Q)=\frac{1}{|P|}\sum_{p_{i}\in P}\min_{q_{j}\in Q}\rho(p_{i},q_{j})

(5)

3.3.5 95th percentile Hausdorff Distance

The 95th percentile Hausdorff Distance (HD95) is similar to the HD but focuses on the 95th percentile of the distances between the predicted and ground truth surfaces, thereby mitigating the impact of outliers. A lower HD95 value indicates more consistent boundary accuracy, discounting extreme deviations.

Together, these metrics provide a robust framework for evaluating segmentation performance, with higher DSC and Jaccard Index values and lower HD, MSD, and HD95 values indicating superior model performance.

4 Results

Table 2: Performance of LAScarQS (Task 1). The best cavity segmentation values are in red, and the best scar segmentation values are in blue. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	Cavity	0.926	0.863	12.952	0.805	3.402
2D	Scar	0.438	0.283	37.166	2.539	13.036
3D full	Cavity	0.939	0.884	12.622	0.666	3.088
3D full	Scar	0.443	0.288	37.060	2.512	12.620
3D low	Cavity	0.937	0.882	13.942	0.711	3.254
3D low	Scar	0.411	0.262	37.294	2.789	13.425
3D cas	Cavity	0.939	0.885	12.601	0.674	3.138
3D cas	Scar	0.449	0.293	38.125	2.530	12.554
Ensem	Cavity	0.939	0.886	12.486	0.663	3.041
Ensem	Scar	0.439	0.285	37.078	2.590	12.850

Table 3: Performance comparison of dice scores in nnU-Net variations and other models in LAScarQS-Task1 for scar and cavity segmentation.

Paper	Scars	Cavity
Punithakumar and Noga (2022)	0.660	0.907
Jiang et al. (2022)	0.641	0.902
Arega et al. (2022)	0.634	0.898
Mazher et al. (2022)	0.602	0.875
Zhang et al. (2022b)	0.598	0.880
Lefebvre et al. (2022)	0.553	0.938
nnU (2D)	0.439	0.926
nnU (3D full res)	0.443	0.939
nnU (3D low res)	0.411	0.937
nnU (3D cascade)	0.449	0.939
nnU (Ensemble)	0.439	0.939

4.1 LAScarQS

Table 4: Performance of LAScarQS (Task 2). The best cavity segmentation values are in blue. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD

Model	DSC	Jaccard	HD	MSD	HD95
2D	0.930	0.869	13.971	0.733	3.018
3D full res.	0.937	0.882	12.971	0.672	2.880
3D low res	0.935	0.879	12.741	0.692	3.069
3D cascade	0.937	0.882	12.807	0.667	2.746
Ensemble	0.938	0.883	12.767	0.652	2.737

When comparing our methods to others, the LAScarQS Task 1 scar segmentation exhibited the most significant difference, with other methods surpassing the nnU-Net models by 21.1%. Additionally, the HD values for scar segmentation are notably higher (Tables 2 and 3). However, nnU-Net models achieve superior performance in cavity segmentation, despite their lower results in scar segmentation. This trend is also observed in LAScarQS Task 2 cavity segmentation (Table 5), where even the nnU-Net (2D) model outperforms other methods. In Task 2, the nnU-Net ensemble model achieves the best performance in both Dice score and MSD metrics, while the nnU-Net (3D low res) model achieves the best performance for HD. nnU-Net is able to perform competitively even with lesser data compared to other methods in the challenge. The nnU-Net models achieve higher performance metrics not only in dice scores but also in HD and MSD matrices. Figure 2 provides a qualitative comparison of LAScarQS Task1 performance, visualized using Amira 3D software (Stalling et al., 2005).

Table 5: Performance comparison of nnU-Net variations and other models in LAScarQS-Task2. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance

Paper	DSC	HD	MSD
Lefebvre et al. (2022)	0.889	26.270	2.179
Tu et al. (2022)	0.890	17.124	1.706
Liu et al. (2022a)	0.866	–	–
Zhang et al. (2022b)	0.890	16.450	1.715
Zhang et al. (2022a)	0.878	–	0.710
Khan et al. (2022)	0.846	105.700	3.390
Xie et al. (2022)	0.872	22.394	–
Zhou et al. (2022)	0.875	24.731	2.233
Jiang et al. (2022)	0.881	18.755	1.782
Li and Li (2022)	0.883	20.883	1.794
Arega et al. (2022)	0.890	16.907	1.720
Punithakumar and Noga (2022)	0.893	15.860	1.613
Mazher et al. (2022)	0.886	18.389	1.813
Singh et al. (2023a)	0.929	12.960	0.890
Singh et al. (2023b)	0.919	15.430	-
nnU (2D)	0.930	13.971	0.733
nnU (3D full res)	0.937	12.971	0.672
nnU (3D low res)	0.935	12.741	0.692
nnU (3D cascade)	0.937	12.807	0.667
nnU (Ensemble)	0.938	12.767	0.652

4.2 LASC

For the LASC dataset, the ensemble model achieves the highest performance (Table 6). According to Table 7, nnU-Net demonstrates competitive performance with other methods, with only Singh et al. (2023a) surpassing nnU-Net by 0.1%. Interestingly, even the nnU-Net (2D) model shows competitive performance compared to the latest models (Xu et al., 2024). nnU-Net is able to surpass the novel method even without additional configurations. We assess the qualitative performance of the nnU-Nets using ITK-SNAP software (Yushkevich et al., 2016) as shown in Figure 3 for axial, sagittal and coronal views.

Table 6: Performance of LASC dataset. The best cavity segmentation values are in blue. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD

Model	DSC	Jaccard	HD	MSD	HD95
2D	0.926	0.863	17.583	1.052	3.930
3D full res.	0.933	0.875	17.485	0.972	3.681
3D low res	0.931	0.872	16.877	0.991	3.727
3D cascade	0.933	0.874	17.553	0.984	3.756
Ensemble	0.934	0.877	16.873	0.954	3.628

Table 7: Performance comparison of nnU-Net variations and other models in LASC dataset. DSC- Dice Score

Publication	DSC
Xia et al. (2019)	0.932
Bian et al. (2018)	0.926
Vesal et al. (2019)	0.925
Yang et al. (2019)	0.925
Li et al. (2019)	0.923
Chen et al. (2022a)	0.920
Chen et al. (2023)	0.932
Li et al. (2023)	0.919
Liu et al. (2019)	0.903
Borra et al. (2019)	0.898
Puybareau et al. (2018)	0.923
Uslu et al. (2021)	0.920
Chen et al. (2021)	0.913
Chen et al. (2022b)	0.923
Qi et al. (2023)	0.921
Zhao et al. (2023)	0.911
Singh et al. (2023a)	0.935
Singh et al. (2023b)	0.934
Milletari et al. (2016)	0.919
Lourenço et al. (2021)	0.910
Zhao et al. (2021)	0.918
Liu et al. (2022b)	0.920
Xu et al. (2024)	0.926
nnU (2D)	0.926
nnU (3D full res)	0.933
nnU (3D low res)	0.931
nnU (3D cascade)	0.933
nnU (Ensemble)	0.934

4.3 ACDC

Performance evaluation of the ACDC dataset is conducted under two main conditions: End-Diastole (ED) (Table 8) and End-Systole (ES) (Table 9). In both cases, the ensemble method demonstrates superior performance compared to other variations of nnU-Nets. Surprisingly, the 2D nnU-Net exhibits better performance than both 3D and ensemble models in RV segmentation of the ACDC-ED phase. When compared to other approaches (Table 10), the nnU-Net lags in LV segmentation in both ED and ES phases, with differences of 2.4% and 4.6%, respectively, for dice score. This pattern is also observed in the MYO segmentation, where other methods surpass the nnU-Net maximum dice score values by 0.8% in both ED and ES phases. However, in RV segmentation, the nnU-Net shows superior performance in both ED and ES phases in both dice score values and HD values. In Figure 4, we compare the performance of ground truth and nnU-net (2D), nnU-Net (3D full res) and Ensemble models in both ED and ES phases.

Table 8: Performance of ACDC Dataset for End-Diastole (ED) phase. The best values for the Left Ventricle (LV), Myocardium (MYO), and Right Ventricle (RV) are highlighted in orange, blue, and red, respectively. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD, LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	LV	0.942	0.892	10.438	0.467	3.152
2D	MYO	0.897	0.814	10.050	0.331	1.583
2D	RV	0.965	0.933	6.739	0.347	2.350
3D full res.	LV	0.934	0.880	11.494	0.617	3.861
3D full res.	MYO	0.889	0.801	8.057	0.367	2.135
3D full res.	RV	0.959	0.922	8.486	0.443	2.720
Ensemble	LV	0.944	0.896	10.716	0.459	3.110
Ensemble	MYO	0.898	0.816	9.884	0.325	1.818
Ensemble	RV	0.963	0.930	9.584	0.404	2.474

Table 9: Performance of ACDC Dataset for End-Systole (ES) phase. The best values for the Left Ventricle (LV), Myocardium (MYO), and Right Ventricle (RV) are highlighted in orange, blue, and red, respectively. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD, LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	LV	0.885	0.799	12.678	0.827	4.318
2D	MYO	0.913	0.841	8.231	0.384	1.975
2D	RV	0.927	0.868	6.795	0.483	2.713
3D full res.	LV	0.882	0.793	12.743	0.911	5.245
3D full res.	MYO	0.906	0.829	8.785	0.456	2.551
3D full res.	RV	0.901	0.831	9.028	0.972	4.968
Ensemble	LV	0.892	0.809	12.200	0.751	4.208
Ensemble	MYO	0.915	0.844	8.460	0.384	2.193
Ensemble	RV	0.922	0.861	8.321	0.608	3.477

Table 10: Performance comparison of nnU-Net variations and other models in ACDC datset. LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle, ED- End Dystole, ES - End Systole, DSC- Dice Score, HD - Hausdorff Distance.

Method	LV				MYO				RV
	ED		ES		ED		ES		ED		ES
	DSC	HD	DSC	HD	DSC	HD	DSC	HD	DSC	HD	DSC	HD
Guo et al. (2021)	0.968	5.814	0.935	7.361	0.906	7.469	0.923	7.702	0.955	8.877	0.894	11.649
Isensee et al. (2018)	0.967	5.476	0.928	6.921	0.904	7.014	0.923	7.328	0.951	8.205	0.904	11.665
Simantiris and Tziritas (2020)	0.967	6.366	0.928	7.573	0.891	8.264	0.904	9.575	0.936	13.289	0.889	14.367
Berihu Girum et al. (2021)	0.968	6.422	0.916	9.305	0.894	8.998	0.906	9.922	0.939	11.326	0.893	13.306
Ammar et al. (2021)	0.968	7.993	0.911	10.528	0.891	10.575	0.901	13.891	0.929	14.189	0.886	16.042
Zotti et al. (2018b)	0.964	6.180	0.912	8.386	0.886	9.586	0.902	9.291	0.934	11.052	0.885	12.650
Khened et al. (2018)	0.964	8.129	0.917	8.968	0.889	9.841	0.898	12.582	0.935	13.994	0.879	13.930
Baumgartner et al. (2018)	0.963	6.526	0.911	9.170	0.892	8.703	0.901	10.637	0.932	12.670	0.883	14.691
Painchaud et al. (2020)	0.961	6.152	0.911	8.278	0.881	8.651	0.897	9.598	0.933	13.718	0.884	13.323
Wolterink et al. (2018)	0.961	7.515	0.918	6.603	0.875	11.121	0.894	10.687	0.928	11.879	0.872	13.399
Calisto and Lai-Yuen (2020)	0.958	5.592	0.903	8.644	0.873	8.197	0.895	8.318	0.936	10.183	0.884	12.234
Zotti et al. (2018a)	0.957	6.641	0.905	8.706	0.884	8.708	0.896	9.264	0.941	10.318	0.882	14.053
Singh et al. (2023c)	0.967	5.526	0.935	6.913	0.902	8.094	0.921	7.772	0.949	9.187	0.900	11.556
Singh et al. (2023a)	0.967	5.652	0.938	6.878	0.905	7.389	0.923	7.373	0.950	8.513	0.895	12.167
Singh et al. (2023b)	0.968	5.859	0.937	6.529	0.904	7.723	0.922	7.221	0.952	8.788	0.890	11.926
nnU (2D)	0.942	10.438	0.885	12.678	0.897	10.050	0.913	8.231	0.965	6.739	0.927	6.795
nnU (3D)	0.934	11.494	0.882	12.743	0.889	8.057	0.906	8.785	0.959	8.486	0.901	9.028
nnU (Ens)	0.944	10.716	0.892	12.200	0.898	9.884	0.915	8.460	0.963	9.584	0.922	8.321

4.4 MnM

As in the ACDC dataset, MnM performance is evaluated on both ES (Table 11) and ED (Table 12) phases. The 2D nnU-Net outperforms both 3D and ensemble models in RV segmentation in the ES phase, while the 3D full-resolution model also outperforms LV segmentation in terms of dice score in the ES phase. In the ED phase, the ensemble model demonstrates superior performance. RV segmentation in both phases achieves higher dice scores compared to other approaches (Table 14). In other cases, other approaches surpass the nnU-Net by slight margins, typically less than 1%. In Figure 5, we compare the performance of the nnU-Net models in both ES and ED phases.

Table 11: Performance of MnM Dataset for End-Systole (ES) phase. The best segmentation values for the Left Ventricle (LV), Myocardium (MYO), and Right Ventricle (RV) are highlighted in orange, blue, and red, respectively. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD, LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	LV	0.888	0.833	12.681	2.368	8.513
2D	MYO	0.800	0.689	15.428	1.905	7.304
2D	RV	0.893	0.821	14.595	1.450	6.411
3D full res.	LV	0.909	0.842	8.576	0.944	4.507
3D full res.	MYO	0.841	0.734	11.141	0.812	3.952
3D full res.	RV	0.871	0.784	13.130	1.258	5.673
Ensemble	LV	0.888	0.803	8.486	0.956	4.432
Ensemble	MYO	0.864	0.762	9.872	0.613	3.542
Ensemble	RV	0.852	0.751	12.658	1.083	5.366

Table 12: Performance of MnM Dataset for End-Diastole (ED) phase. The best segmentation values for the Left Ventricle (LV), Myocardium (MYO), and Right Ventricle (RV) are highlighted in orange, blue, and red, respectively. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD, LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	LV	0.936	0.882	7.517	0.728	3.871
2D	MYO	0.824	0.706	10.738	0.592	3.676
2D	RV	0.909	0.836	11.601	0.900	4.578
3D full res.	LV	0.933	0.877	8.199	0.819	4.261
3D full res.	MYO	0.819	0.699	10.776	0.580	3.484
3D full res.	RV	0.908	0.836	11.520	0.870	4.501
Ensemble	LV	0.937	0.883	7.393	0.725	3.761
Ensemble	MYO	0.826	0.709	9.944	0.527	3.138
Ensemble	RV	0.913	0.843	10.847	0.818	4.208

4.5 MnM2

Deviating from the MnM challenge, we analyze the performance of the MnM2 challenge in four different conditions: Short Axis (ShA) ED phase (Table 13), ShA ES phase (Table 15), Long Axis (LoA) ED phase (Table 16), and LoA ES phase (Table 17). In both phases in ShA, the ensemble method demonstrates superior performance, while the 2D method surpasses the 3D method. For LoA segmentation, images have the shape of $H\times W\times 1$ , indicating only one layer in the Z-axis, making the 3D full-resolution method particularly effective, and thus only 3D full-resolution results are reported. The challenge organizers report only the values of RV segmentation (Table 18). In this case, nnU-Net outperforms ShA ES segmentation by 2.4% compared to other models. However, in other cases (ShA ED, LoA ES, and LoA ED), other models surpass the nnU-Net, but the margin is less than 1%.

In summary, ensemble models demonstrate strong performance across all datasets. Surprisingly, in some cases, the 2D models outperform the 3D models and even the ensemble models. The most significant difference where other models surpass nnU-Net occurs in the LAScarQs Task 1 scar segmentation. A summary of the comparison between the highest Dice value obtained from nnU-Net, the highest Dice value from other methods, and the absolute difference (%) is shown in Table 19.

Table 13: Performance of MnM2 Dataset for Short Axis (ShA) End-Diastole (ED) phase. The best values for the Left Ventricle (LV), Myocardium (MYO), and Right Ventricle (RV) are highlighted in orange, blue, and red, respectively. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD, LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	LV	0.957	0.920	8.268	0.515	3.170
2D	MYO	0.867	0.769	12.238	0.442	2.842
2D	RV	0.934	0.879	10.050	0.766	4.084
3D full res.	LV	0.955	0.916	8.361	0.565	3.571
3D full res.	MYO	0.862	0.761	12.035	0.426	2.561
3D full res.	RV	0.934	0.878	10.394	0.779	4.200
Ensemble	LV	0.958	0.921	8.029	0.496	3.256
Ensemble	MYO	0.869	0.772	11.492	0.396	2.371
Ensemble	RV	0.937	0.884	11.079	0.742	4.021

Table 14: Performance comparison of nnU-Net variations and other models in MnM dataset. LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle, ED- End Dystole, ES - End Systole, DSC- Dice Score, HD - Hausdorff Distance.

Method	LV				MYO				RV
	ED		ES		ED		ES		ED		ES
	DSC	HD	DSC	HD	DSC	HD	DSC	HD	DSC	HD	DSC	HD
Full et al. (2021)	0.939	9.1	0.886	9.1	0.839	12.8	0.867	10.6	0.910	11.8	0.860	12.7
Parreño et al. (2021)	0.939	11.3	0.884	11.4	0.826	15.2	0.856	14.0	0.886	15.4	0.829	16.7
Zhang et al. (2021)	0.938	9.3	0.880	9.5	0.830	12.9	0.861	10.8	0.909	12.3	0.850	13.0
Ma (2021)	0.935	9.5	0.875	10.5	0.825	13.3	0.856	11.6	0.906	12.3	0.844	13.0
Saber et al. (2021)	0.933	13.4	0.867	14.0	0.812	17.1	0.839	18.2	0.876	15.7	0.815	18.1
Kong and Shadden (2021)	0.931	10.0	0.877	9.8	0.816	13.7	0.850	11.3	0.893	14.3	0.827	15.2
Singh et al. (2023c)	0.928	7.15	0.890	7.6	0.834	10.2	0.868	9.6	0.902	10.6	0.852	11.7
Corral Acero et al. (2021)	0.927	11.2	0.877	9.7	0.815	14.0	0.852	11.1	0.892	13.6	0.834	15.0
Li et al. (2021a)	0.922	15.5	0.857	17.5	0.809	18.0	0.836	17.2	0.867	16.6	0.802	19.1
Khader et al. (2021)	0.914	12.1	0.853	12.0	0.768	17.2	0.814	15.2	0.850	17.5	0.794	17.0
Carscadden et al. (2021)	0.913	14.5	0.851	13.0	0.776	17.8	0.809	14.5	0.791	30.7	0.732	32.9
Scannell et al. (2021)	0.905	13.6	0.848	15.5	0.772	17.2	0.820	17.5	0.876	16.2	0.809	19.6
Huang et al. (2021)	0.896	15.7	0.772	23.0	0.761	17.9	0.721	20.2	0.820	21.0	0.698	29.5
Liu et al. (2021b)	0.889	16.0	0.835	14.2	0.785	22.1	0.808	18.9	0.814	22.1	0.758	22.0
Li et al. (2021c)	0.797	21.9	0.716	25.8	0.668	31.6	0.673	33.0	0.552	49.1	0.517	52.0
Singh et al. (2023a)	0.940	7.5	0.890	7.7	0.839	10.3	0.870	9.9	0.909	10.2	0.856	11.4
nnU-Net (2D)	0.936	7.5	0.888	12.7	0.824	10.7	0.800	15.4	0.909	11.6	0.893	14.6
nnU-Net (3D)	0.933	8.2	0.909	8.6	0.819	10.8	0.841	11.1	0.908	11.5	0.871	13.1
nnU-Net (Ensemble)	0.937	7.4	0.888	8.5	0.826	9.9	0.864	9.9	0.913	10.8	0.852	12.7

Table 15: Performance of MnM2 Dataset for Short Axis (ShA) End-Systole (ES) phase. The best segmentation values for the Left Ventricle (LV), Myocardium (MYO), and Right Ventricle (RV) are highlighted in orange, blue, and red, respectively. DSC- Dice Score, HD - Hausdorff Distance, MSD- Mean Surface Distance, HD95-95th percentile of HD, LV- Left Ventricle, MYO- Myocardium, RV- Right Ventricle.

Model	Label	DSC	Jaccard	HD	MSD	HD95
2D	LV	0.958	0.920	8.350	0.513	3.170
2D	MYO	0.867	0.770	11.928	0.428	2.571
2D	RV	0.934	0.879	11.228	0.817	4.520
3D full res.	LV	0.956	0.916	8.233	0.561	3.481
3D full res.	MYO	0.862	0.761	12.004	0.426	2.555
3D full res.	RV	0.934	0.878	10.301	0.779	4.302
Ensemble	LV	0.958	0.920	8.225	0.503	3.264
Ensemble	MYO	0.868	0.771	11.684	0.398	2.341
Ensemble	RV	0.938	0.885	11.119	0.722	3.930

Table 16: Performance of MnM2 Dataset for Long Axis (LoA) End-Diastole (ED) phase.

Model	Label	DSC	Jaccard	HD	MSD	HD95
3D full res.	LV	0.968	0.938	4.082	0.871	2.977
3D full res.	MYO	0.878	0.786	6.504	0.662	2.151
3D full res.	RV	0.934	0.878	6.055	1.262	4.075

Table 17: Performance of MnM2 Dataset for Long Axis (LoA) End-Systole (ES) phase.

Model	Label	DSC	Jaccard	HD	MSD	HD95
3D full res.	LV	0.948	0.904	4.432	1.076	3.246
3D full res.	MYO	0.891	0.809	5.342	0.837	2.809
3D full res.	RV	0.899	0.822	6.108	1.457	4.254

Table 18: Performance comparison of nnU-Net variations and other models in MnM2 dataset for Right Ventricle only. ShA- Short Axis, LoA - Long Axis, ED - End Diastole, ES- End Systole, DSC - Dice Score, HD - Hausdorff Distance.

	ShA				LoA
Method	ED		ES		ED		ES
	DSC	HD	DSC	HD	DSC	HD	DSC	HD
Fulton et al. (2021)	0.934	9.610	0.910	10.032	0.935	6.227	0.904	5.935
Arega et al. (2021)	0.932	10.078	0.910	9.782	0.935	6.028	0.905	6.188
Punithakumar et al. (2021)	0.940	10.122	0.914	9.987	0.931	6.337	0.904	5.976
Li et al. (2021b)	0.933	10.563	0.907	10.050	0.930	6.246	0.902	6.097
Sun et al. (2022)	0.937	10.879	0.913	9.874	0.935	6.056	0.904	6.031
Al Khalil et al. (2021)	0.927	9.941	0.897	10.307	0.907	8.444	0.883	7.265
Liu et al. (2021a)	0.932	10.517	0.903	10.101	0.934	7.721	0.896	6.019
Jabbar et al. (2021)	0.923	11.258	0.897	11.062	0.910	7.757	0.882	6.933
Queirós (2021)	0.924	11.327	0.898	11.447	0.922	7.173	0.900	6.391
Galati and Zuluaga (2021)	0.916	11.681	0.890	11.747	0.924	7.840	0.894	6.978
Mazher et al. (2021)	0.909	15.275	0.880	14.606	0.888	8.333	0.854	8.347
Gao and Zhuang (2022)	0.844	15.495	0.821	16.750	0.887	9.733	0.851	9.659
Beetz et al. (2021)	0.873	16.682	0.820	17.913	0.896	8.570	0.864	7.591
Tautz et al. (2021)	0.883	17.024	0.838	18.003	0.849	13.303	0.809	13.716
Galazis et al. (2021)	0.852	19.430	0.821	19.117	0.814	18.629	0.781	17.198
nnU (2D)	0.934	10.50	0.934	11.228	-	-	-	-
nnU (3D full res)	0.934	10.393	0.934	10.301	0.934	6.055	0.900	6.108
nnU (Ensemble)	0.937	11.079	0.938	11.119	-	-	-	-

Table 19: Comparison of Dice scores of nnUnet and Other methods. ED - End Diastole, ES - End Systole, ShA- Short Axis, LoA - Long Axis, LV - Left Ventricle, MYO - Myocardium, RV - Right Ventricle, LA - Left Atrium.

Dataset	Sub Task	Anatomical Region	nnUnet	Other methods	Abs. Difference (%)
ACDC	ED	LV	0.944	0.968	2.4
	ES	LV	0.892	0.938	4.6
	ED	MYO	0.898	0.906	0.8
	ES	MYO	0.915	0.923	0.8
	ED	RV	0.963	0.963	0.0
	ES	RV	0.922	0.904	1.8
LAScarQS	Task-1	LA Scar	0.449	0.660	21.1
	Task-1	LA Cavity	0.939	0.938	0.1
	Task-2	LA Cavity	0.938	0.929	0.9
MnM1	ED	LV	0.937	0.940	0.3
	ES	LV	0.909	0.890	1.9
	ED	MYO	0.826	0.834	0.8
	ES	MYO	0.864	0.870	0.6
	ED	RV	0.913	0.910	0.3
	ES	RV	0.893	0.860	3.3
MnM2	ShA ED	RV	0.937	0.940	0.3
	ShA ES	RV	0.938	0.914	2.4
	LoA ES	RV	0.934	0.935	0.1
	LoA ED	RV	0.900	0.905	0.5
LASC	-	LA Cavity	0.934	0.935	0.1

5 Discussion

In this section, we discuss and analyze our findings in detail.

5.1 Lower performance in LAScarQS scar segmentation

In analyzing the performance of nnU-Net for scar segmentation in the LAScarQS Task 1, it is evident that the model underperforms relative to other available models. Several factors contribute to this discrepancy.

Firstly, the primary challenge lies in the nature of the target region. Scar tissues occupy only a small fraction of the LA compared to the LA cavity (As shown in Figure 6 nearly 0.7% occupies the cavity and less than 0.1% occupies the scar). This significant imbalance in the spatial distribution makes it difficult for the model to accurately distinguish and segment the scar regions. The nnU-Net’s architecture, while robust for larger and more continuous regions, struggles with the precision required for such minute and sparse areas.

Secondly, the characteristics of the data further complicate the task. Unlike the LA cavity, which presents as a more continuous and homogenous region, scar tissues are often irregular and dispersed. This non-continuous nature of scar data poses a substantial challenge for segmentation models, particularly those like nnU-Net that rely heavily on spatial continuity and context provided by larger regions.

Additionally, most state-of-the-art methods for scar segmentation adopt a two-stage network approach. These approaches typically involve an initial stage that performs coarse segmentation, identifying potential regions of interest (ROIs), followed by a refinement stage that focuses on enhancing the segmentation accuracy within these regions. This two-step process allows for more focused learning and better handling of small and irregular regions, leading to superior performance in scar segmentation tasks.

In contrast, the nnU-Net framework primarily utilizes a single-stage approach. While this method is advantageous for its simplicity and reduced computational requirements, it may not provide the necessary granularity and focus required for effectively segmenting small and irregular structures like scar tissues. The lack of an initial coarse segmentation stage means that nnU-Net must rely solely on its inherent ability to capture and distinguish fine details within a single pass, which is inherently more challenging for such complex tasks.

Moreover, the non-continuous property of the scar tissue can contribute to higher HD values. The HD metric is particularly sensitive to outliers and disjoint regions, which are characteristic of scar tissue. As a result, even small segmentation errors can lead to disproportionately high HD values, further reflecting the difficulty in accurately segmenting these regions.

Lastly, the standard data augmentation and preprocessing techniques employed by nnU-Net, while effective for general segmentation tasks, might not be sufficiently tailored to the unique challenges presented by scar tissue segmentation. Employing more specialized augmentation techniques that better simulate the variability and appearance of scar tissues could potentially enhance the model’s performance.

5.2 Ensemble Results

When comparing nnU-Net ensemble models to individual 3D and 2D nnU-Net variants, it is essential to understand that while ensemble methods have the potential to enhance model performance, this improvement is not always guaranteed. For an ensemble to significantly outperform a single model, the base classifiers must exhibit diversity. This means they need to make different errors, thereby complementing each other’s weaknesses. However, when the signal in the data is dominated by a few strong predictors, most models, including those within an ensemble, will likely capture and model this dominant information similarly. This can result in highly correlated predictions across the ensemble members, thereby reducing the potential benefits of combining them. In other words, if the nnU-Net ensemble models demonstrate lower performance compared to individual 3D or 2D nnU-Net variants, a lack of diversity among the ensemble members could be a contributing factor. When ensemble models are not sufficiently diverse, they may fail to provide the expected performance boost, leading to a situation where the ensemble’s performance is merely on par with or even inferior to the best individual model.

5.3 Higher performance in 2D model compared to 3D model

In our analysis of the ACDC, MnM, and MnM2 datasets, we observe a trend where 2D nnU-Net implementations demonstrated superior performance, as measured by Dice scores, compared to their 3D counterparts. This can be attributed to several factors inherent to both the nature of MRI data and the architectural differences between 2D and 3D models.

Firstly, the inherent characteristics of MRI data play a crucial role. These images typically exhibit high in-plane resolution but relatively lower through-plane resolution (Upendra et al., 2021). This aligns well with the strengths of 2D models, which can effectively process and leverage the high-resolution in-plane information without being encumbered by the lower resolution in the z-axis.

Secondly, the increased complexity of 3D models presents both advantages and challenges. While 3D architectures have the potential to capture volumetric context, they also introduce a significantly larger number of parameters. This increased parameter count necessitates larger training data-sets to achieve optimal performance. In scenarios where the available data is limited, 2D models may be better suited to generalize effectively from the available samples.

The computational demands of 3D models also impact their performance. These architectures require substantially more GPU memory, which can impose constraints on critical training hyperparameters such as patch size and batch size. Smaller patch sizes, often necessitated by memory limitations, may restrict the spatial context available to the model during training. This reduced context can be particularly detrimental in tasks where long-range spatial dependencies are crucial for accurate segmentation.

Furthermore, the nature of the segmentation task itself may favor 2D approaches. If the key features for accurate segmentation are predominantly visible within individual slices, the additional complexity introduced by 3D models in capturing inter-slice relationships may not provide significant benefits. In fact, this added complexity could potentially introduce noise or irrelevant information into the learning process, leading to suboptimal performance.

5.4 Effect of the configurations of nnU-Net

When utilizing nnU-Net, the selection of loss functions, optimizers, batch sizes, and patch sizes is tailored to the specific characteristics of the dataset. In our case, all the nnU-Nets employ a combination of Dice loss and cross-entropy loss (DiceCE loss) as its default loss function. However, in scenarios with class imbalance, alternative loss functions such as DiceHD loss (combining Dice loss with Hausdorff Distance loss) and DiceFocal loss (combining Dice loss with focal loss) have demonstrated superior performance (Ma et al., 2021). Therefore, incorporating these loss functions into nnU-Net could potentially enhance segmentation results.

Furthermore, nnU-Net traditionally utilizes the SGD optimizer. Nonetheless, recent studies have shown that the Adam optimizer can achieve comparable, if not superior, outcomes in segmentation tasks (Rajinikanth et al., 2022). Consequently, integrating the Adam optimizer into nnU-Net’s framework may lead to improved performance in certain cases.

6 Conclusion

In this study, we evaluated five datasets related to cardiac MRI segmentation using various adaptations of nnU-Nets. Through extensive experimentation over more than 130 training cycles, we conducted a comprehensive performance analysis of these models. Our comparative study against existing methods demonstrated that nnU-Net performs not only competitively but also frequently surpasses current state-of-the-art techniques, even the latest methods in some datasets.

Our findings underscore the robustness and adaptability of nnU-Net for cardiac MRI segmentation tasks. The model’s consistent performance across different datasets highlights its potential as a reliable tool for clinical applications. However, this study also raises an important question: when is it necessary to develop new models specifically tailored for particular cardiac segmentation tasks?

The answer lies in the intricacies and demands of specific scenarios. While nnU-Net provides a strong baseline, certain cases may present unique challenges that require bespoke solutions. For example, in the segmentation of complex anatomical structures such as scars, we observed that nnU-Net faces limitations. In such cases, developing specialized models proved to be beneficial. Additionally, some analyses often require integrating information from various imaging modalities (e.g., combining MRI with CT scans). To effectively merge and interpret such data, specialized models might be necessary.

Our study focused exclusively on cardiac-related datasets and a single imaging modality. Future research should expand to other anatomical regions, such as the brain and abdomen, and incorporate additional imaging modalities such as CT, X-ray, and Ultrasound. This would not only validate the generalizability of nnU-Net but also identify any potential limitations and areas for improvement.

In conclusion, while the nnU-Net framework provides a robust and versatile foundation for cardiac MRI segmentation, the development of new models tailored to specific clinical needs and challenges remains essential. Our work demonstrates that while general-purpose models like nnU-Net offer significant advantages, there is still a critical need for ongoing innovation and customization to address the unique complexities of different medical imaging tasks. Future research should continue to explore and develop these specialized approaches to fully harness the potential of deep learning in medical imaging.

References

Akkaya et al. (2013) Mehmet Akkaya, Koji Higuchi, Matthias Koopmann, Nathan Burgon, Ercan Erdogan, Kavitha Damal, Eugene Kholmovski, Chris McGann, and Nassir F Marrouche. Relationship between left atrial tissue structural remodelling detected using late gadolinium enhancement mri and left ventricular hypertrophy in patients with atrial fibrillation. Europace, 15(12):1725–1732, 2013.
Al Khalil et al. (2021) Yasmina Al Khalil, Sina Amirrajab, Josien Pluim, and Marcel Breeuwer. Late fusion u-net with gan-based augmentation for generalizable cardiac mri segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 360–373. Springer, 2021.
Allessie et al. (2002) Maurits Allessie, Jannie Ausma, and Ulrich Schotten. Electrical, contractile and structural remodeling during atrial fibrillation. Cardiovascular research, 54(2):230–246, 2002.
Ammar et al. (2021) Abderazzak Ammar, Omar Bouattane, and Mohamed Youssfi. Automatic cardiac cine mri segmentation and heart disease classification. Computerized Medical Imaging and Graphics, 88:101864, 2021.
Arega et al. (2021) Tewodros Weldebirhan Arega, François Legrand, Stéphanie Bricq, and Fabrice Meriaudeau. Using mri-specific data augmentation to enhance the segmentation of right ventricle in multi-disease, multi-center and multi-view cardiac mri. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 250–258. Springer, 2021.
Arega et al. (2022) Tewodros Weldebirhan Arega, Stéphanie Bricq, and Fabrice Meriaudeau. Using polynomial loss and uncertainty information for robust left atrial and scar quantification and segmentation. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 133–144. Springer, 2022.
Baumgartner et al. (2018) Christian F Baumgartner, Lisa M Koch, Marc Pollefeys, and Ender Konukoglu. An exploration of 2d and 3d deep learning techniques for cardiac mr image segmentation. In Statistical Atlases and Computational Models of the Heart. ACDC and MMWHS Challenges: 8th International Workshop, STACOM 2017, Held in Conjunction with MICCAI 2017, Quebec City, Canada, September 10-14, 2017, Revised Selected Papers 8, pages 111–119. Springer, 2018.
Beetz et al. (2021) Marcel Beetz, Jorge Corral Acero, and Vicente Grau. A multi-view crossover attention u-net cascade with fourier domain adaptation for multi-domain cardiac mri segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 323–334. Springer, 2021.
Berihu Girum et al. (2021) Kibrom Berihu Girum, Gilles Créhange, and Alain Lalande. Learning with context feedback loop for robust medical image segmentation. arXiv e-prints, pages arXiv–2103, 2021.
Bernard et al. (2018) Olivier Bernard, Alain Lalande, Clement Zotti, Frederick Cervenansky, Xin Yang, Pheng-Ann Heng, Irem Cetin, Karim Lekadir, Oscar Camara, Miguel Angel Gonzalez Ballester, et al. Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE transactions on medical imaging, 37(11):2514–2525, 2018.
Bian et al. (2018) Cheng Bian, Xin Yang, Jianqiang Ma, Shen Zheng, Yu-An Liu, Reza Nezafat, Pheng-Ann Heng, and Yefeng Zheng. Pyramid network with online hard example mining for accurate left atrium segmentation. In international workshop on statistical atlases and computational models of the heart, pages 237–245. Springer, 2018.
Bisbal et al. (2014) Felipe Bisbal, Esther Guiu, Pilar Cabanas-Grandío, Antonio Berruezo, Susana Prat-Gonzalez, Bárbara Vidal, Cesar Garrido, David Andreu, Juan Fernandez-Armenta, Jose María Tolosana, et al. Cmr-guided approach to localize and ablate gaps in repeat af ablation procedure. JACC: Cardiovascular Imaging, 7(7):653–663, 2014.
Boldt et al. (2004) Andreas Boldt, Ulrike Wetzel, Joerg Lauschke, J Weigl, Jf Gummert, Gerd Hindricks, Hans Kottkamp, and Stefan Dhein. Fibrosis in left atrial tissue of patients with atrial fibrillation with and without underlying mitral valve disease. Heart, 90(4):400–405, 2004.
Borra et al. (2019) Davide Borra, Alessandro Masci, Lorena Esposito, Alice Andalò, Claudio Fabbri, and Cristiana Corsi. A semantic-wise convolutional neural network approach for 3-d left atrium segmentation from late gadolinium enhanced magnetic resonance imaging. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 329–338. Springer, 2019.
Calisto and Lai-Yuen (2020) Maria Baldeon Calisto and Susana K Lai-Yuen. Adaen-net: An ensemble of adaptive 2d–3d fully convolutional networks for medical image segmentation. Neural Networks, 126:76–94, 2020.
Campello et al. (2021) Victor M Campello, Polyxeni Gkontra, Cristian Izquierdo, Carlos Martin-Isla, Alireza Sojoudi, Peter M Full, Klaus Maier-Hein, Yao Zhang, Zhiqiang He, Jun Ma, et al. Multi-centre, multi-vendor and multi-disease cardiac segmentation: the m&ms challenge. IEEE Transactions on Medical Imaging, 40(12):3543–3554, 2021.
Carscadden et al. (2021) Adam Carscadden, Michelle Noga, and Kumaradevan Punithakumar. A deep convolutional neural network approach for the segmentation of cardiac structures from mri sequences. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 250–258. Springer, 2021.
Chen et al. (2019) Chen Chen, Wenjia Bai, and Daniel Rueckert. Multi-task learning for left atrial segmentation on ge-mri. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 292–301. Springer, 2019.
Chen et al. (2021) Jun Chen, Guang Yang, Habib Khan, Heye Zhang, Yanping Zhang, Shu Zhao, Raad Mohiaddin, Tom Wong, David Firmin, and Jennifer Keegan. Jas-gan: generative adversarial network based joint atrium and scar segmentations on unbalanced atrial targets. IEEE Journal of Biomedical and Health Informatics, 26(1):103–114, 2021.
Chen et al. (2022a) Shaolong Chen, Changzhen Qiu, Weiping Yang, and Zhiyong Zhang. Combining edge guidance and feature pyramid for medical image segmentation. Biomedical signal processing and control, 78:103960, 2022a.
Chen et al. (2022b) Shaolong Chen, Changzhen Qiu, Weiping Yang, and Zhiyong Zhang. Multiresolution aggregation transformer unet based on multiscale input and coordinate attention for medical image segmentation. Sensors, 22(10):3820, 2022b.
Chen et al. (2023) Shaolong Chen, Lijie Zhong, Changzhen Qiu, Zhiyong Zhang, and Xiaodong Zhang. Transformer-based multilevel region and edge aggregation network for magnetic resonance image segmentation. Computers in Biology and Medicine, 152:106427, 2023.
Corral Acero et al. (2021) Jorge Corral Acero, Vaanathi Sundaresan, Nicola Dinsdale, Vicente Grau, and Mark Jenkinson. A 2-step deep learning method with domain adaptation for multi-centre, multi-vendor and multi-disease cardiac magnetic resonance segmentation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 196–207. Springer, 2021.
Full et al. (2021) Peter M Full, Fabian Isensee, Paul F Jäger, and Klaus Maier-Hein. Studying robustness of semantic segmentation under domain shift in cardiac mri. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 238–249. Springer, 2021.
Fulton et al. (2021) Mitchell J Fulton, Christoffer R Heckman, and Mark E Rentschler. Deformable bayesian convolutional networks for disease-robust cardiac mri segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 296–305. Springer, 2021.
Galati and Zuluaga (2021) Francesco Galati and Maria A Zuluaga. Using out-of-distribution detection for model refinement in cardiac image segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 374–382. Springer, 2021.
Galazis et al. (2021) Christoforos Galazis, Huiyi Wu, Zhuoyu Li, Camille Petri, Anil A Bharath, and Marta Varela. Tempera: Spatial transformer feature pyramid network for cardiac mri segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 268–276. Springer, 2021.
Gao and Zhuang (2022) Zheyao Gao and Xiahai Zhuang. Consistency based co-segmentation for multi-view cardiac mri using vision transformer. In Statistical Atlases and Computational Models of the Heart. Multi-Disease, Multi-View, and Multi-Center Right Ventricular Segmentation in Cardiac MRI Challenge: 12th International Workshop, STACOM 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, September 27, 2021, Revised Selected Papers 12, pages 306–314. Springer, 2022.
Guo et al. (2021) Fumin Guo, Matthew Ng, Idan Roifman, and Graham Wright. Cardiac mri left ventricular segmentation and function quantification using pre-trained neural networks. In International Conference on Functional Imaging and Modeling of the Heart, pages 46–54. Springer, 2021.
Huang et al. (2021) Xiaoqiong Huang, Zejian Chen, Xin Yang, Zhendong Liu, Yuxin Zou, Mingyuan Luo, Wufeng Xue, and Dong Ni. Style-invariant cardiac image segmentation with test-time augmentation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 305–315. Springer, 2021.
Isensee et al. (2018) Fabian Isensee, Paul F Jaeger, Peter M Full, Ivo Wolf, Sandy Engelhardt, and Klaus H Maier-Hein. Automatic cardiac disease assessment on cine-mri via time-series segmentation and domain specific features. In Statistical Atlases and Computational Models of the Heart. ACDC and MMWHS Challenges: 8th International Workshop, STACOM 2017, Held in Conjunction with MICCAI 2017, Quebec City, Canada, September 10-14, 2017, Revised Selected Papers 8, pages 120–129. Springer, 2018.
Isensee et al. (2021) Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Petersen, and Klaus H Maier-Hein. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
Isensee et al. (2024) Fabian Isensee, Tassilo Wald, Constantin Ulrich, Michael Baumgartner, Saikat Roy, Klaus Maier-Hein, and Paul F Jaeger. nnu-net revisited: A call for rigorous validation in 3d medical image segmentation. arXiv preprint arXiv:2404.09556, 2024.
Jabbar et al. (2021) Sana Jabbar, Syed Talha Bukhari, and Hassan Mohy-ud Din. Multi-view sa-la net: A framework for simultaneous segmentation of rv on multi-view cardiac mr images. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 277–286. Springer, 2021.
Jiang et al. (2022) Lei Jiang, Yan Li, Yifan Wang, Hengfei Cui, Yong Xia, and Yanning Zhang. Deep u-net architecture with curriculum learning for left atrial segmentation. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 115–123. Springer, 2022.
Khader et al. (2021) Firas Khader, Justus Schock, Daniel Truhn, Fabian Morsbach, and Christoph Haarburger. Adaptive preprocessing for generalization in cardiac mr image segmentation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 269–276. Springer, 2021.
Khan et al. (2022) Abbas Khan, Omnia Alwazzan, Martin Benning, and Greg Slabaugh. Sequential segmentation of the left atrium and atrial scars using a multi-scale weight sharing network and boundary-based processing. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 69–82. Springer, 2022.
Khened et al. (2018) Mahendra Khened, Varghese Alex, and Ganapathy Krishnamurthi. Densely connected fully convolutional network for short-axis cardiac cine mr image segmentation and heart diagnosis using random forest. In Statistical Atlases and Computational Models of the Heart. ACDC and MMWHS Challenges: 8th International Workshop, STACOM 2017, Held in Conjunction with MICCAI 2017, Quebec City, Canada, September 10-14, 2017, Revised Selected Papers 8, pages 140–151. Springer, 2018.
Kong and Shadden (2021) Fanwei Kong and Shawn C Shadden. A generalizable deep-learning approach for cardiac magnetic resonance image segmentation using image augmentation and attention u-net. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 287–296. Springer, 2021.
Lefebvre et al. (2022) Arthur L Lefebvre, Carolyna AP Yamamoto, Julie K Shade, Ryan P Bradley, Rebecca A Yu, Rheeda L Ali, Dan M Popescu, Adityo Prakosa, Eugene G Kholmovski, and Natalia A Trayanova. Lassnet: A four steps deep neural network for left atrial segmentation and scar quantification. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 1–15. Springer, 2022.
Li et al. (2019) Caizi Li, Qianqian Tong, Xiangyun Liao, Weixin Si, Yinzi Sun, Qiong Wang, and Pheng-Ann Heng. Attention based hierarchical aggregation network for 3d left atrial segmentation. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 255–264. Springer, 2019.
Li and Li (2022) Feiyan Li and Weisheng Li. Cross-domain segmentation of left atrium based on multi-scale decision level fusion. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 124–132. Springer, 2022.
Li et al. (2023) Feiyan Li, Weisheng Li, Xinbo Gao, Rui Liu, and Bin Xiao. Comprehensive information integration network for left atrium segmentation on lge cmr images. Biomedical Signal Processing and Control, 81:104537, 2023.
Li et al. (2021a) Hongwei Li, Jianguo Zhang, and Bjoern Menze. Generalisable cardiac structure segmentation via attentional and stacked image adaptation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 297–304. Springer, 2021a.
Li et al. (2021b) Lei Li, Wangbin Ding, Liqin Huang, and Xiahai Zhuang. Right ventricular segmentation from short-and long-axis mris via information transition. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 259–267. Springer, 2021b.
Li et al. (2021c) Lei Li, Veronika A Zimmer, Wangbin Ding, Fuping Wu, Liqin Huang, Julia A Schnabel, and Xiahai Zhuang. Random style transfer based domain generalization networks integrating shape and spatial information. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 208–218. Springer, 2021c.
Liu et al. (2021a) Di Liu, Zhennan Yan, Qi Chang, Leon Axel, and Dimitris N Metaxas. Refined deep layer aggregation for multi-disease, multi-view & multi-center cardiac mr segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 315–322. Springer, 2021a.
Liu et al. (2022a) Tianyi Liu, Size Hou, Jiayuan Zhu, Zilong Zhao, and Haochuan Jiang. Ugformer for robust left atrium and scar segmentation across scanners. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 36–48. Springer, 2022a.
Liu et al. (2021b) Xiao Liu, Spyridon Thermos, Agisilaos Chartsias, Alison O’Neil, and Sotirios A Tsaftaris. Disentangled representations for domain-generalized cardiac segmentation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 187–195. Springer, 2021b.
Liu et al. (2019) Yashu Liu, Yangyang Dai, Cong Yan, and Kuanquan Wang. Deep learning based method for left atrial segmentation in ge-mri. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 311–318. Springer, 2019.
Liu et al. (2022b) Yashu Liu, Wei Wang, Gongning Luo, Kuanquan Wang, Dong Liang, and Shuo Li. Uncertainty-guided symmetric multilevel supervision network for 3d left atrium segmentation in late gadolinium-enhanced mri. Medical Physics, 49(7):4554–4565, 2022b.
Lourenço et al. (2021) Ana Lourenço, Eric Kerfoot, Connor Dibblin, Ebraham Alskaf, Mustafa Anjari, Anil A Bharath, Andrew P King, Henry Chubb, Teresa M Correia, and Marta Varela. Left atrial ejection fraction estimation using seganet for fully automated segmentation of cine mri. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 137–145. Springer, 2021.
Ma (2021) Jun Ma. Histogram matching augmentation for domain adaptation with application to multi-centre, multi-vendor and multi-disease cardiac image segmentation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 177–186. Springer, 2021.
Ma et al. (2021) Jun Ma, Jianan Chen, Matthew Ng, Rui Huang, Yu Li, Chen Li, Xiaoping Yang, and Anne L Martel. Loss odyssey in medical image segmentation. Medical Image Analysis, 71:102035, 2021.
Martín-Isla et al. (2023) Carlos Martín-Isla, Víctor M Campello, Cristian Izquierdo, Kaisar Kushibar, Carla Sendra-Balcells, Polyxeni Gkontra, Alireza Sojoudi, Mitchell J Fulton, Tewodros Weldebirhan Arega, Kumaradevan Punithakumar, et al. Deep learning segmentation of the right ventricle in cardiac mri: The m&ms challenge. IEEE Journal of Biomedical and Health Informatics, 27(7):3302–3313, 2023.
Mazher et al. (2021) Moona Mazher, Abdul Qayyum, Abdesslam Benzinou, Mohamed Abdel-Nasser, and Domenec Puig. Multi-disease, multi-view and multi-center right ventricular segmentation in cardiac mri using efficient late-ensemble deep learning approach. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 335–343. Springer, 2021.
Mazher et al. (2022) Moona Mazher, Abdul Qayyum, Mohamed Abdel-Nasser, and Domenec Puig. Automatic semi-supervised left atrial segmentation using deep-supervision 3dresunet with pseudo labeling approach for lascarqs 2022 challenge. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 153–161. Springer, 2022.
Milletari et al. (2016) Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. Ieee, 2016.
Painchaud et al. (2020) Nathan Painchaud, Youssef Skandarani, Thierry Judge, Olivier Bernard, Alain Lalande, and Pierre-Marc Jodoin. Cardiac segmentation with strong anatomical guarantees. IEEE transactions on medical imaging, 39(11):3703–3713, 2020.
Parreño et al. (2021) Mario Parreño, Roberto Paredes, and Alberto Albiol. Deidentifying mri data domain by iterative backpropagation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 277–286. Springer, 2021.
Peters et al. (2007) Dana C Peters, John V Wylie, Thomas H Hauser, Kraig V Kissinger, René M Botnar, Vidal Essebag, Mark E Josephson, and Warren J Manning. Detection of pulmonary vein and left atrial scar after catheter ablation with three-dimensional navigator-gated delayed enhancement mr imaging: initial experience. Radiology, 243(3):690–695, 2007.
Punithakumar and Noga (2022) Kumaradevan Punithakumar and Michelle Noga. Automated segmentation of the left atrium and scar using deep convolutional neural networks. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 145–152. Springer, 2022.
Punithakumar et al. (2021) Kumaradevan Punithakumar, Adam Carscadden, and Michelle Noga. Automated segmentation of the right ventricle from magnetic resonance imaging using deep convolutional neural networks. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 344–351. Springer, 2021.
Puybareau et al. (2018) Élodie Puybareau, Zhou Zhao, Younes Khoudli, Edwin Carlinet, Yongchao Xu, Jérôme Lacotte, and Thierry Géraud. Left atrial segmentation in a few seconds using fully convolutional network and transfer learning. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 339–347. Springer, 2018.
Qi et al. (2023) Yushi Qi, Chunhu Hu, Liling Zuo, Bo Yang, and Youlong Lv. Cardiac magnetic resonance image segmentation method based on multi-scale feature fusion and sequence relationship learning. Sensors, 23(2):690, 2023.
Queirós (2021) Sandro Queirós. Right ventricular segmentation in multi-view cardiac mri using a unified u-net model. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 287–295. Springer, 2021.
Rajinikanth et al. (2022) Venkatesan Rajinikanth, Seifedine Kadry, Robertas Damaševičius, D Sankaran, Mazin Abed Mohammed, and Shrinithi Chander. Skin melanoma segmentation using vgg-unet with adam/sgd optimizer: a study. In 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), pages 982–986. IEEE, 2022.
Saber et al. (2021) Mina Saber, Dina Abdelrauof, and Mustafa Elattar. Multi-center, multi-vendor, and multi-disease cardiac image segmentation using scale-independent multi-gate unet. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 259–268. Springer, 2021.
Scannell et al. (2021) Cian M Scannell, Amedeo Chiribiri, and Mitko Veta. Domain-adversarial learning for multi-centre, multi-vendor, and multi-disease cardiac mr image segmentation. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 228–237. Springer, 2021.
Shi et al. (2024) Zhebin Shi, Mingfeng Jiang, Yang Li, Bo Wei, Zefeng Wang, Yongquan Wu, Tao Tan, and Guang Yang. Mlc: Multi-level consistency learning for semi-supervised left atrium segmentation. Expert Systems with Applications, 244:122903, 2024.
Simantiris and Tziritas (2020) Georgios Simantiris and Georgios Tziritas. Cardiac mri segmentation with a dilated cnn incorporating domain-specific constraints. IEEE Journal of Selected Topics in Signal Processing, 14(6):1235–1243, 2020.
Singh et al. (2023a) Kamal Raj Singh, Ambalika Sharma, and Girish Kumar Singh. Attention-guided residual w-net for supervised cardiac magnetic resonance imaging segmentation. Biomedical Signal Processing and Control, 86:105177, 2023a.
Singh et al. (2023b) Kamal Raj Singh, Ambalika Sharma, and Girish Kumar Singh. Madru-net: Multi-scale attention-based cardiac mri segmentation using deep residual u-net. IEEE Transactions on Instrumentation and Measurement, 2023b.
Singh et al. (2023c) Kamal Raj Singh, Ambalika Sharma, and Girish Kumar Singh. W-net: Novel deep supervision for deep learning-based cardiac magnetic resonance imaging segmentation. IETE Journal of Research, 69(12):8960–8976, 2023c.
Stalling et al. (2005) Detlev Stalling, Malte Westerhoff, Hans-Christian Hege, et al. Amira: A highly interactive system for visual data analysis. The visualization handbook, 38:749–767, 2005.
Sun et al. (2022) Xiaowu Sun, Li-Hsin Cheng, and Rob J van der Geest. Right ventricle segmentation via registration and multi-input modalities in cardiac magnetic resonance imaging from multi-disease, multi-view and multi-center. In Statistical Atlases and Computational Models of the Heart. Multi-Disease, Multi-View, and Multi-Center Right Ventricular Segmentation in Cardiac MRI Challenge: 12th International Workshop, STACOM 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, September 27, 2021, Revised Selected Papers 12, pages 241–249. Springer, 2022.
Tautz et al. (2021) Lennart Tautz, Lars Walczak, Chiara Manini, Anja Hennemuth, and Markus Hüllebrand. 3d right ventricle reconstruction from 2d u-net segmentation of sparse short-axis and 4-chamber cardiac cine mri views. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 352–359. Springer, 2021.
Tobon-Gomez et al. (2015) Catalina Tobon-Gomez, Arjan J Geers, Jochen Peters, Jürgen Weese, Karen Pinto, Rashed Karim, Mohammed Ammar, Abdelaziz Daoudi, Jan Margeta, Zulma Sandoval, et al. Benchmark for algorithms segmenting the left atrium from 3d ct and mri datasets. IEEE transactions on medical imaging, 34(7):1460–1473, 2015.
Tsao et al. (2023) Connie W Tsao, Aaron W Aday, Zaid I Almarzooq, Cheryl AM Anderson, Pankaj Arora, Christy L Avery, Carissa M Baker-Smith, Andrea Z Beaton, Amelia K Boehme, Alfred E Buxton, et al. Heart disease and stroke statistics—2023 update: a report from the american heart association. Circulation, 147(8):e93–e621, 2023.
Tu et al. (2022) Can Tu, Ziyan Huang, Zhongying Deng, Yuncheng Yang, Chenglong Ma, Junjun He, Jin Ye, Haoyu Wang, and Xiaowei Ding. Self pre-training with single-scale adapter for left atrial segmentation. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 24–35. Springer, 2022.
Ukwatta et al. (2015) Eranga Ukwatta, Hermenegild Arevalo, Martin Rajchl, James White, Farhad Pashakhanloo, Adityo Prakosa, Daniel A Herzka, Elliot McVeigh, Albert C Lardo, Natalia A Trayanova, et al. Image-based reconstruction of three-dimensional myocardial infarct geometry for patient-specific modeling of cardiac electrophysiology. Medical physics, 42(8):4579–4590, 2015.
Upendra et al. (2021) Roshan Reddy Upendra, Richard Simon, and Cristian A Linte. A deep learning framework for image super-resolution for late gadolinium enhanced cardiac mri. In 2021 Computing in Cardiology (CinC), volume 48, pages 1–4. IEEE, 2021.
Uslu et al. (2021) Fatmatülzehra Uslu, Marta Varela, Georgia Boniface, Thakshayene Mahenthran, Henry Chubb, and Anil A Bharath. La-net: A multi-task deep network for the segmentation of the left atrium. IEEE transactions on medical imaging, 41(2):456–464, 2021.
Vergara and Marrouche (2011) Gaston R Vergara and Nassir F Marrouche. Tailored management of atrial fibrillation using a lge-mri based model: from the clinic to the electrophysiology laboratory. Journal of cardiovascular electrophysiology, 22(4):481–487, 2011.
Vesal et al. (2019) Sulaiman Vesal, Nishant Ravikumar, and Andreas Maier. Dilated convolutions in neural networks for left atrial segmentation in 3d gadolinium enhanced-mri. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 319–328. Springer, 2019.
Wolterink et al. (2018) Jelmer M Wolterink, Tim Leiner, Max A Viergever, and Ivana Išgum. Automatic segmentation and disease classification using cardiac cine mr images. In Statistical Atlases and Computational Models of the Heart. ACDC and MMWHS Challenges: 8th International Workshop, STACOM 2017, Held in Conjunction with MICCAI 2017, Quebec City, Canada, September 10-14, 2017, Revised Selected Papers 8, pages 101–110. Springer, 2018.
Xia et al. (2019) Qing Xia, Yuxin Yao, Zhiqiang Hu, and Aimin Hao. Automatic 3d atrial segmentation from ge-mris using volumetric fully convolutional networks. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 211–220. Springer, 2019.
Xie et al. (2022) Tongtong Xie, Zhengeng Yang, and Hongshan Yu. La-hrnet: High-resolution network for automatic left atrial segmentation in multi-center leg mri. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 83–92. Springer, 2022.
Xiong et al. (2021) Zhaohan Xiong, Qing Xia, Zhiqiang Hu, Ning Huang, Cheng Bian, Yefeng Zheng, Sulaiman Vesal, Nishant Ravikumar, Andreas Maier, Xin Yang, et al. A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging. Medical image analysis, 67:101832, 2021.
Xu et al. (2024) Fangqiang Xu, Wenxuan Tu, Fan Feng, Malitha Gunawardhana, Jiayuan Yang, Yun Gu, and Jichao Zhao. Dynamic position transformation and boundary refinement network for left atrial segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2024.
Yang et al. (2019) Xin Yang, Na Wang, Yi Wang, Xu Wang, Reza Nezafat, Dong Ni, and Pheng-Ann Heng. Combating uncertainty with novel losses for automatic left atrium segmentation. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 246–254. Springer, 2019.
Yushkevich et al. (2016) Paul A Yushkevich, Yang Gao, and Guido Gerig. Itk-snap: An interactive tool for semi-automatic segmentation of multi-modality biomedical images. In 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pages 3342–3345. IEEE, 2016.
Zhang et al. (2022a) Xuru Zhang, Xinye Yang, Lihua Huang, and Liqin Huang. Two stage of histogram matching augmentation for domain generalization: Application to left atrial segmentation. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 60–68. Springer, 2022a.
Zhang et al. (2021) Yao Zhang, Jiawei Yang, Feng Hou, Yang Liu, Yixin Wang, Jiang Tian, Cheng Zhong, Yang Zhang, and Zhiqiang He. Semi-supervised cardiac image segmentation via label propagation and style transfer. In Statistical Atlases and Computational Models of the Heart. M&Ms and EMIDEC Challenges: 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers 11, pages 219–227. Springer, 2021.
Zhang et al. (2022b) Yuchen Zhang, Yanda Meng, and Yalin Zheng. Automatically segment the left atrium and scars from lge-mris using a boundary-focused nnu-net. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 49–59. Springer, 2022b.
Zhao et al. (2023) Chenji Zhao, Shun Xiang, Yuanquan Wang, Zhaoxi Cai, Jun Shen, Shoujun Zhou, Di Zhao, Weihua Su, Shijie Guo, and Shuo Li. Context-aware network fusing transformer and v-net for semi-supervised segmentation of 3d left atrium. Expert Systems with Applications, 214:119105, 2023.
Zhao et al. (2021) Zhou Zhao, Elodie Puybareau, Nicolas Boutry, and Thierry Géraud. Do not treat boundaries and regions differently: An example on heart left atrial segmentation. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 7447–7453. IEEE, 2021.
Zhou et al. (2022) Siping Zhou, Kai-Ni Wang, and Guang-Quan Zhou. Edge-enhanced feature guided joint segmentation of left atrial and scars in lge mri images. In Challenge on Left Atrial and Scar Quantification and Segmentation, pages 93–105. Springer, 2022.
Zhuang et al. (2023) Xiahai Zhuang, Lei Li, Sihan Wang, and Fuping Wu. Left Atrial and Scar Quantification and Segmentation: First Challenge, LAScarQS 2022, Held in Conjunction with MICCAI 2022, Singapore, September 18, 2022, Proceedings, volume 13586. Springer Nature, 2023.
Zotti et al. (2018a) Clément Zotti, Zhiming Luo, Olivier Humbert, Alain Lalande, and Pierre-Marc Jodoin. Gridnet with automatic shape prior registration for automatic mri cardiac segmentation. In Statistical Atlases and Computational Models of the Heart. ACDC and MMWHS Challenges: 8th International Workshop, STACOM 2017, Held in Conjunction with MICCAI 2017, Quebec City, Canada, September 10-14, 2017, Revised Selected Papers 8, pages 73–81. Springer, 2018a.
Zotti et al. (2018b) Clement Zotti, Zhiming Luo, Alain Lalande, and Pierre-Marc Jodoin. Convolutional neural network with shape prior applied to cardiac mri segmentation. IEEE journal of biomedical and health informatics, 23(3):1119–1128, 2018b.