Unraveling Radiomics Complexity: Strategies for Optimal Simplicity in Predictive Modeling

Mahdi Ait Lhaj Loutfi¹, Teodora Boblea Podasca², Alex Zwanenburg^3,4,5,6, Taman Upadhaya⁷, Jorge Barrios Ginart⁸, David R Raleigh⁹, William C. Chen⁸, Dante P.I. Capaldi⁸, Hong Zheng¹⁰, Olivier Gevaert¹¹, Jing Wu¹², Alvin C. Silva¹³, Paul J. Zhang¹⁴, Harrison X. Bai¹⁵, Jan Seuntjens¹⁶, Steffen Löck³, Patrick O. Richard¹⁷, Olivier Morin⁸, Caroline Reinhold¹⁸, Martin Lepage^19,20, Martin Vallières^1,21,∗

¹Department of Computer Science, Université de Sherbrooke, Sherbrooke, QC, Canada
²Department of Surgery, Service of Urology, Université de Sherbrooke, Sherbrooke, QC, Canada
³OncoRay – National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Germany
⁴National Center for Tumor Diseases Dresden (NCT/UCC), Germany: German Cancer Research Center (DKFZ), Heidelberg, Germany
⁵Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
⁶Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
⁷Department of Radiation Oncology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
⁸Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
⁹Departments of Radiation Oncology, Neurological Surgery, and Pathology, University of California San Francisco, San Francisco, CA, USA
¹⁰Center for Biomedical Informatics Research, School of Medicine, Stanford University, CA 94305, USA
¹¹Department of Medicine, and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
¹²Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, 410011, Hunan, China
¹³Department of Radiology, Mayo Clinic Arizona, 13400 E Shea Blvd., Scottsdale, AZ 85259, USA
¹⁴Department of Pathology and Clinical Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
¹⁵Department of Radiology and Radiological Sciences, Johns Hopkins, Baltimore, MD, USA
¹⁶Princess Margaret Cancer Centre, University Health Network & Departments of Radiation Oncology & Medical Biophysics, University of Toronto, Toronto, ON, Canada.
¹⁷Division of Urology, Centre Hospitalier Universitaire de Sherbrooke; Université de Sherbrooke Cancer Research Institute, Sherbrooke, QC, Canada
¹⁸Department of Radiology, McGill University Health Center, Director and Co-founder, Augmented Intelligence Precision Health Laboratory (AIPHL) of the Research Institute of the McGill University Health Center, Montreal, QC, Canada
¹⁹Département de médecine nucléaire et radiobiologie, Université de Sherbrooke, Sherbrooke, QC, Canada
²⁰Centre d’imagerie moléculaire de Sherbrooke, Université de Sherbrooke, Sherbrooke, QC, Canada
²¹Centre de recherche du Centre hospitalier universitaire de Sherbrooke, Université de Sherbrooke, Sherbrooke, QC, Canada.
^∗Corresponding author: martin.vallieres@usherbrooke.ca

Abstract

Background: The high dimensionality of radiomic feature sets, the variability in radiomic feature types and potentially high computational requirements all underscore the need for an effective method to identify the smallest set of predictive features for a given clinical problem.

Purpose: To establish a methodology and provide tools for identifying and explaining the smallest set of predictive features radiomic features.

Materials and Methods: Radiomic features (a total of 89,714) were extracted from five distinct datasets with different cancer types: low-grade glioma, meningioma, non-small cell lung cancer (NSCLC), and two renal cell carcinoma cohorts (n=2104). These features were categorized into complexity levels, defined by the number of computational steps required for their computation, encompassing morphological, intensity, texture, linear filters-based, and nonlinear filter-based features. For every dataset, models were trained on each complexity level specifically to classify clinical outcomes, and their performance was evaluated using the area under the curve (AUC). The most informative features were identified and their importance was explained. The optimal complexity level and associated most informative features were identified using systematic statistical significance analyses and a false discovery avoidance procedure, respectively. Their predictive importance was explained using a novel tree-based method.

Results: MEDimage, a new open-source tool, was designed and implemented to streamline radiomic studies through both code-based and graphical-based approaches, and was applied using our proposed methodology to analyze the datasets. Morphological features were found to be optimal in two cases: for MRI-based meningioma (AUC: 0.65; sensitivity: 64%; specificity: 62%; 95% CI: 0.59, 0.72) and MRI-based low-grade glioma (AUC: 0.68; sensitivity: 68%; specificity: 69%; 95% CI: 0.60, 0.75). Additionally, intensity features were optimal in two instances: for contrast-enhanced CT (CECT)-based renal cell carcinoma (AUC: 0.82; sensitivity: 77%; specificity: 78%; 95% CI: 0.76, 0.88) and CT-based NSCLC (AUC: 0.76; sensitivity: 73%; specificity: 71%; 95% CI: 0.71, 0.80). Texture features were identified as optimal for MRI-based renal cell carcinoma (AUC: 0.72; sensitivity: 71%; specificity: 65%; 95% CI: 0.68, 0.77). Notably, in CECT-based renal cell carcinoma, the tuning of the Hounsfield unit range, which directly affects intensity-based features, led to improved results (AUC: 0.86).

Conclusion: Our proposed methodology and software can estimate the optimal radiomics complexity level for specific medical outcomes, potentially simplifying the use of radiomics in predictive modeling across various contexts.

1 Introduction

Medical imaging is a cornerstone of personalized medicine by providing a non-invasive window into the unique phenotypic characteristics of volumes of interest. Radiomics is defined as the high-throughput extraction of quantitative features from images to enable characterization of tissues [1]. Such features extend our ability to discern subtle nuances in medical imaging characteristics, providing data for predictive modeling approaches to enhance personalized treatment [2].

Radiomic analysis encompasses various feature categories, each potentially related to different aspects of a tissue phenotype. These categories include morphological features, which describe the shape and size of analyzed regions; intensity features, which capture characteristics related to pixel intensity distributions; and texture features, which quantify spatial intensity patterns. These features can be extracted from the original image intensities (e.g., Hounsfield Units in x-ray CT or arbitrary intensities in MRI), but also from image intensities previously modified by filters that highlight various structures and patterns [3].

Indeed, different image pre-processing steps such as interpolation and intensity range definition may be performed prior to feature extraction. Some radiomic features, such as the mean of gray level intensities, can be simple quantities to calculate. Others, such as texture features are more complex constructs and require more computational steps. Many features also rely on adjustable parameters, such as the intensity discretisation scheme, prior to texture calculations. Recently, the Image Biomarker Standardization Initiative (IBSI) defined a standardized workflow for radiomic feature extraction, with and without filters [4, 3]. The IBSI also established reference values for 173 features and eight linear filters (LF), which can be used to calibrate radiomic software. Nonetheless, it is up to research teams to define the set of radiomic features and associated extraction parameters most relevant to address a given clinical question. With multiple options for image pre-processing steps and adjustable parameters, a given radiomic feature set can easily reach a size close to ~1,000 features only when considering the original image intensities [5]. If filtering is considered prior to feature extraction, the feature set size could rise to ~10,000 and more [6]. Overall, the variability in feature types requiring increasing computational steps gives rise to the concept of complexity of radiomic feature extraction.

Furthermore, radiomics studies often integrate machine learning processes aiming at building predictive models from these extracted features with the goal, for example, to classify tumors and predict patient outcomes. The high dimensionality of radiomic feature sets introduces yet another level of complexity to the overall radiomic analysis pipeline, necessitating careful feature selection and model optimization to avoid overfitting and ensure generalizability [7]. Consequently, more emphasis should be given to focusing on the most effective features for a given clinical question. For instance, in an overall survival prediction in lung cancer [8], it was found that the tumor volume alone was driving the prognostic accuracy and that intensity and texture values were not relevant for prognostication. On the other hand, more complex features such as gray–level histograms combined with texture features extracted from high-resolution CT images, following wavelet transforms, demonstrated a high accuracy in identifying lung tissue types [9]. This suggests that certain types of features are more suitable than others for predictive modeling in specific radiological clinical questions.

In this study, we propose a new methodology that identifies the most predictive features specific to a given clinical outcome and modality, and explains the model’s choices, further enhancing understanding and potentially improving generalizability in the future. By analyzing five datasets, each with a distinct medical indication across two imaging modalities, we demonstrate how our methodology simplifies radiomics analysis by focusing on the most relevant features, and we also demonstrate the capabilities of MEDimage, an openly accessible tool designed to to implement our methodology and to potentially promote synergy between clinical radiologists and computer scientists through both a code-based solution and a user-friendly interface. Finally, we highlight how focusing on a single complexity level can improve performance.

2 Materials and Methods

2.1 Cohorts

We collected five distinct imaging cohorts of cancer patients, including non-small cell lung cancer (NSCLC), low-grade glioma (LGG), meningioma and two imaging cohorts of renal cell carcinoma (RCC). The NSCLC dataset was collected from Primakov et al. study [10]. It includes data from three institutions: MAASTRO [11]; Stanford [12], available on The Cancer Imaging Archive (TCIA) [13]; And the University of California San Francisco (UCSF), which is not public. For LGG, part of the data was collected from The Cancer Genome Atlas (TCGA) [14], while the rest was provided by Yu Je et al. and Li Z et al. studies [15, 16], respectively. Meningioma cohort was provided by Wu et al. [17] , Vasudevan et al. [18], Gennatas et al. [19] and Morin et al. [20] studies. MRI-based RCC cohort was provided by Lin Xi et al. study [21] , and TCGA data [22] was included. Finally, the contrast-enhanced computed tomography (CECT)-based RCC dataset was provided by CIUSSSE-Centre hospitalier universitaire de Sherbrooke (CHUS). Institutional review approval was given for the use of this dataset in this retrospective study. The cohorts are summarized in table 1 (patient characteristics are provided in supplementary note 3).

Cohorts

NSCLC

(n=506)

LGG

(n=329)

Meningioma

(n=344)

RCC

(n=599)

RCC

(n=326)

Institutions

MAASTRO (n=207)[10, 11]

UCSF (n=163) [10]

Stanford (n=136)[10, 12]

TCGA (n=103)[14]

Huashan (n=226)[15]

[16]

UCSF (n=257)[17]

[18, 19, 20]

PM (n=87)[20]

Penn (n=439)[21]

Mayo (n=53) [21]

TCGA (n=54)[22]

HPH (n=30)[21]

XYSH (n=23)[21]

CIUSSSE-CHUS

(n=326)

Imaging modality

MRI-T2F

MRI-T1CE

MRI-T2WI

CECT

Clinical endpoint

Histological subtype:

- Adenocarcinoma (n=240)

- Other (n=266)

IDH1 mutation:

- Yes (n=239)

- No (n=90)

Pathological tumor

grade*:

- Grade 1 (n=197)

- Grade 2 & 3 (n=147)

Subtype discrimination:

- Papillary (n=158)

- Clear Cell (n=441)

Subtype discrimination

- Non-Clear Cell (n=79)

- Clear Cell (n=247)

CT: Computed tomography. MRI: Magnetic resonance imaging. T2F: T2-weighted/FLAIR (Fluid attenuated inversion recovery).

T1CE: T1-weighted contrast-enhanced. T2WI: T2-weighted image.

CECT: Contrast-enhanced computed tomography. UCSF: University California San Francisco. PM: Princess Margaret Cancer Centre.

CIUSSSE-CHUS: Centre intégré universitaire de santé et de services sociaux de l’Estrie-Centre hospitalier universitaire de Sherbrooke.

Penn: Hospital of the University of Pennsylvania. Mayo: Mayo Clinic. HPH: Hunan Provincial People’s Hospital.

TCGA: The Cancer Genome Atlas. XYSH: Xiangya Second Hospital of Central South University.

For binary prediction of pathological grade, Grade 1 is considered “Low” (0), and Grade 2 and 3 is considered “High” (1).

Table 1: Study cohorts summary

2.2 Radiomic features and levels

For each dataset, a total of 170 features were extracted from the original images, in accordance with IBSI definitions [4]: 27 shape-based; 49 intensity-based (1 local intensity, 18 statistical, 23 intensity histogram and 7 intensity-volume histogram); and 94 texture-based, each extracted using six different combinations of parameters, for images with physical units (CT and CECT) and three different combinations of parameters for the rest (MRI and filtered images). For filtered images, shape and intensity-based statistical features were excluded as they are not meaningful for arbitrary intensity scales, resulting in a total of 124 features. Six LF were used, including mean, Laplacian-of-Gaussian, Laws, Gabor, wavelets (high-pass and low-pass), all in accordance with IBSI definitions [3]. We also used nonlinear filtering by moving a predefined 3D cubic window over the voxels of the image and calculating texture feature values from the Gray Level Co-occurrence Matrix (GLCM) family at each position (hereby referred to as “textural filtering”). This resulted in 25 textural filters (TF) corresponding to the 25 features defined in the GLCM family. To our knowledge, no reference values currently exist for such filters, but similar definitions are found in the works of Mayerhoefer et al. [23], and of Deasy et al. [24]. Consequently, a total of 18,112 features were extracted for cohorts with physical units, and 17,830 features for cohorts with arbitrary ones. Image processing and extraction settings are provided in supplementary note 1.

The aforementioned categories of radiomic features are henceforth designated as complexity levels. Therefore, our investigation focuses on five levels of complexity. The first three levels include features extracted from the original image intensities: Morphological (“M”) features, Intensity-based (“I”) features and Texture-based (“T”) features. The fourth (“LF”) and fifth (“TF”) levels are features extracted after linear and textural filtering, respectively. To increase complexity, features were combined prior to predictive modeling, giving rise to the final sequence of complexity levels: “M”, “I”, “M+I” (MI), “T”, “M+I+T” (MIT), “LF”, “M+I+T+LF” (MITLF), “TF”, and “M+I+T+LF+TF” (MITLFTF).

2.3 MEDimage

To facilitate synergy between clinical radiologists and computer scientists, our approach was built on two integral components. Firstly, we have developed a Python-based package with a modular architecture, ensuring flexibility of the code. Each module is dedicated to specific tasks in feature extraction and model training. Secondly, we have implemented a node-based user interface (UI) based on Electron and ReactFlow, offering clinical radiologists easy access to various modules for model training, testing, and results analysis, without requiring programming skills. Importantly, users can transition between the two components by automatically generating Python code for selected experiments on the interface. A comparison of existing free-to-access radiomics tools is available in the supplementary table 2. All information about the software is available here: https://medimage.app.

2.4 Experiment workflow

The experiment design was separated into three phases. The initial phase involved processing raw data and extracting radiomic features. Features were then organized by complexity levels, starting from “M” to “MITLFTF”. Following this, 10-fold cross-validation was used to partition the data. At each complexity level and for every fold, features were reduced using an adaptation of the false-discovery-avoidance method [25], retaining only a small subset (n~5-20) with the least intra-correlation and highest correlation to the outcome. Subsequently, models were then trained using the XGBoost algorithm, a gradient-boosted decision tree, across four feature counts (5, 10, 15, and 20 most relevant) to assess whether increasing the number of features enhances performance. Testing folds were utilized to evaluate model performance. Finally, the analysis of results was based on feature importance, which quantifies the improvement in performance brought by a feature during the model’s construction. The analysis involved two key steps: (i) Identification: A heatmap of metrics was used to pinpoint the optimal complexity level, characterized by the minimum number of features, minimum complexity, and the most statistically significant performance; and (ii) Explanation: For the selected level, feature importance tree, a novel method, was utilized. It consists of a tree plot that breaks down the selected complexity level in a cascade architecture, where each branch is connected to the filter, feature family and individual features that contributed to the decision-making process. Branch thickness reflects the feature importance, and the path that leads to the most predictive individual feature is highlighted. Additionally, a feature importance histogram was employed to display the importance scores of features, to assess contribution to the model’s predictive performance. The workflow is illustrated in Figure 1 with a detailed version in supplementary note 2.

Refer to caption — Figure 1: Overview of the study workflow. The workflow in a typical radiomics analysis starts with acquisition and reconstruction of medical images. Subsequently, images are segmented to define regions of interest (ROIs). Following this step, the proposed radiomics software processes the images and computes features characterizing the ROIs, which are then organized by complexity levels for model training. Machine learning begins with feature cleaning to remove or replace invariant features, followed by feature set reduction to retain features exhibiting a high and stable correlation with the clinical endpoint, while removing inter-correlated features. All models are constructed using XGBoost. The final step involves results analysis through two stages: identification of the optimal complexity level, characterized by: the minimum number of features; minimum complexity; the highest and statistically significant performance, and explanation based on feature importance. Experiments can be conducted via programming or through the interface, with the code generation option facilitating the shift between the two approaches.

2.5 Statistical analysis

The area under the curve (AUC) metric was employed for evaluation. To address the issues encountered by small datasets, including overfitting, noise, outliers and sampling bias, which can render the learned model ineffective [26], model predictions over all 10 test folds of the cross-validation were aggregated into a single receiver operating characteristic (ROC) curve, mimicking the behavior of leave-one-out cross-validation. While this approach sacrifices the model’s variance, it is known for minimizing bias and offering reliable estimates [27]. A DeLong test [28] was used to determine whether models were statistically different, and the difference was considered significant with $\text{p-value}<.05$ .

3 Results

3.1 MEDimage

We have developed MEDimage, an open-source tool designed to streamline radiomics studies. Clinical radiologists can customize radiomics studies graphically by defining data processing and feature extraction sequences, through node moving and linking (showcased in supplementary media), facilitating multiple hypothesis testing and results comparison. For programmers, the modular implementation of the code ensures its easy manipulation. With the code generation option, users can transition from the graphical to the code-based approach.

3.2 Optimal complexity level identification and explanation

Models performance was assessed using the aggregated AUC value across cross-validation splits as a base metric, and the mean feature importance across splits was used to identify highly predictive features. All results are depicted in Figure 2.

3.2.1 Morphological features as the optimal complexity level:

In the Meningioma cohort, the model solely utilizing morphological features displayed an AUC of 0.65 (95% CI: 0.59, 0.72), specificity of 0.62, and sensitivity of 0.64. Models based on other complexity levels displayed varying degrees of performance, such as intensity-based (AUC 0.59), MI-based (AUC 0.63), Texture-based (AUC 0.62), and TF-based (AUC 0.64). MIT-based and MITLF-based models had the highest specificity (0.69) but a lower sensitivity (0.6 and 0.61, respectively). MIT-based and LF-based models exhibited the highest AUC (0.66), yet were not different from the morphological features-based model (.46 and .95 respectively). Therefore, the morphological features based model was selected as optimal. The maximum 3D diameter feature had the highest mean importance in the model. Similarly, for the prediction of IDH1 mutation in LGG, morphological features were sufficient to obtain the best performance, with an AUC of 0.68 (95% CI: 0.60, 0.75), specificity of 0.69, and sensitivity of 0.68. The surface to volume ratio feature had the highest mean importance.

3.2.2 Intensity features as the optimal complexity level:

For histological subtype classification of NSCLC, the intensity features model achieved an AUC of 0.76 (95% CI: 0.71, 0.80), specificity of 0.71, and sensitivity of 0.73. Models based on LF, MITLF and MITLFTF demonstrated higher AUC values (0.77), yet their p-values (.40, .42 and .24, respectively) did not indicate statistical significance with respect to the intensity features model. The coefficient of variation had the highest mean feature importance. Similarly, for the clear and non-clear RCC classification based on CECT, the model based on intensity features recorded the highest AUC of 0.82 (95% CI: 0.76, 0.88), with a specificity of 0.78 and a sensitivity of 0.77. Median intensity had the highest mean feature importance.

3.2.3 Texture features as the optimal complexity level:

Texture-based model proved improvement in performance from morphological-based (p=.01) and intensity-based (p=.03) models, for the subtype classification of RCC based on MRI-T2WI, achieving an AUC of 0.72 (95% CI: 0.68, 0.77), specificity of 0.65, and sensitivity of 0.71. Within the texture feature families, the GLCM feature family demonstrated the highest mean feature importance, with cluster shade having the highest importance.

3.2.4 Linear filters and textural filters:

Although the LF-based models were never selected as optimal in any case, they consistently demonstrated high AUC values across cohorts. For example, in the classification of NSCLC subtypes, the LF-based model achieved an AUC of 0.77 (95% CI: 0.72, 0.80), indicating their potential utility in radiomics analyses.

Similarly, TF-based models, though not selected as optimal in any cohort, demonstrated high AUC values across various cohorts. For instance, in NSCLC subtype classification, the TF-based model matched the AUC of the selected optimal level (0.76; 95% CI: 0.72, 0.80), but was not considered optimal due to its higher complexity (detailed results provided in supplementary note 5). Figure 3 illustrates the application of a textural filter on NSCLC images, selected from patients with the highest difference in the feature with the highest importance.

3.3 In-depth analysis of an optimal complexity level

For clear and non-clear RCC classification, the intensity-based level was selected as optimal, and the median intensity had the highest feature importance. To emphasize its impact, images were automatically selected from patients with the highest difference in the median intensity measure and were displayed to visually assess the distinctions between clear and non-clear cell carcinoma based on CECT (See Fig. 4.A) (For other cohorts, comparisons are available in supplementary note 4). The clear-cell subtype typically exhibited hypervascularity and greater heterogeneity due to necrotic areas compared to the non-clear cell subtype. Moreover, Identifying the optimal level allowed us to refine feature extraction settings, particularly the re-segmentation range [29], which directly affects the intensities inside the ROI [30]. This refinement led to a 4% improvement in AUC from 0.82 to 0.86 (See Fig. 4.B).

4 Discussion

Many studies have highlighted the potential of radiomics to enhance clinical decision making, but application requires further optimization and standardization [7], In this work, the methodology we developed aims to simplify radiomics predictive modeling, paving the way for future clinical applications. We introduced the concept of radiomics complexity levels defined by the number of computational steps needed to extract features, and proposed a methodology for estimating an optimal radiomics complexity level for a given clinical problem, that takes into account computational steps, predictive performance, and statistical significance, in order to focus on predictive features and potentially pave the way for more generalizability. Additionally, we proposed MEDimage, an innovative software tool designed to streamline radiomics studies, and facilitate synergy between computer scientists and clinical radiologists.

Selections of optimal levels aligned with findings from other available studies, indicating the robustness of our methodology. For example, shape features had the highest impact in meningioma cancer grade classification which corroborates the findings of Zhang et al. [31] who reported a correlation between shape features and brain invasion in meningioma cancer grade prediction. Similarly, texture features were found predictive and sufficient in MRI-based non-clear cell and clear cell RCC classification, corroborating the findings of Wang et al. [32], who found texture features effective in differentiating three RCC subtypes (clear cell, papillary, and chromophobe) from MRI images. Finally, Linning et al. [33] utilized radiomics for classifying histological subtypes of lung cancer and found intensity features to be the most predictive, indicating tumor heterogeneity. These findings are concordant with our results and suggest our methodology could pave the way for more generalizability across diverse clinical scenarios.

All experiments conducted as part of our study utilized MEDimage, offering enhanced flexibility in study design and analysis. It facilitates feature extraction, cleaning, selection, model training, and results analysis. Feature selection is particularly relevant for radiomics approaches to choose predictive biomarkers. Thus, our selection involved applying standard false-discovery-avoidance method [25] independently to each feature type, followed by a final iteration on the combination of all types. We also accounted for variants of texture features extracted under different parameter sets, enhancing robustness of our estimations against changes in extraction settings related to texture features selection. In results analysis, the feature importance tree drew the path that highlighted filters, feature families and individual features that significantly contributed to model performance, offering insights into the model’s decision-making process and enabling further optimization.

Our aim in developing the workflow and software outlined here was to streamline the process by avoiding unnecessary complexity and emphasizing the efficacy of less complex features for optimal performance, potentially increasing generalizability. Through this approach, we believe that the focus on a singular optimal level can potentially save time by preventing the investigation of irrelevant features, while also laying the groundwork for a more in-depth analysis. For example, in the case of clear cell versus non-clear cell RCC classification, 18,112 features were extracted and tested, however, we later identified that a set of 15 intensity-based features was sufficient to obtain the best performance. We then optimized the Hounsfield unit range used in the ROI re-segmentation step [29], which directly affected the intensities inside the ROI, and improved the AUC from 0.82 to 0.86.

Our study has some limitations. First, access to large datasets restricted our ability to assess generalizability and study cases that could benefit from different optimal levels such as nonlinear filters. Additionally, other combinations of complexity levels, such as combining morphological and texture features, were not explored, potentially missing out on improvements in predictive performance. Our estimation of optimal complexity levels is susceptible to variations in image processing and feature extraction parameters. We exclusively used the GLCM texture family for nonlinear filtering, suggesting future exploration of other textural feature families. Also, we did not assess the robustness of features against differences in positioning, acquisition and segmentation [34], potentially leaving the identified optimal levels susceptible to differences in these factors. Additionally, all classifications were limited to binary problems and exclusively analyzed using the XGBoost algorithm. Inclusion of features derived from deep learning represents an area for future investigation, adding an additional layer of complexity to radiomic analyses. These limitations underscore the need for future exploration into more automated and robust techniques for selecting optimal levels to enhance the efficacy and reproducibility of radiomics approaches.

To conclude, our study unveiled context-specific optimal radiomics complexity levels, as demonstrated across five distinct datasets. Leveraging our proposed methodology and software, we successfully identified and explained the optimal level for each dataset, providing an optimal simplification of radiomics use in predictive modeling.

Acknowledgments

We thank Nicolas Longchamps, BA, Guillaume Blain, BA, Sarah Denis, MS and Andréanne Allaire, BA, for their valuable contribution to the development of the software’s user interface.

Author contributions

Guarantors of integrity of entire study, M.A.L.L., J.S., S.L., O.M., C.R., M.L., M.V.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; approval of final version of submitted manuscript, all authors; agrees to ensure any questions related to the work are appropriately resolved, all authors; literature research, M.A.L.L., C.R., M.L., M.V.; clinical studies, T.B.P., D.R.R., W.C.C., D.P.I.C., H.Z., O.G., J.W., A.C.S., P.J.Z., H.X.B., P.O.R., O.M.; experimental studies, M.A.L.L.; statistical analysis, M.A.L.L.; code contribution, A.Z., T.U., J.B.; and manuscript editing, M.A.L.L., O.M., C.R., M.L., M.V.

Funding information

Martin Vallières acknowledges the support of the Canada CIFAR AI Chairs Program and Unité de Soutien SSA QC. Harrison X. Bai acknowledges funding from the National Institute of Health (NIH) under Project #1R03CA249554-01. David R Raleigh acknowledges funding from the NIH under Project R01 CA262311. Jan Seuntjens acknowledges funding from the Canadian Institutes of Health Research (FDN-143257).

Data sharing statement

At this time, the following datasets are not publicly shared by the hosting institutions: Non-Small Cell Lung Cancer (UCSF), Meningioma (UCSF, PMH), Renal cell carcinoma 1 (Penn, Mayo, HPH, XYSH) and Renal cell carcinoma 2 (CHUS). The Non-Small Cell Lung Cancer data from MAASTRO is available here: https://doi.org/10.7937/K9/TCIA.2015.PF0M9REI and data from Stanford is available here: https://doi.org/10.7937/K9/TCIA.2017.7hs46erv. The Low-grade glioma dataset from Huashan is available upon reasonable request here: https://doi.org/10.1038/s41598-017-05848-2; https://doi.org/10.1007/s00330-016-4653-3. The Low-grade glioma dataset from TCGA is available here: http://doi.org/10.7937/K9/TCIA.2016.L4LTD3TK. The Renal cell carcinoma dataset from TCGA is available here: https://doi.org/10.7937/K9/TCIA.2016.V6PBVTDR.

Supplementary Notes

Supplementary Note 1: Image processing configurations

The image processing configurations used prior to feature extraction for each cohort are listed in table 2.

		Image processing configurations
		Lungs Cancer (CT)	Low-grade glioma (MRI)	Meningioma (MRI)	Renal-cell carcinoma (MRI)	Renal-cell carcinoma (CECT)
Interpolation	Voxel dimension (mm)	2×2×2	1×1×1	1×1×1	4x4x4	2×2×2
	Interpolation method	Tricubic spline	Tricubic spline	Tricubic spline	Tricubic spline	Tricubic spline
	Intensity rounding	Nearest integer	-	-	-	Nearest integer
	ROI interpolation method	Trilinear	Trilinear	Trilinear	Trilinear	Trilinear
	ROI partial mask volume threshold	0.5	0.5	0.5	0.5	0.5
Discretization	Intensity histogram	FBN: 64 bins	FBN: 64 bins	FBN: 64 bins	FBN: 64 bins	FBN: 64 bins
	Intensity volume histogram	-	FBN: 1000 bins	FBN: 1000 bins	FBN: 1000 bins	-
	Texture	FBN: [8, 16, 32] bins; FBS: [31, 63, 125] HU	FBN: [8, 16, 32] bins	FBN: [8, 16, 32] bins	FBN: [8, 16, 32] bins	FBN: [8, 16, 32] bins; FBS: [16, 31, 63] HU
Re-seg	Intensity range	$\left[-700,300\right]$ HU	$\left[0,+\infty\right[$	$\left[0,+\infty\right[$	$\left[0,+\infty\right[$	$\left[-200,300\right]$ HU
Re-seg	Outlier filtering	-	Collewet*	Collewet	Collewet	-
Mean filter	Size	3	7	3	3	3
LoG	Sigma (mm)	1.75	4.2	2.16	1.5	2.23
Gabor	Sigma (mm)	1.75	4.2	2.16	1.5	2.23
	Lambda (mm)	3.5	8.4	8.4	3	4.45
	Theta	$\frac{\pi}{8}$	$\frac{\pi}{8}$	$\frac{\pi}{8}$	$\frac{\pi}{8}$	$\frac{\pi}{8}$
	Gamma	2	2	2	2	2
	Rotation invariance	yes	yes	yes	yes	yes
Laws	Kernel	[L3, L3, L3]	[L5, L5, L5]	[L3, L3, L3]	[L3, L3, L3]	[L3, L3, L3]
Wavelet	Basis function	Coiflet 1	Coiflet 3	Coiflet 3	Coiflet 1	Coiflet 2
	Subband	HHH, LLL	HHH, LLL	HHH, LLL	HHH, LLL	HHH, LLL
	Level	1	1	1	1	1
Textural filtering	Size	7	7	3	3	3
	Local discretization	FBN: 25 HU adapted*	FBN: 8 bins	FBN: 8 bins	FBN: 8 bins	FBN: 25 HU adapted*
	Global discretization	FBN: 25 HU	FBN: 8 bins	FBN: 8 bins	FBN: 8 bins	FBN: 25 HU
Boundary condition		mirror padding	mirror padding	mirror padding	mirror padding	mirror padding

Table 2: Image processing parameters used for each dataset. CECT: contrast-enhanced computed tomography; CT: computed tomography; ROI: region of interest; HU: Hounsfield Unit. FBN: Fixed bin number; FBS: Fixed bin size; H: High-pass; L: Low-pass; *ROI voxels outlier intensities were removed from the intensity mask using the method suggested by Collewet et al. [35]; **Adapted indicates that the bin number was computed using the specified bin width and the image intensity range.

Supplementary Note 2: The MEDimage software

The MEDimage package (https://github.com/MEDomics-UdeS/MEDimage) can extract radiomic features from medical images using a modular implementation. In a typical workflow, the selected dataset undergoes automatic preprocessing before feature extraction. Subsequently, features are extracted according to user-defined parameters, with available tools aiding in selection of such parameters. Features are then extracted, predictive features are identified using an adaptation of the FDA method [25], and models are trained and fine-tuned using the established machine learning library PyCaret (https://pycaret.org/). The platform facilitates the determination of optimal feature types for further analysis, and users can easily generate Python code for their experiments, promoting collaboration between clinical radiologists and computer scientists. MEDimage complies with international standards [4, 3] and provides comprehensive support through tutorials, videos, and detailed documentation. The following section highlights the major functionalities of the package and introduces the different modules used:

•

Image pre-processing: Through the DataManager class, MEDimage facilitates Digital Imaging and COmmunications in Medicine (DICOM) image management, including ROI management (reading, association with imaging volume, etc.). MEDimage reads and serializes the images into byte streams. This serialization process is referred to as “pickling” (https://docs.python.org/3/library/pickle.html). The objects hold all necessary imaging data needed for extraction including for example the tumor mask. The objects are also used to organize extraction results leading to the simplification and the minimisation of the code.
•

Image Processing: Consists of interpolation, re-segmentation and other processing methods. All these methods are implemented in the processing module. Image filtering is also implemented according to IBSI standards [3], offering a choice of several built-in filters such as Laws, Gabor, etc. This includes non linear filters that are defined as radiomic feature maps generated by moving a defined cubic window over the voxels of the image and calculating feature value at each position, while each feature map depicts a single radiomic feature. Moreover, the textural features can be computed in two different ways, in the first one, we discretize the image intensities inside the ROI locally, meaning at each position of the cubic window, whereas in the second one, the discretization is done globally on the whole region.
•

Feature extraction: The package module biomarkers handles all feature extraction related processes. It allows feature extraction from single scans and batch data. For batch extraction, BatchExtractor class is used where a parameter file is used to customize the extraction by setting the different parameters such as filter sizes. This class also generates and organizes results automatically.
•

Model training: Model training within the MEDimage module encompasses various methods supporting both training and evaluation of machine learning models. Additionally, it offers users a range of useful techniques for preprocessing radiomics features, including cleaning, normalization, and feature selection.Currently, the package exclusively supports training using XGBoost and binary classification tasks. The sequential steps for model training are outlined as follows:
- –
  
  Cleaning: Involves the removal of features considered irrelevant for the analysis eliminating those with low variance and a high number of missing patients. Additionally, patients with a high number of missing features are excluded from the dataset during this step.
- –
  
  Normalization: Performed on features using the ComBat method [36] as a preprocessing step to mitigate batch effects.
- –
  
  Feature set reduction (FSR): Implemented using the false discovery avoidance method (FDA) introduced by Chatterjee et al. [25]. As illustrated in figure 5, this method involves subdividing training sets into 100 internal training and validation splits using a 2:1 ratio, and stratified random subsampling. Variables with low stability, measured by Spearman’s rank correlation with the outcome of interest ( $RS_{f/o}$ ), are discarded using a minimal $RS_{f/o}$ cut-off of 0.5. Additionally, inter-correlated variable pairs are removed using a maximal $RS_{f/f}$ cut-off of 0.7, resulting in the retention of N features with the highest $RS_{f/o}$ . The number of features to retain is set by the user. In our case, we tested four different numbers of retained features and kept the best performing one. Our package incorporates a balanced version of this method, which consists of applying FDA separately to each feature set to ensure consistency in the number of features drawn from each set. Subsequently, these drawn features are combined to form a final feature set, upon which FDA is reapplied. This process ensures that each set participates equally in the selection process, preventing the number of features in each set from affecting the overall reduction process.
  
  Figure 5: False discovery avoidance method breakdown.
- –
  
  Model training: We relied exclusively on the XGBoost algorithm for binary prediction endpoints. The model was trained on the features retained after the feature set reduction step, with hyper-parameters automatically tuned using the pyCaret library. Subsequently, models were tested on the test set for all splits, with metrics computed for each split and aggregated or averaged to assess overall performance.
- –
  
  Analysis method: Comprises two steps: identification and explanation of the optimal level. To identify the best-performing feature types, a heatmap is generated to compare the performance metric across complexity levels using p-values. The optimal level, characterized by the least number of features and the highest statistically significant performance metric, is then selected. In the explanation step, the histogram of feature importance is plotted, and for more complex cases (texture and filter-based features), the explanation tree highlights the feature family, filter, and features most impactful in the model’s decision-making process.

MEDimage app

To facilitate the utilization of the package by clinical radiologists, the MEDimage application (https://github.com/MEDomics-UdeS/MEDimage-app), also referred to as the MEDimage interface, provides access to package methods without requiring coding. It employs a drag-and-drop architecture where nodes are connected to form a pipeline representing the experiment, as depicted in supplementary figure 6. Each node corresponds to a step in either feature extraction or model training, enabling customization of all processes and facilitating multiple hypothesis testing and results comparison.

The interface also supports results analysis for machine learning experiments, aligned with the methodology presented in this work. It facilitates easy comparison of metrics across training, testing, and holdout data, and provides convenient access to analysis plots, such as the feature importance histogram and the metrics heatmap, which includes statistical comparisons between models. This is illustrated in the supplementary figure 7.

Finally, the MEDimage app supports code generation. Once the machine learning experiments are executed, users can select and generate code for multiple pipelines (experiments) through the interface, making the transition from a graphical interface to a code-based approach easy. This feature is illustrated in Supplementary Figure 8.

Comparison of existing tools for radiomics research

We incorporated our package into the comparison conducted by Abler et al. [37], making minor updates. Additionally, we introduced the feature of prediction modeling analysis to the comparison. The updated review results are provided in table 3.

Name

Type

DICOM

Radiomics features

Predictive modeling

Extraction

Visualization

Selection

Training

Evaluation

Analysis

PyRadiomics**

moddicom⁺

RADIOMICS

PORTS

ROdiomiX*

SERA**⁺

QIFE**

MIRP**⁺

RaCaT

Precision-medicine-toolbox

LIFEx**⁺

S-IBEx**⁺

CERR**⁺

MRP

MITK Phenotyping**

SlicerRadiomics

CGITA

QuantImage (v1)

ePAD

MaZda/b11

CaPTk**⁺

AutoRadiomics**

QuantImage v2**

MEDimage**⁺

Library

cmd-exec

Library/

Library

cmd-exec

Library

GUI

GUI(plugin)

GUI(web)

GUI

GUI(web)

Library/GUI

PyRadiomics

QIFE

Cohort

Feature map

Cohort

Automated

Interactive

Automated

Cross (x) or dash (-) indicate the presence or absence, respectively, of a specific characteristic.

Features not typically applicable to a specific tool category are left empty.

Participation in the IBSI-1 benchmark [4].

⁺Participation in the IBSI-2 benchmark [3].

*Self-reported adherence to IBSI-1 recommendations [4].

cmd-exec: Command line executable.

GUI: graphical user interface.

IBSI: Image Biomarker Standardisation Initiative.

Table 3: Review summary of open-source radiomics tools.

Supplementary Note 3: Dataset

Five distinct datasets, each associated with a different medical context have been used in this study. All the cohorts characteristics are summarized in the tables below:

•

Non-Small Cell Lung Cancer (NSCLC) dataset for histology classification: The data utilized for histological classification of Non-Small Cell Lung Cancer (NSCLC) was sourced from Primakov et al. study [10] and consisted of three institutions. Only patient cohort treated at the MAASTRO clinic, the netherlands, between 2005 and 2010 and at the Stanford University Medical Center between 2008 and 2012 are publicly available and accessible via The Cancer Imaging Archive (only patients with non-contrast-enhanced CT scans were retained) at https://doi.org/10.7937/K9/TCIA.2015.PF0M9REI and https://doi.org/10.7937/K9/TCIA.2017.7hs46erv. Patients from the University of California San Francisco (UCSF) are not publicly shared. The scans were accompanied by segmentation maps outlining tumor regions, hence no further preprocessing was performed. Patient information is listed in table 4.

Characteristics

Type

No of patients (%) / value

Gender

Male – MAASTRO

Male – Stanford

Male – UCSF

Male – All

Female – MAASTRO

Female – Stanford

Female – UCSF

Female – All

Missing (UCSF)

136 (66 %)
102 (75 %)
75 (46 %)
313 (62 %)
71 (34 %)
34 (25 %)
87 (54 %)
192 (38 %)
1

Age

MAASTRO

UCSF

Stanford

All

69 ± 10 [69; 43-92] years

69 ± 9 [69; 43-87] years

70 ± 9 [72; 46-92] years

69 ± 9 [70; 43-92] years

Histology

Adenocarcinoma – MAASTRO

Adenocarcinoma – Stanford

Adenocarcinoma – UCSF

Adenocarcinoma – All

Other – MAASTRO

Other – Stanford

Other – UCSF

Other - All

34 (16 %)
106 (78 %)
100 (61 %)
240 (47 %)
173 (84 %)
30 (22 %)
63 (39 %)
266 (53 %)

Treatment - MAASTRO

Radiotherapy only

Chemo-radiotherapy

46.5 %

53.5 %

Treatment - Stanford

Surgery

Radiotherapy

Chemotherapy

Adjuvant therapy

100 %

36 %

12 %

36 %

Treatment - UCSF

Surgery

Radiotherapy

Chemotherapy

Immunotherapy

63 %

57 %

52 %

5 %

Note: mean ± std [median; min-max].

UCSF: University California San Francisco.

Table 4: Patient information – Lung cancer cohort.

•

Low grade glioma (LGG): The dataset employed for the prediction of IDH1 mutation and comprises two cohorts. The first cohort (n = 227) was treated at the Department of Neurosurgery of Huashan hospital between 2010 and 2016, and created from the combined studies of Yu et al. [15] and Li et al. [16]. Outcome information and imaging data is available from the authors of these studies upon request. The second cohort (n=107) was part of The Cancer Genome Atlas Low Grade Glioma (TCGA-LGG) [14]. Outcome information as well as clinical, imaging and genomics data is available from TCIA at http://doi.org/10.7937/K9/TCIA.2016.L4LTD3TK. Patient information is listed in table 5.

Characteristics

Type

No of patients (%) / value

Gender

Male – TCGA

Female – TCGA

50 (47 %)

57 (53 %)

Age

TCGA

46 ± 14 [47; 20-75] years

Laterality

Left – TCGA

Midline – TCGA

Right – TCGA

48 (45 %)

3 (3 %)

56 (52 %)

Tumour location

Cerebellum – TCGA

Frontal – TCGA

Parietal – TCGA

Temporal – TCGA

Not specified – TCGA

Missing

1 (1 %)

58 (58 %)

12 (12 %)

28 (28 %)

1 (1 %)

IDH1 - mutation

Yes – Huashan

Yes – TCGA

Yes – All

No – Huashan

No – TCGA

No – All

Missing (TCGA)

164 (72 %)

76 (74 %)

240 (73 %)

63 (28 %)

27 (26 %)

90 (27 %)

Treatment - TCGA

See http://doi.org/10.7937/K9/TCIA.2016.L4LTD3TK

Treatment - Huashan

Not available

Note: mean ± std [median; min-max].

UCSF: University California San Francisco.

Table 5: Patient information – Low-grade-glioma cohort.

•

Meningioma (MRI): The dataset employed for pathological grade classification comprises two cohorts. The first cohort (n=257) includes patients treated at the Radiation Oncology Department of UCSF between 2001 and 2013, sourced from studies by Wu et al. [17], Vasudevan et al. [18], Gennatas et al. [19], and Morin et al. [20]. The second cohort (n=87) comprises patients treated at Princess Margaret Hospital (PMH) in Toronto between 2010 and 2017, as detailed in the study by Morin et al. [20]. The dataset is not publicly shared by the hosting institutions. Patient information is listed in table 6.

Characteristics

Type

No of patients (%) / value

Gender

Male – UCSF

Male – PMH

Male – All

Female – UCSF

Female – PMH

Female – All

96 (37 %)

31 (36 %)

127 (37 %)

161 (63 %)

56 (64 %)

217 (63 %)

Age

UCSF

PMH

All

58 ± 13 [58; 14-89] years

58 ± 15 [59; 19-88] years

58 ± 14 [58; 14-89] years

Pathology – Grade

Grade 1 – UCSF

Grade 1 – PHM

Grade 1 – All

Grade 2 – UCSF

Grade 2 – PMH

Grade 2 – All

Grade 3 – UCSF

Grade 3 – PMH

Grade 3 – All

128 (50 %)
69 (79 %)
197 (57 %)
104 (40 %)
17 (20 %)
121 (35 %)
25 (10 %)
1 (1 %)
26 (8 %)

Tumour location

Cerebellum – TCGA

Frontal – TCGA

Parietal – TCGA

Temporal – TCGA

Not specified – TCGA

Missing

1 (1 %)

58 (58 %)

12 (12 %)

28 (28 %)

1 (1 %)

IDH1 mutation

Yes – Huashan

Yes – TCGA

Yes – All

No – Huashan

No – TCGA

No – All

Missing (TCGA)

164 (72 %)

76 (74 %)

240 (73 %)

63 (28 %)

27 (26 %)

90 (27 %)

Treatment - UCSF

Extent of resection:

- Gross total resection

- Subtotal resection

Adjuvant radiotherapy

56 %

44 %

24 %

Treatment - PMH

Extent of resection:

- Gross total resection

- Subtotal resection

Unknown

Adjuvant radiotherapy

56 %

12 %

32 %

5 %

Note: mean ± std [median; min-max].

For binary prediction of pathological grade:

- Grade 1 is considered as “Low” (0)

- Grade 2 and 3 are considered as “High” (1).

UCSF: University California San Francisco.

PMH: Princess Margaret Hospital.

Table 6: Patient information – Meningioma cohort.

•

Renal Cell Carcinomas (RCC) dataset: Part of the dataset utilized for classifying Renal Lesions was obtained from the work of Ianto Lin Xi et al. [21], while the rest was generated by the TCGA Research Network (https://doi.org/10.7937/K9/TCIA.2016.V6PBVTDR), comprising a collection of 1197 MR images, including two enhanced sequences: T1-contrast (T1C, n=598) and T2-weighted (T2WI, n=599). Manual segmentation of the images was performed by three radiologists to delineate regions of interest. Subsequently, only T2WI images were retained for analysis, as the initial investigation revealed similar results to those obtained from the NSCLC and CECT-based RCC datasets. Patient information is accessible in the dataset sources.

•

Renal Cell Carcinomas (RCC) dataset: This dataset comprises 326 patients diagnosed with renal cell carcinoma (RCC) and treated at Centre hospitalier universitaire de Sherbrooke (CHUS), QC, Canada. Manual segmentation was performed by three residents under the supervision of a urologist. The images were acquired using Contrast Enhanced Computed Tomography (CECT) and were used to classify clear versus non-clear Cells. This dataset is not publicly shared by the institution. Patient information is listed in table 7.

Characteristics

Type

No of patients (%) / value

Gender

Male – CHUS

Female – CHUS

217 (67 %)

109 (33 %)

Age

CHUS

63.3 ± 10.53 [65; 33-91] years

Subtype

Clear cell

Non-Clear cell

79 (24 %)

247 (76 %)

Lesion side

Left

Right

170 (52 %)

156 (48 %)

Imaging size

CHUS

5.14 ± 2.99 [4.5; 1-15] cm

Family History

Yes

18 (5%)

308 (95%)

Note: mean ± std [median; min-max].

cm: centimeter

CHUS: Centre hospitalier universitaire de Sherbrooke.

Table 7: Patient information – CECT-based Renal cell carcinoma cohort.

Supplementary Note 4: Visualization of the impact of features with the highest importance

Additional figures were included here, each showcasing two representative images corresponding to the two classes of the binary problem studied. These images were automatically chosen from patients with the highest difference in the feature with the highest importance during the model’s training process.

Supplementary Note 5: Highlighting the potential of textural filters in Lungs cohort

To our best knowledge, our work is the first to leverage the textural filters for extracting radiomics features. In the histological subtypes classification of lung cancer based on CT, features derived from textural filters matched the AUC of the selected optimal level (0.76; 95% CI: 0.72, 0.80) which was the intensity features based model, the sensitivity (0.73), but with a specificity 3% higher, underscoring their considerable potential.

To identify filters, feature families and features that contribute most to the decision-making process for a given complexity level, we used feature importance, a measure of how much each feature contributes to the predictive power of the model. In XGBoost, feature importance is computed based on how often and how significantly a feature is used to split data across all trees in the model. We used this measure and created the feature importance tree (see Supplementary Figure 13), a plot that breaks down the complexity level in a hierarchical fashion. The tree branches from top to bottom, starting with the filter type used (if applicable), followed by the filter name (if applicable), then the feature families for texture features, and finally, individual features. Branch thickness (green lines) indicates the relative importance of different features or feature groups. Solid lines show selected features or paths, while black dotted lines represent features included in the set but not selected for the final model. The orange line traces the path from the root to the leaf node with the highest accumulated feature importance in the model’s decision-making process. According to the plot, the Cluster Shade filter and the High Dependence Low Gray Level Emphasis (HDLGE) feature had the highest importance.

Supplementary media

•

A MEDimage promotional video to highlight how this platform allows the graphical customization of radiomics studies through node movement and linking: https://youtu.be/h38vEpkHSpc?feature=shared
•

The MEDimage package documentation: https://medimage.readthedocs.io

•

The MEDimage app documentation: https://medomics-udes.gitbook.io/medimage-app-docs

References

[1]
[1] Gillies, Robert J. ; Kinahan, Paul E. ; Hricak, Hedvig: Radiomics: Images Are More than Pictures, They Are Data. In: Radiology 278 (2016), Februar, Nr. 2, 563–577. http://dx.doi.org/10.1148/radiol.2015151169. – DOI 10.1148/radiol.2015151169. – ISSN 0033–8419. – Publisher: Radiological Society of North America
[2] Lambin, Philippe ; Leijenaar, Ralph T. ; Deist, Timo M. ; Peerlings, Jurgen ; De Jong, Evelyn E. ; Van Timmeren, Janita ; Sanduleanu, Sebastian ; Larue, Ruben T. ; Even, Aniek J. ; Jochems, Arthur ; Van Wijk, Yvonka ; Woodruff, Henry ; Van Soest, Johan ; Lustberg, Tim ; Roelofs, Erik ; Van Elmpt, Wouter ; Dekker, Andre ; Mottaghy, Felix M. ; Wildberger, Joachim E. ; Walsh, Sean: Radiomics: the bridge between medical imaging and personalized medicine. In: Nature Reviews Clinical Oncology 14 (2017), Dezember, Nr. 12, 749–762. http://dx.doi.org/10.1038/nrclinonc.2017.141. – DOI 10.1038/nrclinonc.2017.141. – ISSN 1759–4774, 1759–4782
[3] Whybra, Philip ; Zwanenburg, Alex ; Andrearczyk, Vincent ; Schaer, Roger ; Apte, Aditya P. ; Ayotte, Alexandre ; Baheti, Bhakti ; Bakas, Spyridon ; Bettinelli, Andrea ; Boellaard, Ronald ; Boldrini, Luca ; Buvat, Irène ; Cook, Gary J. R. ; Dietsche, Florian ; Dinapoli, Nicola ; Gabryś, Hubert S. ; Goh, Vicky ; Guckenberger, Matthias ; Hatt, Mathieu ; Hosseinzadeh, Mahdi ; Iyer, Aditi ; Lenkowicz, Jacopo ; Loutfi, Mahdi A. L. ; Löck, Steffen ; Marturano, Francesca ; Morin, Olivier ; Nioche, Christophe ; Orlhac, Fanny ; Pati, Sarthak ; Rahmim, Arman ; Rezaeijo, Seyed M. ; Rookyard, Christopher G. ; Salmanpour, Mohammad R. ; Schindele, Andreas ; Shiri, Isaac ; Spezi, Emiliano ; Tanadini-Lang, Stephanie ; Tixier, Florent ; Upadhaya, Taman ; Valentini, Vincenzo ; Griethuysen, Joost J. M. ; Yousefirizi, Fereshteh ; Zaidi, Habib ; Müller, Henning ; Vallières, Martin ; Depeursinge, Adrien: The Image Biomarker Standardization Initiative: Standardized Convolutional Filters for Reproducible Radiomics and Enhanced Clinical Insights. In: Radiology 310 (2024), Februar, Nr. 2, e231319. http://dx.doi.org/10.1148/radiol.231319. – DOI 10.1148/radiol.231319. – ISSN 0033–8419. – Publisher: Radiological Society of North America
[4] Zwanenburg, Alex ; Vallières, Martin ; Abdalah, Mahmoud A. ; Aerts, Hugo J. W. L. ; Andrearczyk, Vincent ; Apte, Aditya ; Ashrafinia, Saeed ; Bakas, Spyridon ; Beukinga, Roelof J. ; Boellaard, Ronald ; Bogowicz, Marta ; Boldrini, Luca ; Buvat, Irène ; Cook, Gary J. R. ; Davatzikos, Christos ; Depeursinge, Adrien ; Desseroit, Marie-Charlotte ; Dinapoli, Nicola ; Dinh, Cuong V. ; Echegaray, Sebastian ; El Naqa, Issam ; Fedorov, Andriy Y. ; Gatta, Roberto ; Gillies, Robert J. ; Goh, Vicky ; Götz, Michael ; Guckenberger, Matthias ; Ha, Sung M. ; Hatt, Mathieu ; Isensee, Fabian ; Lambin, Philippe ; Leger, Stefan ; Leijenaar, Ralph T. ; Lenkowicz, Jacopo ; Lippert, Fiona ; Losnegård, Are ; Maier-Hein, Klaus H. ; Morin, Olivier ; Müller, Henning ; Napel, Sandy ; Nioche, Christophe ; Orlhac, Fanny ; Pati, Sarthak ; Pfaehler, Elisabeth A. ; Rahmim, Arman ; Rao, Arvind U. ; Scherer, Jonas ; Siddique, Muhammad M. ; Sijtsema, Nanna M. ; Socarras Fernandez, Jairo ; Spezi, Emiliano ; Steenbakkers, Roel J. ; Tanadini-Lang, Stephanie ; Thorwarth, Daniela ; Troost, Esther G. ; Upadhaya, Taman ; Valentini, Vincenzo ; Dijk, Lisanne V. ; Griethuysen, Joost van ; Velden, Floris H. ; Whybra, Philip ; Richter, Christian ; Löck, Steffen: The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. In: Radiology 295 (2020), Mai, Nr. 2, 328–338. http://dx.doi.org/10.1148/radiol.2020191145. – DOI 10.1148/radiol.2020191145. – ISSN 0033–8419. – Publisher: Radiological Society of North America
[5] Vallières, Martin ; Kay-Rivest, Emily ; Perrin, Léo J. ; Liem, Xavier ; Furstoss, Christophe ; Aerts, Hugo J. W. L. ; Khaouam, Nader ; Nguyen-Tan, Phuc F. ; Wang, Chang-Shu ; Sultanem, Khalil ; Seuntjens, Jan ; El Naqa, Issam: Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer. In: Scientific Reports 7 (2017), August, Nr. 1, 10117. http://dx.doi.org/10.1038/s41598-017-10371-5. – DOI 10.1038/s41598–017–10371–5. – ISSN 2045–2322. – Publisher: Nature Publishing Group
[6] Vallières, M. ; Freeman, C. R. ; Skamene, S. R. ; Naqa, I. E.: A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities. In: Physics in Medicine & Biology 60 (2015), Juni, Nr. 14, 5471. http://dx.doi.org/10.1088/0031-9155/60/14/5471. – DOI 10.1088/0031–9155/60/14/5471. – ISSN 0031–9155. – Publisher: IOP Publishing
[7] Timmeren, Janita E. ; Cester, Davide ; Tanadini-Lang, Stephanie ; Alkadhi, Hatem ; Baessler, Bettina: Radiomics in medical imaging—“how-to” guide and critical reflection. In: Insights into Imaging 11 (2020), Dezember, Nr. 1, 91. http://dx.doi.org/10.1186/s13244-020-00887-2. – DOI 10.1186/s13244–020–00887–2. – ISSN 1869–4101
[8] Welch, Mattea L. ; McIntosh, Chris ; Haibe-Kains, Benjamin ; Milosevic, Michael F. ; Wee, Leonard ; Dekker, Andre ; Huang, Shao H. ; Purdie, Thomas G. ; O’Sullivan, Brian ; Aerts, Hugo J. ; Jaffray, David A.: Vulnerabilities of radiomic signature development: The need for safeguards. In: Radiotherapy and Oncology 130 (2019), Januar, 2–9. http://dx.doi.org/10.1016/j.radonc.2018.10.027. – DOI 10.1016/j.radonc.2018.10.027. – ISSN 01678140
[9] Depeursinge, Adrien ; Sage, Daniel ; Hidki, Asmaa ; Platon, Alexandra ; Poletti, Pierre-Alexandre ; Unser, Michael ; Muller, Henning: Lung Tissue Classification Using Wavelet Frames. In: 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Lyon, France : IEEE, August 2007. – ISBN 978–1–4244–0787–3 978–1–4244–0788–0, 6259–6262. – ISSN: 1557-170X
[10] Primakov, Sergey P. ; Ibrahim, Abdalla ; Timmeren, Janita E. ; Wu, Guangyao ; Keek, Simon A. ; Beuque, Manon ; Granzier, Renée W. Y. ; Lavrova, Elizaveta ; Scrivener, Madeleine ; Sanduleanu, Sebastian ; Kayan, Esma ; Halilaj, Iva ; Lenaers, Anouk ; Wu, Jianlin ; Monshouwer, René ; Geets, Xavier ; Gietema, Hester A. ; Hendriks, Lizza E. L. ; Morin, Olivier ; Jochems, Arthur ; Woodruff, Henry C. ; Lambin, Philippe: Automated detection and segmentation of non-small cell lung cancer computed tomography images. In: Nature Communications 13 (2022), Juni, Nr. 1, 3423. http://dx.doi.org/10.1038/s41467-022-30841-3. – DOI 10.1038/s41467–022–30841–3. – ISSN 2041–1723. – Publisher: Nature Publishing Group
[11] Aerts, Hugo J. W. L. ; Wee, Leonard ; Rios Velazquez, Emmanuel ; Leijenaar, Ralph T. H. ; Parmar, Chintan ; Grossmann, Patrick ; Carvalho, Sara ; Bussink, Johan ; Monshouwer, Ren ; Haibe-Kains, Benjamin ; Rietveld, Derek ; Hoebers, Frank ; Rietbergen, Michelle M. ; Leemans, C. R. ; Dekker, Andre ; Quackenbush, John ; Gillies, Robert J. ; Lambin, Philippe: Data From NSCLC-Radiomics. http://dx.doi.org/10.7937/K9/TCIA.2015.PF0M9REI. Version: 2019
[12] Bakr, Shaimaa ; Gevaert, Olivier ; Echegaray, Sebastian ; Ayers, Kelsey ; Zhou, Mu ; Shafiq, Majid ; Zheng, Hong ; Zhang, Weiruo ; Leung, Ann ; Kadoch, Michael ; Shrager, Joseph ; Quon, Andrew ; Rubin, Daniel ; Plevritis, Sylvia ; Napel, Sandy: Data for NSCLC Radiogenomics Collection. http://dx.doi.org/10.7937/K9/TCIA.2017.7HS46ERV. Version: 2017
[13] Clark, Kenneth ; Vendt, Bruce ; Smith, Kirk ; Freymann, John ; Kirby, Justin ; Koppel, Paul ; Moore, Stephen ; Phillips, Stanley ; Maffitt, David ; Pringle, Michael ; Tarbox, Lawrence ; Prior, Fred: The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. In: Journal of Digital Imaging 26 (2013), Dezember, Nr. 6, 1045–1057. http://dx.doi.org/10.1007/s10278-013-9622-7. – DOI 10.1007/s10278–013–9622–7. – ISSN 0897–1889, 1618–727X
[14] Pedano, Nancy ; Flanders, Adam E. ; Scarpace, Lisa ; Mikkelsen, Tom ; Eschbacher, Jennifer M. ; Hermes, Beth ; Sisneros, Victor ; Barnholtz-Sloan, Jill ; Ostrom, Quinn: The Cancer Genome Atlas Low Grade Glioma Collection (TCGA-LGG). http://dx.doi.org/10.7937/K9/TCIA.2016.L4LTD3TK. Version: 2016
[15] Yu, Jinhua ; Shi, Zhifeng ; Lian, Yuxi ; Li, Zeju ; Liu, Tongtong ; Gao, Yuan ; Wang, Yuanyuan ; Chen, Liang ; Mao, Ying: Noninvasive IDH1 mutation estimation based on a quantitative radiomics approach for grade II glioma. In: European Radiology 27 (2017), August, Nr. 8, 3509–3522. http://dx.doi.org/10.1007/s00330-016-4653-3. – DOI 10.1007/s00330–016–4653–3. – ISSN 1432–1084
[16] Li, Zeju ; Wang, Yuanyuan ; Yu, Jinhua ; Guo, Yi ; Cao, Wei: Deep Learning based Radiomics (DLR) and its usage in noninvasive IDH1 prediction for low grade glioma. In: Scientific Reports 7 (2017), Juli, Nr. 1, 5467. http://dx.doi.org/10.1038/s41598-017-05848-2. – DOI 10.1038/s41598–017–05848–2. – ISSN 2045–2322. – Number: 1 Publisher: Nature Publishing Group
[17] Wu, Ashley ; Garcia, Michael A. ; Magill, Stephen T. ; Chen, William ; Vasudevan, Harish N. ; Perry, Arie ; Theodosopoulos, Philip V. ; McDermott, Michael W. ; Braunstein, Steve E. ; Raleigh, David R.: Presenting Symptoms and Prognostic Factors for Symptomatic Outcomes Following Resection of Meningioma. In: World Neurosurgery 111 (2018), März, e149–e159. http://dx.doi.org/10.1016/j.wneu.2017.12.012. – DOI 10.1016/j.wneu.2017.12.012. – ISSN 1878–8750
[18] Vasudevan, Harish N. ; Braunstein, Steve E. ; Phillips, Joanna J. ; Pekmezci, Melike ; Tomlin, Bryan A. ; Wu, Ashley ; Reis, Gerald F. ; Magill, Stephen T. ; Zhang, Jie ; Feng, Felix Y. ; Nicholaides, Theodore ; Chang, Susan M. ; Sneed, Penny K. ; McDermott, Michael W. ; Berger, Mitchel S. ; Perry, Arie ; Raleigh, David R.: Comprehensive Molecular Profiling Identifies FOXM1 as a Key Transcription Factor for Meningioma Proliferation. In: Cell Reports 22 (2018), März, Nr. 13, 3672–3683. http://dx.doi.org/10.1016/j.celrep.2018.03.013. – DOI 10.1016/j.celrep.2018.03.013. – ISSN 2211–1247. – Publisher: Elsevier
[19] Gennatas, Efstathios D. ; Wu, Ashley ; Braunstein, Steve E. ; Morin, Olivier ; Chen, William C. ; Magill, Stephen T. ; Gopinath, Chetna ; Villaneueva-Meyer, Javier E. ; Perry, Arie ; McDermott, Michael W. ; Solberg, Timothy D. ; Valdes, Gilmer ; Raleigh, David R.: Preoperative and postoperative prediction of long-term meningioma outcomes. In: PLOS ONE 13 (2018), September, Nr. 9, e0204161. http://dx.doi.org/10.1371/journal.pone.0204161. – DOI 10.1371/journal.pone.0204161. – ISSN 1932–6203. – Publisher: Public Library of Science
[20] Morin, Olivier ; Chen, William C. ; Nassiri, Farshad ; Susko, Matthew ; Magill, Stephen T. ; Vasudevan, Harish N. ; Wu, Ashley ; Vallières, Martin ; Gennatas, Efstathios D. ; Valdes, Gilmer ; Pekmezci, Melike ; Alcaide-Leon, Paula ; Choudhury, Abrar ; Interian, Yannet ; Mortezavi, Siavash ; Turgutlu, Kerem ; Bush, Nancy Ann O. ; Solberg, Timothy D. ; Braunstein, Steve E. ; Sneed, Penny K. ; Perry, Arie ; Zadeh, Gelareh ; McDermott, Michael W. ; Villanueva-Meyer, Javier E. ; Raleigh, David R.: Integrated models incorporating radiologic and radiomic features predict meningioma grade, local failure, and overall survival. In: Neuro-Oncology Advances 1 (2019), Mai, Nr. 1, vdz011. http://dx.doi.org/10.1093/noajnl/vdz011. – DOI 10.1093/noajnl/vdz011. – ISSN 2632–2498
[21] Xi, Ianto L. ; Zhao, Yijun ; Wang, Robin ; Chang, Marcello ; Purkayastha, Subhanik ; Chang, Ken ; Huang, Raymond Y. ; Silva, Alvin C. ; Vallières, Martin ; Habibollahi, Peiman ; Fan, Yong ; Zou, Beiji ; Gade, Terence P. ; Zhang, Paul J. ; Soulen, Michael C. ; Zhang, Zishu ; Bai, Harrison X. ; Stavropoulos, S. W.: Deep Learning to Distinguish Benign from Malignant Renal Lesions Based on Routine MR Imaging. In: Clinical Cancer Research 26 (2020), April, Nr. 8, 1944–1952. http://dx.doi.org/10.1158/1078-0432.CCR-19-0374. – DOI 10.1158/1078–0432.CCR–19–0374. – ISSN 1078–0432
[22] Akin, Oguz ; Elnajjar, Pierre ; Heller, Matthew ; Jarosz, Rose ; Erickson, Bradley J. ; Kirk, Shanah ; Lee, Yueh ; Linehan, Marston W. ; Gautam, Rabindra ; Vikram, Raghu ; Garcia, Kimberly M. ; Roche, Charles ; Bonaccio, Ermelinda ; Filippini, Joe: The Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma Collection (TCGA-KIRC). http://dx.doi.org/10.7937/K9/TCIA.2016.V6PBVTDR. Version: 2016
[23] Mayerhoefer, Marius E. ; Materka, Andrzej ; Langs, Georg ; Häggström, Ida ; Szczypiński, Piotr ; Gibbs, Peter ; Cook, Gary: Introduction to Radiomics. In: Journal of Nuclear Medicine 61 (2020), April, Nr. 4, 488–495. http://dx.doi.org/10.2967/jnumed.118.222893. – DOI 10.2967/jnumed.118.222893. – ISSN 0161–5505, 2159–662X
[24] Deasy, Joseph O. ; Blanco, Angel I. ; Clark, Vanessa H.: CERR: A computational environment for radiotherapy research. In: Medical Physics 30 (2003), April, Nr. 5, 979–985. http://dx.doi.org/10.1118/1.1568978. – DOI 10.1118/1.1568978. – ISSN 00942405
[25] Chatterjee, Avishek ; Vallières, Martin ; Dohan, Anthony ; Levesque, Ives R. ; Ueno, Yoshiko ; Bist, Vipul ; Saif, Sameh ; Reinhold, Caroline ; Seuntjens, Jan: An Empirical Approach for Avoiding False Discoveries When Applying High-Dimensional Radiomics to Small Datasets. In: IEEE Transactions on Radiation and Plasma Medical Sciences 3 (2019), März, Nr. 2, 201–209. http://dx.doi.org/10.1109/TRPMS.2018.2880617. – DOI 10.1109/TRPMS.2018.2880617. – ISSN 2469–7303. – Conference Name: IEEE Transactions on Radiation and Plasma Medical Sciences
[26] Wang, Hui ; Duentsch, Ivo ; Guo, Gongde ; Khan, Sadiq A.: Special issue on small data analytics. In: International Journal of Machine Learning and Cybernetics 14 (2023), Januar, Nr. 1, 1–2. http://dx.doi.org/10.1007/s13042-022-01699-0. – DOI 10.1007/s13042–022–01699–0. – ISSN 1868–808X
[27] Yates, Luke A. ; Aandahl, Zach ; Richards, Shane A. ; Brook, Barry W.: Cross validation for model selection: A review with examples from ecology. In: Ecological Monographs 93 (2023), Nr. 1, e1557. http://dx.doi.org/10.1002/ecm.1557. – DOI 10.1002/ecm.1557. – ISSN 1557–7015. – _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/ecm.1557
[28] Sun, Xu ; Xu, Weichao: Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves. In: IEEE Signal Processing Letters 21 (2014), November, Nr. 11, 1389–1393. http://dx.doi.org/10.1109/LSP.2014.2337313. – DOI 10.1109/LSP.2014.2337313. – ISSN 1558–2361. – Conference Name: IEEE Signal Processing Letters
[29] Zwanenburg, Alex ; Leger, Stefan ; Vallières, Martin ; Löck, Steffen: Image biomarker standardisation initiative. In: Radiology 295 (2020), Mai, Nr. 2, 328–338. http://dx.doi.org/10.1148/radiol.2020191145. – DOI 10.1148/radiol.2020191145. – ISSN 0033–8419, 1527–1315. – arXiv:1612.07003 [cs, eess]
[30] Reiazi, Reza ; Abbas, Engy ; Famiyeh, Petra ; Rezaie, Aria ; Kwan, Jennifer Y. Y. ; Patel, Tirth ; Bratman, Scott V. ; Tadic, Tony ; Liu, Fei-Fei ; Haibe-Kains, Benjamin: The impact of the variation of imaging parameters on the robustness of Computed Tomography radiomic features: A review. In: Computers in Biology and Medicine 133 (2021), Juni, 104400. http://dx.doi.org/10.1016/j.compbiomed.2021.104400. – DOI 10.1016/j.compbiomed.2021.104400. – ISSN 0010–4825
[31] Zhang, Jing ; Yao, Kuan ; Liu, Panpan ; Liu, Zhenyu ; Han, Tao ; Zhao, Zhiyong ; Cao, Yuntai ; Zhang, Guojin ; Zhang, Junting ; Tian, Jie ; Zhou, Junlin: A radiomics model for preoperative prediction of brain invasion in meningioma non-invasively based on MRI: A multicentre study. In: eBioMedicine 58 (2020), August. http://dx.doi.org/10.1016/j.ebiom.2020.102933. – DOI 10.1016/j.ebiom.2020.102933. – ISSN 2352–3964. – Publisher: Elsevier
[32] Wang, Wei ; Cao, KaiMing ; Jin, ShengMing ; Zhu, XiaoLi ; Ding, JianHui ; Peng, WeiJun: Differentiation of renal cell carcinoma subtypes through MRI-based radiomics analysis. In: European Radiology 30 (2020), Oktober, Nr. 10, 5738–5747. http://dx.doi.org/10.1007/s00330-020-06896-5. – DOI 10.1007/s00330–020–06896–5. – ISSN 0938–7994, 1432–1084
[33] E, Linning ; Lu, Lin ; Li, Li ; Yang, Hao ; Schwartz, Lawrence H. ; Zhao, Binsheng: Radiomics for Classifying Histological Subtypes of Lung Cancer Based on Multiphasic Contrast-Enhanced Computed Tomography. In: Journal of Computer Assisted Tomography 43 (2019), März, Nr. 2, 300–306. http://dx.doi.org/10.1097/RCT.0000000000000836. – DOI 10.1097/RCT.0000000000000836. – ISSN 1532–3145, 0363–8715
[34] Zwanenburg, Alex ; Leger, Stefan ; Agolli, Linda ; Pilz, Karoline ; Troost, Esther G. C. ; Richter, Christian ; Löck, Steffen: Assessing robustness of radiomic features by image perturbation. In: Scientific Reports 9 (2019), Januar, Nr. 1, 614. http://dx.doi.org/10.1038/s41598-018-36938-4. – DOI 10.1038/s41598–018–36938–4. – ISSN 2045–2322
[35] Collewet, G. ; Strzelecki, M. ; Mariette, F.: Influence of MRI acquisition protocols and image intensity normalization methods on texture classification. In: Magnetic Resonance Imaging 22 (2004), Januar, Nr. 1, 81–91. http://dx.doi.org/10.1016/j.mri.2003.09.001. – DOI 10.1016/j.mri.2003.09.001. – ISSN 0730–725X
[36] Horng, Hannah ; Singh, Apurva ; Yousefi, Bardia ; Cohen, Eric A. ; Haghighi, Babak ; Katz, Sharyn ; Noël, Peter B. ; Kontos, Despina ; Shinohara, Russell T.: Improved generalized ComBat methods for harmonization of radiomic features. In: Scientific Reports 12 (2022), November, Nr. 1, 19009. http://dx.doi.org/10.1038/s41598-022-23328-0. – DOI 10.1038/s41598–022–23328–0. – ISSN 2045–2322. – Number: 1 Publisher: Nature Publishing Group
[37] Abler, Daniel ; Schaer, Roger ; Oreiller, Valentin ; Verma, Himanshu ; Reichenbach, Julien ; Aidonopoulos, Orfeas ; Evéquoz, Florian ; Jreige, Mario ; Prior, John O. ; Depeursinge, Adrien: QuantImage v2: a comprehensive and integrated physician-centered cloud platform for radiomics and machine learning research. In: European Radiology Experimental 7 (2023), März, Nr. 1, 16. http://dx.doi.org/10.1186/s41747-023-00326-z. – DOI 10.1186/s41747–023–00326–z. – ISSN 2509–9280