Abstract
The aim of this study was to develop an open-source, modular, locally run or server-based system for 3D radiomics feature computation that can be used on any computer system and included in existing workflows for understanding associations and building predictive models between image features and clinical data, such as survival. The QIFE exploits various levels of parallelization for use on multiprocessor systems. It consists of a managing framework and four stages: input, pre-processing, feature computation, and output. Each stage contains one or more swappable components, allowing run-time customization. We benchmarked the engine using various levels of parallelization on a cohort of CT scans presenting 108 lung tumors. Two versions of the QIFE have been released: (1) the open-source MATLAB code posted to Github, (2) a compiled version loaded in a Docker container, posted to DockerHub, which can be easily deployed on any computer. The QIFE processed 108 objects (tumors) in 2:12 (h/mm) using 1 core, and 1:04 (h/mm) hours using four cores with object-level parallelization. We developed the Quantitative Image Feature Engine (QIFE), an open-source feature-extraction framework that focuses on modularity, standards, parallelism, provenance, and integration. Researchers can easily integrate it with their existing segmentation and imaging workflows by creating input and output components that implement their existing interfaces. Computational efficiency can be improved by parallelizing execution at the cost of memory usage. Different parallelization levels provide different trade-offs, and the optimal setting will depend on the size and composition of the dataset to be processed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van Stiphout RGPM, Granton P, Zegers CML, Gillies R, Boellard R, Dekker A et al.: Radiomics: Extracting more information from medical images using advanced feature analysis. Eur J Cancer 48(4):441–446, 2012
Aerts HJWL, Velazquez ER, Leijenaar RT, Parmar C, Grossmann P, Carvalho S, Bussink J, Monshouwer R, Haibe-Kains B, Rietveld D, Hoebers F, Rietbergen MM, Leemans CR, Dekker A, Quackenbush J, Gillies RJ, Lambin P, Cavalho S, Bussink J et al.: Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5:4006, 2014
Kumar V, Gu Y, Basu S, Berglund A, Eschrich SA, Schabath MB, Forster K, Aerts HJWL, Dekker A, Fenstermacher D, Goldgof DB, Hall LO, Lambin P, Balagurunathan Y, Gatenby RA, Gillies RJ et al.: Radiomics: The process and the challenges. Magn Reson Imaging [Internet] 30(9):1234–1248, 2012. https://doi.org/10.1016/j.mri.2012.06.010
Coroller TP, Grossmann P, Hou Y, Rios Velazquez E, Leijenaar RTH, Hermann G, Lambin P, Haibe-Kains B, Mak RH, Aerts HJWL: CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. Radiother Oncol [Internet] 114(3):345–350, 2015. https://doi.org/10.1016/j.radonc.2015.02.015
Parmar C, Velazquez ER, Leijenaar R, Jermoumi M, Carvalho S, Mak RH, Mitra S, Shankar BU, Kikinis R, Haibe-Kains B, Lambin P, Aerts HJWL: Robust radiomics feature quantification using semiautomatic volumetric segmentation. PLoS One 9(7):1–8, 2014
Gatenby RA, Grove O, Gillies RJ: Quantitative imaging in cancer evolution and ecology. Radiology [Internet] 269(1):8–15, 2013 Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3781355&tool=pmcentrez&rendertype=abstract
Leijenaar RTH, Carvalho S, Velazquez ER, van Elmpt WJC, Parmar C, Hoekstra OS, Hoekstra CJ, Boellaard R, Dekker A, Gillies RJ, Aerts HJWL, Lambin P: Stability of FDG-PET Radiomics features: An integrated analysis of test-retest and inter-observer variability. Acta Oncol (Madr) [Internet] 52(7):1391–1397, 2013 Available from: http://www.ncbi.nlm.nih.gov/pubmed/24047337
Machine Learning | Microsoft Azure [Internet]. Available from: https://azure.microsoft.com/en-us/services/machine-learning/
Ludäscher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee EA, Tao J, Zhao Y: Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039–1065, 2006
Parker SG, Johnson CR: SCIRun: A Scientific Programming Environment for Computational Steering [Internet]. In: Proceedings of the 1995 ACM/IEEE Conference on Supercomputing. New York: ACM, 1995. Available from: http://doi.acm.org/10.1145/224170.224354
Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T: Taverna: A tool for building and running workflows of services. Nucleic Acids Res 34(WEB. SERV. ISS):729–732, 2006
Taylor I, Shields M, Wang I, Harrison A: The triana workflow environment: Architecture and applications. Work e-Science Sci Work Grids:320–339, 2007
Zhang L, Fried DV, Fave XJ, Hunter LA, Yang J, Court LE: IBEX: An open infrastructure software platform to facilitate collaborative work in radiomics. Med Phys [Internet] 42:1341–1353, 2015 Available from: http://scitation.aip.org/content/aapm/journal/medphys/42/3/10.1118/1.4908210
van Griethuysen J, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, Beets-Tan R, Fillion-Robin JC, Pieper S, Aerts HJWL: Computational Radiomics System to Decode the Radiographic Phenotype. Accepted Cancer Res 2017. https://github.com/Radiomics/pyradiomics
Boettiger C: An introduction to Docker for reproducible research. ACM SIGOPS Oper Syst Rev [Internet] 49(1):71–79, 2015 Available from: http://arxiv.org/abs/1410.0846
Ince DC, Hatton L, Graham-Cumming J: The case for open computer programs. Nature [Internet] 482(7386):485–488, 2012 Available from: http://www.ncbi.nlm.nih.gov/pubmed/22358837
Bitzer J, Schröder PJH: Bug-fixing and code-writing: The private provision of open source software. Inf Econ Policy 17(3):389–406, 2005
Aberdour M: Achieving quality in open source software. IEEE Softw [Internet] (September):58–64, 2007 Available from: http://www.computer.org/portal/web/csdl/doi/10.1109/MS.2007.2
Simmhan YL, Plale B, Gannon D: A survey of data provenance in e-science [internet]. SIGMOD Rec. 34(3):31–36, 2005 Available from: http://doi.acm.org/10.1145/1084805.1084812%5C, http://dl.acm.org/ft_gateway.cfm?id=1084812&type=pdf
Davidson SB, Freire J: Provenance and scientific workflows. Proc 2008 ACM SIGMOD Int Conf Manag data - SIGMOD ‘08 [Internet], 2008, p 1345. Available from: http://www.scopus.com/inward/record.url?eid=2-s2.0-57149126952&partnerID=tZOtx3y1
Mildenberger P, Eichelberg M, Martin E: Introduction to the DICOM standard. Eur Radiol 12(4):920–927, 2002
DICOM Standards Committee WG 17 (3D). Supplement 111: Segmentation Storage SOP Class. In: Digital Imaging and Communications in Medicine (DICOM). Rosslyn, Virginia, 2006, p 22209
Liu B, Zhu M, Zhang Z, Yin C, Liu Z, Gu J: Medical image conversion with DICOM. Can Conf Electr Comput Eng:36–39, 2007
Riesmeier J, Eichelberg M, Jensch P: An approach to DICOM image display handling the full flexibility of the standard’s specification. Med Imaging 1999 Image Disp 3658(February):363–9, 1999
Jonker PP: Morphological operations on 3D and 4D images: From shape primitive detection to skeletonization. In: Lecture Notes in Computer Science. 2000, pp 371–91
Norris N: General means and statistical theory. Am Stat [Internet] 30(1):8–12, 1976 Available from: http://www.tandfonline.com/doi/abs/10.1080/00031305.1976.10479125
Mathworks. isosurface [Internet]. Matlab Ref. [cited 2016 Oct 19]. Available from: https://www.mathworks.com/help/matlab/ref/isosurface.html
reducepatch [Internet]. Mathworks MATLAB 2016a Doc. Available from: https://www.mathworks.com/help/matlab/ref/reducepatch.html
Han J, Moraga C: The influence of the sigmoid function parameters on the speed of backpropagation learning. From Nat to Artif Neural Comput [Internet] 930:195–201, 1995. doi:https://doi.org/10.1007/3-540-59497-3_175
Xu J, Napel S, Greenspan H, Beaulieu CF, Agrawal N, Rubin D: Quantifying the margin sharpness of lesions on radiological images for content-based image retrieval. Med Phys 39(9):5405–5418, 2012
nlinfit [Internet]. Mathworks MATLAB 2016a Doc.2016. Available from: https://www.mathworks.com/help/stats/nlinfit.html
Degarmo EP, Black J, Kohser RA: Materials and processes in manufacturing, 9th edition. Hoboken: Wiley, 2003
Definition and Designation of Surface Roughness. JIS B 0601. Japanese Industrial Standard, 1982
Surface Texture Symbols [Internet]. The American Society of Mechanical Engineers, 1996. Available from: https://www.asme.org/products/codes-standards/y1436m-1996-surface-texture-symbols
Wadell H: Volume, shape, and roundness of quartz particles. J Geol [Internet] 43(3):250–280, 1935 Available from: http://www.journals.uchicago.edu/doi/10.1086/624298
Haralick RMM, Shanmugam K, Dinstein IH: Textural Features for Image Classification. IEEE Trans Syst Man Cybern [Internet] [cited 2010 Nov 6];SMC-3(6):610–21, 1973. Available from: http://ieeexplore.ieee.org/xpls/abs_all.jsp?&arnumber=4309314
Kong TY, Roscoe AW, Rosenfeld A: Concepts of digital topology. Topol Appl [Internet] 46(3):219–262, 1992 Available from: http://www.sciencedirect.com/science/article/pii/016686419290016S
Shafranovich, Y.: Common Format and MIME Type for Comma-Separated Values (CSV) File, RFC 4180, October 2005. https://tools.ietf.org/html/rfc4180. Accessed 2017-05-01
Opensource.org. The BSD 2-Clause License [Internet]. Licenses 2016.Available from: https://opensource.org/licenses/BSD-2-Clause
Echegaray S, Nair V, Kadoch M, Leung A, Rubin D, Gevaert O, Napel S: A rapid segmentation-insensitive “digital biopsy” method for Radiomic feature extraction: Method and pilot study using CT images of non–small cell lung cancer. Tomography [Internet] 2(4):283–294, 2016. Available from: http://digitalpub.tomography.org/i/763956-vol-2-no-4-dec-2016/52
Kalpathy-Cramer J, Mamomov A, Zhao B, Lu L, Cherezov D, Napel S, Echegaray S, McNitt-Gray M, Lo P, Sieren JC, Uthoff J, Dilger SKN, Driscoll B, Yeung I, Goldgof D: Radiomics of lung nodules: a multi-institutional study of robustness and agreement of quantitative imaging features. Tomography 2(4):430–437, 2016. https://doi.org/10.18383/j.tom.2016.00235
Napel SA, Beaulieu CF, Rodriguez C, Cui J, Xu J, Gupta A, Korenblum D, Greenspan H, Ma Y, Rubin DL: Automated retrieval of CT images of liver lesions on the basis of image similarity: Method and preliminary results. Radiology [Internet] 256(1):243–252, 2010. https://doi.org/10.1148/radiol.10091694
Gevaert O, Mitchell LA, Achrol AS, Xu J, Echegaray S, Steinberg GK, Cheshier SH, Napel S, Zaharchuk G, Plevritis SK: Glioblastoma Multiforme: Exploratory Radiogenomic analysis by using quantitative image features. Radiology [Internet] 273(1):168–174, 2014. https://doi.org/10.1148/radiol.14131731
Acknowledgements
This research was funded in part by the following grants from the National Institutes of Health: R01 CA160251, U24 CA180927, U01 CA187947, and U01-CA190214.
Funding
This work was supported by the National Institutes of Health Grants R01 CA160251, U01 CA187947, U01-CA190214, and U24 CA180927.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Dr. Napel is a consultant for Carestream, Inc. and is on the scientific advisory boards of Echo Pixel, Inc., Fovia, Inc., and RadLogics, Inc.
Appendices
Appendix 1 Configuration Parameters for each component
Global
Parameter name | Default Value | Description |
---|---|---|
inputRoot | N/A | Root directory for input. (All input folders are relative to this directory) |
outputRoot | N/A | Root directory for output. (All output folders are relative to this directory) |
parallelMode | none | What parallelization strategy to use (None, Object, Feature, Internal) |
numberOfProcessors | max | If parallelMode is other than None, then the software creates a processing pool with numberOfProcessors processors. “Max” uses all available |
uidToProcess | all | A list of UID to be processed by the QIFE. If “all” it processes all volumes loaded by the input stage |
Input Stage
DSO/DICOM loader component
Parameter name | Default value | Description |
---|---|---|
dicomFolder | N/A | Folder relative to inputRoot where the DICOM sets are stored |
dsoFolder | N/A | Folder relative to inputRoot where the DSOs are stored |
recomputeHashTable | false | Computes the UID hash tables even if a cache index is found in the directory |
saveHashTable | true | Saves a cache of the UID hashtables in their root directories |
padding | 10 | Millimeters to go outside the VOI when loading the VOI |
Preprocessing stage
Segmentation deformation
Parameter name | Default value | Description |
---|---|---|
operation | N/A | Operation to perform in the VOI (“erosion” or “dilation”). |
sizeOfElement | N/A | Size of the ball used to perform the operation specified. |
Topology preservation
Parameter Name | Default value | Description |
---|---|---|
sizeOfGap | N/A | Maximum size of gaps to be bridged. |
Maximum connected volume selection
Parameter name | Default value | Description |
---|---|---|
connectivity | 26 | What connectivity determines that a voxel is part of the same volume (Possible values: 6, 18, 26) |
Hole filling
Parameter name | Default value | Description |
---|---|---|
connectivity | 26 | What connectivity determines that a voxel is part of the same volume (Possible values: 6, 18, 26) |
Feature Computation Stage
Size distribution features
Parameter name | Default value | Description |
---|---|---|
featureRootName | size | The prefix to add to all results generated by this component |
Intensity distribution features
Parameter name | Default value | Description |
---|---|---|
featureRootName | intensity | The prefix to add to all results generated by this component |
Edge sharpness features
Parameter name | Default value | Description |
---|---|---|
featureRootName | edge | The prefix to add to all results generated by this component |
normalLength | 5 | Length in millimeters of normals in each direction. |
numberOfNormals | 600 | Number of normals after triangulation and decimation |
numberOfSamplingPoints | 21 | Number of intensity samples along a normal |
Local volume invariant integral (LVII) feature
Parameter name | Default value | Description |
---|---|---|
featureRootName | lvii | The prefix to add to all results generated by this component |
sphereRadius | 1,2,3,4,5 | List of radii for the Sphere used to calculate intersections separated by commas |
Roughness feature
Parameter name | Default value | Description |
---|---|---|
featureRootName | roughness | The prefix to add to all results generated by this component |
patchSize | 3 | Maximum distance in mm for a voxel to be considered in the same patch when roughness is computed |
Sphericity feature
Parameter name | Default value | Description |
---|---|---|
featureRootName | sphericity | The prefix to add to all results generated by this component |
Haralick’s texture features
Parameter name | Default value | Description |
---|---|---|
featureRootName | haralick | The prefix to add to all results generated by this component |
distance | 1,2,3 | Distances in mm at which to calculate the GLCM. |
grayLevels | 16 | Number of gray levels to quantify intensity values to. |
Output stage
CSV Exporter
Parameter name | Default value | Description |
---|---|---|
filename | out.csv | Filename for the csv file relative to out folder |
Transpose | false | Transpose the data in the CSV (headers in the first column) |
Run information exporter
Parameter name | Default value | Description |
---|---|---|
filename | Info.txt | Filename for the run information file |
Cross-sectional image generator
Parameter name | Default value | Description |
---|---|---|
folderRoot | . | Folder relative to output root where to save the generated images. By default it uses the output root folder. |
windowLevelPreset | ctLung | Window and Level preset. By default it assumes Lung CT |
Reference Generator
Parameter name | Default value | Description |
---|---|---|
filename | references.bib | Filename for the bib file relative to out folder |
Appendix 2 Example Configuration File
The configuration file defines which components are loaded in each stage, and can override default parameters (shown in Appendix 1). The file follows the following syntax:
Category|ParameterName = VALUE (The separator is a pipe “|”).
Category can be:
-
global (sets parameter for the whole engine)
-
input (sets parameter for the input stage)
-
preprocessing (sets parameters for the preprocessing stage)
-
featureComputation (sets parameters for the feature computation stage)
-
output (sets parameters for the output stage), or
-
a specific component name to override its defaults.
Multiple parameters can be set using comma as a separator.
Comments are defined with a semicolon at the beginning of a line. The following is an example configuration file:
-
; Global Parameters
-
; Disables parallel mode
-
global|parallelMode = “none”
-
; Use the maximum number of processors
-
global|numberOfProcessors = “max”
-
; Process all files included in the input directory
-
global|uidToProcess = “all”
-
; Components to load
-
; Input components to load
-
input|component = “dsoLoader”
-
; Preprocessing components to load
-
preprocessing|components = “maximumConnected,holeFilling”
-
; Feature computation components to load
-
featureComputation|components = “information,size,intensity,sphericity,roughness,edgeSigmoidFitting,lvii,glcm,connectedRegions”
-
; Output components to load
-
output|components = “csvOutput,maxAreaImage,references”
-
; Component parameters to override (See Appendix 1 for definition)
-
; Number of Normals in the Edge Sigmoid Feature
-
edgeSigmoidFitting|numberOfNormals = 1200
-
; Window and Level preset
-
maxAreaImage|windowLevelPreset = “ctLung”
Rights and permissions
About this article
Cite this article
Echegaray, S., Bakr, S., Rubin, D.L. et al. Quantitative Image Feature Engine (QIFE): an Open-Source, Modular Engine for 3D Quantitative Feature Extraction from Volumetric Medical Images. J Digit Imaging 31, 403–414 (2018). https://doi.org/10.1007/s10278-017-0019-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-017-0019-x