Machine Learning Engineer with a focus on anything related to preserving privacy in this data-hungry world. Supervisors: Dr. Spyridon Bakas and Dr. Bjoern Menze
e13538 Background: Breast density is considered a well-established breast cancer risk factor. As ... more e13538 Background: Breast density is considered a well-established breast cancer risk factor. As quasi-3D, digital breast tomosynthesis (DBT) becomes increasingly utilized for screening, there is an opportunity to routinely estimate volumetric breast density (VBD). However, current methods extrapolate VBD from 2D images acquired with DBT and/or depend on the existence of raw DBT data, which is rarely archived due to cost and storage constraints. Using a racially diverse screening cohort, this study evaluates the potential of deep learning for VBD assessment based solely on 3D reconstructed, “for presentation” DBT images. Methods: We retrospectively analyzed 1,080 negative DBT screening exams obtained between 2011 and 2016 from the Hospital of the University of Pennsylvania (racial makeup, 41.2% White, 54.2% Black, 4.6% Other; mean age ± SD, 57 ± 11 years; mean BMI ± SD, 28.7 ± 7.1 kg/m2), for which both 2D raw and 3D reconstructed DBT images (Selenia Dimensions, Hologic Inc) were av...
7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, ... more 7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, ... more 7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, ... more 7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
From History: Misc bugfixes Automatic check-pointing of the model has been added Extending the co... more From History: Misc bugfixes Automatic check-pointing of the model has been added Extending the codebase has been simplified New optimizers added New metrics added Affine augmentation can now be significantly fine-tuned Update logic for penalty calculation RGB-specific augmentation added Cropping added
International challenges have become the standard for validation of biomedical image analysis met... more International challenges have become the standard for validation of biomedical image analysis methods. We argue, though, that the actual performance even of the winning algorithms on "real-world" clinical data often remains unclear, as the data included in these challenges usually acquired in very controlled settings at few institutions. The seemingly obvious solution of just collecting increasingly more data from more institutions in such challenges does not scale well due to privacy and ownership hurdles. As the first challenge to ever be proposed for federated learning in medicine, the Federated Tumor Segmentation (FeTS) challenge 2021 intends to address these hurdles, both for the creation and the evaluation of tumor segmentation models. Specifically, FeTS 2021 uses clinical, multi-institutional MRI scans from the BraTS challenge and from various remote independent institutions included in the collaborative network of a real-world federation (www.fets.ai). The FeTS cha...
This dataset comprises two paired sets of expert segmentation labels for tumor sub-compartments o... more This dataset comprises two paired sets of expert segmentation labels for tumor sub-compartments of the pre-operative multi-institutional scans of the Ivy Glioblastoma Atlas Project (Ivy GAP) collection of The Cancer Imaging Archive (TCIA). These labels have been approved by independent expert board-certified neuroradiologists at the Hospital of the University of Pennsylvania and at Case Western Reserve University. Furthermore, for each of the paired sets of approved labels, a diverse comprehensive panel of radiomic features is provided, along with their corresponding skull-stripped and co-registered multi-parametric magnetic resonance imaging (mpMRI) volumes (i.e. native (T1) and post-contrast T1-weighted (T1-Gd), T2, T2-FLAIR), in NIfTI format.
Changelog: Misc bugfixes for segmentation and classification DFU 2021 parameter file added Added ... more Changelog: Misc bugfixes for segmentation and classification DFU 2021 parameter file added Added SDNet for supervised learning - https://doi.org/10.1016/j.media.2019.101535 Added option to re-orient all images to canonical Preprocessing and augmentation made into separate submodules
A multitude of image-based machine learning segmentation and classification algorithms has recent... more A multitude of image-based machine learning segmentation and classification algorithms has recently been proposed, offering diagnostic decision support for the identification and characterization of glioma, Covid-19 and many other diseases. Even though these algorithms often outperform human experts in segmentation tasks, their limited reliability, and in particular the inability to detect failure cases, has hindered translation into clinical practice. To address this major shortcoming, we propose an unsupervised quality estimation method for segmentation ensembles. Our primitive solution examines discord in binary segmentation maps to automatically flag segmentation results that are particularly error-prone and therefore require special assessment by human readers. We validate our method both on segmentation of brain glioma in multi-modal magnetic resonance - and of lung lesions in computer tomography images. Additionally, our method provides an adaptive prioritization mechanism to...
Variously stained histology slices are routinely used by pathologists to assess extracted tissue ... more Variously stained histology slices are routinely used by pathologists to assess extracted tissue samples from various anatomical sites and determine the presence or extent of a disease. Evaluation of sequential slides is expected to enable a better understanding of the spatial arrangement and growth patterns of cells and vessels. In this paper we present a practical two-step approach based on diffeomorphic registration to align digitized sequential histopathology stained slides to each other, starting with an initial affine step followed by the estimation of a detailed deformation field.
Radiomic features are being increasingly studied for clinical applications. We aimed to assess th... more Radiomic features are being increasingly studied for clinical applications. We aimed to assess the agreement among radiomic features when computed by several groups by using different software packages under very tightly controlled conditions, which included standardized feature definitions and common image data sets. Ten sites (9 from the NCI's Quantitative Imaging Network] positron emission tomography–computed tomography working group plus one site from outside that group) participated in this project. Nine common quantitative imaging features were selected for comparison including features that describe morphology, intensity, shape, and texture. The common image data sets were: three 3D digital reference objects (DROs) and 10 patient image scans from the Lung Image Database Consortium data set using a specific lesion in each scan. Each object (DRO or lesion) was accompanied by an already-defined volume of interest, from which the features were calculated. Feature values for e...
Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, 2020
Accurate segmentation of different sub-regions of gliomas including peritumoral edema, necrotic c... more Accurate segmentation of different sub-regions of gliomas including peritumoral edema, necrotic core, enhancing and non-enhancing tumor core from multimodal MRI scans has important clinical relevance in diagnosis, prognosis and treatment of brain tumors. However, due to the highly heterogeneous appearance and shape, segmentation of the sub-regions is very challenging. Recent development using deep learning models has proved its effectiveness in the past several brain segmentation challenges as well as other semantic and medical image segmentation problems. Most models in brain tumor segmentation use a 2D/3D patch to predict the class label for the center voxel and variant patch sizes and scales are used to improve the model performance. However, it has low computation efficiency and also has limited receptive field. U-Net is a widely used network structure for end-to-end segmentation and can be used on the entire image or extracted patches to provide classification labels over the entire input voxels so that it is more efficient and expect to yield better performance with larger input size. In this paper we developed a deep-learning-based segmentation method using an ensemble of 3D U-Nets with different hyper-parameters. Furthermore, we estimated the uncertainty of the segmentation from the probabilistic outputs of each network and studied the correlation between the uncertainty and the performances. Preliminary results showed effectiveness of the segmentation model. Finally, we developed a linear model for survival prediction using extracted imaging and non-imaging features, which, despite the simplicity, can effectively reduce overfitting and regression errors.
e13538 Background: Breast density is considered a well-established breast cancer risk factor. As ... more e13538 Background: Breast density is considered a well-established breast cancer risk factor. As quasi-3D, digital breast tomosynthesis (DBT) becomes increasingly utilized for screening, there is an opportunity to routinely estimate volumetric breast density (VBD). However, current methods extrapolate VBD from 2D images acquired with DBT and/or depend on the existence of raw DBT data, which is rarely archived due to cost and storage constraints. Using a racially diverse screening cohort, this study evaluates the potential of deep learning for VBD assessment based solely on 3D reconstructed, “for presentation” DBT images. Methods: We retrospectively analyzed 1,080 negative DBT screening exams obtained between 2011 and 2016 from the Hospital of the University of Pennsylvania (racial makeup, 41.2% White, 54.2% Black, 4.6% Other; mean age ± SD, 57 ± 11 years; mean BMI ± SD, 28.7 ± 7.1 kg/m2), for which both 2D raw and 3D reconstructed DBT images (Selenia Dimensions, Hologic Inc) were av...
7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, ... more 7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, ... more 7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, ... more 7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
From History: Misc bugfixes Automatic check-pointing of the model has been added Extending the co... more From History: Misc bugfixes Automatic check-pointing of the model has been added Extending the codebase has been simplified New optimizers added New metrics added Affine augmentation can now be significantly fine-tuned Update logic for penalty calculation RGB-specific augmentation added Cropping added
International challenges have become the standard for validation of biomedical image analysis met... more International challenges have become the standard for validation of biomedical image analysis methods. We argue, though, that the actual performance even of the winning algorithms on "real-world" clinical data often remains unclear, as the data included in these challenges usually acquired in very controlled settings at few institutions. The seemingly obvious solution of just collecting increasingly more data from more institutions in such challenges does not scale well due to privacy and ownership hurdles. As the first challenge to ever be proposed for federated learning in medicine, the Federated Tumor Segmentation (FeTS) challenge 2021 intends to address these hurdles, both for the creation and the evaluation of tumor segmentation models. Specifically, FeTS 2021 uses clinical, multi-institutional MRI scans from the BraTS challenge and from various remote independent institutions included in the collaborative network of a real-world federation (www.fets.ai). The FeTS cha...
This dataset comprises two paired sets of expert segmentation labels for tumor sub-compartments o... more This dataset comprises two paired sets of expert segmentation labels for tumor sub-compartments of the pre-operative multi-institutional scans of the Ivy Glioblastoma Atlas Project (Ivy GAP) collection of The Cancer Imaging Archive (TCIA). These labels have been approved by independent expert board-certified neuroradiologists at the Hospital of the University of Pennsylvania and at Case Western Reserve University. Furthermore, for each of the paired sets of approved labels, a diverse comprehensive panel of radiomic features is provided, along with their corresponding skull-stripped and co-registered multi-parametric magnetic resonance imaging (mpMRI) volumes (i.e. native (T1) and post-contrast T1-weighted (T1-Gd), T2, T2-FLAIR), in NIfTI format.
Changelog: Misc bugfixes for segmentation and classification DFU 2021 parameter file added Added ... more Changelog: Misc bugfixes for segmentation and classification DFU 2021 parameter file added Added SDNet for supervised learning - https://doi.org/10.1016/j.media.2019.101535 Added option to re-orient all images to canonical Preprocessing and augmentation made into separate submodules
A multitude of image-based machine learning segmentation and classification algorithms has recent... more A multitude of image-based machine learning segmentation and classification algorithms has recently been proposed, offering diagnostic decision support for the identification and characterization of glioma, Covid-19 and many other diseases. Even though these algorithms often outperform human experts in segmentation tasks, their limited reliability, and in particular the inability to detect failure cases, has hindered translation into clinical practice. To address this major shortcoming, we propose an unsupervised quality estimation method for segmentation ensembles. Our primitive solution examines discord in binary segmentation maps to automatically flag segmentation results that are particularly error-prone and therefore require special assessment by human readers. We validate our method both on segmentation of brain glioma in multi-modal magnetic resonance - and of lung lesions in computer tomography images. Additionally, our method provides an adaptive prioritization mechanism to...
Variously stained histology slices are routinely used by pathologists to assess extracted tissue ... more Variously stained histology slices are routinely used by pathologists to assess extracted tissue samples from various anatomical sites and determine the presence or extent of a disease. Evaluation of sequential slides is expected to enable a better understanding of the spatial arrangement and growth patterns of cells and vessels. In this paper we present a practical two-step approach based on diffeomorphic registration to align digitized sequential histopathology stained slides to each other, starting with an initial affine step followed by the estimation of a detailed deformation field.
Radiomic features are being increasingly studied for clinical applications. We aimed to assess th... more Radiomic features are being increasingly studied for clinical applications. We aimed to assess the agreement among radiomic features when computed by several groups by using different software packages under very tightly controlled conditions, which included standardized feature definitions and common image data sets. Ten sites (9 from the NCI's Quantitative Imaging Network] positron emission tomography–computed tomography working group plus one site from outside that group) participated in this project. Nine common quantitative imaging features were selected for comparison including features that describe morphology, intensity, shape, and texture. The common image data sets were: three 3D digital reference objects (DROs) and 10 patient image scans from the Lung Image Database Consortium data set using a specific lesion in each scan. Each object (DRO or lesion) was accompanied by an already-defined volume of interest, from which the features were calculated. Feature values for e...
Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, 2020
Accurate segmentation of different sub-regions of gliomas including peritumoral edema, necrotic c... more Accurate segmentation of different sub-regions of gliomas including peritumoral edema, necrotic core, enhancing and non-enhancing tumor core from multimodal MRI scans has important clinical relevance in diagnosis, prognosis and treatment of brain tumors. However, due to the highly heterogeneous appearance and shape, segmentation of the sub-regions is very challenging. Recent development using deep learning models has proved its effectiveness in the past several brain segmentation challenges as well as other semantic and medical image segmentation problems. Most models in brain tumor segmentation use a 2D/3D patch to predict the class label for the center voxel and variant patch sizes and scales are used to improve the model performance. However, it has low computation efficiency and also has limited receptive field. U-Net is a widely used network structure for end-to-end segmentation and can be used on the entire image or extracted patches to provide classification labels over the entire input voxels so that it is more efficient and expect to yield better performance with larger input size. In this paper we developed a deep-learning-based segmentation method using an ensemble of 3D U-Nets with different hyper-parameters. Furthermore, we estimated the uncertainty of the segmentation from the probabilistic outputs of each network and studied the correlation between the uncertainty and the performances. Preliminary results showed effectiveness of the segmentation model. Finally, we developed a linear model for survival prediction using extracted imaging and non-imaging features, which, despite the simplicity, can effectively reduce overfitting and regression errors.
Uploads
Papers by Sarthak Pati