-
METRIK: Measurement-Efficient Randomized Controlled Trials using Transformers with Input Masking
Authors:
Sayeri Lala,
Niraj K. Jha
Abstract:
Clinical randomized controlled trials (RCTs) collect hundreds of measurements spanning various metric types (e.g., laboratory tests, cognitive/motor assessments, etc.) across 100s-1000s of subjects to evaluate the effect of a treatment, but do so at the cost of significant trial expense. To reduce the number of measurements, trial protocols can be revised to remove metrics extraneous to the study'…
▽ More
Clinical randomized controlled trials (RCTs) collect hundreds of measurements spanning various metric types (e.g., laboratory tests, cognitive/motor assessments, etc.) across 100s-1000s of subjects to evaluate the effect of a treatment, but do so at the cost of significant trial expense. To reduce the number of measurements, trial protocols can be revised to remove metrics extraneous to the study's objective, but doing so requires additional human labor and limits the set of hypotheses that can be studied with the collected data. In contrast, a planned missing design (PMD) can reduce the amount of data collected without removing any metric by imputing the unsampled data. Standard PMDs randomly sample data to leverage statistical properties of imputation algorithms, but are ad hoc, hence suboptimal. Methods that learn PMDs produce more sample-efficient PMDs, but are not suitable for RCTs because they require ample prior data (150+ subjects) to model the data distribution. Therefore, we introduce a framework called Measurement EfficienT Randomized Controlled Trials using Transformers with Input MasKing (METRIK), which, for the first time, calculates a PMD specific to the RCT from a modest amount of prior data (e.g., 60 subjects). Specifically, METRIK models the PMD as a learnable input masking layer that is optimized with a state-of-the-art imputer based on the Transformer architecture. METRIK implements a novel sampling and selection algorithm to generate a PMD that satisfies the trial designer's objective, i.e., whether to maximize sampling efficiency or imputation performance for a given sampling budget. Evaluated across five real-world clinical RCT datasets, METRIK increases the sampling efficiency of and imputation performance under the generated PMD by leveraging correlations over time and across metrics, thereby removing the need to manually remove metrics from the RCT.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
CONFINE: Conformal Prediction for Interpretable Neural Networks
Authors:
Linhui Huang,
Sayeri Lala,
Niraj K. Jha
Abstract:
Deep neural networks exhibit remarkable performance, yet their black-box nature limits their utility in fields like healthcare where interpretability is crucial. Existing explainability approaches often sacrifice accuracy and lack quantifiable measures of prediction uncertainty. In this study, we introduce Conformal Prediction for Interpretable Neural Networks (CONFINE), a versatile framework that…
▽ More
Deep neural networks exhibit remarkable performance, yet their black-box nature limits their utility in fields like healthcare where interpretability is crucial. Existing explainability approaches often sacrifice accuracy and lack quantifiable measures of prediction uncertainty. In this study, we introduce Conformal Prediction for Interpretable Neural Networks (CONFINE), a versatile framework that generates prediction sets with statistically robust uncertainty estimates instead of point predictions to enhance model transparency and reliability. CONFINE not only provides example-based explanations and confidence estimates for individual predictions but also boosts accuracy by up to 3.6%. We define a new metric, correct efficiency, to evaluate the fraction of prediction sets that contain precisely the correct label and show that CONFINE achieves correct efficiency of up to 3.3% higher than the original accuracy, matching or exceeding prior methods. CONFINE's marginal and class-conditional coverages attest to its validity across tasks spanning medical image classification to language understanding. Being adaptable to any pre-trained classifier, CONFINE marks a significant advance towards transparent and trustworthy deep learning applications in critical domains.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Label-Efficient Sleep Staging Using Transformers Pre-trained with Position Prediction
Authors:
Sayeri Lala,
Hanlin Goh,
Christopher Sandino
Abstract:
Sleep staging is a clinically important task for diagnosing various sleep disorders, but remains challenging to deploy at scale because it because it is both labor-intensive and time-consuming. Supervised deep learning-based approaches can automate sleep staging but at the expense of large labeled datasets, which can be unfeasible to procure for various settings, e.g., uncommon sleep disorders. Wh…
▽ More
Sleep staging is a clinically important task for diagnosing various sleep disorders, but remains challenging to deploy at scale because it because it is both labor-intensive and time-consuming. Supervised deep learning-based approaches can automate sleep staging but at the expense of large labeled datasets, which can be unfeasible to procure for various settings, e.g., uncommon sleep disorders. While self-supervised learning (SSL) can mitigate this need, recent studies on SSL for sleep staging have shown performance gains saturate after training with labeled data from only tens of subjects, hence are unable to match peak performance attained with larger datasets. We hypothesize that the rapid saturation stems from applying a sub-optimal pretraining scheme that pretrains only a portion of the architecture, i.e., the feature encoder, but not the temporal encoder; therefore, we propose adopting an architecture that seamlessly couples the feature and temporal encoding and a suitable pretraining scheme that pretrains the entire model. On a sample sleep staging dataset, we find that the proposed scheme offers performance gains that do not saturate with amount of labeled training data (e.g., 3-5\% improvement in balanced sleep staging accuracy across low- to high-labeled data settings), reducing the amount of labeled training data needed for high performance (e.g., by 800 subjects). Based on our findings, we recommend adopting this SSL paradigm for subsequent work on SSL for sleep staging.
△ Less
Submitted 29 March, 2024;
originally announced April 2024.
-
SECRETS: Subject-Efficient Clinical Randomized Controlled Trials using Synthetic Intervention
Authors:
Sayeri Lala,
Niraj K. Jha
Abstract:
The randomized controlled trial (RCT) is the gold standard for estimating the average treatment effect (ATE) of a medical intervention but requires 100s-1000s of subjects, making it expensive and difficult to implement. While a cross-over trial can reduce sample size requirements by measuring the treatment effect per individual, it is only applicable to chronic conditions and interventions whose e…
▽ More
The randomized controlled trial (RCT) is the gold standard for estimating the average treatment effect (ATE) of a medical intervention but requires 100s-1000s of subjects, making it expensive and difficult to implement. While a cross-over trial can reduce sample size requirements by measuring the treatment effect per individual, it is only applicable to chronic conditions and interventions whose effects dissipate rapidly. Another approach is to replace or augment data collected from an RCT with external data from prospective studies or prior RCTs, but it is vulnerable to confounders in the external or augmented data. We propose to simulate the cross-over trial to overcome its practical limitations while exploiting its strengths. We propose a novel framework, SECRETS, which, for the first time, estimates the individual treatment effect (ITE) per patient in the RCT study without using any external data by leveraging a state-of-the-art counterfactual estimation algorithm, called synthetic intervention. It also uses a new hypothesis testing strategy to determine whether the treatment has a clinically significant ATE based on the estimated ITEs. We show that SECRETS can improve the power of an RCT while maintaining comparable significance levels; in particular, on three real-world clinical RCTs (Phase-3 trials), SECRETS increases power over the baseline method by $\boldsymbol{6}$-$\boldsymbol{54\%}$ (average: 21.5%, standard deviation: 15.8%).
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Semi-Supervised Learning for Fetal Brain MRI Quality Assessment with ROI consistency
Authors:
Junshen Xu,
Sayeri Lala,
Borjan Gagoski,
Esra Abaci Turk,
P. Ellen Grant,
Polina Golland,
Elfar Adalsteinsson
Abstract:
Fetal brain MRI is useful for diagnosing brain abnormalities but is challenged by fetal motion. The current protocol for T2-weighted fetal brain MRI is not robust to motion so image volumes are degraded by inter- and intra- slice motion artifacts. Besides, manual annotation for fetal MR image quality assessment are usually time-consuming. Therefore, in this work, a semi-supervised deep learning me…
▽ More
Fetal brain MRI is useful for diagnosing brain abnormalities but is challenged by fetal motion. The current protocol for T2-weighted fetal brain MRI is not robust to motion so image volumes are degraded by inter- and intra- slice motion artifacts. Besides, manual annotation for fetal MR image quality assessment are usually time-consuming. Therefore, in this work, a semi-supervised deep learning method that detects slices with artifacts during the brain volume scan is proposed. Our method is based on the mean teacher model, where we not only enforce consistency between student and teacher models on the whole image, but also adopt an ROI consistency loss to guide the network to focus on the brain region. The proposed method is evaluated on a fetal brain MR dataset with 11,223 labeled images and more than 200,000 unlabeled images. Results show that compared with supervised learning, the proposed method can improve model accuracy by about 6\% and outperform other state-of-the-art semi-supervised learning methods. The proposed method is also implemented and evaluated on an MR scanner, which demonstrates the feasibility of online image quality assessment and image reacquisition during fetal MR scans.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.