-
Amazing Things Come From Having Many Good Models
Authors:
Cynthia Rudin,
Chudi Zhong,
Lesia Semenova,
Margo Seltzer,
Ronald Parr,
Jiachang Liu,
Srikar Katta,
Jon Donnelly,
Harry Chen,
Zachery Boner
Abstract:
The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mostly magic. In light of the Rashomon Effect, this perspective piece proposes reshaping the way we think about machine learning, particularly for…
▽ More
The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mostly magic. In light of the Rashomon Effect, this perspective piece proposes reshaping the way we think about machine learning, particularly for tabular data problems in the nondeterministic (noisy) setting. We address how the Rashomon Effect impacts (1) the existence of simple-yet-accurate models, (2) flexibility to address user preferences, such as fairness and monotonicity, without losing performance, (3) uncertainty in predictions, fairness, and explanations, (4) reliable variable importance, (5) algorithm choice, specifically, providing advanced knowledge of which algorithms might be suitable for a given problem, and (6) public policy. We also discuss a theory of when the Rashomon Effect occurs and why. Our goal is to illustrate how the Rashomon Effect can have a massive impact on the use of machine learning for complex problems in society.
△ Less
Submitted 9 July, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
This Looks Better than That: Better Interpretable Models with ProtoPNeXt
Authors:
Frank Willard,
Luke Moffett,
Emmanuel Mokel,
Jon Donnelly,
Stark Guo,
Julia Yang,
Giyoung Kim,
Alina Jade Barnett,
Cynthia Rudin
Abstract:
Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision. However, they are difficult to train, with high sensitivity to hyperparameter tuning, inhibiting their application to new datasets and our understanding of which methods truly improve their performance. To facilitate the careful study of prototypical-part networks (ProtoPNets), w…
▽ More
Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision. However, they are difficult to train, with high sensitivity to hyperparameter tuning, inhibiting their application to new datasets and our understanding of which methods truly improve their performance. To facilitate the careful study of prototypical-part networks (ProtoPNets), we create a new framework for integrating components of prototypical-part models -- ProtoPNeXt. Using ProtoPNeXt, we show that applying Bayesian hyperparameter tuning and an angular prototype similarity metric to the original ProtoPNet is sufficient to produce new state-of-the-art accuracy for prototypical-part models on CUB-200 across multiple backbones. We further deploy this framework to jointly optimize for accuracy and prototype interpretability as measured by metrics included in ProtoPNeXt. Using the same resources, this produces models with substantially superior semantics and changes in accuracy between +1.3% and -1.5%. The code and trained models will be made publicly available upon publication.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
Authors:
Julia Yang,
Alina Jade Barnett,
Jon Donnelly,
Satvik Kishore,
Jerry Fang,
Fides Regina Schwartz,
Chaofan Chen,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency t…
▽ More
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency to these formerly black boxes by utilizing prototypes for case-based explanations, achieving high accuracy in applications including mammography. However, these models struggle with precise feature localization, reasoning on large portions of an image when only a small part is relevant. This paper addresses this gap by proposing a novel multi-scale interpretable deep learning model for mammographic mass margin classification. Our contribution not only offers an interpretable model with reasoning aligned with radiologist practices, but also provides a general architecture for computer vision with user-configurable prototypes from coarse- to fine-grained prototypes.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
SiamQuality: A ConvNet-Based Foundation Model for Imperfect Physiological Signals
Authors:
Cheng Ding,
Zhicheng Guo,
Zhaoliang Chen,
Randall J Lee,
Cynthia Rudin,
Xiao Hu
Abstract:
Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for developing foundation models for phys…
▽ More
Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for developing foundation models for physiological data; such data are often noisy, incomplete, or inconsistent. The present work aims to provide a toolset for developing foundation models on physiological data. We leverage a large dataset of photoplethysmography (PPG) signals from hospitalized intensive care patients. For this data, we propose SimQuality, a novel self-supervised learning task based on convolutional neural networks (CNNs) as the backbone to enforce representations to be similar for good and poor quality signals that are from similar physiological states. We pre-trained the SimQuality on over 36 million 30-second PPG pairs and then fine-tuned and tested on six downstream tasks using external datasets. The results demonstrate the superiority of the proposed approach on all the downstream tasks, which are extremely important for heart monitoring on wearable devices. Our method indicates that CNNs can be an effective backbone for foundation models that are robust to training data quality.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Data Poisoning Attacks on Off-Policy Policy Evaluation Methods
Authors:
Elita Lobo,
Harvineet Singh,
Marek Petrik,
Cynthia Rudin,
Himabindu Lakkaraju
Abstract:
Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversarial threats to data quality is largely unexplored. In this work, we make the first attempt at investigating the sensitivity of OPE methods to m…
▽ More
Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversarial threats to data quality is largely unexplored. In this work, we make the first attempt at investigating the sensitivity of OPE methods to marginal adversarial perturbations to the data. We design a generic data poisoning attack framework leveraging influence functions from robust statistics to carefully construct perturbations that maximize error in the policy value estimates. We carry out extensive experimentation with multiple healthcare and control datasets. Our results demonstrate that many existing OPE methods are highly prone to generating value estimates with large errors when subject to data poisoning attacks, even for small adversarial perturbations. These findings question the reliability of policy values derived using OPE methods and motivate the need for developing OPE methods that are statistically robust to train-time data poisoning attacks.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
What is different between these datasets?
Authors:
Varun Babbar,
Zhicheng Guo,
Cynthia Rudin
Abstract:
The performance of machine learning models heavily depends on the quality of input data, yet real-world applications often encounter various data-related challenges. One such challenge could arise when curating training data or deploying the model in the real world - two comparable datasets in the same domain may have different distributions. While numerous techniques exist for detecting distribut…
▽ More
The performance of machine learning models heavily depends on the quality of input data, yet real-world applications often encounter various data-related challenges. One such challenge could arise when curating training data or deploying the model in the real world - two comparable datasets in the same domain may have different distributions. While numerous techniques exist for detecting distribution shifts, the literature lacks comprehensive approaches for explaining dataset differences in a human-understandable manner. To address this gap, we propose a suite of interpretable methods (toolbox) for comparing two datasets. We demonstrate the versatility of our approach across diverse data modalities, including tabular data, language, images, and signals in both low and high-dimensional settings. Our methods not only outperform comparable and related approaches in terms of explanation quality and correctness, but also provide actionable, complementary insights to understand and mitigate dataset differences effectively.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Sparse and Faithful Explanations Without Sparse Models
Authors:
Yiyang Sun,
Zhi Chen,
Vittorio Orlandi,
Tong Wang,
Cynthia Rudin
Abstract:
Even if a model is not globally sparse, it is possible for decisions made from that model to be accurately and faithfully described by a small number of features. For instance, an application for a large loan might be denied to someone because they have no credit history, which overwhelms any evidence towards their creditworthiness. In this work, we introduce the Sparse Explanation Value (SEV), a…
▽ More
Even if a model is not globally sparse, it is possible for decisions made from that model to be accurately and faithfully described by a small number of features. For instance, an application for a large loan might be denied to someone because they have no credit history, which overwhelms any evidence towards their creditworthiness. In this work, we introduce the Sparse Explanation Value (SEV), a new way of measuring sparsity in machine learning models. In the loan denial example above, the SEV is 1 because only one factor is needed to explain why the loan was denied. SEV is a measure of decision sparsity rather than overall model sparsity, and we are able to show that many machine learning models -- even if they are not sparse -- actually have low decision sparsity, as measured by SEV. SEV is defined using movements over a hypercube, allowing SEV to be defined consistently over various model classes, with movement restrictions reflecting real-world constraints. We proposed the algorithms that reduce SEV without sacrificing accuracy, providing sparse and completely faithful explanations, even without globally sparse models.
△ Less
Submitted 8 March, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Optimal Sparse Survival Trees
Authors:
Rui Zhang,
Rui Xin,
Margo Seltzer,
Cynthia Rudin
Abstract:
Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for survival analysis due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival…
▽ More
Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for survival analysis due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival trees rely on heuristic (or greedy) algorithms, which risk producing sub-optimal models. We present a dynamic-programming-with-bounds approach that finds provably-optimal sparse survival tree models, frequently in only a few seconds.
△ Less
Submitted 22 May, 2024; v1 submitted 27 January, 2024;
originally announced January 2024.
-
Interpretable Causal Inference for Analyzing Wearable, Sensor, and Distributional Data
Authors:
Srikar Katta,
Harsh Parikh,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be misleading. For example, disparate distributions can have the same means, variances, and other statistics. Researchers can overcome the loss of inf…
▽ More
Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be misleading. For example, disparate distributions can have the same means, variances, and other statistics. Researchers can overcome the loss of information by instead representing the data as distributions. We develop an interpretable method for distributional data analysis that ensures trustworthy and robust decision-making: Analyzing Distributional Data via Matching After Learning to Stretch (ADD MALTS). We (i) provide analytical guarantees of the correctness of our estimation strategy, (ii) demonstrate via simulation that ADD MALTS outperforms other distributional data analysis methods at estimating treatment effects, and (iii) illustrate ADD MALTS' ability to verify whether there is enough cohesion between treatment and control units within subpopulations to trustworthily estimate treatment effects. We demonstrate ADD MALTS' utility by studying the effectiveness of continuous glucose monitors in mitigating diabetes risks.
△ Less
Submitted 20 March, 2024; v1 submitted 16 December, 2023;
originally announced December 2023.
-
ProtoEEGNet: An Interpretable Approach for Detecting Interictal Epileptiform Discharges
Authors:
Dennis Tang,
Frank Willard,
Ronan Tegerdine,
Luke Triplett,
Jon Donnelly,
Luke Moffett,
Lesia Semenova,
Alina Jade Barnett,
Jin Jing,
Cynthia Rudin,
Brandon Westover
Abstract:
In electroencephalogram (EEG) recordings, the presence of interictal epileptiform discharges (IEDs) serves as a critical biomarker for seizures or seizure-like events.Detecting IEDs can be difficult; even highly trained experts disagree on the same sample. As a result, specialists have turned to machine-learning models for assistance. However, many existing models are black boxes and do not provid…
▽ More
In electroencephalogram (EEG) recordings, the presence of interictal epileptiform discharges (IEDs) serves as a critical biomarker for seizures or seizure-like events.Detecting IEDs can be difficult; even highly trained experts disagree on the same sample. As a result, specialists have turned to machine-learning models for assistance. However, many existing models are black boxes and do not provide any human-interpretable reasoning for their decisions. In high-stakes medical applications, it is critical to have interpretable models so that experts can validate the reasoning of the model before making important diagnoses. We introduce ProtoEEGNet, a model that achieves state-of-the-art accuracy for IED detection while additionally providing an interpretable justification for its classifications. Specifically, it can reason that one EEG looks similar to another ''prototypical'' EEG that is known to contain an IED. ProtoEEGNet can therefore help medical professionals effectively detect IEDs while maintaining a transparent decision-making process.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Reconsideration on evaluation of machine learning models in continuous monitoring using wearables
Authors:
Cheng Ding,
Zhicheng Guo,
Cynthia Rudin,
Ran Xiao,
Fadi B Nahab,
Xiao Hu
Abstract:
This paper explores the challenges in evaluating machine learning (ML) models for continuous health monitoring using wearable devices beyond conventional metrics. We state the complexities posed by real-world variability, disease dynamics, user-specific characteristics, and the prevalence of false notifications, necessitating novel evaluation strategies. Drawing insights from large-scale heart stu…
▽ More
This paper explores the challenges in evaluating machine learning (ML) models for continuous health monitoring using wearable devices beyond conventional metrics. We state the complexities posed by real-world variability, disease dynamics, user-specific characteristics, and the prevalence of false notifications, necessitating novel evaluation strategies. Drawing insights from large-scale heart studies, the paper offers a comprehensive guideline for robust ML model evaluation on continuous health monitoring.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Fast and Interpretable Mortality Risk Scores for Critical Care Patients
Authors:
Chloe Qinyu Zhu,
Muhang Tian,
Lesia Semenova,
Jiachang Liu,
Jack Xu,
Joseph Scarpa,
Cynthia Rudin
Abstract:
Prediction of mortality in intensive care unit (ICU) patients is an important task in critical care medicine. Prior work in creating mortality risk models falls into two major categories: domain-expert-created scoring systems, and black box machine learning (ML) models. Both of these have disadvantages: black box models are unacceptable for use in hospitals, whereas manual creation of models (incl…
▽ More
Prediction of mortality in intensive care unit (ICU) patients is an important task in critical care medicine. Prior work in creating mortality risk models falls into two major categories: domain-expert-created scoring systems, and black box machine learning (ML) models. Both of these have disadvantages: black box models are unacceptable for use in hospitals, whereas manual creation of models (including hand-tuning of logistic regression parameters) relies on humans to perform high-dimensional constrained optimization, which leads to a loss in performance. In this work, we bridge the gap between accurate black box models and hand-tuned interpretable models. We build on modern interpretable ML techniques to design accurate and interpretable mortality risk scores. We leverage the largest existing public ICU monitoring datasets, namely the MIMIC III and eICU datasets. By evaluating risk across medical centers, we are able to study generalization across domains. In order to customize our risk score models, we develop a new algorithm, GroupFasterRisk, which has several important benefits: (1) it uses hard sparsity constraint, allowing users to directly control the number of features; (2) it incorporates group sparsity to allow more cohesive models; (3) it allows for monotonicity correction on models for including domain knowledge; (4) it produces many equally-good models at once, which allows domain experts to choose among them. GroupFasterRisk creates its risk scores within hours, even on the large datasets we study here. GroupFasterRisk's risk scores perform better than risk scores currently used in hospitals, and have similar prediction performance to black box ML models (despite being much sparser). Because GroupFasterRisk produces a variety of risk scores and handles constraints, it allows design flexibility, which is the key enabler of practical and trustworthy model creation.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
A Path to Simpler Models Starts With Noise
Authors:
Lesia Semenova,
Harry Chen,
Ronald Parr,
Cynthia Rudin
Abstract:
The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can…
▽ More
The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can attain the same level of accuracy as more complex models. An open question is why Rashomon ratios often tend to be large. In this work, we propose and study a mechanism of the data generation process, coupled with choices usually made by the analyst during the learning process, that determines the size of the Rashomon ratio. Specifically, we demonstrate that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models. Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
This Looks Like Those: Illuminating Prototypical Concepts Using Multiple Visualizations
Authors:
Chiyu Ma,
Brandon Zhao,
Chaofan Chen,
Cynthia Rudin
Abstract:
We present ProtoConcepts, a method for interpretable image classification combining deep learning and case-based reasoning using prototypical parts. Existing work in prototype-based image classification uses a ``this looks like that'' reasoning process, which dissects a test image by finding prototypical parts and combining evidence from these prototypes to make a final classification. However, al…
▽ More
We present ProtoConcepts, a method for interpretable image classification combining deep learning and case-based reasoning using prototypical parts. Existing work in prototype-based image classification uses a ``this looks like that'' reasoning process, which dissects a test image by finding prototypical parts and combining evidence from these prototypes to make a final classification. However, all of the existing prototypical part-based image classifiers provide only one-to-one comparisons, where a single training image patch serves as a prototype to compare with a part of our test image. With these single-image comparisons, it can often be difficult to identify the underlying concept being compared (e.g., ``is it comparing the color or the shape?''). Our proposed method modifies the architecture of prototype-based networks to instead learn prototypical concepts which are visualized using multiple image patches. Having multiple visualizations of the same prototype allows us to more easily identify the concept captured by that prototype (e.g., ``the test image and the related training patches are all the same shade of blue''), and allows our model to create richer, more interpretable visual explanations. Our experiments show that our ``this looks like those'' reasoning process can be applied as a modification to a wide range of existing prototypical image classification networks while achieving comparable accuracy on benchmark datasets.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Safe and Interpretable Estimation of Optimal Treatment Regimes
Authors:
Harsh Parikh,
Quinn Lanners,
Zade Akras,
Sahar F. Zafar,
M. Brandon Westover,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regim…
▽ More
Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regimes. This approach involves matching patients with similar medical and pharmacological characteristics, allowing us to construct an optimal policy via interpolation. We perform a comprehensive simulation study to demonstrate the framework's ability to identify optimal policies even in complex settings. Ultimately, we operationalize our approach to study regimes for treating seizures in critically ill patients. Our findings strongly support personalized treatment strategies based on a patient's medical history and pharmacological features. Notably, we identify that reducing medication doses for patients with mild and brief seizure episodes while adopting aggressive treatment for patients in intensive care unit experiencing intense seizures leads to more favorable outcomes.
△ Less
Submitted 1 April, 2024; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Uncertainty Quantification of Bandgaps in Acoustic Metamaterials with Stochastic Geometric Defects and Material Properties
Authors:
Han Zhang,
Rayehe Karimi Mahabadi,
Cynthia Rudin,
Johann Guilleminot,
L. Catherine Brinson
Abstract:
This paper studies the utility of techniques within uncertainty quantification, namely spectral projection and polynomial chaos expansion, in reducing sampling needs for characterizing acoustic metamaterial dispersion band responses given stochastic material properties and geometric defects. A novel method of encoding geometric defects in an interpretable, resolution independent is showcased in th…
▽ More
This paper studies the utility of techniques within uncertainty quantification, namely spectral projection and polynomial chaos expansion, in reducing sampling needs for characterizing acoustic metamaterial dispersion band responses given stochastic material properties and geometric defects. A novel method of encoding geometric defects in an interpretable, resolution independent is showcased in the formation of input space probability distributions. Orders of magnitude sampling reductions down to $\sim10^0$ and $\sim10^1$ are achieved in the 1D and 7D input space scenarios respectively while maintaining accurate output space probability distributions through combining Monte Carlo, quadrature rule, and sparse grid sampling with surrogate model fitting.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
SiamAF: Learning Shared Information from ECG and PPG Signals for Robust Atrial Fibrillation Detection
Authors:
Zhicheng Guo,
Cheng Ding,
Duc H. Do,
Amit Shah,
Randall J. Lee,
Xiao Hu,
Cynthia Rudin
Abstract:
Atrial fibrillation (AF) is the most common type of cardiac arrhythmia. It is associated with an increased risk of stroke, heart failure, and other cardiovascular complications, but can be clinically silent. Passive AF monitoring with wearables may help reduce adverse clinical outcomes related to AF. Detecting AF in noisy wearable data poses a significant challenge, leading to the emergence of var…
▽ More
Atrial fibrillation (AF) is the most common type of cardiac arrhythmia. It is associated with an increased risk of stroke, heart failure, and other cardiovascular complications, but can be clinically silent. Passive AF monitoring with wearables may help reduce adverse clinical outcomes related to AF. Detecting AF in noisy wearable data poses a significant challenge, leading to the emergence of various deep learning techniques. Previous deep learning models learn from a single modality, either electrocardiogram (ECG) or photoplethysmography (PPG) signals. However, deep learning models often struggle to learn generalizable features and rely on features that are more susceptible to corruption from noise, leading to sub-optimal performances in certain scenarios, especially with low-quality signals. Given the increasing availability of ECG and PPG signal pairs from wearables and bedside monitors, we propose a new approach, SiamAF, leveraging a novel Siamese network architecture and joint learning loss function to learn shared information from both ECG and PPG signals. At inference time, the proposed model is able to predict AF from either PPG or ECG and outperforms baseline methods on three external test sets. It learns medically relevant features as a result of our novel architecture design. The proposed model also achieves comparable performance to traditional learning regimes while requiring much fewer training labels, providing a potential approach to reduce future reliance on manual labeling.
△ Less
Submitted 8 March, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance
Authors:
Jon Donnelly,
Srikar Katta,
Cynthia Rudin,
Edward P. Browne
Abstract:
Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different re…
▽ More
Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data. Additionally, even when accounting for all possible explanations for a given dataset, these insights may not generalize because not all good explanations are stable across reasonable data perturbations. We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution. Our framework is extremely flexible and can be integrated with most existing model classes and global variable importance metrics. We demonstrate through experiments that our framework recovers variable importance rankings for complex simulation setups where other methods fail. Further, we show that our framework accurately estimates the true importance of a variable for the underlying data distribution. We provide theoretical guarantees on the consistency and finite sample error rates for our estimator. Finally, we demonstrate its utility with a real-world case study exploring which genes are important for predicting HIV load in persons with HIV, highlighting an important gene that has not previously been studied in connection with HIV. Code is available at https://github.com/jdonnelly36/Rashomon_Importance_Distribution.
△ Less
Submitted 1 April, 2024; v1 submitted 24 September, 2023;
originally announced September 2023.
-
Learned Kernels for Sparse, Interpretable, and Efficient Medical Time Series Processing
Authors:
Sully F. Chen,
Zhicheng Guo,
Cheng Ding,
Xiao Hu,
Cynthia Rudin
Abstract:
Background: Rapid, reliable, and accurate interpretation of medical signals is crucial for high-stakes clinical decision-making. The advent of deep learning allowed for an explosion of new models that offered unprecedented performance in medical time series processing but at a cost: deep learning models are often compute-intensive and lack interpretability.
Methods: We propose Sparse Mixture of…
▽ More
Background: Rapid, reliable, and accurate interpretation of medical signals is crucial for high-stakes clinical decision-making. The advent of deep learning allowed for an explosion of new models that offered unprecedented performance in medical time series processing but at a cost: deep learning models are often compute-intensive and lack interpretability.
Methods: We propose Sparse Mixture of Learned Kernels (SMoLK), an interpretable architecture for medical time series processing. The method learns a set of lightweight flexible kernels to construct a single-layer neural network, providing not only interpretability, but also efficiency and robustness. We introduce novel parameter reduction techniques to further reduce the size of our network. We demonstrate the power of our architecture on two important tasks: photoplethysmography (PPG) artifact detection and atrial fibrillation detection from single-lead electrocardiograms (ECGs). Our approach has performance similar to the state-of-the-art deep neural networks with several orders of magnitude fewer parameters, allowing for deep neural network level performance with extremely low-power wearable devices.
Results: Our interpretable method achieves greater than 99% of the performance of the state-of-the-art methods on the PPG artifact detection task, and even outperforms the state-of-the-art on a challenging out-of-distribution test set, while using dramatically fewer parameters (2% of the parameters of Segade, and about half of the parameters of Tiny-PPG). On single lead atrial fibrillation detection, our method matches the performance of a 1D-residual convolutional network, at less than 1% the parameter count, while exhibiting considerably better performance in the low-data regime, even when compared to a parameter-matched control deep network.
△ Less
Submitted 2 April, 2024; v1 submitted 6 July, 2023;
originally announced July 2023.
-
A Self-Supervised Algorithm for Denoising Photoplethysmography Signals for Heart Rate Estimation from Wearables
Authors:
Pranay Jain,
Cheng Ding,
Cynthia Rudin,
Xiao Hu
Abstract:
Smart watches and other wearable devices are equipped with photoplethysmography (PPG) sensors for monitoring heart rate and other aspects of cardiovascular health. However, PPG signals collected from such devices are susceptible to corruption from noise and motion artifacts, which cause errors in heart rate estimation. Typical denoising approaches filter or reconstruct the signal in ways that elim…
▽ More
Smart watches and other wearable devices are equipped with photoplethysmography (PPG) sensors for monitoring heart rate and other aspects of cardiovascular health. However, PPG signals collected from such devices are susceptible to corruption from noise and motion artifacts, which cause errors in heart rate estimation. Typical denoising approaches filter or reconstruct the signal in ways that eliminate much of the morphological information, even from the clean parts of the signal that would be useful to preserve. In this work, we develop an algorithm for denoising PPG signals that reconstructs the corrupted parts of the signal, while preserving the clean parts of the PPG signal. Our novel framework relies on self-supervised training, where we leverage a large database of clean PPG signals to train a denoising autoencoder. As we show, our reconstructed signals provide better estimates of heart rate from PPG signals than the leading heart rate estimation methods. Further experiments show significant improvement in Heart Rate Variability (HRV) estimation from PPG signals using our algorithm. We conclude that our algorithm denoises PPG signals in a way that can improve downstream analysis of many different health metrics from wearable devices.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
A Double Machine Learning Approach to Combining Experimental and Observational Data
Authors:
Harsh Parikh,
Marco Morucci,
Vittorio Orlandi,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only on…
▽ More
Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only one of these assumptions is violated, we provide semiparametrically efficient treatment effect estimators. However, our no-free-lunch theorem highlights the necessity of accurately identifying the violated assumption for consistent treatment effect estimation. Through comparative analyses, we show our framework's superiority over existing data fusion methods. The practical utility of our approach is further exemplified by three real-world case studies, underscoring its potential for widespread application in empirical research.
△ Less
Submitted 2 April, 2024; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?
Authors:
Zhi Chen,
Sarah Tan,
Urszula Chajewska,
Cynthia Rudin,
Rich Caruana
Abstract:
Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the consequences may affect healthcare decisions. There are many methods in the literature for dealing with missing values, including state-of-the-…
▽ More
Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the consequences may affect healthcare decisions. There are many methods in the literature for dealing with missing values, including state-of-the-art methods which often depend on black-box models for imputation. In this work, we show how recent advances in interpretable machine learning provide a new perspective for understanding and tackling the missing value problem. We propose methods based on high-accuracy glass-box Explainable Boosting Machines (EBMs) that can help users (1) gain new insights on missingness mechanisms and better understand the causes of missingness, and (2) detect -- or even alleviate -- potential risks introduced by imputation algorithms. Experiments on real-world medical datasets illustrate the effectiveness of the proposed methods.
△ Less
Submitted 23 April, 2023;
originally announced April 2023.
-
OKRidge: Scalable Optimal k-Sparse Ridge Regression
Authors:
Jiachang Liu,
Sam Rosen,
Chudi Zhong,
Cynthia Rudin
Abstract:
We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive the underlying dynamics. We propose a fast algorithm, OKRidge, for sparse ridge regression, using a novel lower bound calculation involving, firs…
▽ More
We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive the underlying dynamics. We propose a fast algorithm, OKRidge, for sparse ridge regression, using a novel lower bound calculation involving, first, a saddle point formulation, and from there, either solving (i) a linear system or (ii) using an ADMM-based approach, where the proximal operators can be efficiently evaluated by solving another linear system and an isotonic regression problem. We also propose a method to warm-start our solver, which leverages a beam search. Experimentally, our methods attain provable optimality with run times that are orders of magnitude faster than those of the existing MIP formulations solved by the commercial solver Gurobi.
△ Less
Submitted 11 January, 2024; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Exploring and Interacting with the Set of Good Sparse Generalized Additive Models
Authors:
Chudi Zhong,
Zhi Chen,
Jiachang Liu,
Margo Seltzer,
Cynthia Rudin
Abstract:
In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space cont…
▽ More
In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space containing a diverse set of models from which domain experts can choose. We present algorithms to efficiently and accurately approximate the Rashomon set of sparse, generalized additive models with ellipsoids for fixed support sets and use these ellipsoids to approximate Rashomon sets for many different support sets. The approximated Rashomon set serves as a cornerstone to solve practical challenges such as (1) studying the variable importance for the model class; (2) finding models under user-specified constraints (monotonicity, direct editing); and (3) investigating sudden changes in the shape functions. Experiments demonstrate the fidelity of the approximated Rashomon set and its effectiveness in solving practical challenges.
△ Less
Submitted 17 November, 2023; v1 submitted 28 March, 2023;
originally announced March 2023.
-
Variable Importance Matching for Causal Inference
Authors:
Quinn Lanners,
Harsh Parikh,
Alexander Volfovsky,
Cynthia Rudin,
David Page
Abstract:
Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves these goals by (i) learning a distance metric via outcome modeling, (ii) creating matched groups using the distance metric, and (iii) using the…
▽ More
Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves these goals by (i) learning a distance metric via outcome modeling, (ii) creating matched groups using the distance metric, and (iii) using the matched groups to estimate treatment effects. Model-to-Match uses variable importance measurements to construct a distance metric, making it a flexible framework that can be adapted to various applications. Concentrating on the scalability of the problem in the number of potential confounders, we operationalize the Model-to-Match framework with LASSO. We derive performance guarantees for settings where LASSO outcome modeling consistently identifies all confounders (importantly without requiring the linear model to be correctly specified). We also provide experimental results demonstrating the method's auditability, accuracy, and scalability as well as extensions to more general nonparametric outcome modeling.
△ Less
Submitted 28 June, 2023; v1 submitted 22 February, 2023;
originally announced February 2023.
-
Optimal Sparse Regression Trees
Authors:
Rui Zhang,
Rui Xin,
Margo Seltzer,
Cynthia Rudin
Abstract:
Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programmi…
▽ More
Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees. We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm in 1-dimension over the set of labels. We are often able to find optimal sparse trees in seconds, even for challenging datasets that involve large numbers of samples and highly-correlated features.
△ Less
Submitted 9 April, 2023; v1 submitted 27 November, 2022;
originally announced November 2022.
-
Interpretable Machine Learning System to EEG Patterns on the Ictal-Interictal-Injury Continuum
Authors:
Alina Jade Barnett,
Zhicheng Guo,
Jin Jing,
Wendong Ge,
Cynthia Rudin,
M. Brandon Westover
Abstract:
In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias an…
▽ More
In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, black box deep learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of trust and adoption by clinicians. To address these challenges, we propose a novel interpretable deep learning model that not only predicts the presence of harmful brainwave patterns but also provides high-quality case-based explanations of its decisions. Our model performs better than the corresponding black box model, despite being constrained to be interpretable. The learned 2D embedded space provides the first global overview of the structure of ictal-interictal-injury continuum brainwave patterns. The ability to understand how our model arrived at its decisions will not only help clinicians to diagnose and treat harmful brain activities more accurately but also increase their trust and adoption of machine learning models in clinical practice; this could be an integral component of the ICU neurologists' standard workflow.
△ Less
Submitted 11 April, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design
Authors:
Ali Behrouz,
Mathias Lecuyer,
Cynthia Rudin,
Margo Seltzer
Abstract:
Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the algorithms cannot handle weighted data samples. Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be…
▽ More
Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the algorithms cannot handle weighted data samples. Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be directly used. For example, none of the existing techniques produce policies that incorporate inverse propensity weighting on individual data points. We present three algorithms for efficient sparse weighted decision tree optimization. The first approach directly optimizes the weighted loss function; however, it tends to be computationally inefficient for large datasets. Our second approach, which scales more efficiently, transforms weights to integer values and uses data duplication to transform the weighted decision tree optimization problem into an unweighted (but larger) counterpart. Our third algorithm, which scales to much larger datasets, uses a randomized procedure that samples each data point with a probability proportional to its weight. We present theoretical bounds on the error of the two fast methods and show experimentally that these methods can be two orders of magnitude faster than the direct optimization of the weighted loss, without losing significant accuracy.
△ Less
Submitted 25 October, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
FasterRisk: Fast and Accurate Interpretable Risk Scores
Authors:
Jiachang Liu,
Chudi Zhong,
Boxuan Li,
Margo Seltzer,
Cynthia Rudin
Abstract:
Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on an index card. Typically, risk scores have been created either without data or by rounding logistic regression coefficients, but these methods do not reliably…
▽ More
Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on an index card. Typically, risk scores have been created either without data or by rounding logistic regression coefficients, but these methods do not reliably produce high-quality risk scores. Recent work used mathematical programming, which is computationally slow. We introduce an approach for efficiently producing a collection of high-quality risk scores learned from data. Specifically, our approach produces a pool of almost-optimal sparse continuous solutions, each with a different support set, using a beam-search algorithm. Each of these continuous solutions is transformed into a separate risk score through a "star ray" search, where a range of multipliers are considered before rounding the coefficients sequentially to maintain low logistic loss. Our algorithm returns all of these high-quality risk scores for the user to consider. This method completes within minutes and can be valuable in a broad variety of applications.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization
Authors:
Zijie J. Wang,
Chudi Zhong,
Rui Xin,
Takuya Takagi,
Zhi Chen,
Duen Horng Chau,
Cynthia Rudin,
Margo Seltzer
Abstract:
Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees--a huge set of almost-optimal interpretable ML models. To help ML practitioners identify models with desirable properties from this Rashomon set, we develop TimberTrek, the f…
▽ More
Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees--a huge set of almost-optimal interpretable ML models. To help ML practitioners identify models with desirable properties from this Rashomon set, we develop TimberTrek, the first interactive visualization system that summarizes thousands of sparse decision trees at scale. Two usage scenarios highlight how TimberTrek can empower users to easily explore, compare, and curate models that align with their domain knowledge and values. Our open-source tool runs directly in users' computational notebooks and web browsers, lowering the barrier to creating more responsible ML models. TimberTrek is available at the following public demo link: https://poloclub.github.io/timbertrek.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
Exploring the Whole Rashomon Set of Sparse Decision Trees
Authors:
Rui Xin,
Chudi Zhong,
Zhi Chen,
Takuya Takagi,
Margo Seltzer,
Cynthia Rudin
Abstract:
In any given machine learning problem, there may be many models that could explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed within a loss function. The Rashomon set is the set of these all almost-optima…
▽ More
In any given machine learning problem, there may be many models that could explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed within a loss function. The Rashomon set is the set of these all almost-optimal models. Rashomon sets can be extremely complicated, particularly for highly nonlinear function classes that allow complex interaction terms, such as decision trees. We provide the first technique for completely enumerating the Rashomon set for sparse decision trees; in fact, our work provides the first complete enumeration of any Rashomon set for a non-trivial problem with a highly nonlinear discrete function class. This allows the user an unprecedented level of control over model choice among all models that are approximately equally good. We represent the Rashomon set in a specialized data structure that supports efficient querying and sampling. We show three applications of the Rashomon set: 1) it can be used to study variable importance for the set of almost-optimal trees (as opposed to a single tree), 2) the Rashomon set for accuracy enables enumeration of the Rashomon sets for balanced accuracy and F1-score, and 3) the Rashomon set for a full dataset can be used to produce Rashomon sets constructed with only subsets of the data set. Thus, we are able to examine Rashomon sets across problems with a new lens, enabling users to choose models rather than be at the mercy of an algorithm that produces only a single model.
△ Less
Submitted 25 October, 2022; v1 submitted 16 September, 2022;
originally announced September 2022.
-
There is no Accuracy-Interpretability Tradeoff in Reinforcement Learning for Mazes
Authors:
Yishay Mansour,
Michal Moshkovitz,
Cynthia Rudin
Abstract:
Interpretability is an essential building block for trustworthiness in reinforcement learning systems. However, interpretability might come at the cost of deteriorated performance, leading many researchers to build complex models. Our goal is to analyze the cost of interpretability. We show that in certain cases, one can achieve policy interpretability while maintaining its optimality. We focus on…
▽ More
Interpretability is an essential building block for trustworthiness in reinforcement learning systems. However, interpretability might come at the cost of deteriorated performance, leading many researchers to build complex models. Our goal is to analyze the cost of interpretability. We show that in certain cases, one can achieve policy interpretability while maintaining its optimality. We focus on a classical problem from reinforcement learning: mazes with $k$ obstacles in $\mathbb{R}^d$. We prove the existence of a small decision tree with a linear function at each inner node and depth $O(\log k + 2^d)$ that represents an optimal policy. Note that for the interesting case of a constant $d$, we have $O(\log k)$ depth. Thus, in this setting, there is no accuracy-interpretability tradeoff. To prove this result, we use a new "compressing" technique that might be useful in additional settings.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
SegDiscover: Visual Concept Discovery via Unsupervised Semantic Segmentation
Authors:
Haiyang Huang,
Zhi Chen,
Cynthia Rudin
Abstract:
Visual concept discovery has long been deemed important to improve interpretability of neural networks, because a bank of semantically meaningful concepts would provide us with a starting point for building machine learning models that exhibit intelligible reasoning process. Previous methods have disadvantages: either they rely on labelled support sets that incorporate human biases for objects tha…
▽ More
Visual concept discovery has long been deemed important to improve interpretability of neural networks, because a bank of semantically meaningful concepts would provide us with a starting point for building machine learning models that exhibit intelligible reasoning process. Previous methods have disadvantages: either they rely on labelled support sets that incorporate human biases for objects that are "useful," or they fail to identify multiple concepts that occur within a single image. We reframe the concept discovery task as an unsupervised semantic segmentation problem, and present SegDiscover, a novel framework that discovers semantically meaningful visual concepts from imagery datasets with complex scenes without supervision. Our method contains three important pieces: generating concept primitives from raw images, discovering concepts by clustering in the latent space of a self-supervised pretrained encoder, and concept refinement via neural network smoothing. Experimental results provide evidence that our method can discover multiple concepts within a single image and outperforms state-of-the-art unsupervised methods on complex datasets such as Cityscapes and COCO-Stuff. Our method can be further used as a neural network explanation tool by comparing results obtained by different encoders.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
Effects of Epileptiform Activity on Discharge Outcome in Critically Ill Patients
Authors:
Harsh Parikh,
Kentaro Hoffman,
Haoqi Sun,
Wendong Ge,
Jin Jing,
Rajesh Amerineni,
Lin Liu,
Jimeng Sun,
Sahar Zafar,
Aaron Struck,
Alexander Volfovsky,
Cynthia Rudin,
M. Brandon Westover
Abstract:
Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mech…
▽ More
Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mechanistic knowledge is available, e.g., how drugs are absorbed. This knowledge together with observational data could provide a more accurate effect estimate using causal inference. We performed a retrospective cross-sectional study with 995 patients with the modified Rankin Scale (mRS) at discharge as the outcome and the EA burden defined as the mean or maximum proportion of time spent with EA in six-hour windows in the first 24 hours of electroencephalography as the exposure. We estimated the change in discharge mRS if everyone in the dataset had experienced a certain EA burden and were untreated. We combined pharmacological modeling with an interpretable matching method to account for confounding and EA-ASM feedback. Our matched groups' quality was validated by the neurologists. Having a maximum EA burden greater than 75% when untreated had a 22% increased chance of a poor outcome (severe disability or death), and mild but long-lasting EA increased the risk of a poor outcome by 14%. The effect sizes were heterogeneous depending on pre-admission profile, e.g., patients with hypoxic-ischemic encephalopathy (HIE) or acquired brain injury (ABI) were more affected. Interventions should put a higher priority on patients with an average EA burden higher than 10%, while treatment should be more conservative when the maximum EA burden is low.
△ Less
Submitted 11 March, 2023; v1 submitted 9 March, 2022;
originally announced March 2022.
-
Fast Sparse Classification for Generalized Linear and Additive Models
Authors:
Jiachang Liu,
Chudi Zhong,
Margo Seltzer,
Cynthia Rudin
Abstract:
We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the l…
▽ More
We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets.
△ Less
Submitted 29 October, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
Fast Sparse Decision Tree Optimization via Reference Ensembles
Authors:
Hayden McTavish,
Chudi Zhong,
Reto Achermann,
Ilias Karimalis,
Jacques Chen,
Cynthia Rudin,
Margo Seltzer
Abstract:
Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have only been made on the problem within the past few years, primarily on the problem of finding optimal spars…
▽ More
Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have only been made on the problem within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude, while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: when you are having trouble optimizing, just guess.
△ Less
Submitted 5 July, 2022; v1 submitted 1 December, 2021;
originally announced December 2021.
-
How to See Hidden Patterns in Metamaterials with Interpretable Machine Learning
Authors:
Zhi Chen,
Alexander Ogren,
Chiara Daraio,
L. Catherine Brinson,
Cynthia Rudin
Abstract:
Machine learning models can assist with metamaterials design by approximating computationally expensive simulators or solving inverse design problems. However, past work has usually relied on black box deep neural networks, whose reasoning processes are opaque and require enormous datasets that are expensive to obtain. In this work, we develop two novel machine learning approaches to metamaterials…
▽ More
Machine learning models can assist with metamaterials design by approximating computationally expensive simulators or solving inverse design problems. However, past work has usually relied on black box deep neural networks, whose reasoning processes are opaque and require enormous datasets that are expensive to obtain. In this work, we develop two novel machine learning approaches to metamaterials discovery that have neither of these disadvantages. These approaches, called shape-frequency features and unit-cell templates, can discover 2D metamaterials with user-specified frequency band gaps. Our approaches provide logical rule-based conditions on metamaterial unit-cells that allow for interpretable reasoning processes, and generalize well across design spaces of different resolutions. The templates also provide design flexibility where users can almost freely design the fine resolution features of a unit-cell without affecting the user's desired band gap.
△ Less
Submitted 1 October, 2022; v1 submitted 10 November, 2021;
originally announced November 2021.
-
BacHMMachine: An Interpretable and Scalable Model for Algorithmic Harmonization for Four-part Baroque Chorales
Authors:
Yunyao Zhu,
Stephen Hahn,
Simon Mak,
Yue Jiang,
Cynthia Rudin
Abstract:
Algorithmic harmonization - the automated harmonization of a musical piece given its melodic line - is a challenging problem that has garnered much interest from both music theorists and computer scientists. One genre of particular interest is the four-part Baroque chorales of J.S. Bach. Methods for algorithmic chorale harmonization typically adopt a black-box, "data-driven" approach: they do not…
▽ More
Algorithmic harmonization - the automated harmonization of a musical piece given its melodic line - is a challenging problem that has garnered much interest from both music theorists and computer scientists. One genre of particular interest is the four-part Baroque chorales of J.S. Bach. Methods for algorithmic chorale harmonization typically adopt a black-box, "data-driven" approach: they do not explicitly integrate principles from music theory but rely on a complex learning model trained with a large amount of chorale data. We propose instead a new harmonization model, called BacHMMachine, which employs a "theory-driven" framework guided by music composition principles, along with a "data-driven" model for learning compositional features within this framework. As its name suggests, BacHMMachine uses a novel Hidden Markov Model based on key and chord transitions, providing a probabilistic framework for learning key modulations and chordal progressions from a given melodic line. This allows for the generation of creative, yet musically coherent chorale harmonizations; integrating compositional principles allows for a much simpler model that results in vast decreases in computational burden and greater interpretability compared to state-of-the-art algorithmic harmonization methods, at no penalty to quality of harmonization or musicality. We demonstrate this improvement via comprehensive experiments and Turing tests comparing BacHMMachine to existing methods.
△ Less
Submitted 22 February, 2022; v1 submitted 15 September, 2021;
originally announced September 2021.
-
Interpretable Mammographic Image Classification using Case-Based Reasoning and Deep Learning
Authors:
Alina Jade Barnett,
Fides Regina Schwartz,
Chaofan Tao,
Chaofan Chen,
Yinhao Ren,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
When we deploy machine learning models in high-stakes medical settings, we must ensure these models make accurate predictions that are consistent with known medical science. Inherently interpretable networks address this need by explaining the rationale behind each decision while maintaining equal or higher accuracy compared to black-box models. In this work, we present a novel interpretable neura…
▽ More
When we deploy machine learning models in high-stakes medical settings, we must ensure these models make accurate predictions that are consistent with known medical science. Inherently interpretable networks address this need by explaining the rationale behind each decision while maintaining equal or higher accuracy compared to black-box models. In this work, we present a novel interpretable neural network algorithm that uses case-based reasoning for mammography. Designed to aid a radiologist in their decisions, our network presents both a prediction of malignancy and an explanation of that prediction using known medical features. In order to yield helpful explanations, the network is designed to mimic the reasoning processes of a radiologist: our network first detects the clinically relevant semantic features of each image by comparing each new image with a learned set of prototypical image parts from the training images, then uses those clinical features to predict malignancy. Compared to other methods, our model detects clinical features (mass margins) with equal or higher accuracy, provides a more detailed explanation of its prediction, and is better able to differentiate the classification-relevant parts of the image.
△ Less
Submitted 4 October, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
Multitask Learning for Citation Purpose Classification
Authors:
Alex Oesterling,
Angikar Ghosal,
Haoyang Yu,
Rui Xin,
Yasa Baig,
Lesia Semenova,
Cynthia Rudin
Abstract:
We present our entry into the 2021 3C Shared Task Citation Context Classification based on Purpose competition. The goal of the competition is to classify a citation in a scientific article based on its purpose. This task is important because it could potentially lead to more comprehensive ways of summarizing the purpose and uses of scientific articles, but it is also difficult, mainly due to the…
▽ More
We present our entry into the 2021 3C Shared Task Citation Context Classification based on Purpose competition. The goal of the competition is to classify a citation in a scientific article based on its purpose. This task is important because it could potentially lead to more comprehensive ways of summarizing the purpose and uses of scientific articles, but it is also difficult, mainly due to the limited amount of available training data in which the purposes of each citation have been hand-labeled, along with the subjectivity of these labels. Our entry in the competition is a multi-task model that combines multiple modules designed to handle the problem from different perspectives, including hand-generated linguistic features, TF-IDF features, and an LSTM-with-attention model. We also provide an ablation study and feature analysis whose insights could lead to future work.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
A Holistic Approach to Interpretability in Financial Lending: Models, Visualizations, and Summary-Explanations
Authors:
Chaofan Chen,
Kangcheng Lin,
Cynthia Rudin,
Yaron Shaposhnik,
Sijia Wang,
Tong Wang
Abstract:
Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is timely, since the economy has dramatically shifted due to a pandemic, and a massive number of new loans will be necessary in the short term. We…
▽ More
Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is timely, since the economy has dramatically shifted due to a pandemic, and a massive number of new loans will be necessary in the short term. We propose a framework for such decisions, including a globally interpretable machine learning model, an interactive visualization of it, and several types of summaries and explanations for any given decision. The machine learning model is a two-layer additive risk model, which resembles a two-layer neural network, but is decomposable into subscales. In this model, each node in the first (hidden) layer represents a meaningful subscale model, and all of the nonlinearities are transparent. Our online visualization tool allows exploration of this model, showing precisely how it came to its conclusion. We provide three types of explanations that are simpler than, but consistent with, the global model: case-based reasoning explanations that use neighboring past cases, a set of features that were the most important for the model's prediction, and summary-explanations that provide a customized sparse explanation for any particular lending decision made by the model. Our framework earned the FICO recognition award for the Explainable Machine Learning Challenge, which was the first public challenge in the domain of explainable machine learning.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Playing Codenames with Language Graphs and Word Embeddings
Authors:
Divya Koyyalagunta,
Anna Sun,
Rachel Lea Draelos,
Cynthia Rudin
Abstract:
Although board games and video games have been studied for decades in artificial intelligence research, challenging word games remain relatively unexplored. Word games are not as constrained as games like chess or poker. Instead, word game strategy is defined by the players' understanding of the way words relate to each other. The word game Codenames provides a unique opportunity to investigate co…
▽ More
Although board games and video games have been studied for decades in artificial intelligence research, challenging word games remain relatively unexplored. Word games are not as constrained as games like chess or poker. Instead, word game strategy is defined by the players' understanding of the way words relate to each other. The word game Codenames provides a unique opportunity to investigate common sense understanding of relationships between words, an important open challenge. We propose an algorithm that can generate Codenames clues from the language graph BabelNet or from any of several embedding methods - word2vec, GloVe, fastText or BERT. We introduce a new scoring function that measures the quality of clues, and we propose a weighting term called DETECT that incorporates dictionary-based word representations and document frequency to improve clue selection. We develop BabelNet-Word Selection Framework (BabelNet-WSF) to improve BabelNet clue quality and overcome the computational barriers that previously prevented leveraging language graphs for Codenames. Extensive experiments with human evaluators demonstrate that our proposed innovations yield state-of-the-art performance, with up to 102.8% improvement in precision@2 in some cases. Overall, this work advances the formal study of word games and approaches for common sense language understanding.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Ethical Implementation of Artificial Intelligence to Select Embryos in In Vitro Fertilization
Authors:
Michael Anis Mihdi Afnan,
Cynthia Rudin,
Vincent Conitzer,
Julian Savulescu,
Abhishek Mishra,
Yanhe Liu,
Masoud Afnan
Abstract:
AI has the potential to revolutionize many areas of healthcare. Radiology, dermatology, and ophthalmology are some of the areas most likely to be impacted in the near future, and they have received significant attention from the broader research community. But AI techniques are now also starting to be used in in vitro fertilization (IVF), in particular for selecting which embryos to transfer to th…
▽ More
AI has the potential to revolutionize many areas of healthcare. Radiology, dermatology, and ophthalmology are some of the areas most likely to be impacted in the near future, and they have received significant attention from the broader research community. But AI techniques are now also starting to be used in in vitro fertilization (IVF), in particular for selecting which embryos to transfer to the woman. The contribution of AI to IVF is potentially significant, but must be done carefully and transparently, as the ethical issues are significant, in part because this field involves creating new people. We first give a brief introduction to IVF and review the use of AI for embryo selection. We discuss concerns with the interpretation of the reported results from scientific and practical perspectives. We then consider the broader ethical issues involved. We discuss in detail the problems that result from the use of black-box methods in this context and advocate strongly for the use of interpretable models. Importantly, there have been no published trials of clinical effectiveness, a problem in both the AI and IVF communities, and we therefore argue that clinical implementation at this point would be premature. Finally, we discuss ways for the broader AI community to become involved to ensure scientifically sound and ethically responsible development of AI in IVF.
△ Less
Submitted 30 April, 2021;
originally announced May 2021.
-
IAIA-BL: A Case-based Interpretable Deep Learning Model for Classification of Mass Lesions in Digital Mammography
Authors:
Alina Jade Barnett,
Fides Regina Schwartz,
Chaofan Tao,
Chaofan Chen,
Yinhao Ren,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
Interpretability in machine learning models is important in high-stakes decisions, such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are small, confounding information is present, and it can be difficult even for a radiologist to decide between watchful waiting and biopsy based on a m…
▽ More
Interpretability in machine learning models is important in high-stakes decisions, such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are small, confounding information is present, and it can be difficult even for a radiologist to decide between watchful waiting and biopsy based on a mammogram alone. In this work, we present a framework for interpretable machine learning-based mammography. In addition to predicting whether a lesion is malignant or benign, our work aims to follow the reasoning processes of radiologists in detecting clinically relevant semantic features of each image, such as the characteristics of the mass margins. The framework includes a novel interpretable neural network algorithm that uses case-based reasoning for mammography. Our algorithm can incorporate a combination of data with whole image labelling and data with pixel-wise annotations, leading to better accuracy and interpretability even with a small number of images. Our interpretable models are able to highlight the classification-relevant parts of the image, whereas other methods highlight healthy tissue and confounding information. Our models are decision aids, rather than decision makers, aimed at better overall human-machine collaboration. We do not observe a loss in mass margin classification accuracy over a black box neural network trained on the same data.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges
Authors:
Cynthia Rudin,
Chaofan Chen,
Zhi Chen,
Haiyang Huang,
Lesia Semenova,
Chudi Zhong
Abstract:
Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. We also identify 10 technical challenge areas in interpretable machine learning and provide history and background on each problem. Some of thes…
▽ More
Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. We also identify 10 technical challenge areas in interpretable machine learning and provide history and background on each problem. Some of these problems are classically important, and some are recent problems that have arisen in the last few years. These problems are: (1) Optimizing sparse logical models such as decision trees; (2) Optimization of scoring systems; (3) Placing constraints into generalized additive models to encourage sparsity and better interpretability; (4) Modern case-based reasoning, including neural networks and matching for causal inference; (5) Complete supervised disentanglement of neural networks; (6) Complete or even partial unsupervised disentanglement of neural networks; (7) Dimensionality reduction for data visualization; (8) Machine learning models that can incorporate physics and other generative or causal constraints; (9) Characterization of the "Rashomon set" of good models; and (10) Interpretable reinforcement learning. This survey is suitable as a starting point for statisticians and computer scientists interested in working in interpretable machine learning.
△ Less
Submitted 9 July, 2021; v1 submitted 20 March, 2021;
originally announced March 2021.
-
There Once Was a Really Bad Poet, It Was Automated but You Didn't Know It
Authors:
Jianyou Wang,
Xiaoxuan Zhang,
Yuren Zhou,
Christopher Suh,
Cynthia Rudin
Abstract:
Limerick generation exemplifies some of the most difficult challenges faced in poetry generation, as the poems must tell a story in only five lines, with constraints on rhyme, stress, and meter. To address these challenges, we introduce LimGen, a novel and fully automated system for limerick generation that outperforms state-of-the-art neural network-based poetry models, as well as prior rule-base…
▽ More
Limerick generation exemplifies some of the most difficult challenges faced in poetry generation, as the poems must tell a story in only five lines, with constraints on rhyme, stress, and meter. To address these challenges, we introduce LimGen, a novel and fully automated system for limerick generation that outperforms state-of-the-art neural network-based poetry models, as well as prior rule-based poetry models. LimGen consists of three important pieces: the Adaptive Multi-Templated Constraint algorithm that constrains our search to the space of realistic poems, the Multi-Templated Beam Search algorithm which searches efficiently through the space, and the probabilistic Storyline algorithm that provides coherent storylines related to a user-provided prompt word. The resulting limericks satisfy poetic constraints and have thematically coherent storylines, which are sometimes even funny (when we are lucky).
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
dame-flame: A Python Library Providing Fast Interpretable Matching for Causal Inference
Authors:
Neha R. Gupta,
Vittorio Orlandi,
Chia-Rui Chang,
Tianyu Wang,
Marco Morucci,
Pritam Dey,
Thomas J. Howell,
Xian Sun,
Angikar Ghosal,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
dame-flame is a Python package for performing matching for observational causal inference on datasets containing discrete covariates. This package implements the Dynamic Almost Matching Exactly (DAME) and Fast Large-Scale Almost Matching Exactly (FLAME) algorithms, which match treatment and control units on subsets of the covariates. The resulting matched groups are interpretable, because the matc…
▽ More
dame-flame is a Python package for performing matching for observational causal inference on datasets containing discrete covariates. This package implements the Dynamic Almost Matching Exactly (DAME) and Fast Large-Scale Almost Matching Exactly (FLAME) algorithms, which match treatment and control units on subsets of the covariates. The resulting matched groups are interpretable, because the matches are made on covariates, and high-quality, because machine learning is used to determine which covariates are important to match on. DAME solves an optimization problem that matches units on as many covariates as possible, prioritizing matches on important covariates. FLAME approximates the solution found by DAME via a much faster backward feature selection procedure. The package provides several adjustable parameters to adapt the algorithms to specific applications, and can calculate treatment effect estimates after matching. Descriptions of these parameters, details on estimating treatment effects, and further examples, can be found in the documentation at https://almost-matching-exactly.github.io/DAME-FLAME-Python-Package/
△ Less
Submitted 2 April, 2023; v1 submitted 5 January, 2021;
originally announced January 2021.
-
Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMAP, and PaCMAP for Data Visualization
Authors:
Yingfan Wang,
Haiyang Huang,
Cynthia Rudin,
Yaron Shaposhnik
Abstract:
Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMAP have demonstrated impressive visualization performance on many real world datasets. One tension that has always faced these methods is the trade-off between preservation of global structure and preservation of local structure: these methods can either handle one or the other, but not both. In this work, our main goal is to underst…
▽ More
Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMAP have demonstrated impressive visualization performance on many real world datasets. One tension that has always faced these methods is the trade-off between preservation of global structure and preservation of local structure: these methods can either handle one or the other, but not both. In this work, our main goal is to understand what aspects of DR methods are important for preserving both local and global structure: it is difficult to design a better method without a true understanding of the choices we make in our algorithms and their empirical impact on the lower-dimensional embeddings they produce. Towards the goal of local structure preservation, we provide several useful design principles for DR loss functions based on our new understanding of the mechanisms behind successful DR methods. Towards the goal of global structure preservation, our analysis illuminates that the choice of which components to preserve is important. We leverage these insights to design a new algorithm for DR, called Pairwise Controlled Manifold Approximation Projection (PaCMAP), which preserves both local and global structure. Our work provides several unexpected insights into what design choices both to make and avoid when constructing DR algorithms.
△ Less
Submitted 24 August, 2021; v1 submitted 8 December, 2020;
originally announced December 2020.
-
Cryo-ZSSR: multiple-image super-resolution based on deep internal learning
Authors:
Qinwen Huang,
Ye Zhou,
Xiaochen Du,
Reed Chen,
Jianyou Wang,
Cynthia Rudin,
Alberto Bartesaghi
Abstract:
Single-particle cryo-electron microscopy (cryo-EM) is an emerging imaging modality capable of visualizing proteins and macro-molecular complexes at near-atomic resolution. The low electron-doses used to prevent sample radiation damage, result in images where the power of the noise is 100 times greater than the power of the signal. To overcome the low-SNRs, hundreds of thousands of particle project…
▽ More
Single-particle cryo-electron microscopy (cryo-EM) is an emerging imaging modality capable of visualizing proteins and macro-molecular complexes at near-atomic resolution. The low electron-doses used to prevent sample radiation damage, result in images where the power of the noise is 100 times greater than the power of the signal. To overcome the low-SNRs, hundreds of thousands of particle projections acquired over several days of data collection are averaged in 3D to determine the structure of interest. Meanwhile, recent image super-resolution (SR) techniques based on neural networks have shown state of the art performance on natural images. Building on these advances, we present a multiple-image SR algorithm based on deep internal learning designed specifically to work under low-SNR conditions. Our approach leverages the internal image statistics of cryo-EM movies and does not require training on ground-truth data. When applied to a single-particle dataset of apoferritin, we show that the resolution of 3D structures obtained from SR micrographs can surpass the limits imposed by the imaging system. Our results indicate that the combination of low magnification imaging with image SR has the potential to accelerate cryo-EM data collection without sacrificing resolution.
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
Bandits for BMO Functions
Authors:
Tianyu Wang,
Cynthia Rudin
Abstract:
We study the bandit problem where the underlying expected reward is a Bounded Mean Oscillation (BMO) function. BMO functions are allowed to be discontinuous and unbounded, and are useful in modeling signals with infinities in the do-main. We develop a toolset for BMO bandits, and provide an algorithm that can achieve poly-log $δ$-regret -- a regret measured against an arm that is optimal after rem…
▽ More
We study the bandit problem where the underlying expected reward is a Bounded Mean Oscillation (BMO) function. BMO functions are allowed to be discontinuous and unbounded, and are useful in modeling signals with infinities in the do-main. We develop a toolset for BMO bandits, and provide an algorithm that can achieve poly-log $δ$-regret -- a regret measured against an arm that is optimal after removing a $δ$-sized portion of the arm space.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.