Skip to main content

David Watson

King's College London, Informatics, Faculty Member

University of Oxford, Oxford Internet Institute, Graduate Student

Queen Mary, University of London, William Harvey Research Institute, Post-Doc

University College London, Statistical Science, Post-Doc

Followers

123

Following

104

Co-authors

10

Public Views

Early career researcher with a long, interdisciplinary publication track record. I am passionate about the theory and practice of artificial intelligence – from the ethical impact of algorithms to their efficient design and deployment to tackle complex problems in the natural and social sciences.

less

Interests

Uploads

Papers by David Watson

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

Advances in Neural Information Processing Systems, 2023

Researchers in explainable artificial intelligence have developed numerous methods for helping us... more Researchers in explainable artificial intelligence have developed numerous methods for helping users understand the predictions of complex supervised learning models. By contrast, explaining the uncertainty of model outputs has received relatively little attention. We adapt the popular Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of individual model outputs. We consider games with modified characteristic functions and find deep connections between the resulting Shapley values and fundamental quantities from information theory and conditional independence testing. We outline inference procedures for finite sample error rate control with provable guarantees, and implement efficient algorithms that perform well in a range of experiments on real and simulated data. Our method has applications to covariate shift detection, active learning, feature selection, and active feature-value acquisition.

Intervention Generalization: A View from Factor Graph Models

Advances in Neural Information Processing Systems, 2023

One of the goals of causal inference is to generalize from past experiments and observational dat... more One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard. Under a typical sparse experimental design, this mapping is ill-posed without relying on heavy regularization or prior distributions. Such assumptions may or may not be reliable, and can be hard to defend or test. In this paper, we take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system, communicated in the well-understood language of factor graph models. A postulated interventional factor model (IFM) may not always be informative, but it conveniently abstracts away a need for explicitly modeling unmeasured confounding and feedback mechanisms, leading to directly testable claims. Given an IFM and datasets from a collection of experimental regimes, we derive conditions for identifiability of the expected outcomes of new regimes never observed in these training data. We implement our framework using several efficient algorithms, and apply them on a range of semi-synthetic experiments.

The Ethics of Online Controlled Experiments (A/B Testing

Minds and Machines, 2023

Online controlled experiments, also known as A/B tests, have become ubiquitous. While many practi... more Online controlled experiments, also known as A/B tests, have become ubiquitous. While many practical challenges in running experiments at scale have been thoroughly discussed, the ethical dimension of A/B testing has been neglected. This article fills this gap in the literature by introducing a new, soft ethics and governance framework that explicitly recognizes how the rise of an experimentation culture in industry settings brings not only unprecedented opportunities to businesses but also significant responsibilities. More precisely, the article (a) introduces a set of principles to encourage ethical and responsible experimentation to protect users, customers, and society; (b) argues that ensuring compliance with the proposed principles is a complex challenge unlikely to be addressed by resorting to a one-solution response; (c) discusses the relevance and effectiveness of several mechanisms and policies in educating, governing, and incentivizing companies conducting online controlled experiments; and (d) offers a list of prompting questions specifically designed to help and empower practitioners by stimulating specific ethical deliberations and facilitating coordination among different groups of stakeholders.

A Genealogical Approach to Algorithmic Bias

Minds and Machines, 2024

The Fairness, Accountability, and Transparency (FAccT) literature tends to focus on bias as a pro... more The Fairness, Accountability, and Transparency (FAccT) literature tends to focus on bias as a problem that requires ex post solutions (e.g. fairness metrics), rather than addressing the underlying social and technical conditions that (re)produce it. In this article, we propose a complementary strategy that uses genealogy as a constructive, epistemic critique to explain algorithmic bias in terms of the conditions that enable it. We focus on XAI feature attributions (Shapley values) and counterfactual approaches as potential tools to gauge these conditions and offer two main contributions. One is constructive: we develop a theoretical framework to classify these approaches according to their relevance for bias as evidence of social disparities. We draw on Pearl's ladder of causation (Causality: models, reasoning, and inference.

Conditional feature importance for mixed data

AStA Advances in Statistical Analysis

Despite the popularity of feature importance (FI) measures in interpretable machine learning, the... more Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analysing a variable’s importance before and after adjusting for covariates—i.e., between marginal and conditional measures. Our work draws attention to this rarely acknowledged, yet crucial distinction and showcases its implications. We find that few methods are available for testing conditional FI and practitioners have hitherto been severely restricted in method application due to mismatched data requirements. Most real-world data exhibits complex feature dependencies and incorporates both continuous and categorical features (i.e., mixed data). Both properties are oftentimes neglected by conditional FI measures. To fill this gap, we propose to combine the conditional predictive impact (CPI) framework with sequential knockoff sampling. The CPI enables conditional F...

Molecular Portraits of Early Rheumatoid Arthritis Identify Clinical and Treatment Response Phenotypes

Cell Reports, 2019

On the Philosophy of Unsupervised Learning

Philosophy & Technology, 2023

Unsupervised learning algorithms are widely used for many important statistical tasks with numero... more Unsupervised learning algorithms are widely used for many important statistical tasks with numerous applications in science and industry. Yet despite their prevalence, they have attracted remarkably little philosophical scrutiny to date. This stands in stark contrast to supervised and reinforcement learning algorithms, which have been widely studied and critically evaluated, often with an emphasis on ethical concerns. In this article, I analyze three canonical unsupervised learning problems: clustering, abstraction, and generative modeling. I argue that these methods raise unique epistemological and ontological questions, providing data-driven tools for discovering natural kinds and distinguishing essence from contingency. This analysis goes some way toward filling the lacuna in contemporary philosophical discourse on unsupervised learning, as well as bringing conceptual unity to a heterogeneous field more often described by what it is not (i.e., supervised or reinforcement learning) than by what it is. I submit that unsupervised learning is not just a legitimate subject of philosophical inquiry but perhaps the most fundamental branch of all AI. However, an uncritical overreliance on unsupervised methods poses major epistemic and ethical risks. I conclude by advocating for a pragmatic, error-statistical approach that embraces the opportunities and mitigates the challenges posed by this powerful class of algorithms.

Adversarial Random Forests for Density Estimation and Generative Modeling

AISTATS, 2023

We propose methods for density estimation and data synthesis using a novel form of unsupervised r... more We propose methods for density estimation and data synthesis using a novel form of unsupervised random forests. Inspired by generative adversarial networks, we implement a recursive procedure in which trees gradually learn structural properties of the data through alternating rounds of generation and discrimination. The method is provably consistent under minimal assumptions. Unlike classic tree-based alternatives, our approach provides smooth (un)conditional densities and allows for fully synthetic data generation. We achieve comparable or superior performance to state-ofthe-art probabilistic circuits and deep learning models on various tabular data benchmarks while executing about two orders of magnitude faster on average. An accompanying R package, arf, is available on CRAN.

Correction to: The Switch, the Ladder, and the Matrix: Models for Classifying AI Systems

Minds and Machines

The Switch, the Ladder, and the Matrix: Models for Classifying AI Systems

Minds and Machines

Organisations that design and deploy artificial intelligence (AI) systems increasingly commit the... more Organisations that design and deploy artificial intelligence (AI) systems increasingly commit themselves to high-level, ethical principles. However, there still exists a gap between principles and practices in AI ethics. One major obstacle organisations face when attempting to operationalise AI Ethics is the lack of a well-defined material scope. Put differently, the question to which systems and processes AI ethics principles ought to apply remains unanswered. Of course, there exists no universally accepted definition of AI, and different systems pose different ethical challenges. Nevertheless, pragmatic problem-solving demands that things should be sorted so that their grouping will promote successful actions for some specific end. In this article, we review and compare previous attempts to classify AI systems for the purpose of implementing AI governance in practice. We find that attempts to classify AI systems proposed in previous literature use one of three mental models: the S...

Operationalizing Complex Causes: A Pragmatic View of Mediation

arXiv (Cornell University), Jun 9, 2021

No Explanation without Inference

Complex algorithms are increasingly used to automate high-stakes decisions in sensitive areas lik... more Complex algorithms are increasingly used to automate high-stakes decisions in sensitive areas like healthcare and finance. However, the opacity of such models raises problems of intelligibility and trust. Researchers in interpretable machine learning (iML) have proposed a number of solutions, including local linear approximations, rule lists, and counterfactuals. I argue that all three methods share the same fundamental flaw – namely, a disregard for severe testing. Techniques for quantifying uncertainty and error are central to scientific explanation, yet iML has largely ignored this methodological imperative. I consider examples that illustrate the dangers of such negligence, with an emphasis on issues of scoping and confounding. Drawing on recent work in philosophy of science, I conclude that there can be no explanation – algorithmic or otherwise – without inference. I propose several ways to severely test existing iML methods and evaluate the resulting trade-offs.

Models for Classifying AI Systems: the Switch, the Ladder, and the Matrix

2022 ACM Conference on Fairness, Accountability, and Transparency

Rational Shapley Values

2022 ACM Conference on Fairness, Accountability, and Transparency

The epistemological foundations of data science: a critical analysis

SSRN Electronic Journal, 2022

Causal Feature Learning for Utility-Maximizing Agents

ArXiv, 2020

Discovering high-level causal relations from low-level data is an important and challenging probl... more Discovering high-level causal relations from low-level data is an important and challenging problem that comes up frequently in the natural and social sciences. In a series of papers, Chalupka et al. (2015, 2016a, 2016b, 2017) develop a procedure for causal feature learning (CFL) in an effort to automate this task. We argue that CFL does not recommend coarsening in cases where pragmatic considerations rule in favor of it, and recommends coarsening in cases where pragmatic considerations rule against it. We propose a new technique, pragmatic causal feature learning (PCFL), which extends the original CFL algorithm in useful and intuitive ways. We show that PCFL has the same attractive measure-theoretic properties as the original CFL algorithm. We compare the performance of both methods through theoretical analysis and experiments.

Local Explanations Via Necessity and Sufficiency: Unifying Theory and Practice

SSRN Electronic Journal, 2021

Testing conditional independence in supervised learning algorithms

Machine Learning, 2021

We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the as... more We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (J R Stat Soc Ser B 80:551–577, 2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regres...

Interpretable machine learning for genomics

Human Genetics, 2021

High-throughput technologies such as next-generation sequencing allow biologists to observe cell ... more High-throughput technologies such as next-generation sequencing allow biologists to observe cell function with unprecedented resolution, but the resulting datasets are too large and complicated for humans to understand without the aid of advanced statistical methods. Machine learning (ML) algorithms, which are designed to automatically find patterns in data, are well suited to this task. Yet these models are often so complex as to be opaque, leaving researchers with few clues about underlying mechanisms. Interpretable machine learning (iML) is a burgeoning subdiscipline of computational statistics devoted to making the predictions of ML models more intelligible to end users. This article is a gentle and critical introduction to iML, with an emphasis on genomic applications. I define relevant concepts, motivate leading methodologies, and provide a simple typology of existing approaches. I survey recent examples of iML in genomics, demonstrating how such techniques are increasingly in...

The paradox of poor representation: How voter–party incongruence curbs affective polarisation

The British Journal of Politics and International Relations, 2021

Research on the relationship between ideology and affective polarisation highlights ideological d... more Research on the relationship between ideology and affective polarisation highlights ideological disagreement as a key driver of animosity between partisan groups. By operationalising disagreement on the left–right dimension, however, existing studies often overlook voter–party incongruence as a potential determinant of affective evaluations. How does incongruence on policy issues impact affective evaluations of mainstream political parties and their leaders? We tackle this question by analysing data from the British Election Study collected ahead of the 2019 UK General Election using an instrumental variable approach. Consistent with our expectations, we find that voter–party incongruence has a significant causal impact on affective evaluations. Perceived representational gaps between party and voter drive negative evaluations of the in-party and positive evaluations of the opposition, thus lowering affective polarisation overall. The results offer a more nuanced perspective on the ...

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

Advances in Neural Information Processing Systems, 2023

Researchers in explainable artificial intelligence have developed numerous methods for helping us... more Researchers in explainable artificial intelligence have developed numerous methods for helping users understand the predictions of complex supervised learning models. By contrast, explaining the uncertainty of model outputs has received relatively little attention. We adapt the popular Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of individual model outputs. We consider games with modified characteristic functions and find deep connections between the resulting Shapley values and fundamental quantities from information theory and conditional independence testing. We outline inference procedures for finite sample error rate control with provable guarantees, and implement efficient algorithms that perform well in a range of experiments on real and simulated data. Our method has applications to covariate shift detection, active learning, feature selection, and active feature-value acquisition.

Intervention Generalization: A View from Factor Graph Models

Advances in Neural Information Processing Systems, 2023

One of the goals of causal inference is to generalize from past experiments and observational dat... more One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard. Under a typical sparse experimental design, this mapping is ill-posed without relying on heavy regularization or prior distributions. Such assumptions may or may not be reliable, and can be hard to defend or test. In this paper, we take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system, communicated in the well-understood language of factor graph models. A postulated interventional factor model (IFM) may not always be informative, but it conveniently abstracts away a need for explicitly modeling unmeasured confounding and feedback mechanisms, leading to directly testable claims. Given an IFM and datasets from a collection of experimental regimes, we derive conditions for identifiability of the expected outcomes of new regimes never observed in these training data. We implement our framework using several efficient algorithms, and apply them on a range of semi-synthetic experiments.

The Ethics of Online Controlled Experiments (A/B Testing

Minds and Machines, 2023

Online controlled experiments, also known as A/B tests, have become ubiquitous. While many practi... more Online controlled experiments, also known as A/B tests, have become ubiquitous. While many practical challenges in running experiments at scale have been thoroughly discussed, the ethical dimension of A/B testing has been neglected. This article fills this gap in the literature by introducing a new, soft ethics and governance framework that explicitly recognizes how the rise of an experimentation culture in industry settings brings not only unprecedented opportunities to businesses but also significant responsibilities. More precisely, the article (a) introduces a set of principles to encourage ethical and responsible experimentation to protect users, customers, and society; (b) argues that ensuring compliance with the proposed principles is a complex challenge unlikely to be addressed by resorting to a one-solution response; (c) discusses the relevance and effectiveness of several mechanisms and policies in educating, governing, and incentivizing companies conducting online controlled experiments; and (d) offers a list of prompting questions specifically designed to help and empower practitioners by stimulating specific ethical deliberations and facilitating coordination among different groups of stakeholders.

A Genealogical Approach to Algorithmic Bias

Minds and Machines, 2024

The Fairness, Accountability, and Transparency (FAccT) literature tends to focus on bias as a pro... more The Fairness, Accountability, and Transparency (FAccT) literature tends to focus on bias as a problem that requires ex post solutions (e.g. fairness metrics), rather than addressing the underlying social and technical conditions that (re)produce it. In this article, we propose a complementary strategy that uses genealogy as a constructive, epistemic critique to explain algorithmic bias in terms of the conditions that enable it. We focus on XAI feature attributions (Shapley values) and counterfactual approaches as potential tools to gauge these conditions and offer two main contributions. One is constructive: we develop a theoretical framework to classify these approaches according to their relevance for bias as evidence of social disparities. We draw on Pearl's ladder of causation (Causality: models, reasoning, and inference.

Conditional feature importance for mixed data

AStA Advances in Statistical Analysis

Despite the popularity of feature importance (FI) measures in interpretable machine learning, the... more Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analysing a variable’s importance before and after adjusting for covariates—i.e., between marginal and conditional measures. Our work draws attention to this rarely acknowledged, yet crucial distinction and showcases its implications. We find that few methods are available for testing conditional FI and practitioners have hitherto been severely restricted in method application due to mismatched data requirements. Most real-world data exhibits complex feature dependencies and incorporates both continuous and categorical features (i.e., mixed data). Both properties are oftentimes neglected by conditional FI measures. To fill this gap, we propose to combine the conditional predictive impact (CPI) framework with sequential knockoff sampling. The CPI enables conditional F...

Molecular Portraits of Early Rheumatoid Arthritis Identify Clinical and Treatment Response Phenotypes

Cell Reports, 2019

On the Philosophy of Unsupervised Learning

Philosophy & Technology, 2023

Unsupervised learning algorithms are widely used for many important statistical tasks with numero... more Unsupervised learning algorithms are widely used for many important statistical tasks with numerous applications in science and industry. Yet despite their prevalence, they have attracted remarkably little philosophical scrutiny to date. This stands in stark contrast to supervised and reinforcement learning algorithms, which have been widely studied and critically evaluated, often with an emphasis on ethical concerns. In this article, I analyze three canonical unsupervised learning problems: clustering, abstraction, and generative modeling. I argue that these methods raise unique epistemological and ontological questions, providing data-driven tools for discovering natural kinds and distinguishing essence from contingency. This analysis goes some way toward filling the lacuna in contemporary philosophical discourse on unsupervised learning, as well as bringing conceptual unity to a heterogeneous field more often described by what it is not (i.e., supervised or reinforcement learning) than by what it is. I submit that unsupervised learning is not just a legitimate subject of philosophical inquiry but perhaps the most fundamental branch of all AI. However, an uncritical overreliance on unsupervised methods poses major epistemic and ethical risks. I conclude by advocating for a pragmatic, error-statistical approach that embraces the opportunities and mitigates the challenges posed by this powerful class of algorithms.

Adversarial Random Forests for Density Estimation and Generative Modeling

AISTATS, 2023

We propose methods for density estimation and data synthesis using a novel form of unsupervised r... more We propose methods for density estimation and data synthesis using a novel form of unsupervised random forests. Inspired by generative adversarial networks, we implement a recursive procedure in which trees gradually learn structural properties of the data through alternating rounds of generation and discrimination. The method is provably consistent under minimal assumptions. Unlike classic tree-based alternatives, our approach provides smooth (un)conditional densities and allows for fully synthetic data generation. We achieve comparable or superior performance to state-ofthe-art probabilistic circuits and deep learning models on various tabular data benchmarks while executing about two orders of magnitude faster on average. An accompanying R package, arf, is available on CRAN.

Correction to: The Switch, the Ladder, and the Matrix: Models for Classifying AI Systems

Minds and Machines

The Switch, the Ladder, and the Matrix: Models for Classifying AI Systems

Minds and Machines

Organisations that design and deploy artificial intelligence (AI) systems increasingly commit the... more Organisations that design and deploy artificial intelligence (AI) systems increasingly commit themselves to high-level, ethical principles. However, there still exists a gap between principles and practices in AI ethics. One major obstacle organisations face when attempting to operationalise AI Ethics is the lack of a well-defined material scope. Put differently, the question to which systems and processes AI ethics principles ought to apply remains unanswered. Of course, there exists no universally accepted definition of AI, and different systems pose different ethical challenges. Nevertheless, pragmatic problem-solving demands that things should be sorted so that their grouping will promote successful actions for some specific end. In this article, we review and compare previous attempts to classify AI systems for the purpose of implementing AI governance in practice. We find that attempts to classify AI systems proposed in previous literature use one of three mental models: the S...

Operationalizing Complex Causes: A Pragmatic View of Mediation

arXiv (Cornell University), Jun 9, 2021

No Explanation without Inference

Complex algorithms are increasingly used to automate high-stakes decisions in sensitive areas lik... more Complex algorithms are increasingly used to automate high-stakes decisions in sensitive areas like healthcare and finance. However, the opacity of such models raises problems of intelligibility and trust. Researchers in interpretable machine learning (iML) have proposed a number of solutions, including local linear approximations, rule lists, and counterfactuals. I argue that all three methods share the same fundamental flaw – namely, a disregard for severe testing. Techniques for quantifying uncertainty and error are central to scientific explanation, yet iML has largely ignored this methodological imperative. I consider examples that illustrate the dangers of such negligence, with an emphasis on issues of scoping and confounding. Drawing on recent work in philosophy of science, I conclude that there can be no explanation – algorithmic or otherwise – without inference. I propose several ways to severely test existing iML methods and evaluate the resulting trade-offs.

Models for Classifying AI Systems: the Switch, the Ladder, and the Matrix

2022 ACM Conference on Fairness, Accountability, and Transparency

Rational Shapley Values

2022 ACM Conference on Fairness, Accountability, and Transparency

The epistemological foundations of data science: a critical analysis

SSRN Electronic Journal, 2022

Causal Feature Learning for Utility-Maximizing Agents

ArXiv, 2020

Discovering high-level causal relations from low-level data is an important and challenging probl... more Discovering high-level causal relations from low-level data is an important and challenging problem that comes up frequently in the natural and social sciences. In a series of papers, Chalupka et al. (2015, 2016a, 2016b, 2017) develop a procedure for causal feature learning (CFL) in an effort to automate this task. We argue that CFL does not recommend coarsening in cases where pragmatic considerations rule in favor of it, and recommends coarsening in cases where pragmatic considerations rule against it. We propose a new technique, pragmatic causal feature learning (PCFL), which extends the original CFL algorithm in useful and intuitive ways. We show that PCFL has the same attractive measure-theoretic properties as the original CFL algorithm. We compare the performance of both methods through theoretical analysis and experiments.

Local Explanations Via Necessity and Sufficiency: Unifying Theory and Practice

SSRN Electronic Journal, 2021

Testing conditional independence in supervised learning algorithms

Machine Learning, 2021

We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the as... more We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (J R Stat Soc Ser B 80:551–577, 2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regres...

Interpretable machine learning for genomics

Human Genetics, 2021

High-throughput technologies such as next-generation sequencing allow biologists to observe cell ... more High-throughput technologies such as next-generation sequencing allow biologists to observe cell function with unprecedented resolution, but the resulting datasets are too large and complicated for humans to understand without the aid of advanced statistical methods. Machine learning (ML) algorithms, which are designed to automatically find patterns in data, are well suited to this task. Yet these models are often so complex as to be opaque, leaving researchers with few clues about underlying mechanisms. Interpretable machine learning (iML) is a burgeoning subdiscipline of computational statistics devoted to making the predictions of ML models more intelligible to end users. This article is a gentle and critical introduction to iML, with an emphasis on genomic applications. I define relevant concepts, motivate leading methodologies, and provide a simple typology of existing approaches. I survey recent examples of iML in genomics, demonstrating how such techniques are increasingly in...

The paradox of poor representation: How voter–party incongruence curbs affective polarisation

The British Journal of Politics and International Relations, 2021

Research on the relationship between ideology and affective polarisation highlights ideological d... more Research on the relationship between ideology and affective polarisation highlights ideological disagreement as a key driver of animosity between partisan groups. By operationalising disagreement on the left–right dimension, however, existing studies often overlook voter–party incongruence as a potential determinant of affective evaluations. How does incongruence on policy issues impact affective evaluations of mainstream political parties and their leaders? We tackle this question by analysing data from the British Election Study collected ahead of the 2019 UK General Election using an instrumental variable approach. Consistent with our expectations, we find that voter–party incongruence has a significant causal impact on affective evaluations. Perceived representational gaps between party and voter drive negative evaluations of the in-party and positive evaluations of the opposition, thus lowering affective polarisation overall. The results offer a more nuanced perspective on the ...

Conceptual Challenges for Interpretable Machine Learning

SSRN preprint, 2020

As machine learning has gradually entered into ever more sectors of public and private life, ther... more As machine learning has gradually entered into ever more sectors of public and private life, there has been a growing demand for algorithmic explainability. How can we make the predictions of complex statistical models more intelligible to end users? A subdiscipline of computer science known as interpretable machine learning (iML) has emerged to address this urgent question. Numerous influential methods have been proposed, from local linear approximations to rule lists and counterfactuals. In this article, I highlight three conceptual challenges that are largely overlooked by authors in this area. I argue that the vast majority of iML algorithms are plagued by (1) ambiguity with respect to their true target; (2) a disregard for error rates and severe testing; and (3) an emphasis on product over process. Each point is developed at length, drawing on relevant debates in epistemology and philosophy of science. Examples and counterexamples from iML are considered, demonstrating how failure to acknowledge these problems can result in counterintuitive and potentially misleading explanations. Without greater care for the conceptual foundations of iML, future work in this area is doomed to repeat the same mistakes.

Rational Shapley Values

arXiv preprint, 2021

Explaining the predictions of opaque machine learning algorithms is an important and challenging ... more Explaining the predictions of opaque machine learning algorithms is an important and challenging task, especially as complex models are increasingly used to assist in high-stakes decisions such as those arising in healthcare and finance. Most popular tools for post-hoc explainable artificial intelligence (XAI) are either insensitive to context (e.g., feature attributions) or difficult to summarize (e.g., counterfactuals). In this paper, I introduce rational Shapley values, a novel XAI method that synthesizes and extends these seemingly incompatible approaches in a rigorous, flexible manner. I leverage tools from decision theory and causal modeling to formalize and implement a pragmatic approach that resolves a number of known challenges in XAI. By pairing the distribution of random variables with the appropriate reference class for a given explanation task, I illustrate through theory and experiments how user goals and knowledge can inform and constrain the solution set in an iterative fashion. The method compares favorably to state of the art XAI tools in a range of quantitative and qualitative comparisons.

Testing Conditional Independence in Supervised Learning Algorithms

arXiv preprint, 2019

We propose a general test of conditional independence. The conditional predictive impact (CPI) is... more We propose a general test of conditional independence. The conditional predictive impact (CPI) is a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability with greater power and speed than the original knockoff filter. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi.

Causal Discovery Under a Confounder Blanket

38th Conference on Uncertainty in Artificial Intelligence, 2022

Inferring causal relationships from observational data is rarely straightforward, but the problem... more Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering a directed acyclic subgraph of variables known to be causally descended from some (possibly large) set of confounding covariates, i.e. a confounder blanket. This is useful in many settings, for example when studying a dynamic biomolecular subsystem with genetic data providing causally relevant background information. Under a structural assumption that, we argue, must be satisfied in practice if informative answers are to be found, our method accommodates graphs of low or high sparsity while maintaining polynomial time complexity. We derive a sound and complete algorithm for identifying causal relationships under these conditions and implement testing procedures with provable error control for linear and nonlinear systems. We demonstrate our approach on a range of simulation settings.

Rational Shapley Values

ACM Conference on Fairness, Accountability, and Transparency, 2022

Explaining the predictions of opaque machine learning algorithms is an important and challenging ... more Explaining the predictions of opaque machine learning algorithms is an important and challenging task, especially as complex models are increasingly used to assist in high-stakes decisions such as those arising in healthcare and finance. Most popular tools for post-hoc explainable artificial intelligence (XAI) are either insensitive to context (e.g., feature attributions) or difficult to summarize (e.g., counterfactuals). In this paper, I introduce rational Shapley values, a novel XAI method that synthesizes and extends these seemingly incompatible approaches in a rigorous, flexible manner. I leverage tools from decision theory and causal modeling to formalize and implement a pragmatic approach that resolves a number of known challenges in XAI. By pairing the distribution of random variables with the appropriate reference class for a given explanation task, I illustrate through theory and experiments how user goals and knowledge can inform and constrain the solution set in an iterative fashion. The method compares favorably to state of the art XAI tools in a range of quantitative and qualitative comparisons.

Local Explanations via Necessity and Sufficiency: Unifying Theory and Practice

by David Watson, Limor Gultchin, and Luciano Floridi

37th Conference on Uncertainty in Artificial Intelligence, 2021

Necessity and sufficiency are the building blocks of all successful explanations. Yet despite the... more Necessity and sufficiency are the building blocks of all successful explanations. Yet despite their importance, these notions have been conceptually underdeveloped and inconsistently applied in explainable artificial intelligence (XAI), a fast-growing research area that is so far lacking in firm theoretical foundations. Building on work in logic, probability, and causality, we establish the central role of necessity and sufficiency in XAI, unifying seemingly disparate methods in a single formal framework. We provide a sound and complete algorithm for computing explanatory factors with respect to a given context, and demonstrate its flexibility and competitive performance against state of the art alternatives on various tasks.

Operationalizing Complex Causes: A Pragmatic View of Mediation

by Limor Gultchin and David Watson

International Conference on Machine Learning, 2021

We examine the problem of causal response estimation for complex objects (e.g., text, images, gen... more We examine the problem of causal response estimation for complex objects (e.g., text, images, genomics). In this setting, classical atomic interventions are often not available (e.g., changes to characters, pixels, DNA base-pairs). Instead, we only have access to indirect or crude interventions (e.g., enrolling in a writing program, modifying a scene, applying a gene therapy). In this work, we formalize this problem and provide an initial solution. Given a collection of candidate mediators, we propose (a) a two-step method for predicting the causal responses of crude interventions; and (b) a testing procedure to identify mediators of crude interventions. We demonstrate, on a range of simulated and real-world-inspired examples, that our approach allows us to efficiently estimate the effect of crude interventions with limited data from new treatment regimes.

Causal Feature Learning for Utility-Maximizing Agents

International Conference on Probabilistic Graphical Models, 2020

Discovering high-level causal relations from low-level data is an important and challenging probl... more Discovering high-level causal relations from low-level data is an important and challenging problem that comes up frequently in the natural and social sciences. In a series of papers, Chalupka et al. (2015, 2016a, 2016b, 2017) develop a procedure for causal feature learning (CFL) in an effort to automate this task. We argue that CFL does not recommend coarsening in cases where pragmatic considerations rule in favor of it, and recommends coarsening in cases where pragmatic considerations rule against it. We propose a new technique, pragmatic causal feature learning (PCFL), which extends the original CFL algorithm in useful and intuitive ways. We show that PCFL has the same attractive measure-theoretic properties as the original CFL algorithm. We compare the performance of both methods through theoretical analysis and experiments.