Search | arXiv e-print repository

arXiv:2402.01972 [pdf, other]

Combining T-learning and DR-learning: a framework for oracle-efficient estimation of causal contrasts

Authors: Lars van der Laan, Marco Carone, Alex Luedtke

Abstract: We introduce efficient plug-in (EP) learning, a novel framework for the estimation of heterogeneous causal contrasts, such as the conditional average treatment effect and conditional relative risk. The EP-learning framework enjoys the same oracle-efficiency as Neyman-orthogonal learning strategies, such as DR-learning and R-learning, while addressing some of their primary drawbacks, including that… ▽ More We introduce efficient plug-in (EP) learning, a novel framework for the estimation of heterogeneous causal contrasts, such as the conditional average treatment effect and conditional relative risk. The EP-learning framework enjoys the same oracle-efficiency as Neyman-orthogonal learning strategies, such as DR-learning and R-learning, while addressing some of their primary drawbacks, including that (i) their practical applicability can be hindered by loss function non-convexity; and (ii) they may suffer from poor performance and instability due to inverse probability weighting and pseudo-outcomes that violate bounds. To avoid these drawbacks, EP-learner constructs an efficient plug-in estimator of the population risk function for the causal contrast, thereby inheriting the stability and robustness properties of plug-in estimation strategies like T-learning. Under reasonable conditions, EP-learners based on empirical risk minimization are oracle-efficient, exhibiting asymptotic equivalence to the minimizer of an oracle-efficient one-step debiased estimator of the population risk function. In simulation experiments, we illustrate that EP-learners of the conditional average treatment effect and conditional relative risk outperform state-of-the-art competitors, including T-learner, R-learner, and DR-learner. Open-source implementations of the proposed methods are available in our R package hte3. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2302.14011 [pdf, other]

Causal isotonic calibration for heterogeneous treatment effects

Authors: Lars van der Laan, Ernesto Ulloa-Pérez, Marco Carone, Alex Luedtke

Abstract: We propose causal isotonic calibration, a novel nonparametric method for calibrating predictors of heterogeneous treatment effects. Furthermore, we introduce cross-calibration, a data-efficient variant of calibration that eliminates the need for hold-out calibration sets. Cross-calibration leverages cross-fitted predictors and generates a single calibrated predictor using all available data. Under… ▽ More We propose causal isotonic calibration, a novel nonparametric method for calibrating predictors of heterogeneous treatment effects. Furthermore, we introduce cross-calibration, a data-efficient variant of calibration that eliminates the need for hold-out calibration sets. Cross-calibration leverages cross-fitted predictors and generates a single calibrated predictor using all available data. Under weak conditions that do not assume monotonicity, we establish that both causal isotonic calibration and cross-calibration achieve fast doubly-robust calibration rates, as long as either the propensity score or outcome regression is estimated accurately in a suitable sense. The proposed causal isotonic calibrator can be wrapped around any black-box learning algorithm, providing robust and distribution-free calibration guarantees while preserving predictive performance. △ Less

Submitted 5 June, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: Accepted to ICML2023

arXiv:2010.04805 [pdf, other]

Discussion of Kallus (2020) and Mo, Qi, and Liu (2020): New Objectives for Policy Learning

Authors: Sijia Li, Xiudi Li, Alex Luedtke

Abstract: We discuss the thought-provoking new objective functions for policy learning that were proposed in "More efficient policy learning via optimal retargeting" by Nathan Kallus and "Learning optimal distributionally robust individualized treatment rules" by Weibin Mo, Zhengling Qi, and Yufeng Liu. We show that it is important to take the curvature of the value function into account when working within… ▽ More We discuss the thought-provoking new objective functions for policy learning that were proposed in "More efficient policy learning via optimal retargeting" by Nathan Kallus and "Learning optimal distributionally robust individualized treatment rules" by Weibin Mo, Zhengling Qi, and Yufeng Liu. We show that it is important to take the curvature of the value function into account when working within the retargeting framework, and we introduce two ways to do so. We also describe more efficient approaches for leveraging calibration data when learning distributionally robust policies. △ Less

Submitted 9 October, 2020; originally announced October 2020.

Comments: Submitted to the Journal of the American Statistical Association as an invited discussion

arXiv:2002.11275 [pdf, other]

Adversarial Monte Carlo Meta-Learning of Optimal Prediction Procedures

Authors: Alex Luedtke, Incheoul Chung, Oleg Sofrygin

Abstract: We frame the meta-learning of prediction procedures as a search for an optimal strategy in a two-player game. In this game, Nature selects a prior over distributions that generate labeled data consisting of features and an associated outcome, and the Predictor observes data sampled from a distribution drawn from this prior. The Predictor's objective is to learn a function that maps from a new feat… ▽ More We frame the meta-learning of prediction procedures as a search for an optimal strategy in a two-player game. In this game, Nature selects a prior over distributions that generate labeled data consisting of features and an associated outcome, and the Predictor observes data sampled from a distribution drawn from this prior. The Predictor's objective is to learn a function that maps from a new feature to an estimate of the associated outcome. We establish that, under reasonable conditions, the Predictor has an optimal strategy that is equivariant to shifts and rescalings of the outcome and is invariant to permutations of the observations and to shifts, rescalings, and permutations of the features. We introduce a neural network architecture that satisfies these properties. The proposed strategy performs favorably compared to standard practice in both parametric and nonparametric experiments. △ Less

Submitted 25 September, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

MSC Class: 62C20 ACM Class: G.3

arXiv:1911.04869 [pdf, other]

doi 10.4204/EPTCS.308.2

Extending Causal Models from Machines into Humans

Authors: Severin Kacianka, Amjad Ibrahim, Alexander Pretschner, Alexander Trende, Andreas Lüdtke

Abstract: Causal Models are increasingly suggested as a means to reason about the behavior of cyber-physical systems in socio-technical contexts. They allow us to analyze courses of events and reason about possible alternatives. Until now, however, such reasoning is confined to the technical domain and limited to single systems or at most groups of systems. The humans that are an integral part of any such s… ▽ More Causal Models are increasingly suggested as a means to reason about the behavior of cyber-physical systems in socio-technical contexts. They allow us to analyze courses of events and reason about possible alternatives. Until now, however, such reasoning is confined to the technical domain and limited to single systems or at most groups of systems. The humans that are an integral part of any such socio-technical system are usually ignored or dealt with by "expert judgment". We show how a technical causal model can be extended with models of human behavior to cover the complexity and interplay between humans and technical systems. This integrated socio-technical causal model can then be used to reason not only about actions and decisions taken by the machine, but also about those taken by humans interacting with the system. In this paper we demonstrate the feasibility of merging causal models about machines with causal models about humans and illustrate the usefulness of this approach with a highly automated vehicle example. △ Less

Submitted 30 October, 2019; originally announced November 2019.

Comments: In Proceedings CREST 2019, arXiv:1910.13641

Journal ref: EPTCS 308, 2019, pp. 17-31

arXiv:1902.04929 [pdf]

Integrating Neurophysiological Sensors and Driver Models for Safe and Performant Automated Vehicle Control in Mixed Traffic

Authors: Werner Damm, Martin Fränzle, Andreas Lüdtke, Jochem W. Rieger, Alexander Trende, Anirudh Unni

Abstract: In future mixed traffic Highly Automated Vehicles (HAV) will have to resolve interactions with human operated traffic. A particular problem for HAVs is detection of human states influencing safety critical decisions and driving behavior of humans. We demonstrate the value proposition of neurophysiological sensors and driver models for optimizing performance of HAVs under safety constraints in mixe… ▽ More In future mixed traffic Highly Automated Vehicles (HAV) will have to resolve interactions with human operated traffic. A particular problem for HAVs is detection of human states influencing safety critical decisions and driving behavior of humans. We demonstrate the value proposition of neurophysiological sensors and driver models for optimizing performance of HAVs under safety constraints in mixed traffic applications. △ Less

Submitted 13 February, 2019; originally announced February 2019.

Comments: 8 pages, 6 Figures, submitted to HFIV'19

arXiv:1606.09388 [pdf, other]

Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

Authors: Alexander Luedtke, Emilie Kaufmann, Antoine Chambaz

Abstract: We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli reward… ▽ More We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli rewards and a variant of KL-UCB for both single-parameter exponential families and bounded, finitely supported rewards. We show these algorithms are asymptotically optimal, both in rateand leading problem-dependent constants, including in the thick margin setting where multiple arms fall on the decision boundary. △ Less

Submitted 12 September, 2019; v1 submitted 30 June, 2016; originally announced June 2016.

Journal ref: Machine Learning Journal, Springer, In press

Showing 1–7 of 7 results for author: Lüdtke, A