Skip to main content

Christian Tarsney

The University of Texas at Austin, Population Wellbeing Initiative, Research Fellow

University of Oxford, Global Priorities Institute, Senior Research Affiliate

Followers

301

Following

256

Co-authors

2

Public Views

I’m a philosopher working mainly in ethics and decision theory. My research spans formal and applied ethics, with an emphasis on teasing out the implications of abstract normative principles for important real-world problems. I'm interested in questions like how much weight we should give to tiny probabilities of extreme outcomes, how to deal with uncertainty about basic moral principles, how to compare outcomes in which different numbers of people exist, and to what extent we can predict the very-long-run effects of our present actions. I've also recently started thinking about philosophical questions related to artificial intelligence.

I'm currently a senior research fellow at the Population Wellbeing Initiative at UT Austin. I’m also a senior research affiliate with the Global Priorities Institute at Oxford University, where I was previously a research fellow and assistant director for philosophy. Before that, I was a postdoc in the Centre for Philosophy, Politics and Economics at the University of Groningen, and did my PhD in philosophy at the University of Maryland.

You can reach me at christian.tarsney (at) austin.utexas.edu.

less

InterestsView All (11)

Uploads

Extreme Risks by Christian Tarsney

Against Anti-Fanaticism

Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fa... more Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fanatics say you should, anti-fanatics say you should not. Anti-fanaticism has great intuitive appeal. But, I argue, these intuitions are untenable, because satisfying them in their full generality is incompatible with three very plausible principles: acyclicity, a minimal dominance principle, and the principle that any outcome can be made better or worse. This argument against anti-fanaticism can be turned into a positive argument for a weak version of fanaticism, but only from significantly more contentious premises. In combination, these facts suggest that those who find fanaticism counterintuitive should favor not anti-fanaticism, but an intermediate position that permits agents to have incomplete preferences that are neither fanatical nor anti-fanatical.

Against Anti-Fanaticism

Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fa... more Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fanatics say you should, anti-fanatics say you should not. Anti-fanaticism has great intuitive appeal. But, I argue, these intuitions are untenable, because satisfying them in their full generality is incompatible with three very plausible principles: acyclicity, a minimal dominance principle, and the principle that any outcome can be made better or worse. This argument against anti-fanaticism can be turned into a positive argument for a weak version of fanaticism, but only from significantly more contentious premises. In combination, these facts suggest that those who find fanaticism counterintuitive should favor not anti-fanaticism, but an intermediate position that permits agents to have incomplete preferences that are neither fanatical nor anti-fanatical.

Exceeding Expectations: Stochastic Dominance as a General Decision Theory

The principle that rational agents should maximize expected utility or choiceworthiness is intuit... more The principle that rational agents should maximize expected utility or choiceworthiness is intuitively plausible in many ordinary cases of decision-making under uncertainty. But it is less plausible in cases of extreme, low-probability risk (like Pascal's Mugging), and intolerably paradoxical in cases like the St. Petersburg and Pasadena games. In this paper I show that, under certain conditions, stochastic dominance reasoning can capture most of the plausible implications of expectational reasoning while avoiding most of its pitfalls. Specifically, given sufficient background uncertainty about the choiceworthiness of one's options, many expectation-maximizing gambles that do not stochastically dominate their alternatives "in a vacuum" become stochastically dominant in virtue of that background uncertainty. But, even under these conditions, stochastic dominance will generally not require agents to accept extreme gambles like Pascal's Mugging or the St. Petersburg game. The sort of background uncertainty on which these results depend looks unavoidable for any agent who measures the choiceworthiness of her options in part by the total amount of value in the resulting world. At least for such agents, then, stochastic dominance offers a plausible general principle of choice under uncertainty that can explain more of the apparent rational constraints on such choices than has previously been recognized.

Exceeding Expectations: Stochastic Dominance as a General Decision Theory

arXiv: Theoretical Economics, 2018

The principle that rational agents should maximize expected utility or choiceworthiness is intuit... more The principle that rational agents should maximize expected utility or choiceworthiness is intuitively plausible in many ordinary cases of decision-making under uncertainty. But it is less plausible in cases of extreme, low-probability risk (like Pascal's Mugging), and intolerably paradoxical in cases like the St. Petersburg and Pasadena games. In this paper I show that, under certain conditions, stochastic dominance reasoning can capture most of the plausible implications of expectational reasoning while avoiding most of its pitfalls. Specifically, given sufficient background uncertainty about the choiceworthiness of one's options, many expectation-maximizing gambles that do not stochastically dominate their alternatives "in a vacuum" become stochastically dominant in virtue of that background uncertainty. But, even under these conditions, stochastic dominance will generally not require agents to accept extreme gambles like Pascal's Mugging or the St. Petersbu...

Population Ethics by Christian Tarsney

Share the Sugar

We provide a general argument against value incomparability, based on a new style of impossibilit... more We provide a general argument against value incomparability, based on a new style of impossibility result. In particular, we show that, against plausible background assumptions, value incomparability creates an incompatibility between two very plausible principles for ranking lotteries: a weak "negative dominance" principle (to the effect that Lottery 1 can be better than Lottery 2 only if some possible outcome of Lottery 1 is better than some possible outcome of Lottery 2) and a weak form of ex ante Pareto (to the effect that, if Lottery 1 gives an unambiguously better prospect to some individuals than Lottery 2, and equally good prospects to everyone else, then Lottery 1 is better than Lottery 2). After spelling out our results, and the arguments based on them, we consider which principle the proponent of incomparability ought to reject.

Non-Additive Axiologies in Large Worlds

Is the overall value of a world just the sum of values contributed by each value-bearing entity i... more Is the overall value of a world just the sum of values contributed by each value-bearing entity in that world? Additively separable axiologies (like total utilitarianism, prioritarianism, and critical level views) say 'yes', but non-additive axiologies (like average utilitarianism, rank-discounted utilitarianism, and variable value views) say 'no'. This distinction is practically important: additive axiologies support 'arguments from astronomical scale' which suggest (among other things) that it is overwhelmingly important for humanity to avoid premature extinction and ensure the existence of a large future population, while non-additive axiologies need not. We show, however, that when there is a large enough 'background population' unaffected by our choices, a wide range of non-additive axiologies converge in their implications with some additive axiology -- for instance, average utilitarianism converges to critical-level utilitarianism and various egalitarian theories converge to prioritiarianism. We further argue that real-world background populations may be large enough to make these limit results practically significant. This means that arguments from astronomical scale, and other arguments in practical ethics that seem to presuppose additive separability, may be truth-preserving in practice whether or not we accept additive separability as a basic axiological principle.

Non-Additive Axiologies in Large Worlds

arXiv: Theoretical Economics, 2020

Is the overall value of a world just the sum of values contributed by each value-bearing entity i... more Is the overall value of a world just the sum of values contributed by each value-bearing entity in that world? Additively separable axiologies (like total utilitarianism, prioritarianism, and critical level views) say 'yes', but non-additive axiologies (like average utilitarianism, rank-discounted utilitarianism, and variable value views) say 'no'. This distinction is practically important: additive axiologies support 'arguments from astronomical scale' which suggest (among other things) that it is overwhelmingly important for humanity to avoid premature extinction and ensure the existence of a large future population, while non-additive axiologies need not. We show, however, that when there is a large enough 'background population' unaffected by our choices, a wide range of non-additive axiologies converge in their implications with some additive axiology -- for instance, average utilitarianism converges to critical-level utilitarianism and various e...

Average Utilitarianism Implies Solipsistic Egoism

Average utilitarianism and several related axiologies, when paired with the standard expectationa... more Average utilitarianism and several related axiologies, when paired with the standard expectational theory of decision-making under risk and with reasonable empirical credences, can find their practical prescriptions overwhelmingly determined by the minuscule probability that the agent assigns to solipsism -- i.e., to the hypothesis that there is only one welfare subject in the world, viz., herself. This either (i) constitutes a reductio of these axiologies, (ii) suggests that they require bespoke decision theories, or (iii) furnishes a novel argument for ethical egoism.

Average Utilitarianism Implies Solipsistic Egoism

Australasian Journal of Philosophy, 2021

What Should We Agree on about the Repugnant Conclusion?

Utilitas

The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in... more The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in Derek Parfit's original formulation, For any possible population of at least ten billion people, all with a very high quality of life, there must be some much larger imaginable population whose existence, if other things are equal, would be better, even though its members have lives that are barely worth living. (Parfit 1984: 388)

What Should We Agree on about the Repugnant Conclusion?

by Mark Budolfson, Christian Tarsney, and Dean Spears

Utilitas, 2021

The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in... more The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in Derek Parfit's original formulation, For any possible population of at least ten billion people, all with a very high quality of life, there must be some much larger imaginable population whose existence, if other things are equal, would be better, even though its members have lives that are barely worth living.

Philosophy of AI by Christian Tarsney

Deception and Manipulation in Generative AI

Large language models now possess human-level linguistic abilities in many contexts. This raises ... more Large language models now possess human-level linguistic abilities in many contexts. This raises the concern that they can be used to deceive and manipulate on a large scale, for instance spreading political misinformation on social media. In future, agentic AI systems might also deceive and manipulate humans for their own ends. In this paper, first, I argue that given its risks, AI-generated content should be held to stricter standards against deception and manipulation than we ordinarily apply to humans. Second, I offer characterizations of AI deception and manipulation meant to support such standards, while avoiding reference to the putative beliefs or intentions of AIs or to third-party judgments about what human users rationally ought to believe or do. Third, I propose two measures to guard against AI deception and manipulation, inspired by this characterization: "extreme transparency" requirements for AI-generated content and defensive systems that annotate AI-generated statements with contextualizing information. Finally, I consider to what extent these methods can protect against deceptive behavior in future, agentic AIs.

Ethics of the Long-Term Future by Christian Tarsney

Does a Discount Rate Measure the Costs of Climate Change?

Economics and Philosophy, 2017

The Epistemic Challenge to Longtermism

Longtermism holds that what we ought to do is mainly determined by e↵ects on the far future. A na... more Longtermism holds that what we ought to do is mainly determined by e↵ects on the far future. A natural objection is that these e↵ects may be nearly impossible to predict—perhaps so close to impossible that, despite the astronomical importance of the far future, the expected value of our present options is mainly determined by short-term considerations. This paper aims to precisify and evaluate (a version of) this epistemic objection. To that end, I develop two simple models for comparing “longtermist” and “short-termist” interventions, incorporating the idea that, as we look further into the future, the e↵ects of any present intervention become progressively harder to predict. These models yield mixed conclusions: If we simply aim to maximize expected value, and don’t mind premising our choices on minuscule probabilities of astronomical payo↵s, the case for longtermism looks robust. But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist ...

The Epistemic Challenge to Longtermism

Longtermism holds that what we ought to do is mainly determined by effects on the far future. A n... more Longtermism holds that what we ought to do is mainly determined by effects on the far future. A natural objection is that these effects may be nearly impossible to predict -- perhaps so close to impossible that, despite the astronomical importance of the far future, the expected value of our present options is mainly determined by short-term considerations. This paper aims to precisify and evaluate (a version of) this epistemic objection. To that end, I develop two simple models for comparing "longtermist" and "short-termist" interventions, incorporating the idea that, as we look further into the future, the effects of any present intervention become progressively harder to predict. These models yield mixed conclusions: If we simply aim to maximize expected value, and don't mind premising our choices on minuscule probabilities of astronomical payoffs, the case for longtermism looks robust. But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist interventions depends heavily on these "Pascalian" probabilities. So the case for longtermism may depend either on plausible but non-obvious empirical claims or on a tolerance for Pascalian fanaticism.

Does a Discount Rate Measure the Costs of Climate Change?

I argue that the use of a social discount rate to assess the costs and benefits of policy respons... more I argue that the use of a social discount rate to assess the costs and benefits of policy responses to climate change is unhelpful and misleading. I consider two lines of justification for discounting, one ethical and the other economic, connected to the two terms of the standard formula for the discount rate. Concerning the former, I examine some arguments recently put forward by Joseph Heath and others for a "pure rate of time preference " and conclude that they fail to overcome standard ethical arguments for temporal neutrality. Concerning the latter, I consider whether the standard economic rationale for discounting, based on the diminishing marginal utility of consumption, is relevant to the specific costs and benefits at stake in climate policy. I argue that it is not, since the unusually long time horizons and nature of the costs and benefits in the climate context mean that a great many of the idealizing assumptions required by this economic rationale do not adequately approximate the underlying reality. The unifying theme of my objections is that all extant rationales for time discounting, both ethical and economic, justify it only as a proxy for normative concerns that have no intrinsic connection to the passage of time, and that in consequence, for any proposed application of a discount rate to ethical or public policy questions, it must be asked whether that approximation is useful to the case at hand. Where it is not, other means must be found to represent the concerns that motivate discounting, and in the concluding section I sketch such an alternative for the case of climate change.

Does a discount rate measure the costs of climate change?

I argue that the use of a social discount rate to assess the costs and benefits of policy respons... more I argue that the use of a social discount rate to assess the costs and benefits of policy responses to climate change is unhelpful and misleading. I consider two lines of justification for discounting, one ethical and the other economic, connected to the two terms of the standard formula for the discount rate. Concerning the former, I examine some arguments recently put forward by Joseph Heath and others for a " pure rate of time preference " and conclude that they fail to overcome standard ethical arguments for temporal neutrality. Concerning the latter, I consider whether the standard economic rationale for discounting, based on the diminishing marginal utility of consumption, is relevant to the specific costs and benefits at stake in climate policy. I argue that it is not, since the unusually long time horizons and nature of the costs and benefits in the climate context mean that a great many of the idealizing assumptions required by this economic rationale do not adequately approximate the underlying reality. The unifying theme of my objections is that all extant rationales for time discounting, both ethical and economic, justify it only as a proxy for normative concerns that have no intrinsic connection to the passage of time, and that in consequence, for any proposed application of a discount rate to ethical or public policy questions, it must be asked whether that approximation is useful to the case at hand. Where it is not, other means must be found to represent the concerns that motivate discounting, and in the concluding section I sketch such an alternative for the case of climate change.

Philosophy of Time by Christian Tarsney

Thank Goodness That's Newcomb: The Practical Relevance of the Temporal Value Asymmetry

I describe a thought experiment in which an agent must choose between suffering a greater pain in... more I describe a thought experiment in which an agent must choose between suffering a greater pain in the past or a lesser pain in the future. This case demonstrates that the "temporal value asymmetry" — our disposition to attribute greater significance to future pleasures and pains than to past — can have consequences for the rationality of actions as well as attitudes. This fact, I argue, blocks attempts to vindicate the temporal value asymmetry as a useful heuristic tied to the asymmetry of causation. Since the two standard arguments for the rationality of the temporal value asymmetry appeal to causal asymmetry and the passage of time respectively, the failure of the causal asymmetry explanation suggests that the B-theory, which rejects temporal passage, has substantial revisionary implications concerning our attitudes toward past and future experience.

Belief in robust temporal passage (probably) does not explain future-bias

Philosophical Studies

Thank goodness that’s Newcomb: The practical relevance of the temporal value asymmetry

Analysis, 2017

I describe a thought experiment in which an agent must choose between suffering a greater pain in... more I describe a thought experiment in which an agent must choose between suffering a greater pain in the past or a lesser pain in the future. This case demonstrates that the &quot;temporal value asymmetry&quot; — our disposition to attribute greater significance to future pleasures and pains than to past — can have consequences for the rationality of actions as well as attitudes. This fact, I argue, blocks attempts to vindicate the temporal value asymmetry as a useful heuristic tied to the asymmetry of causation. Since the two standard arguments for the rationality of the temporal value asymmetry appeal to causal asymmetry and the passage of time respectively, the failure of the causal asymmetry explanation suggests that the B-theory, which rejects temporal passage, has substantial revisionary implications concerning our attitudes toward past and future experience.

Against Anti-Fanaticism

Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fa... more Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fanatics say you should, anti-fanatics say you should not. Anti-fanaticism has great intuitive appeal. But, I argue, these intuitions are untenable, because satisfying them in their full generality is incompatible with three very plausible principles: acyclicity, a minimal dominance principle, and the principle that any outcome can be made better or worse. This argument against anti-fanaticism can be turned into a positive argument for a weak version of fanaticism, but only from significantly more contentious premises. In combination, these facts suggest that those who find fanaticism counterintuitive should favor not anti-fanaticism, but an intermediate position that permits agents to have incomplete preferences that are neither fanatical nor anti-fanatical.

Against Anti-Fanaticism

Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fa... more Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fanatics say you should, anti-fanatics say you should not. Anti-fanaticism has great intuitive appeal. But, I argue, these intuitions are untenable, because satisfying them in their full generality is incompatible with three very plausible principles: acyclicity, a minimal dominance principle, and the principle that any outcome can be made better or worse. This argument against anti-fanaticism can be turned into a positive argument for a weak version of fanaticism, but only from significantly more contentious premises. In combination, these facts suggest that those who find fanaticism counterintuitive should favor not anti-fanaticism, but an intermediate position that permits agents to have incomplete preferences that are neither fanatical nor anti-fanatical.

Exceeding Expectations: Stochastic Dominance as a General Decision Theory

The principle that rational agents should maximize expected utility or choiceworthiness is intuit... more The principle that rational agents should maximize expected utility or choiceworthiness is intuitively plausible in many ordinary cases of decision-making under uncertainty. But it is less plausible in cases of extreme, low-probability risk (like Pascal's Mugging), and intolerably paradoxical in cases like the St. Petersburg and Pasadena games. In this paper I show that, under certain conditions, stochastic dominance reasoning can capture most of the plausible implications of expectational reasoning while avoiding most of its pitfalls. Specifically, given sufficient background uncertainty about the choiceworthiness of one's options, many expectation-maximizing gambles that do not stochastically dominate their alternatives "in a vacuum" become stochastically dominant in virtue of that background uncertainty. But, even under these conditions, stochastic dominance will generally not require agents to accept extreme gambles like Pascal's Mugging or the St. Petersburg game. The sort of background uncertainty on which these results depend looks unavoidable for any agent who measures the choiceworthiness of her options in part by the total amount of value in the resulting world. At least for such agents, then, stochastic dominance offers a plausible general principle of choice under uncertainty that can explain more of the apparent rational constraints on such choices than has previously been recognized.

Exceeding Expectations: Stochastic Dominance as a General Decision Theory

arXiv: Theoretical Economics, 2018

The principle that rational agents should maximize expected utility or choiceworthiness is intuit... more The principle that rational agents should maximize expected utility or choiceworthiness is intuitively plausible in many ordinary cases of decision-making under uncertainty. But it is less plausible in cases of extreme, low-probability risk (like Pascal's Mugging), and intolerably paradoxical in cases like the St. Petersburg and Pasadena games. In this paper I show that, under certain conditions, stochastic dominance reasoning can capture most of the plausible implications of expectational reasoning while avoiding most of its pitfalls. Specifically, given sufficient background uncertainty about the choiceworthiness of one's options, many expectation-maximizing gambles that do not stochastically dominate their alternatives "in a vacuum" become stochastically dominant in virtue of that background uncertainty. But, even under these conditions, stochastic dominance will generally not require agents to accept extreme gambles like Pascal's Mugging or the St. Petersbu...

Share the Sugar

We provide a general argument against value incomparability, based on a new style of impossibilit... more We provide a general argument against value incomparability, based on a new style of impossibility result. In particular, we show that, against plausible background assumptions, value incomparability creates an incompatibility between two very plausible principles for ranking lotteries: a weak "negative dominance" principle (to the effect that Lottery 1 can be better than Lottery 2 only if some possible outcome of Lottery 1 is better than some possible outcome of Lottery 2) and a weak form of ex ante Pareto (to the effect that, if Lottery 1 gives an unambiguously better prospect to some individuals than Lottery 2, and equally good prospects to everyone else, then Lottery 1 is better than Lottery 2). After spelling out our results, and the arguments based on them, we consider which principle the proponent of incomparability ought to reject.

Non-Additive Axiologies in Large Worlds

Is the overall value of a world just the sum of values contributed by each value-bearing entity i... more Is the overall value of a world just the sum of values contributed by each value-bearing entity in that world? Additively separable axiologies (like total utilitarianism, prioritarianism, and critical level views) say 'yes', but non-additive axiologies (like average utilitarianism, rank-discounted utilitarianism, and variable value views) say 'no'. This distinction is practically important: additive axiologies support 'arguments from astronomical scale' which suggest (among other things) that it is overwhelmingly important for humanity to avoid premature extinction and ensure the existence of a large future population, while non-additive axiologies need not. We show, however, that when there is a large enough 'background population' unaffected by our choices, a wide range of non-additive axiologies converge in their implications with some additive axiology -- for instance, average utilitarianism converges to critical-level utilitarianism and various egalitarian theories converge to prioritiarianism. We further argue that real-world background populations may be large enough to make these limit results practically significant. This means that arguments from astronomical scale, and other arguments in practical ethics that seem to presuppose additive separability, may be truth-preserving in practice whether or not we accept additive separability as a basic axiological principle.

Non-Additive Axiologies in Large Worlds

arXiv: Theoretical Economics, 2020

Is the overall value of a world just the sum of values contributed by each value-bearing entity i... more Is the overall value of a world just the sum of values contributed by each value-bearing entity in that world? Additively separable axiologies (like total utilitarianism, prioritarianism, and critical level views) say 'yes', but non-additive axiologies (like average utilitarianism, rank-discounted utilitarianism, and variable value views) say 'no'. This distinction is practically important: additive axiologies support 'arguments from astronomical scale' which suggest (among other things) that it is overwhelmingly important for humanity to avoid premature extinction and ensure the existence of a large future population, while non-additive axiologies need not. We show, however, that when there is a large enough 'background population' unaffected by our choices, a wide range of non-additive axiologies converge in their implications with some additive axiology -- for instance, average utilitarianism converges to critical-level utilitarianism and various e...

Average Utilitarianism Implies Solipsistic Egoism

Average utilitarianism and several related axiologies, when paired with the standard expectationa... more Average utilitarianism and several related axiologies, when paired with the standard expectational theory of decision-making under risk and with reasonable empirical credences, can find their practical prescriptions overwhelmingly determined by the minuscule probability that the agent assigns to solipsism -- i.e., to the hypothesis that there is only one welfare subject in the world, viz., herself. This either (i) constitutes a reductio of these axiologies, (ii) suggests that they require bespoke decision theories, or (iii) furnishes a novel argument for ethical egoism.

Average Utilitarianism Implies Solipsistic Egoism

Australasian Journal of Philosophy, 2021

What Should We Agree on about the Repugnant Conclusion?

Utilitas

The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in... more The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in Derek Parfit's original formulation, For any possible population of at least ten billion people, all with a very high quality of life, there must be some much larger imaginable population whose existence, if other things are equal, would be better, even though its members have lives that are barely worth living. (Parfit 1984: 388)

What Should We Agree on about the Repugnant Conclusion?

by Mark Budolfson, Christian Tarsney, and Dean Spears

Utilitas, 2021

The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in... more The Repugnant Conclusion is an implication of some approaches to population ethics. It states, in Derek Parfit's original formulation, For any possible population of at least ten billion people, all with a very high quality of life, there must be some much larger imaginable population whose existence, if other things are equal, would be better, even though its members have lives that are barely worth living.

Deception and Manipulation in Generative AI

Large language models now possess human-level linguistic abilities in many contexts. This raises ... more Large language models now possess human-level linguistic abilities in many contexts. This raises the concern that they can be used to deceive and manipulate on a large scale, for instance spreading political misinformation on social media. In future, agentic AI systems might also deceive and manipulate humans for their own ends. In this paper, first, I argue that given its risks, AI-generated content should be held to stricter standards against deception and manipulation than we ordinarily apply to humans. Second, I offer characterizations of AI deception and manipulation meant to support such standards, while avoiding reference to the putative beliefs or intentions of AIs or to third-party judgments about what human users rationally ought to believe or do. Third, I propose two measures to guard against AI deception and manipulation, inspired by this characterization: "extreme transparency" requirements for AI-generated content and defensive systems that annotate AI-generated statements with contextualizing information. Finally, I consider to what extent these methods can protect against deceptive behavior in future, agentic AIs.

Does a Discount Rate Measure the Costs of Climate Change?

Economics and Philosophy, 2017

The Epistemic Challenge to Longtermism

Longtermism holds that what we ought to do is mainly determined by e↵ects on the far future. A na... more Longtermism holds that what we ought to do is mainly determined by e↵ects on the far future. A natural objection is that these e↵ects may be nearly impossible to predict—perhaps so close to impossible that, despite the astronomical importance of the far future, the expected value of our present options is mainly determined by short-term considerations. This paper aims to precisify and evaluate (a version of) this epistemic objection. To that end, I develop two simple models for comparing “longtermist” and “short-termist” interventions, incorporating the idea that, as we look further into the future, the e↵ects of any present intervention become progressively harder to predict. These models yield mixed conclusions: If we simply aim to maximize expected value, and don’t mind premising our choices on minuscule probabilities of astronomical payo↵s, the case for longtermism looks robust. But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist ...

The Epistemic Challenge to Longtermism

Longtermism holds that what we ought to do is mainly determined by effects on the far future. A n... more Longtermism holds that what we ought to do is mainly determined by effects on the far future. A natural objection is that these effects may be nearly impossible to predict -- perhaps so close to impossible that, despite the astronomical importance of the far future, the expected value of our present options is mainly determined by short-term considerations. This paper aims to precisify and evaluate (a version of) this epistemic objection. To that end, I develop two simple models for comparing "longtermist" and "short-termist" interventions, incorporating the idea that, as we look further into the future, the effects of any present intervention become progressively harder to predict. These models yield mixed conclusions: If we simply aim to maximize expected value, and don't mind premising our choices on minuscule probabilities of astronomical payoffs, the case for longtermism looks robust. But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist interventions depends heavily on these "Pascalian" probabilities. So the case for longtermism may depend either on plausible but non-obvious empirical claims or on a tolerance for Pascalian fanaticism.

Does a Discount Rate Measure the Costs of Climate Change?

I argue that the use of a social discount rate to assess the costs and benefits of policy respons... more I argue that the use of a social discount rate to assess the costs and benefits of policy responses to climate change is unhelpful and misleading. I consider two lines of justification for discounting, one ethical and the other economic, connected to the two terms of the standard formula for the discount rate. Concerning the former, I examine some arguments recently put forward by Joseph Heath and others for a "pure rate of time preference " and conclude that they fail to overcome standard ethical arguments for temporal neutrality. Concerning the latter, I consider whether the standard economic rationale for discounting, based on the diminishing marginal utility of consumption, is relevant to the specific costs and benefits at stake in climate policy. I argue that it is not, since the unusually long time horizons and nature of the costs and benefits in the climate context mean that a great many of the idealizing assumptions required by this economic rationale do not adequately approximate the underlying reality. The unifying theme of my objections is that all extant rationales for time discounting, both ethical and economic, justify it only as a proxy for normative concerns that have no intrinsic connection to the passage of time, and that in consequence, for any proposed application of a discount rate to ethical or public policy questions, it must be asked whether that approximation is useful to the case at hand. Where it is not, other means must be found to represent the concerns that motivate discounting, and in the concluding section I sketch such an alternative for the case of climate change.

Does a discount rate measure the costs of climate change?

I argue that the use of a social discount rate to assess the costs and benefits of policy respons... more I argue that the use of a social discount rate to assess the costs and benefits of policy responses to climate change is unhelpful and misleading. I consider two lines of justification for discounting, one ethical and the other economic, connected to the two terms of the standard formula for the discount rate. Concerning the former, I examine some arguments recently put forward by Joseph Heath and others for a " pure rate of time preference " and conclude that they fail to overcome standard ethical arguments for temporal neutrality. Concerning the latter, I consider whether the standard economic rationale for discounting, based on the diminishing marginal utility of consumption, is relevant to the specific costs and benefits at stake in climate policy. I argue that it is not, since the unusually long time horizons and nature of the costs and benefits in the climate context mean that a great many of the idealizing assumptions required by this economic rationale do not adequately approximate the underlying reality. The unifying theme of my objections is that all extant rationales for time discounting, both ethical and economic, justify it only as a proxy for normative concerns that have no intrinsic connection to the passage of time, and that in consequence, for any proposed application of a discount rate to ethical or public policy questions, it must be asked whether that approximation is useful to the case at hand. Where it is not, other means must be found to represent the concerns that motivate discounting, and in the concluding section I sketch such an alternative for the case of climate change.

Thank Goodness That's Newcomb: The Practical Relevance of the Temporal Value Asymmetry

I describe a thought experiment in which an agent must choose between suffering a greater pain in... more I describe a thought experiment in which an agent must choose between suffering a greater pain in the past or a lesser pain in the future. This case demonstrates that the "temporal value asymmetry" — our disposition to attribute greater significance to future pleasures and pains than to past — can have consequences for the rationality of actions as well as attitudes. This fact, I argue, blocks attempts to vindicate the temporal value asymmetry as a useful heuristic tied to the asymmetry of causation. Since the two standard arguments for the rationality of the temporal value asymmetry appeal to causal asymmetry and the passage of time respectively, the failure of the causal asymmetry explanation suggests that the B-theory, which rejects temporal passage, has substantial revisionary implications concerning our attitudes toward past and future experience.

Belief in robust temporal passage (probably) does not explain future-bias

Philosophical Studies

Thank goodness that’s Newcomb: The practical relevance of the temporal value asymmetry

Analysis, 2017

I describe a thought experiment in which an agent must choose between suffering a greater pain in... more I describe a thought experiment in which an agent must choose between suffering a greater pain in the past or a lesser pain in the future. This case demonstrates that the &quot;temporal value asymmetry&quot; — our disposition to attribute greater significance to future pleasures and pains than to past — can have consequences for the rationality of actions as well as attitudes. This fact, I argue, blocks attempts to vindicate the temporal value asymmetry as a useful heuristic tied to the asymmetry of causation. Since the two standard arguments for the rationality of the temporal value asymmetry appeal to causal asymmetry and the passage of time respectively, the failure of the causal asymmetry explanation suggests that the B-theory, which rejects temporal passage, has substantial revisionary implications concerning our attitudes toward past and future experience.

Future bias in action: does the past matter more when you can affect it?

Synthese, 2020

Robust passage phenomenology probably does not explain future-bias

Synthese, 2022

People are ‘biased toward the future’: all else being equal, we typically prefer to have positive... more People are ‘biased toward the future’: all else being equal, we typically prefer to have positive experiences in the future, and negative experiences in the past. Several explanations have been suggested for this pattern of preferences. Adjudicating among these explanations can, among other things, shed light on the rationality of future-bias: For instance, if our preferences are explained by unjustified beliefs or an illusory phenomenology, we might conclude that they are irrational. This paper investigates one hypothesis, according to which future-bias is (at least partially) explained by our having a phenomenology that we describe, or conceive of, as being as of time robustly passing. We empirically tested this hypothesis and found no evidence in its favour. Our results present a puzzle, however, when compared with the results of an earlier study. We conclude that although robust passage phenomenology on its own probably does not explain future-bias, having this phenomenology and...

Bias towards the future

Philosophy Compass

Metanormative Regress: An Escape Plan

How should you decide what to do when you’re uncertain about basic normative principles? A natura... more How should you decide what to do when you’re uncertain about basic normative principles? A natural suggestion is to follow some "second-order" norm: e.g., *obey the most probable norm* or *maximize expected choiceworthiness*. But what if you’re uncertain about second-order norms too -- must you then invoke some third-order norm? If so, any norm-guided response to normative uncertainty appears doomed to a vicious regress. This paper aims to rescue second-order norms from the threat of regress. I first elaborate and defend the claim some philosophers have made that the regress problem forces us to accept *normative externalism*, the view that at least one norm is incumbent on all agents regardless of their normative beliefs. But, I then argue, we need not accept externalism about first-order norms, thus closing off any question of how agents should respond to normative uncertainty. Rather, we can head off the threat of regress by ascribing external force to a single second-order norm: the enkratic principle.

Vive la Différence? Structural Diversity as a Challenge for Metanormative Theories*

Decision-making under normative uncertainty requires an agent to aggregate the assessments of opt... more Decision-making under normative uncertainty requires an agent to aggregate the assessments of options given by rival normative theories into a single assessment that tells her what to do in light of her uncertainty. But what if the assessments of rival theories differ not just in their content but in their structure—for example, some are merely ordinal while others are cardinal? This article describes and evaluates three general approaches to this “problem of structural diversity”: structural enrichment, structural depletion, and multistage aggregation. Each approach has notable drawbacks, but I tentatively defend multistage aggregation as least bad of the three.

Normative Externalism, by Brian Weatherson

Mind, 2020

Normative Uncertainty and Social Choice

Mind, 2018

In ‘Normative Uncertainty as a Voting Problem’, William MacAskill argues that positive credence i... more In ‘Normative Uncertainty as a Voting Problem’, William MacAskill argues that positive credence in ordinal-structured or intertheoretically incomparable normative theories does not prevent an agent from rationally accounting for her normative uncertainties in practical deliberation. Rather, such an agent can aggregate the theories in which she has positive credence by methods borrowed from voting theory—specifically, MacAskill suggests, by a kind of weighted Borda count. The appeal to voting methods opens up a promising new avenue for theories of rational choice under normative uncertainty. The Borda rule, however, is open to at least two serious objections. First, it seems implicitly to ‘cardinalize’ ordinal theories, and so does not fully face up to the problem of merely ordinal theories. Second, the Borda rule faces a problem of option individuation. MacAskill attempts to solve this problem by invoking a measure on the set of practical options. But it is unclear that there is any...

Rejecting Supererogationism

Pacific Philosophical Quarterly

Moral Uncertainty for Deontologists

Ethical Theory and Moral Practice, 2018

Moral Uncertainty for Deontologists

Defenders of deontological constraints in normative ethics face a challenge: how should an agent ... more Defenders of deontological constraints in normative ethics face a challenge: how should an agent decide what to do when she is uncertain whether some course of action would violate a constraint? The most common response to this challenge has been to defend a threshold principle on which it is subjectively permissible to act iff the agent's credence that her action would be constraint-violating is below some threshold t. But the threshold approach seems arbitrary and unmotivated: what would possibly determine where the threshold should be set, and why should there be any precise threshold at all? Threshold views also seem to violate "ought " agglomeration, since a pair of actions each of which is below the threshold for acceptable moral risk can, in combination, exceed that threshold. In this paper, I argue that stochastic dominance reasoning can vindicate and lend rigor to the threshold approach: given characteristically deontological assumptions about the moral value of acts, it turns out that morally safe options will stochastically dominate morally risky alternatives when and only when the likelihood that the risky option violates a moral constraint is greater than some precisely definable threshold (in the simplest case, .5). I also show how, in combination with the observation that deontological moral evaluation is relativized to particular choice situations, this approach can overcome the agglomeration problem. This allows the deontologist to give a precise and well-motivated response to the problem of uncertainty.

Intertheoretic Value Comparison: A Modest Proposal

Journal of Moral Philosophy, 2018

In the growing literature on decision-making under moral uncertainty, a number of skeptics have a... more In the growing literature on decision-making under moral uncertainty, a number of skeptics have argued that there is an insuperable barrier to rational “hedging” for the risk of moral error, namely the apparent incomparability of moral reasons given by rival theories like Kantianism and utilitarianism. Various general theories of intertheoretic value comparison have been proposed to counter this objection, but each suffers from apparently fatal flaws. In this paper, I propose a more modest approach that aims to identify classes of moral theories that share common principles strong enough to establish bases for intertheoretic comparison. I show that, contra the claims of skeptics, there are often rationally perspicuous grounds for precise, quantitative value comparisons within such classes. In light of this fact, I argue, the existence of some apparent incomparabilities between widely divergent moral theories cannot serve as a general argument against hedging for one’s moral uncertai...

Intertheoretic Value Comparison: A Modest Proposal

Rationality and Moral Risk: A Moderate Defense of Hedging

How should an agent decide what to do when she is uncertain not just about morally relevant empir... more How should an agent decide what to do when she is uncertain not just about morally relevant empirical matters, like the consequences of some course of action, but about the basic principles of morality itself? This question has only recently been taken up in a systematic way by philosophers. Advocates of moral hedging claim that an agent should weigh the reasons put forward by each moral theory in which she has positive credence, considering both the likelihood that that theory is true and the strength of the reasons it posits. The view that it is sometimes rational to hedge for one's moral uncertainties, however, has recently come under attack both from those who believe that an agent should always be guided by the dictates of the single moral theory she deems most probable and from those who believe that an agent's moral beliefs are simply irrelevant to what she ought to do. Among the many objections to hedging that have been pressed in the recent literature is the worry that there is no non-arbitrary way of making the intertheoretic comparisons of moral value necessary to aggregate the value assignments of rival moral theories into a single ranking of an agent's options.

This dissertation has two principal objectives: First, I argue that, contra these recent objections, an agent's moral beliefs and uncertainties are relevant to what she rationally ought to do, and more particularly, that agents are at least sometimes rationally required to hedge for their moral uncertainties. My principal argument for these claims appeals to the enkratic conception of rationality, according to which the requirements of practical rationality derive from an agent's beliefs about the objective, desire-independent value or choiceworthiness of her options. Second, I outline a new general theory of rational choice under moral uncertainty. Central to this theory is the idea of content-based aggregation, that the principles according to which an agent should compare and aggregate rival moral theories are grounded in the content of those theories themselves, including not only their value assignments but also the metaethical and other non-surface-level propositions that underlie, justify, or explain those value assignments.