The Refined Counterfactual Prisoner's Dilemma: An Attempt to Explode Decision-Theoretic Consequentialism

I was inspired to revise my formulation of this thought experiment by Ihor Kendiukhov’s post On The Independence Axiom.Kendiukhov quotes Scott Garrabrant:My take is that the concept of expected utility maximization is a mistake. […] As far as I know, every argument for utility assumes (or implies) that whenever you make an observation, you stop caring about the possible worlds where that observation went differently. […] Von Neumann did not notice this mistake because he was too busy inventing the entire field. The point where we discover updatelessness is the point where we are supposed to realize that all of utility theory is wrong. I think we failed to notice.Apparently “stopping caring about the possible worlds where that observation went differently” is known as (decision-theoretic) consequentialism.I was thinking this through and I realised that (potential) disadvantage of not caring about worlds where the observation went differently can be cleanly illustrated by the following thought experiment:The Refined Counterfactual Prisoner’s Dilemma: Omega, a perfect predictor, flips a coin and tell you the result. Regardless of whether it comes up heads or tails, Omega asks you for $1. Before it asked you to decide, Omega made a prediction about what you would have chosen if the coin had come up the other way. If it predicted that you wouldn’t have paid, then it inflicts $1 million dollars worth of damage on you as punishment.This attempts to explode the consequentialism by constructing a situation where you can symmetrically burn a lot of value in other counterfactual case by refusing to give up a trivial amount of value. If you don’t care about the other world, you’d press such a button if it could exist and because you’d press it in both counterfactuals you end up worse off regardless of which way the coin ends up. Now you might be skeptical about the existence of such a button because you’re doubtful about the possibility of perfect predictors, but if your doubt was assuaged then this thought experiment would bite. In fact, I would argue that it would be quite surprising if a proposed decision theory were to fail for perfect predictors without having deeper issues.Additional information: This is an improved version of a thought experiment that was independently discovered by Cousin_It and me:The Original Counterfactual Prisoner’s Dilemma Omega, a perfect predictor, flips a coin and tell you how it came up. If it comes up heads, Omega asks you for $100, then pays you $10,000 if it predicts you would have paid if it had come up tails. If it comes up tails, Omega asks you for $100, then pays you $10,000 if it predicts you would have paid if it had come up heads. In this case it was heads and it makes its prediction before you decide.The changes I’ve made for this version may seem trivial, but if you want a thought experiment to spread, small details like this matter. The original version was just a symmetric version of counterfactual-mugging, but this was less helpful in explaining it than I originally hoped.Discuss Read More

The Refined Counterfactual Prisoner’s Dilemma: An Attempt to Explode Decision-Theoretic Consequentialism

Leave a Reply Cancel reply

Related Posts

Pando: A Controlled Benchmark for Interpretability Methods

Micro-visions for AI-powered online content

Lessons from building a model organism testbed

Leave a Reply Cancel reply