Published on February 4, 2026 9:11 PM GMTI see the opposite claim made in The Problem, and see it implied along with most mentions of AlphaGo. I also see some people who might agree with me, e.g. here, or here, but they don’t get convincing responses.It’s an odd thing to claim a chess bot is “trying to win”, given that, after training, the bot receives no reward for winning, and no feedback for losing. It doesn’t even know that the sequence of boards it is given is from the same game. It does not react to the opponent making illegal moves, either by insisting that it won, or by making illegal moves of its own. It does not try to frustrate human opponents or bait weaker opponents into making mistakes. It does not seek out more games in order to win more games. It is entirely incapable of considering any such actions, or other actions you’d expect if it were “trying to win”, regardless of how deep its networks are and how long it has been trained, because the training environment did not reward them.It is certainly true that in the narrow domain of valid chess moves, the bot does optimize “winning” or some proxy of it. But once the bot enters the domain of the real world, the utility function is extended and the description “trying to win” no longer needs to apply, nor does any other simple description of a goal. There are many utility functions that look like “trying to win” when restricted to valid chess moves, and only a narrow subset of those look like “trying to win” in the real world. There is no mechanism for training to produce functions that extend like that. In fact, any neurons spent on considering real-world context are not considering valid chess moves and therefore a waste of compute.People seem to believe that the bot trained to “win” in a narrow domain will extend to a bot that “tries to win” in the real world, but I have seen no such argument, certainly nothing justifying the high confidence needed for high p(doom). You’re very welcome to point me to arguments I may have missed.Discuss Read More
Chess bots do not have goals
Published on February 4, 2026 9:11 PM GMTI see the opposite claim made in The Problem, and see it implied along with most mentions of AlphaGo. I also see some people who might agree with me, e.g. here, or here, but they don’t get convincing responses.It’s an odd thing to claim a chess bot is “trying to win”, given that, after training, the bot receives no reward for winning, and no feedback for losing. It doesn’t even know that the sequence of boards it is given is from the same game. It does not react to the opponent making illegal moves, either by insisting that it won, or by making illegal moves of its own. It does not try to frustrate human opponents or bait weaker opponents into making mistakes. It does not seek out more games in order to win more games. It is entirely incapable of considering any such actions, or other actions you’d expect if it were “trying to win”, regardless of how deep its networks are and how long it has been trained, because the training environment did not reward them.It is certainly true that in the narrow domain of valid chess moves, the bot does optimize “winning” or some proxy of it. But once the bot enters the domain of the real world, the utility function is extended and the description “trying to win” no longer needs to apply, nor does any other simple description of a goal. There are many utility functions that look like “trying to win” when restricted to valid chess moves, and only a narrow subset of those look like “trying to win” in the real world. There is no mechanism for training to produce functions that extend like that. In fact, any neurons spent on considering real-world context are not considering valid chess moves and therefore a waste of compute.People seem to believe that the bot trained to “win” in a narrow domain will extend to a bot that “tries to win” in the real world, but I have seen no such argument, certainly nothing justifying the high confidence needed for high p(doom). You’re very welcome to point me to arguments I may have missed.Discuss Read More

