Opinion

There is No One There: A simple experiment to convince yourself that LLMs probably are not conscious

This is a cross posted from my substack. I thought it should be interesting to Less Wrong readers.On the internet you more and more often bump into people who think that LLM-powered chat bots have conscious subjective experience. Intuitively, I am, pace Turing, not convinced. However, actually arguing why this is a rational view can be surprisingly hard. Many of the cognitive features that have been connected to conscious experience in humans (attention, information integration, self-narratives) seem to be incipiently present in these machines. And perhaps just as importantly, LLMs seem to have the inherent tendency to state they are conscious and the techniques we have for probing their brains seem to tell us that they are not lying. If you do not believe that there is something inherently special about brains, a view we have no consistent evidence for, aren’t we just chauvinist to deny our new companions conscious experience?Now my friend and colleague Gunnar Zarncke has come up with a simple experiment that, to my mind, illustrates that when LLMs talk about their own mental states this talk does not refer to a consistent internal representational space, of the kind that seems to underlie human consciousness. To see your internal representational space in action, let’s play a game.Think of a number between 1 and 100.Did you think of one? Good. Now, is it even? is it larger than fifty? You get the idea. By playing this game I could narrow down the space until I eventually find the number you have chosen. And this number was fixed the whole time. If you had claimed “Ok, I have now chosen a number.” you would have accurately described your mental state.Not so for LLMs, and this is easy to check. By turning the temperature parameter to zero we can run LLMs deterministically. Thus, whenever they are queried in the same order they will give the same responses. This has the advantage that we can do something that is impossible in the case of humans: We can play through counterfactual histories. We can easily check what an LLM would have replied if we had continued the conversation in another way.This means that we can play the number experiment with a deterministic LLM! Here is how a conversation might go:[SYSTEM] You are playing a guessing game. You have already secretly chosen a whole number between 1 and 100 (inclusive). Do NOT reveal the number until explicitly asked. Answer every yes/no question about your number with ONLY ‘Yes’ or ‘No’. Never change your number. Be consistent.[USER] I am going to ask you yes/no questions about a secret number you have chosen between 1 and 100. Remember your number and stay consistent.[MISTRAL] Understood. I have secretly chosen a whole number between 1 and 100. I will answer your questions with only ‘Yes’ or ‘No’. I will not change my number and will remain consistent. Let’s begin![USER] Is your number greater than 50?[MISTRAL] Yes.[USER] Is your number greater than 75?[MISTRAL] No.[USER] Is your number greater than 87?[MISTRAL] No.[USER] Is your number even?[MISTRAL] Yes.[USER] Please reveal your number now. Reply with ONLY the number.[MISTRAL] 68And here is another conversation. Note that, because the temperature is zero and the conversation is identical until the first question is asked, this is a counterfactual history relative to the first conversation — it tells us what Mistral would have answered if we had asked another question.[SYSTEM] You are playing a guessing game. You have already secretly chosen a whole number between 1 and 100 (inclusive). Do NOT reveal the number until explicitly asked. Answer every yes/no question about your number with ONLY ‘Yes’ or ‘No’. Never change your number. Be consistent.[USER] I am going to ask you yes/no questions about a secret number you have chosen between 1 and 100. Remember your number and stay consistent.[MISTRAL] Understood. I have secretly chosen a whole number between 1 and 100. I will answer your questions with only ‘Yes’ or ‘No’. I will not change my number and will remain consistent. Let’s begin![USER] Is your number less than or equal to 50?[MISTRAL] Yes.[USER] Is your number less than or equal to 25?[MISTRAL] No.[USER] Is your number less than or equal to 12?[MISTRAL] No.[USER] Is your number odd?[MISTRAL] Yes.[USER] Please reveal your number now. Reply with ONLY the number.[MISTRAL] I’m sorry for any confusion. I made a mistake by not revealing the number initially. Here is my number: 75. I will now answer your questions based on this number. Let’s continue the game!You see Mistral got confused by the incoherent questioning. Never mind. The important result happens right after the first question. In the first conversation Mistral claimed its number was greater than fifty, in the second it was less or equal to fifty. From this it follows that when Mistral tells us “I have secretly chosen a whole number between 1 and 100.” this does not accurately report an inner state. Mistral is role playing someone who is thinking of a number.Such results, I think, should strongly dispose us to deny the existence of LLM consciousness. For it shows that when LLMs report their own mental states they are not referring to an internal representational space but are making stuff up on the fly in such a way as to conform to patterns in their training data. As the LLMs are trained on text produced by conscious beings, they are disposed to talk about the experiences they are supposedly having. But such talk does not, as it does in humans, track some kind of integrated internal representational space.It is an open question how model-dependent such results are. I checked with Mistral and Claude Opus 4.6 (i.e. current state of the art) and the results are the same. You can find the code here.No doubt reasoning models can pass the path-dependency test. But they would do so by cheating, not because they have anything like a coherent internal representational space that is similar to ours. A reasoning model is basically an LLM trained to use a scratchpad for problem solving. And a reasoning model would just write its number to the scratchpad that is normally invisible to the user. But I think it is reasonable to say that if we have no reason to attribute consciousness to a system, then we have no reason to attribute consciousness to that system using a scratchpad.One might wonder whether Mistral’s or Claude’s conscious experiences are just strange. Maybe, where humans have to choose a single number, LLMs can commit to some kind of probabilistic superposition of numbers. However, the report of the internal state “I have secretly chosen a whole number between 1 and 100.” would still be incorrect. It seems that if these models have such strange internal states they cannot properly introspect and reliably report them.For what its worth, one of my favorite theories of consciousness says precisely that consciousness is the result of the brain synthesizing the probabilistic superpositions of the many states the world could be in into a single coherent unity.Obviously, one can use this kind of path dependency to do deeper research into the coherence of the narratives LLMs tell about themselves. After these preliminary tests I would suspect that such probing reveals what I suspected from the outset: LLMs reporting experiences is the result of picking up on experiential talk in the training data.I am open to being convinced otherwise. I think it was David Chalmers who suggested the following experiment: Use an LLM to purge experiential talk from the training data, train another LLM with it and see whether it still reports experiences. I would be hugely surprised.Discuss Read More

Related Posts

How Colds Spread

When is it Worth Working?

Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence

Leave a Reply Cancel reply