Opinion

The fall of the theorem economy (David Bessis)

I found this post from mathematician David Bessis very interesting. It explains that while AI can be trained to prove mathematical theorems in Lean, AI-written proofs are often poor at conveying useful mathematical insights. Bessis argues that the human-usable intuitions gained by the process of proving a theorem is more important than the proof itself, and AI provers do not address this need.It’s known that AIs do sloppy work on hard-to-verify tasks, but I was surprised that similar issues arise even in ungameable domains like math. Even when an AI’s work is provably correct, its sloppiness may make it much harder for humans to benefit from.I definitely recommend reading the whole post, but I’ll pull out a few key quotes here:Mathematics is meant to improve human understandingThe product of mathematics is clarity and understanding. Not theorems, by themselves. Is there, for example any real reason that even such famous results as Fermat’s Last Theorem, or the Poincaré conjecture, really matter? Their real importance is not in their specific statements, but their role in challenging our understanding, presenting challenges that led to mathematical developments that increased our understanding…Mathematics only exists in a living community of mathematicians that spreads understanding and breathes life into ideas both old and new. The real satisfaction from mathematics is in learning from others and sharing with others. All of us have clear understanding of a few things and murky concepts of many more. There is no way to run out of ideas in need of clarification…Lean proofs from AI don’t contain useful abstractionsA few weeks ago, Math Inc, one of the best-funded AI-for-math startups, produced a Lean formalization of Maryna Viazovska’s spectacular work on the sphere packing problem in dimensions 8 and 24, results which earned her Fields medal in 2022. That was impressive in its own right: never before had theorems of this caliber been autoformalized. Yet this clear success was met by a massive pushback from the “formal mathematics” community, the very people who lead the effort to port “human mathematics” into machine-verified code.Luddites?Well, it’s more complicated, as it emerges from their actual conversation. The thing is that autoformalization isn’t a full solution to the problem of formalizing mathematics, just like Tesla’s Full Self-Driving isn’t a full solution to the problem of driving cars.Yes, I know, that sounds counterintuitive. This is why outsiders are likely to miss the nuance and label the pushback as Luddism.A clear explanation can be found in Alex Kontorovich’s account of his own learning curve with formalized mathematics. In a nutshell: Mathlib, the dominant Lean library, is a human-curated formalization of an ever-growing fraction of existing human mathematics. It exposes clean APIs and abstractions, without which no autoformalization could take place. By contrast, Math Inc’s autoformalized proof of Viazovska’s results exposes no intelligible interface. Who in their right mind would merge a 200,000-line unaudited vibe-coded blob into the master branch of global human science?Mathematicians aren’t rewarded for writing better explanations of existing proofsMath Inc captured the prize (“first formalization of a Fields-medal-level theorem”) and there is no social reward left for cleaning up after them. Hence this comment by Patrick Massot, a non-Luddite expert in formalized mathematics:I think the situation is pretty clear: AI companies, and especially Math Inc, will indeed thoroughly bomb this area to turn it into a giant radioactive wasteland that will never be able to sustain life again, so we will never get the benefits expected from formalization (improved understanding and accessibility). I strongly advise young people to contribute to less shiny projects that are less likely to be destroyed.What makes the situation really tricky is that unintelligible formal proofs may hold significant residual value, even if they aren’t accretive to the canonized corpus.And, to be honest, the issue existed well before AI, with the four-color theorem, with the classification of finite simple groups, or with Tom Hales’s monumental work on the Kepler conjecture (which led him to seek a formalized proof).The likely outcome is that formalized mathematics will now develop in two separate layers, an intelligible layer embodied by Mathlib, and an unintelligible layer we might call Mathslop, a library of results that are known to be correct via proofs that no human has ever understood.If AI proved the Riemann hypothesis, what would be the point?Here is what the nuclear scenario looks like. A year from now, an AI lab assembles a secret A-team, throws a few billion dollars of compute at the Riemann hypothesis, and comes up with a 2,000,000-line Lean proof. What are mathematicians going to say?That the proof is unintelligible? Right, but who cares? Isn’t the whole of math already unintelligible to most people?That the theorem was still proven by the humans in the loop? Yes and no—and in that specific case I would probably vote no.That a few billion dollars represents decades of the entire mathematical sciences budget at the National Science Foundation? Correct, so what? And what if the conjecture happens to be much easier, accessible with only nominal resources?That the conjecture itself was still produced by a human brain, along with the entire mathematical stack underneath, starting from the very language in which the conjecture was written, the concepts of infinite sums, complex numbers, meromorphic functions and analytic continuation, the interpretation of the conjecture as a profound statement about the distribution of prime numbers, and the prophecy that it should be true? And that, at this point, humans are the only beings who care about it?This objection is profound and valid. But when you look like a sore loser, nobody cares what you say.Personally, I’m hopeful that we can find ways to motivate AIs to make their work easier for humans to understand. For example, we could train AIs to prefer simple solutions over complex ones, or to start by producing general insights that apply to problems besides the specific one they’re working on. I speculate that the math domain could be an especially nice place to start researching ways to make AI outputs more legible, before moving on to domains that are harder to verify.Read the full post here.Discuss Read More

Related Posts

Making Sense of Consciousness Part 6: Perceptions of Disembodiment

U.S. Democracy Threat Index: $10,000 in Forecasting Prizes

What did we learn from the AI Village in 2025?

Leave a Reply Cancel reply