Half-understood mathematics

ai, mathematics, essay

Mathematicians like speaking of ‘beautiful’ and ‘ugly’ proofs. A beautiful proof isn’t just a certificate of truth: it typically triggers some insight for the reader. Aesthetic judgement seems to correlate with understanding.

Historically, mathematicians’ sense of beauty has shaped the discipline. Authors of elegant proofs often receive significant credit, even for proving known results: for example, Selberg and Erdős’ elementary proof of the Prime Number Theorem caused a sensation, though the result was more than 50 years old.

However, with increasingly capable computer systems and AI, we’re seeing more ugly, poorly understood mathematics. In fact, within just 3-5 years, it’s quite plausible a large proportion of novel results might be fully AI-generated and AI-verified ― true, but only half-understood.

Understanding less… #

This trend arguably began back in the 1970s, with the proof of the Four Color Theorem (1976). This was the first major result with a computer-assisted proof: Apel and Haken reduced the proof to case exhaustion, after which a computer checked 1,200 cases manually to establish the result.

Another milestone in the history of computer-assisted proofs was the classification of finite simple groups (1981). With a proof 15,000 pages long, the theorem was dubbed ‘The Enormous Theorem’.

The Kepler Conjecture, which concerns the densest packing of spheres, was proved in 1998: 3GB of computer calculations and results. Referees gave up on verifying the proof manually, but said they were ‘99% certain’ of its correctness. The proof was published in Annals of Mathematics, one of the world’s most prestigious journals, in 2005.

And less… #

Now large language models (LLMs) are pushing the trend of half-understood mathematics further. Amateurs have elicited proofs to open problems from LLMs which they are unable to verify, while many professional mathematicians have AI-generated results lying around which they have no time checking. In the words of Terence Tao, we’re entering an era of ‘proof abundance’.

My impression is that LLMs and autoformalisation agents still aren’t capable enough for reliable peer review (where I’ll take human peer review as the golden standard for reliability). However, I expect this to change well within 2-5 years, with AIs verifying informal mathematics or checking autoformalisations. If AI peer review becomes reliable, we’re in the Kepler Conjecture situation, with humans being 99% certain of a proof’s correctness but unable to ascertain all details.

The price #

Yet, absolute certainty isn’t possible, not even within entirely human-powered research. All this to say that AIs could, in principle, autonomously conduct cutting-edge research. Since human experts cannot be duplicated and need to eat and sleep, AIs might soon account for a huge proportion of all high-quality research output. But it will be half-understood and ― if I’m right about ’this is elegant’ being a proxy for ’now I understand’ ― it won’t be as pretty.

This might lead to a sharper distinction between recreational and research-level maths, with maths departments seeing two new kinds of personas: the luddite aesthete and the LLM orchestrator1. This is only partly a joke ― I actually think it’s worth taking this scenario seriously.

In the marvelous essay The Fall of the Theorem Economy, where David Bessis asks how the mathematical community should adapt to AI, he calls for mathematicians to ‘come clean about the nature of mathematics’, i.e. being honest about their motivations to do maths.

While I’m repelled by the idea of lots of ugly mathematics, I remain hopeful. Perhaps mathematics can experience a boom similar to chess? People didn’t stop playing chess after DeepBlue’s victory over Garry Kasparov; on the contrary, chess has never been more popular. In mathematics, perhaps AIs could take care of the uglier aspects of maths, leaving humans to do mathematics for its own sake.

This essay was inspired by discussions with Yaël Dillies and Geby Jaff.