Where Is Mathematics Going? Large Language Models And Lean Proof Assistant

October 8, 2025

If you’re a hacker you may well have a passing interest in math, and if you have an interest in math you might like to hear about the direction of mathematical research. In a talk on this topic [Kevin Buzzard], professor of pure mathematics at Imperial College London, asks the question: Where is Mathematics Going?

It starts by explaining that in 2017 he had a mid-life crisis, of sorts, becoming disillusioned with the way mathematics research was being done, and he started looking to computer science for solutions.

He credits Euclid, as many do, with writing down some axioms and starting mathematics, over 2,000 years ago. From axioms came deductions, and deductions became mathematical facts, and math proceeded in this fashion. This continues to be the way mathematical research is done in mathematical departments around the world. The consequence of this is that mathematics is now incomprehensibly large. Similarly the mathematical proofs themselves are exceedingly large, he gives an example of one proof that is 10,000 pages long and still hasn’t been completely written down after having been announced more than 20 years ago.

The conclusion from this is that mathematics has become so complex that traditional methods of documenting it struggle to cope. He says that a tertiary education in mathematics aims to “get students to the 1940s”, whereas a tertiary education in computer science will expose students to the state of the art.

He investigates the effect “computer as calculator” has had on mathematics since the middle of the 20th century, stating that it is less than you might have thought. More recently though we have large language models (LLMs) giving us “computer as generator of mathematics” and interactive theorem provers (ITPs) as “computer as checker of mathematics”, both being new ways to use computers for mathematics research. He notes that each of these technologies have flaws and that neither has, so far, told us anything profound which we didn’t already know. As he puts it mathematics has not seen a “Deep Blue moment“.

The point is then made that the problem with LLMs is that they hallucinate statements which introduces errors and the problem with ITPs is that all the code needs to be hand-written by humans. He floats the “no brainer idea” of combining LLM tech with ITP tech, the LLMs can propose mathematics and the ITP can verify it.

He concludes with the idea that LLM + ITP is the best future for mathematics, enabling mathematics to go from “mostly right” to “definitely right”.

If you have a passing interest in math you might also like to read Getting The Most Out Of ISM Transceivers Using Math and Design Scanimations In A Snap With The Right Math.

31 thoughts on “Where Is Mathematics Going? Large Language Models And Lean Proof Assistant”

Derek Tombrello says:

October 8, 2025 at 10:55 pm

Yeah… he lost me with 2+2=5

Report comment

Reply
1. John Elliot V says:
  
  October 8, 2025 at 11:05 pm
  
  Yeah I don’t get that t-shirt. Anyway. You might prefer this one: https://www.youtube.com/watch?v=oOYcPkBaotg
  
  Report comment
  
  Reply
  1. Luke says:
    
    October 9, 2025 at 1:28 am
    
    Ah, looks like it’s a Thing (I wasn’t aware) https://en.wikipedia.org/wiki/2_%2B_2_%3D_5
    
    Report comment
    
    Reply
2. 160zc says:
  
  October 9, 2025 at 3:26 am
  
  It’s sometimes true. Depending on what 2 you are talking about. Not true for integers. Sometimes true if the numbers have been rounded, and are at the higher end of what we round to 2
  2.4+2.4=4.8
  Round to one significant figure.
  2+2=5
  
  Report comment
  
  Reply
  1. albertron says:
    
    October 9, 2025 at 6:13 am
    
    You can’t apply rounding to the operands prior to computation, only the end result.
    
    Report comment
    
    Reply
    1. Steven Clark says:
      
      October 9, 2025 at 12:53 pm
      
      The idea is you don’t know the precision of the data you are given. So adding two data points that are ‘2’ could give you a real value if you added them physically of somewhere between ‘3’ and ‘5’.
      
      Report comment
      
      Reply
    2. Anonymous says:
      
      October 9, 2025 at 1:05 pm
      
      You can’t
      
      But he did!
      
      Report comment
      
      Reply
      1. Aknup says:
        
        October 10, 2025 at 8:14 am
        
        No he didn’t, he rounded the end result and did not do so ‘prior to computation’
        
        Incidentally, two units of 1.5 plus two units of 1 equals 5 units, shipping is free, local taxes may apply.
        
        Report comment
    3. HaHa says:
      
      October 9, 2025 at 2:04 pm
      
      It’s operator overloading.
      
      You are all assuming scalar Cartesian addition.
      
      Granting ‘2’ is a scalar, fancy math uses jargon and conventions that they make up as they go along.
      
      Report comment
      
      Reply
  2. C says:
    
    October 10, 2025 at 4:47 am
    
    1.5 + 1.5 = 3
    So by your logic 2 + 2 can be [3,5] if the numbers were rounded for writing down but not for addition. Ignoring there are also different types of rounding.
    
    You are basically redefining the addition operator by adding a deround functionality.
    
    What you want can be done by extending the numbers with tolerance. You can use square, gaussian or, any other distribution forcthat tolerance. Or simply leave out distribution:
    2+-0.5 + 2+-0.5 = 4+-1
    
    Another way is to note the numbers are rounded:
    2.— + 2.— = [4,6)
    
    Report comment
    
    Reply
3. kgsws says:
  
  October 9, 2025 at 9:11 am
  
  Two plus two ..
  ┌─┐┌─┐
  ├─┤├─┤
  └─┘└─┘
  equals five.
  ┌─┬─┐
  ├─┼─┤
  └─┴─┘
  If you count squares.
  
  Report comment
  
  Reply
  1. D says:
    
    October 10, 2025 at 4:00 am
    
    It’s actually 3+3 not 2+2
    
    Report comment
    
    Reply
    1. kgsws says:
      
      October 11, 2025 at 5:45 am
      
      Squares, not rectangles.
      
      Report comment
      
      Reply
  2. C says:
    
    October 10, 2025 at 4:35 am
    
    That’s not addition but addition PLUS merging.
    
    Report comment
    
    Reply
4. JTV says:
  
  October 9, 2025 at 12:35 pm
  
  I believe the idea was to acknowledge the misconception that LLMs always produce factual mathematical statements.
  
  Report comment
  
  Reply
  1. Derek Tombrello says:
    
    October 9, 2025 at 4:27 pm
    
    I’ve seen too many people in the past few years trying to claim that “yes, 2+2 does equal five” to where that was my first thought. I can see your (and possibly his) way of looking at it, though.
    
    Report comment
    
    Reply
5. LordNothing says:
  
  October 9, 2025 at 6:14 pm
  
  big brother is watching you play with your noodle.
  
  Report comment
  
  Reply
6. acoustician says:
  
  October 10, 2025 at 9:02 am
  
  In decibel arithmetic, x + x = x + 3 dB.
  So 2 dB + 2 dB is actually 5 dB.
  
  Report comment
  
  Reply
7. Clemmitt Sigler says:
  
  October 10, 2025 at 2:17 pm
  
  It’s an old mathematicians’ joke: 2 + 2 = 5 for very large values of 2
  
  C
  
  Report comment
  
  Reply
Elliot Williams says:

October 9, 2025 at 4:51 am

Theorem provers are great because they work well with what traditional computers do best: combining many different truth values together. Good for verification, but not for innovation.

LLM’s aren’t really good at innovation either, unless you set their temperature up high, but that’s got a negative effect on the truth value of what comes out. To get something new, you have to take less probable tokens and just roll with it.

So the idea of combining with a solver sounds good, but it’s also a little bit like adding a proofreader to the infinite monkeys working on Shakespeare. I’m not 100% sure it’s a good use of either the monkeys’ or the proofreaders’ time, but I guess it’s better than just letting them go wild? If they are all willing to work for bananas?

Report comment

Reply
1. Aknup says:
  
  October 9, 2025 at 6:53 am
  
  I don’t really use chat, but image generation with AI is most fun when it gets a bit weird I find.
  Realistic and compliant to reality is a bit boring, we already got that.
  The fun part is if the elements are real-ish but the image is weird.
  
  I guess Dali would have agreed.
  
  But does that apply to LLM? I mean if you have it write a story or script then I guess you would want some interesting plot twists and not ‘bob went to the store and bought some bread’
  
  Report comment
  
  Reply
2. Spencer Martin says:
  
  October 9, 2025 at 7:31 am
  
  To be clear, Kevin’s not advocating for using LLMs to do new math – at least not in the next decade or two. I’m not aware of many mathematicians who want LLMs to make substantial new math. We don’t exactly want innovation here – the beauty and draw of mathematics is not getting answers, but understanding them, so a machine that just spits out answers is no fun. The primary interest right now is in using LLMs to assist with formalizing mathematics in a language like Lean. After all, mathematical proofs are written in natural language, and each field is subject to its own colloquialisms. Authors will omit detailed proofs and indicate the style with which one proves a lemma if it’s a sufficiently standard technique. But these are details that need to be filled in when formalizing a proof.
  
  LLMs seem to be pretty good at generating code from natural language when given precise instructions. I suppose this should not be a surprise, given the transformer architecture originated in research on machine translation. So I think there are reasons to be hopeful that this application of LLMs could be productive. The big challenge that Kevin’s identified is that even formalizing statements of theorems can be a large task – the same is true for definitions of mathematical objects. These are instances where, if the LLM hallucinates or makes an error in translating, the theorem prover cannot automatically catch this error and any formalized proofs building off of these flawed definitions are useless.
  
  Report comment
  
  Reply
  1. Aknup says:
    
    October 10, 2025 at 8:08 am
    
    Our planet already has a superior resource to do new math surely, called ‘The Chinese’.
    Also available on most all foreign universities.
    
    Report comment
    
    Reply
Greg A says:

October 9, 2025 at 7:21 am

My understanding is that this is similar to how many modern chess engines work. A neural network and a correct analysis step work in tandem. The neural network performs a qualitative assessment much faster (or with more subtlety, take your pick) than the correct analysis could. And then the correct analysis uses that to decide which branches are worthy of correct investigation. A composite heuristic.

Report comment

Reply
1. TARIK M HOSHAN says:
  
  October 10, 2025 at 4:23 pm
  
  Totally agree. The stockfish approach is probably how it will be done.
  
  Report comment
  
  Reply
Dimitri William Ponirakis says:

October 9, 2025 at 7:36 am

You should watch this:

https://youtu.be/oOYcPkBaotg?si=Q7HLlBjqFt2wH1ZV

Report comment

Reply
5re6u4e says:

October 9, 2025 at 10:54 am

solution exist from 30 years. mizar https://pl.wikipedia.org/wiki/System_Mizar

Report comment

Reply
Benjamin Goldberg says:

October 9, 2025 at 1:47 pm

This makes a lot of sense.

Proof assistants like lean, coq or mizar all take as their inputs formal proof languages which are far more verbose and tedious than textbook math proofs.

Using an ai to translate a proof written by a human in a textbook into a program to be read by a formal proof checker is going to take far less time than having humans translate.

Similarly, asking a language model to explain in math-ese a formal digital proof in lean is going to be faster than reading the program and trying to read and understand the code.

Report comment

Reply
Anonymous Coward says:

October 9, 2025 at 1:56 pm

A few months ago Terrance Tao live streamed himself doing a proof in LEAN and using an LLM based coding assistant to do as much of the work for him as he could get it to do. If I remember correctly, he initially made lots of progress letting the coding assistant do most of the work, then got stuck had to reread and get a better understanding of what was being done in the proof attempt, and then redirect it in the correct direction. I suspect that is where LLM + ITP mathematics is going. It will help you make progress on the small details but it won’t get you to the completed proof without sufficient understanding.

Report comment

Reply
Jesse Alama says:

October 10, 2025 at 4:08 am

My experience is similar to Tao’s. I use Claude Code to assist me with Lean proofs. It’s true that Claude rarely comes up with genuinely new ideas for how to tackle a proof (though I have seen that, in small doses), it does a great job of turning my paper-and-pencil proof sketches into working Lean code, and if the first version of the code doesn’t work, it’s usually able to correct type errors and flesh out proofs in a second or third round. I’ve even had the experience of Claude repeatedly struggling to get a proof over the finish line, despite much assistance from me, leading me to eventually realize that there was an assumption missing in the theorem I was proving. Claude has also found outright inconsistencies (not just typos) in my formalizations.

Report comment

Reply
Milton Ponson Jr says:

October 11, 2025 at 5:40 pm

As a mathematician myself I must disagree. The future isn’t LLM +ITPs, but a combination of philosophy, mathematics and engineering with a large sprinkle of computer science.
After all artificial intelligence is NOT equivalent to machine learning and deep learning.
My personal motto is “Nulla veritas sine arte” and curiosity, creativity and the sense of wonder and discovery of artful patterns in nature is not easily replicated in (sentient) machine intelligence.

Report comment

Reply

Hackaday

Where Is Mathematics Going? Large Language Models And Lean Proof Assistant

31 thoughts on “Where Is Mathematics Going? Large Language Models And Lean Proof Assistant”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

PCB Design Review: TinySparrow, A Module For CAN Hacking, V2

Belting Out The Audio

Ore Formation: A Surface Level Look

In Which I Vibe-Code A Personal Library System

Give Us One Manual For Normies, Another For Hackers

Our Columns

Keebin’ With Kristina: The One With The C64 Keyboard

Linux Fu: The SSD Super Cache

Hackaday Links: December 7, 2025

Something New Every Day, Something Relevant Every Week?

Hackaday Podcast Episode 348: 50 Grams Of PLA Hold A Ton, Phreaknic Badge Is Off The Shelf, And Hackers Need Repair Manuals

31 thoughts on “Where Is Mathematics Going? Large Language Models And Lean Proof Assistant”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns