TL;DR
Researchers and hobbyists report that recent large language models, notably GPT 5.2, are producing correct, checkable solutions to high-level math problems. Several Erdős conjectures have been reclassified as solved recently, with formalization tools and proof assistants playing a key role in verification.
What happened
Over the weekend a software engineer and former quant researcher, Neel Somani, tested OpenAI’s latest model and found it produced a full solution to a challenging math problem after an extended run. He used a formalization tool from Harmonic to convert and verify the model’s output; the formalization reportedly checked out. The model’s internal reasoning referenced results such as Legendre’s formula and Bertrand’s postulate, and it located a relevant MathOverflow post by Noam Elkies before producing a proof that differed in substantive ways and addressed a version of a problem attributed to Paul Erdős. Since the release of GPT 5.2, people working with math have reported a noticeable uptick in model competence. An earlier Gemini-powered system, AlphaEvolve, also produced autonomous solutions last year. In the weeks since Christmas, 15 items on the online Erdős problem list moved from “open” to “solved,” and 11 of those entries credited AI involvement. Experts including Terence Tao note multiple cases of meaningful autonomous progress as well as instances where models primarily located existing research.
Why it matters
- Large language models are beginning to contribute to research-level mathematics, shifting parts of the problem-solving workflow.
- Automated formalization and proof assistants make it easier to verify machine-generated proofs, potentially increasing trust and reuse.
- AI’s scalability could systematically tackle the long tail of less-studied conjectures, changing how certain collections of problems are addressed.
- Wider adoption by established researchers signals a shift from experimental tools to accepted parts of mathematical practice.
Key facts
- Neel Somani used GPT 5.2 to obtain a full solution to a mathematical problem and formalized it with a Harmonic tool.
- ChatGPT’s chain-of-thought during the attempt referenced results like Legendre’s formula and Bertrand’s postulate and found a 2013 MathOverflow post by Noam Elkies.
- The model’s final proof differed from Elkies’ solution and reportedly supplied a more complete solution to a version of an Erdős problem.
- Since Christmas, 15 problems on the online Erdős list moved from “open” to “solved,” with 11 of those entries acknowledging AI involvement.
- Terence Tao identified eight cases where models made meaningful autonomous progress on Erdős problems and six cases where they aided by locating prior work.
- An earlier autonomous batch of solutions came from a Gemini-powered model named AlphaEvolve in November.
- The Lean proof assistant, developed at Microsoft Research in 2013, is widely used for formalizing proofs; newer tools aim to automate more of that process.
- Harmonic founder Tudor Achim emphasized that established academics are beginning to use AI tools for formalization and research workflows.
What to watch next
- Whether the number of Erdős problems reclassified as solved with AI involvement continues to rise.
- How widely proof assistants and formalization tools like Lean and Harmonic’s systems are adopted across mathematics departments.
- Progress toward truly autonomous, end-to-end machine proofs versus systems that still require significant human guidance.
Quick glossary
- Large language model (LLM): A machine learning system trained on large text datasets to generate and analyze natural language, often used for tasks like problem-solving and summarization.
- Proof assistant: Software that helps users write and check formal mathematical proofs, ensuring each step follows strict logical rules.
- Formalization: The process of converting informal mathematical arguments into a precise, machine-checkable format.
- Erdős problems: A collection of conjectures and questions compiled from the work of mathematician Paul Erdős, varying widely in difficulty and topic.
Reader FAQ
Did GPT 5.2 actually solve open math problems on its own?
The source reports cases of meaningful autonomous progress with GPT 5.2, but also notes instances where models built on existing research; fully independent, general-purpose autonomy is not claimed.
Were the machine-generated proofs verified?
At least one example was formalized with a Harmonic tool and reportedly checked out; broader verification practices vary.
Are top mathematicians accepting AI-produced results?
The source says some prominent researchers are taking these tools seriously and using them, though acceptance is heterogeneous.
Will AI replace human mathematicians soon?
Not confirmed in the source.

Over the weekend, Neel Somani, who is a software engineer, former quant researcher, and a startup founder, was testing the math skills of OpenAI’s new model when he made an…
Sources
- AI models are starting to crack high-level math problems
- Can Large Language Models Solve Easy Conjectures?
- Reconstructing Mathematics from the Ground Up with … – gekko
- Benchmarking LLMs on Advanced Mathematical Reasoning
Related posts
- Why Google’s Gemini Appears to Be Leading the AI Race — What It Means for Industry
- Google adds Gemini to Trends Explore to auto-identify and compare trends
- X hasn’t stopped Grok AI from creating undressed images in the UK