Erdos Problem #728 Largely Solved Autonomously by AI and Formalized

TL;DR

Terence Tao reported that AI systems played the central role in resolving Erdős problem #728, after the problem was reconstructed to reflect its intended meaning. The workflow combined multiple AI tools (including ChatGPT and a proof assistant pipeline) with human oversight to produce a Lean-verified argument and successive human- and AI-edited writeups.

What happened

Researchers working from the Erdős problems repository found that recent AI tools could produce a full solution to problem #728 once the statement was clarified. An early AI attempt by a team using AlphaProof revealed trivial solutions when the variables a or b were allowed to be large, prompting participants to add a restriction keeping a and b bounded away from n. On Jan. 4, ChatGPT generated a proof for the version with a small constant C; that argument was then formalized in the Lean proof assistant by the tool Aristotle. ChatGPT later adapted the approach to cover the case where C is large, and Aristotle repaired minor flaws to yield a Lean-verified proof. Multiple people iterated on the Lean output, using ChatGPT to translate formal code into natural-language expositions and to tighten the narrative and references. Terence Tao characterized the final writeup as approaching an acceptable research-paper standard, while emphasizing continued human responsibility for essential portions of the work.

Why it matters

Demonstrates recent gains in AI capability to both produce mathematical arguments and convert formal proofs into readable exposition.
Shows how formal proof assistants can be paired with generative models to detect and repair gaps in reasoning.
Highlights a potential shift in how mathematical manuscripts are produced and revised, enabling many alternate expositions alongside a principal paper.
Raises questions about evaluating novelty and overlap with existing literature when AI rapidly reproduces or adapts known methods.

Key facts

The episode centers on Erdős problem #728 as posted on the Erdős problems website.
Terence Tao reported the developments in a public thread on Mathstodon.
An initial AI approach (AlphaProof) found trivial solutions until participants imposed a constraint a,b ≤ (1−ε)n to stay 'in spirit' of the question.
On Jan. 4, ChatGPT produced a proof for the small‑C case; the argument was formalized into Lean by a tool named Aristotle.
ChatGPT later extended the argument to the large‑C version; Aristotle patched minor errors and produced a Lean-verified proof.
Community participants ran the Lean formalization through ChatGPT to obtain improved natural-language writeups and a more polished article.
Tao noted similar results exist in the literature for the small‑C claim, but the reconstructed intended problem and the large‑C result appear not to have been replicated in prior papers.
Tao expressed a preference for core human-authored exposition while acknowledging utility in delegating routine proofs to AI/formal methods.

What to watch next

Efforts to develop objective measures of similarity between AI-produced results and existing literature, as raised in community discussion.
How journals, authors, and reviewers will treat AI-generated drafts and formally verified proofs versus a single authoritative human-written paper.
Wider adoption of workflows that combine generative models with proof assistants to produce, verify, and rewrite mathematical arguments.

Quick glossary

Erdős problems: A catalog of open questions and conjectures attributed to mathematician Paul Erdős and collaborators; used by the community to track and discuss unsolved problems.
Lean: A formal proof assistant used to write machine-checkable mathematical proofs and to verify the correctness of formalized arguments.
Formal verification: The process of using formal logic and software tools to check that a proof or system satisfies a specification without human ambiguity.
Generative language model (e.g., ChatGPT): An AI system that produces text in response to prompts, often used to draft explanations, proofs, or other natural-language material.
Aristotle (in this context): An AI-assisted tool referenced in the community that was used to convert and repair informal proofs into Lean formalizations and to produce verified proofs.

Reader FAQ

Did AI fully solve the problem without human help?
Not fully; AI systems generated and formalized key arguments but participants supplied feedback, constraints, and further editing.

Is the result entirely new to the mathematical literature?
Tao reported that, to the best of the community's knowledge, the reconstructed intended result—especially for large C—was not found in existing literature, though similar small‑C results do exist.

Who verified the proof?
A tool called Aristotle produced a Lean-verified version of the proof after AI-generated drafts were refined; community members also reviewed and rewrote expositions.

Will future papers be written by AI?
Tao indicated a preference for human-authored core expositions but suggested AI could be useful for routine proofs and producing multiple alternate writeups; broader policy outcomes are not confirmed in the source.

Back Terence Tao @tao@mathstodon.xyz Recently, the application of AI tools to Erdos problems passed a milestone: an Erdos problem (#728 https://www. erdosproblems.com/728 ) was solved more or less autonomously by…

Sources

"Erdos problem #728 was solved more or less autonomously by AI"

Erdos Problem #728 Largely Solved Autonomously by AI and Formalized

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

AI Econ Seminar: A Simulated AI Economist Faces a Hostile Faculty Panel

How 20 AI Models Differ When Facing Moral and Political Dilemmas

What to expect from Apple Intelligence and the new Siri this spring

Leave a Reply Cancel reply

You missed

AI Econ Seminar: A Simulated AI Economist Faces a Hostile Faculty Panel

How the High School You Attend Reshapes Your Chances for UC Admission

ASCII-Driven Development: A Practical Look at Text-First Tools and Workflows

GPU memory snapshots bring checkpoint/restore to GPU workloads, speeding cold starts