AI tools autonomously generate and verify a solution to Erdos Problem #728

TL;DR

Terence Tao reported that AI systems, working with human guidance and community reconstruction of the question, produced and formalized a solution to Erdős problem #728. Multiple AI tools (ChatGPT and a prover called Aristotle) were used to generate, adapt, and Lean-verify proofs, and iterative AI edits produced readable writeups.

What happened

Terence Tao described on Mathstodon a recent sequence in which AI systems produced a solution to Erdős problem #728 after the problem statement was reconstructed by the Erdos problems community. Early runs by the tool AlphaProof exposed trivial solutions that violated the intended spirit of the question, prompting an added constraint on the parameters. On Jan. 4 ChatGPT produced a proof under the tightened constraint for a small value of the parameter C; that proof was formalized in the Lean proof assistant by a tool called Aristotle. Subsequent prompting adapted the approach to handle the large-C interpretation believed to be intended, and Aristotle repaired minor gaps to yield a Lean-verified proof. Participants then used Aristotle and ChatGPT iteratively to shorten and rewrite the formal proof into more polished natural-language expositions. Tao highlighted the novelty not only of the solution but of the AI-driven ability to rapidly generate, refine and reframe formal proofs and accompanying writeups.

Why it matters

AI generated both new arguments and multiple natural-language expositions, demonstrating faster cycles for producing and revising mathematical writeups.
Formal proof assistants (Lean) were used to check and repair AI-produced proofs, showing a workflow that pairs automated reasoning with verification.
The episode raises questions about measuring whether an AI-produced result is genuinely new versus already present in the literature.
It illustrates a potential shift in the division of labor between human authors and AI for routine proofs and for drafting research exposition.

Key facts

Terence Tao posted a detailed thread on Mathstodon describing the episodes around Erdős problem #728.
The original problem had vagueness in its statement (notably about whether C should be small or large).
AlphaProof found trivial solutions when a or b were allowed large relative to n, leading to an added constraint a,b ≤ (1−ε)n to match the intended spirit.
On Jan. 4 ChatGPT produced an initial proof for the small-C case; that proof was formalized in Lean by Aristotle.
ChatGPT later adapted the argument to treat the large-C interpretation; Aristotle fixed minor errors and produced a Lean-verified proof.
Community participants iteratively used Aristotle and ChatGPT to shorten, polish and expand the proof into multiple natural-language writeups.
Tao noted the final AI-assisted writeup reached a level close to acceptable research writing but still lacked some human touches and references.
Tao and forum participants discussed the need for objective measures to assess similarity between new AI-generated results and existing literature.

What to watch next

Development of objective metrics or tooling to assess how much an AI-produced proof overlaps with prior literature (raised by forum discussion).
How journals and referees will treat AI-assisted and AI-generated formal proofs and multiple machine-produced expositions (not confirmed in the source).
Further integration of formal proof assistants with large-language models to streamline repair and verification of informal arguments.

Quick glossary

Erdős problems: A catalog of mathematical questions associated with the mathematician Paul Erdős and related collaborators, often posed as open problems.
Large language model (LLM): A class of AI systems trained on large text corpora that can generate and transform natural-language text based on prompts.
Lean: A formal proof assistant that encodes mathematical statements and proofs so they can be mechanically checked for logical correctness.
Formal verification: The process of using formal logic and automated tools to check that a proof or program satisfies a precise specification.

Reader FAQ

Was the solution produced entirely by AI without human input?
Not entirely; the effort involved human guidance, community reconstruction of the intended problem statement, iterative prompting, and human participants running tools and selecting outputs.

Did Lean formally verify the final proof?
Yes; the tool Aristotle produced a Lean-verified proof after repairing minor errors in earlier drafts.

Is this result already published in the mathematical literature?
Tao wrote that, to the best of his knowledge, this particular result was not found replicated in existing literature, though similar results by related methods were located.

Does this mean humans will stop writing research papers?
Tao expressed a preference for human-authored final manuscripts for essential portions, suggesting AI could be used for routine proofs and drafting rather than replacing human authorship.

Back Terence Tao @tao@mathstodon.xyz Recently, the application of AI tools to Erdos problems passed a milestone: an Erdos problem (#728 https://www. erdosproblems.com/728 ) was solved more or less autonomously by…

Sources

“Erdos problem #728 was solved more or less autonomously by AI”

AI tools autonomously generate and verify a solution to Erdos Problem #728

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

Whenwords: a language-agnostic library built from specs, no code

Indonesia Temporarily Blocks Grok amid Surge of Sexualized Deepfakes

AI Econ Seminar: A Simulated AI Economist Faces a Hostile Faculty Panel

Leave a Reply Cancel reply

You missed

Whenwords: a language-agnostic library built from specs, no code

Indonesia Temporarily Blocks Grok amid Surge of Sexualized Deepfakes

Bichon: Lightweight Rust email archiver with WebUI and REST API

AI Econ Seminar: A Simulated AI Economist Faces a Hostile Faculty Panel