buildfastwithaibuildfastwithai
GenAI LaunchpadAI WorkshopsAll blogs
Download Unrot App
Free AI Workshop
Share
Back to blogs
Analysis
Tutorials

GPT-5.4 Solved a 60-Year Math Problem: What Happened

May 1, 2026
18 min read
Share:
GPT-5.4 Solved a 60-Year Math Problem: What Happened
Share:

GPT-5.4 Just Solved a 60-Year Math Problem - And the Method It Used Is What Makes It Remarkable

On a Monday afternoon in April 2026, a 23-year-old named Liam Price typed a math problem into ChatGPT.

He did not know the problem had resisted professional mathematicians for 60 years. He did not know it came from Paul Erdős — the most prolific mathematician in history. He had no advanced mathematics training. He was just "vibe maths-ing," as Scientific American put it — casually feeding problems into an AI to see what happened.

What happened: GPT-5.4 Pro solved it. In 80 minutes. From a single prompt. Using a proof method that no human mathematician had thought to apply in 90 years of working on problems of this type.

Within 24 hours, Fields Medalist Terence Tao — one of the greatest living mathematicians — had read the proof, called it a "meaningful contribution to the anatomy of integers that goes well beyond the solution of this particular Erdős problem," and extended it into the seed of a new mathematical theory.

This is the story of what happened, why the proof method matters more than the result itself, and what it actually means — both for AI and for mathematics.
What Is an Erdős Problem (And Why Does It Matter)?

Paul Erdős died in 1996 having published more than 1,500 mathematical papers — more than any mathematician in history. He spent his life traveling between universities with two suitcases, collaborating with anyone who would work with him on mathematics, and leaving behind a catalogue of unsolved problems that read like a to-do list for the next century of mathematical research.

The Erdős problems are a collection of over 1,000 conjectures he posed during his lifetime, maintained today on erdosproblems.com (curated by mathematician Thomas Bloom). They vary enormously in difficulty and significance. Some are straightforward. Some have stymied the best mathematical minds on earth for decades.

Erdős famously offered cash prizes for solutions — amounts ranging from $25 for the easiest to $10,000 for the hardest. He called the best proofs worthy of inclusion in "The Book" — an imaginary volume he described as one God keeps containing the most beautiful proof of every theorem. A proof in The Book is not just correct — it is illuminating, elegant, the proof that reveals why something is true in a way that seems almost inevitable in hindsight.

Since January 2026, 15 Erdős problems have moved from 'open' to 'solved,' with 11 specifically crediting AI models as involved in the process — a number that would have seemed science fiction five years ago.

What Is Primitive Set Problem #1196? (Explained for Non-Mathematicians)

A primitive set is a collection of integers greater than 1 where no number in the set divides evenly into any other. Prime numbers — 2, 3, 5, 7, 11 — form a primitive set automatically, because primes have no factors except themselves and 1 (so no prime can divide another prime). But you can also build primitive sets from non-primes. The set {6, 10, 15} is primitive: 6 doesn't divide 10, 10 doesn't divide 15, 15 doesn't divide 6.

In 1935, Erdős proved something beautiful: if you calculate the sum of 1/(a · log a) across all numbers in any primitive set, that sum is always finite — it never grows without bound, no matter how many numbers you include. The set of all primes gives you the largest possible such sum: approximately 1.6366... This became known as the Erdős sum.

Problem #1196 — posed by Erdős, Sárközy, and Szemerédi in 1968 — asked a sharper question: as the numbers in a primitive set get larger and larger, what happens to the Erdős sum? Specifically, can you prove that for any primitive set containing only numbers larger than x, the sum must be smaller than 1 + some function that shrinks as x grows?

The conjecture was that the answer is yes — and that the exact asymptotics are 1 + O(1/log x). Proving this requires showing not just that the sum is bounded, but that you can describe exactly how fast it approaches 1 as the numbers get larger. Jared Lichtman — an Oxford mathematician who spent seven years on problems in this family and proved the related Erdős Primitive Set Conjecture in his doctoral thesis — tried to prove this sharper bound and got stuck, as did every mathematician before him.

What GPT-5.4 Did: The Method That Changed Everything

GPT-5.4 Pro produced the correct exact asymptotics — 1 + O(1/log x) — in a single reasoning session of approximately 80 minutes. Price then had the model format the proof as a LaTeX paper, which he posted to the Erdős Problems discussion forum.

But the result is not the remarkable part. The method is.

Since 1935, every mathematician working on primitive set problems had taken the same approach: translating the problem from number theory into probability theory, then analyzing it in that framework. This was so natural for human mathematical thinking — Erdős himself worked this way — that no one had seriously looked for an alternative route in nearly 90 years of effort.

GPT-5.4 Pro did not take that route. Instead, it approached the problem through the von Mangoldt function — an object from analytic number theory that encodes the fundamental theorem of arithmetic (the fact that every integer has a unique prime factorization). From there, it used a Markov chain technique — modeling the multiplicative structure of integers as a random process in which prime factors are gradually extracted — to analyze how the Erdős sum behaves.

Kevin Barreto, who will soon join OpenAI's AI for Science team, described the Markov chain approach as "a creative step human mathematicians had overlooked despite years of work on the problem." Lichtman compared it directly to AlphaGo's Move 37 — the 2016 Go game move that looked wrong by every human convention, turned out to be a masterstroke, and has since been studied extensively as a genuinely new idea that rewrote the theory of the game.

"This one is a bit different because people did look at it, and the humans that looked at it just collectively made a slight wrong turn at move one." — Terence Tao, speaking to Scientific American

Lichtman was explicit about the significance: he called the result "the first AI proof at the level of Erdős's Book." Not the first AI proof. The first at the level of The Book — the level of elegant inevitability that Erdős described as the highest standard in mathematics.

Why Terence Tao's Reaction Is the Real Story

Proofs get announced every week. Many mathematical claims turn out to be wrong, incomplete, or trivial extensions of prior work. The reason this result matters is not the claim — it's who validated it and what they said after.

Terence Tao is arguably the best mathematician alive. He won the Fields Medal in 2006 (the highest honor in mathematics), has made fundamental contributions to number theory, harmonic analysis, and partial differential equations, and is known specifically for work in additive combinatorics — the very area that overlaps with primitive set problems. He is not someone who gives out praise lightly.

Tao read the GPT-5.4 proof and commented in the forum that the work reveals "a previously undescribed connection between the anatomy of integers and Markov process theory." He went further: this connection "would be a meaningful contribution to the anatomy of integers that goes well beyond the solution of this particular Erdős problem."

That is the key sentence. Tao was not saying "the proof is correct." He was saying the proof opened a door to a broader theory that mathematicians had not seen before. Within 24 hours, he had extended the argument himself, turning the original proof into the beginning of a more general framework.

This is the difference between a solution and a discovery. The Erdős problem got solved. A new connection in number theory got discovered. The GPT-5.4 model that produced this proof also scores 99.2% on AIME 2026 (advanced math competition) and 92.8% on GPQA Diamond (graduate-level science reasoning) — context that makes clear this was not an isolated accident.

The Formal Verification: How We Know the Proof Is Real

In the history of AI math claims, not all results have held up to scrutiny. Earlier in 2026, several widely-reported AI Erdős solutions turned out to be sophisticated literature searches — the model found existing published papers the database maintainer wasn't aware of, rather than producing original arguments. The AI in mathematics community has learned to be skeptical until formal verification is complete.

This proof passed that test. The Erdős Problem #1196 solution has been formally verified in Lean — an automated proof assistant that enforces mathematical rigor at the level of symbolic logic. In Lean, every logical step must be explicitly justified and machine-checked. If there is a gap in the reasoning, Lean rejects the proof. There is no room for hand-waving.

The erdosproblems.com website now officially marks Problem #1196 as "PROVED," crediting GPT-5.4 Pro (prompted by Liam Price). The mathematical community — including the researchers who worked on this problem for years — has accepted the result.

For readers who want to experiment with GPT-5.4's mathematical reasoning capabilities on their own problems, the Build Fast with AI gen-ai-experiments repository contains reasoning and problem-solving notebooks across OpenAI, Anthropic, and Gemini APIs that make a practical starting point.

This Was Not a One-Off: AI's Math Scorecard in 2026

The media coverage of GPT-5.4 and Erdős #1196 makes it sound like an isolated event. It is not. It is the most striking result in a sustained, accelerating pattern.

This Was Not a One-Off: AI's Math Scorecard in 2026

The media coverage of GPT-5.4 and Erdős #1196 makes it sound like an isolated event. It is not. It is the most striking result in a sustained, accelerating pattern.

Since January 2026, 15 Erdős problems have moved from "open" to "solved," 11 of them with AI models specifically credited. The momentum is not slowing — it is accelerating. Several mathematicians have predicted 2026 will be the first year AI contributions make it through peer review in major math journals. Mehtaab Sawhney, a Columbia mathematician who worked directly on Erdős problems with GPT, has taken an academic leave to join OpenAI. The GPT-5.5 review notes that GPT-5.5 Pro — the successor model — was specifically designed to push further on research tasks and hard mathematics.

The Democratization Angle Nobody Is Talking About

Most coverage of this story focuses on the mathematics. I want to make the other point clearly: Liam Price is 23. He has no advanced mathematics degree. He was not doing research. He entered this problem into ChatGPT on a Monday afternoon because he occasionally feeds Erdős problems to AI to see what happens. His words when asked about the significance: "I don't even know what this problem is."

The solution to a 60-year-old mathematical conjecture was produced by a person with a consumer AI subscription and a casual interest in math puzzles. The same result that required Jared Lichtman — a professional Oxford mathematician — seven years of dedicated work to make partial progress on, was accomplished in an afternoon by an amateur who didn't fully understand what he was solving.

I find this more important than the mathematical result itself. Mathematics has historically been one of the most credential-gated fields in human knowledge. You need the right PhD, from the right institution, with the right advisor, to work on the right problems. The barriers to entry are enormous. GPT-5.4 is not eliminating those barriers for professional mathematicians — but it is creating a path for people who never had access to professional mathematics to contribute to it.

That is not a small thing. And it connects to something broader: the GPT-5.4 vs Gemini 3.1 Pro comparison shows that Gemini is also solving Erdős problems through a different architecture. The mathematical capability is not locked in one model — it's becoming a property of frontier AI systems broadly.

What This Means for AGI — The Honest Answer

Here is where I want to be careful, because the internet's takes on this have been extreme in both directions.

On the optimistic end, some commentators have declared that this proves AGI is imminent, that AI has surpassed human mathematical intelligence, and that the era of human mathematicians is ending. None of these claims are supported by what happened.

On the skeptical end, some mathematicians have argued that the Erdős problems are an imperfect benchmark — they vary wildly in difficulty, some are straightforward, and some earlier AI "solutions" turned out to be literature searches. One mathematician quoted in Scientific American said: "Every individual result has been vastly overhyped by certain corners of the Internet." This critique is fair.

The honest reading sits between these extremes, and it's more interesting than either:

  • What is true: GPT-5.4 Pro produced an original proof method that human mathematicians had not discovered in 90 years of working on a related problem class. That is not a retrieval. That is not a rephrasing of known results. It is something new.
  • What is also true: The Erdős problems represent an 'accessible tail' of open problems. The hardest unsolved problems in mathematics — the Riemann Hypothesis, the Birch and Swinnerton-Dyer Conjecture, the Navier-Stokes equations — are qualitatively different. GPT-5.4 is nowhere near those.
  • What is also true: The most important impact of AI on mathematics in 2026 may not be the headline problem-solving. MIT's Andrew Sutherland argues it's the integration into daily workflows — AI helping mathematicians write Lean proofs faster, check literature they've missed, explore variant approaches, and speed up the tedious parts of research so humans can focus on the creative parts.
  • What is also true: The trajectory is clear. IMO silver in 2024, IMO gold in 2025, original research-level proofs with novel techniques in 2026. If that trajectory continues for another two years, the framing of these questions will look very different.

The question worth sitting with is not "has AI matched human mathematicians" — it hasn't, not at the frontier. The question is: "what does it mean that an AI discovered a mathematical connection that the world's best human researchers missed?" That question does not have a comfortable answer. For context on the current state of AI reasoning models that make this possible, the best AI models April 2026 comparison puts GPT-5.4's reasoning benchmarks in full perspective alongside Claude and Gemini.

Frequently Asked Questions

What is Erdős Problem #1196?

Erdős Problem #1196 is a 1968 conjecture by mathematicians Paul Erdős, A. Sárközy, and E. Szemerédi about the asymptotic behavior of the Erdős sum over primitive sets — collections of integers where no element divides another. Specifically, it asked whether, for any primitive set containing only numbers larger than x, the sum of 1/(a·log a) must be bounded by 1 + O(1/log x). GPT-5.4 Pro proved this conjecture on April 13, 2026, and the result has been formally verified in Lean.

Who is Liam Price and how did he solve the problem?

Liam Price is a 23-year-old with no advanced mathematics training who occasionally enters Erdős problems into ChatGPT Pro to see what results the AI produces. He entered Problem #1196 as a single prompt to GPT-5.4 Pro on April 13, 2026. The model spent approximately 80 minutes reasoning through the problem and produced a proof, which Price then had the model format as a LaTeX paper and posted to the Erdős Problems forum. Price later told Scientific American: "I don't even know what this problem is," describing the discovery as accidental.

What is a primitive set in mathematics?

A primitive set is a collection of integers greater than 1 where no element divides evenly into any other. Prime numbers automatically form a primitive set because primes have no factors except themselves and 1. The set {6, 10, 15} is also primitive because none of those numbers divides another. Erdős proved in 1935 that the sum Σ 1/(a·log a) across any primitive set is always finite, and that the set of all prime numbers achieves the maximum possible value of this sum (approximately 1.6366...).

What did Terence Tao say about the GPT-5.4 proof?

Terence Tao — a Fields Medalist and one of the leading mathematicians in the world — commented in the Erdős Problems forum that the proof "reveals a previously undescribed connection between the anatomy of integers and Markov process theory" and that this would be "a meaningful contribution to the anatomy of integers that goes well beyond the solution of this particular Erdős problem." Within 24 hours of reading the proof, Tao had extended it into the seed of a new mathematical theory.

What was novel about the proof method GPT-5.4 used?

Every mathematician since 1935 had approached primitive set problems by translating them from number theory into probability theory. GPT-5.4 Pro instead used the von Mangoldt function — an analytic number theory object that encodes the prime factorization structure of integers — and built a Markov chain technique that modeled the multiplicative structure of integers as a random process. Oxford mathematician Jared Lichtman, who spent seven years on related problems, compared this to AlphaGo's Move 37 in 2016: a move that violated human convention but turned out to reveal a deeper principle.

Is the GPT-5.4 Erdős proof formally verified?

Yes. The proof has been formally verified in Lean, an automated proof assistant that enforces mathematical rigor at the symbolic logic level. Every logical step must be explicitly machine-checked, and any gap in reasoning causes the proof to be rejected. The erdosproblems.com website now officially marks Problem #1196 as "PROVED," credited to GPT-5.4 Pro prompted by Liam Price. This formal verification distinguishes this result from earlier AI math claims that turned out to be sophisticated literature searches rather than original proofs.

Is this the first time AI solved an Erdős problem?

No. The first AI solution to an Erdős problem was Problem #728, solved in January 2026 by a combination of GPT-5.2 Pro and Harmonic's Aristotle system (a specialized Lean prover). Three Erdős problems were solved with AI assistance in a single seven-day period in January 2026, with all proofs verified by Terence Tao. Google's Gemini Deep Think solved Erdős Problem #1051 autonomously and contributed to a published research paper. What distinguishes Problem #1196 is the originality of the proof method — using a mathematical technique that no human had previously applied to this problem class — rather than being the first AI Erdős solution.

Will AI replace mathematicians?

Not based on current evidence. What AI is demonstrating is the ability to solve specific types of problems and discover novel connections at the research level — a capability that was not expected this soon. But the hardest unsolved problems in mathematics (the Riemann Hypothesis, Birch and Swinnerton-Dyer, the Millennium Problems) require levels of creative reasoning and new conceptual frameworks that current AI systems cannot produce. MIT's Andrew Sutherland argues that AI's greatest near-term impact on mathematics will be integration into daily research workflows — helping mathematicians write formal proofs faster, explore literature more thoroughly, and check more variant approaches — rather than replacing the creative work of research mathematics itself.

Recommended Blogs

Related reading from Build Fast with AI:

  • GPT-5.4 Review: Features, Benchmarks & Access (2026)
  • GPT-5.5 Review: Benchmarks, Pricing & Vs Claude (2026)
  • GPT-5.4 vs Gemini 3.1 Pro (2026): Which AI Wins?
  • Best AI Models April 2026: GPT-5.5, Claude & Gemini Compared
  • Kimi K2.6 vs GPT-5.4 vs Claude Opus: Who Wins? (2026)
  • GPT-5.3-Codex vs Claude Opus 4.6 vs Kimi K2.5 (2026)

References

  • Scientific American — Amateur Armed with ChatGPT 'Vibe Maths' a 60-Year-Old Problem (April 28, 2026)
  • The Decoder — OpenAI's GPT-5.4 Pro Reportedly Solves a Longstanding Open Erdős Math Problem in Under Two Hours
  • abit.ee — GPT-5.4 Pro Solves Erdős Problem Using a Method Mathematicians Overlooked for 90 Years
  • heise online — Creative Solution: AI Solves 60-Year-Old Erdős Problem
  • Erdős Problems Forum — Discussion Thread for Problem #1196
  • GitHub (teorth) — AI Contributions to Erdős Problems (Community Tracking Database)
  • Scientific American — AI Uncovers Solutions to Erdős Problems, Moving Closer to Transforming Math (February 2026)
  • TechCrunch — AI Models Are Starting to Crack High-Level Math Problems (January 14, 2026)
  • Nature — Olympiad-Level Formal Mathematical Reasoning with Reinforcement Learning (AlphaProof, November 2025)
  • Google DeepMind — AI for Math Initiative (Gemini Deep Think, AlphaProof, AlphaEvolve)
  • Build Fast with AI — gen-ai-experiments: Reasoning and Problem-Solving Notebooks
Enjoyed this article? Share it →
Share:

    You Might Also Like

    How FAISS is Revolutionizing Vector Search: Everything You Need to Know
    LLMs

    How FAISS is Revolutionizing Vector Search: Everything You Need to Know

    Discover FAISS, the ultimate library for fast similarity search and clustering of dense vectors! This in-depth guide covers setup, vector stores, document management, similarity search, and real-world applications. Master FAISS to build scalable, AI-powered search systems efficiently! 🚀

    7 AI Tools That Changed Development (December 2025 Guide)
    Tools

    7 AI Tools That Changed Development (December 2025 Guide)

    7 AI tools reshaping development: Google Workspace Studio, DeepSeek V3.2, Gemini 3 Deep Think, Kling 2.6, FLUX.2, Mistral 3, and Runway Gen-4.5.