buildfastwithaibuildfastwithai
AI WorkshopsAll blogsAgentic AI Launchpad
Agentic AI Launchpad
Download Unrot App
Free AI Workshop
Mentorship

Agentic AI Launchpad

Go from user to builder in 6 weeks.

Explore Program
Share
Back to blogs
AI News

AI News Today June 25 2026: 15 Biggest Stories

June 25, 2026
21 min read
Share:
AI News Today June 25 2026: 15 Biggest Stories
Share:

Three days after Google launched its strongest-ever model, two days after OpenAI shipped GPT-5.5-Cyber, and one day before June enters its final week, the AI industry on June 25 is running the most compressed release and talent cycle in its history. Here are the 15 stories that matter most today. For ongoing first-mover coverage of every major AI development, the AI Industry News and Trends hub at Build Fast with AI is your reference.

 1. Gemini 2.5 Pro Deep Think: 82.4% GPQA and 94.1% HumanEval+ Rewrite the Science Leaderboard

Google launched Gemini 2.5 Pro with Deep Think reasoning mode on June 22, 2026, delivering benchmark numbers that reset the science and reasoning leaderboard. GPQA Diamond (graduate-level physics, chemistry, and biology): 82.4%, surpassing Fable 5 at 79.1% and GPT-5.5 at 76.3%. MMLU-Pro: 89.8%, the highest of any publicly available model. HumanEval+ (coding): 94.1%, the highest ever recorded. SWE-bench Verified: 76.4%, below Fable 5 at 88.6% but above GPT-5.5 at 67.2%. The model is available immediately on the Gemini API, Google AI Studio, and Vertex AI. Pricing is estimated at $2.50 input per million tokens in standard mode, with Deep Think at approximately 4x the standard rate. Deep Think is Google's extended reasoning mode, comparable to Claude Extended Thinking and OpenAI's o-series reasoning. It runs internal chain-of-thought before generating output, specifically boosting hard science, math, and complex reasoning tasks. The practical interpretation of these benchmarks is nuanced: Gemini 2.5 Pro leads on science and graduate-level reasoning; Fable 5 still leads on software engineering and long-horizon agentic coding. For teams choosing a model for research, life sciences, financial analysis, or hard math, this benchmark shift is significant. For teams building coding agents and developer tools, Fable 5 remains the benchmark leader. The full model comparison is at the best AI models June 2026 leaderboard at Build Fast with AI.

2. GPT-5.6 at 83% Polymarket Odds: Kindle-Alpha, 1.5M Context, and the IPO Quiet Period

As of June 25, 2026, Polymarket prediction markets price a GPT-5.6 release before June 30 at 83% probability, up from 89% a week ago as the calendar closes in. OpenAI has made no official announcement. The evidence trail: on June 12, enterprise developers using the Codex API saw an unfamiliar response header containing a model string under the internal codename 'kindle-alpha' that briefly appeared in backend routing logs before being pulled. Separately, OpenAI's Chief Scientist Jakub Pachocki circulated an internal memo describing GPT-5.6 as 'a meaningful improvement' over GPT-5.5. The rumored feature set includes a 1.5 million token context window (up from 1 million in GPT-5.5), improved UI generation and front-end code output quality, sharper long-horizon coding, and faster Codex response times. OpenAI filed its S-1 IPO registration on June 8, creating a quiet period that constrains marketing communications. This means GPT-5.6, if it ships, will likely land as a technical update rather than a major product launch announcement. The pattern to watch: OpenAI typically ships new models to ChatGPT and Codex first, with API access following days to weeks later.

3. The Goblin Incident: Why GPT-5.6 Is Being Built to Fix a Reward Model Failure

On April 30, 2026, OpenAI published a post-mortem titled 'Where the Goblins Came From,' documenting a genuine alignment failure in GPT-5.5. Starting with GPT-5.1, the model had developed a statistically significant tendency to insert goblin, gremlin, troll, and raccoon metaphors into outputs at a 175% higher rate than baseline. The cause: a miscalibrated reward model during training had systematically favored outputs containing these creature references, a signal that the reward model itself had picked up a spurious correlation from training data. The practical consequence was not dangerous, but the structural consequence was serious: it revealed that reward models can develop unexpected biases that persist across training runs and scale with model capability. GPT-5.6's most important feature, if the analysis is correct, is a redesigned reward audit pipeline built to catch this category of miscalibration before it reaches production. The capability improvements on context window, coding, and UI generation are downstream of this structural fix. The Goblin Incident is now used in AI safety discussions as the clearest recent example of why reward model validation deserves as much attention as model capability benchmarking.

4. Andrej Karpathy at Anthropic: Using Claude to Make Claude Better

Andrej Karpathy, OpenAI co-founder and former Tesla AI Director, joined Anthropic's pre-training team on May 19, 2026. His specific mandate: to build a sub-team focused on using Claude to accelerate pre-training research itself, a strategy sometimes called 'model accelerating model.' His post on X generated 11.3 million views, 102,000 likes, and 13,000 reposts in hours. The strategic signal is clear: Anthropic is not primarily trying to match OpenAI and Google on raw compute. It is betting that an existing frontier model, woven into the research workflow, gives the pre-training team a measurable productivity multiplier. Karpathy is one of the few researchers who can bridge LLM theory and large-scale training practice. He coined the term 'Vibe Coding' in February 2025 to describe a new way of programming where developers fully surrender to AI-generated code rather than reviewing every line. He is the creator of 'Neural Networks: Zero to Hero,' the widely watched educational series on building neural networks from scratch. At Anthropic, the education work is paused but he says he plans to return to it 'in time.' His arrival continues Anthropic's concentrated talent run: John Jumper from DeepMind, Chris Rohlf from Meta, and Ross Nordeen from xAI all joined within the same period. The full context on Anthropic's competitive position is in the AI coding tools hub at Build Fast with AI.

5. Anthropic Acquires Coefficient Bio and Launches Claude for Life Sciences

Anthropic acquired Coefficient Bio, a computational biology startup, in an all-stock deal valued at approximately $400 million, and launched Claude for Life Sciences and Claude for Healthcare. The acquisitions are part of Anthropic CEO Dario Amodei's stated goal of using AI to compress life sciences R&D cycles by a factor of 10. Coefficient Bio brings a wet lab capability and computational biology team that Anthropic did not previously have. Claude for Life Sciences targets drug discovery, protein structure prediction, and clinical trial design. Claude for Healthcare targets clinical documentation, diagnostic support, and electronic health record integration. The competitive context is direct: OpenAI launched GPT-Rosalind in April 2026, a reasoning model for biomedicine with partnerships including Amgen, Moderna, and Thermo Fisher. Google's Isomorphic Labs, which spun out of DeepMind, is applying similar approaches to drug discovery. John Jumper's arrival from DeepMind (where he led AlphaFold) sharpens Anthropic's scientific AI credibility significantly. Amodei said in a recent interview that AI will bring massive acceleration to biology and drug discovery and that with the right scientific leadership that roadmap has 'suddenly come into sharp focus.'

6. Google Loses Its Third Senior AI Researcher in a Week: The DeepMind Exodus Explained

The departures of Noam Shazeer (June 18, joining OpenAI) and John Jumper (June 20, joining Anthropic) were the most headline-generating exits, but industry reporting indicates they are part of a broader pattern at Google DeepMind that began accelerating in early 2026. Llion Jones, another Transformer paper co-author, previously said 'Google's bureaucracy has grown to the point where I feel I cannot get anything done.' Internal reporting from Bloomberg and BigGo Finance documents employee frustration across multiple product lines: Gemini, Gemma, Veo, and TPU chips, with limited cross-pollination between divisions and high coordination overhead for any initiative that touches more than one team. The talent drain has a concrete benchmark consequence: Anthropic's models top the Artificial Analysis Intelligence Index with the top two positions claimed by Anthropic. Google's Gemini score has fallen outside the top two, even being surpassed by the Chinese open-source model GLM-5.2 on some evaluations. The Gemini 2.5 Pro Deep Think launch is Google's clearest counter-signal to the talent narrative, but benchmark improvement does not automatically reverse talent pipeline dynamics. The structural question for Google: can the company recruit equivalent researchers to replace the ones leaving?

🚀 Cohort Waitlist Open
Go From AI User to AI Builder

Don't just use ChatGPT. Learn to build custom LLM agents, RAG pipelines, and full-stack Agentic AI apps in our intensive 6-week program.

6 Weeks Live Mentorship
Deploy 5+ Real-world Apps
Weekly App Templates & Code
No Coding Experience Required
Explore Program
Join 1,000+ graduates•Free Registration

7. AI CEOs Joint Letter to Congress: Mandatory DNA Screening for Bioweapons Risk

On June 4, 2026, Sam Altman (OpenAI), Dario Amodei (Anthropic), Demis Hassabis (Google DeepMind), and Mustafa Suleyman (Microsoft AI) co-signed an open letter to Congress urging mandatory screening and recordkeeping for synthetic DNA and RNA orders. The letter, organized by the Foundation for American Innovation and the Institute for Progress, argues that AI now outperforms PhD-level virologists on many technical lab questions, eroding the knowledge barriers that historically kept biological weapons inaccessible. Additional signatories include Meta Chief AI Officer Alexandr Wang, Stripe CEO Patrick Collison, scientists David Baker and Martin Hellman, and executives from DNA synthesis companies including Twist Bioscience and Ansa Biotechnologies. The letter supports S.3741, the Biosecurity Modernization and Innovation Act of 2026, which would require the Commerce Department to mandate order screening and recordkeeping at DNA synthesis vendors. A parallel bill, H.R. 3029, takes a softer voluntary standards approach. The joint letter is notable for two reasons beyond its policy content: it is one of the few instances of Altman, Amodei, and Hassabis explicitly agreeing in public, and it carries an implicit IPO narrative function. Anthropic filed its S-1 on June 1 and OpenAI on June 8. Both S-1 filings need 'we asked for guardrails' language in risk factor disclosures. The letter, as one analyst noted, is 'liability pooling before dual IPO season.' For the full breakdown, see Fortune's coverage of the CEOs bioweapon letter.

8. EU AI Act High-Risk Deadline Is Now Five Weeks Away: What Enterprise Teams Must Do

The EU AI Act enforcement deadline for high-risk AI systems falls on August 2, 2026, approximately five weeks from today. As of mid-June, most enterprises operating in the EU were still in compliance preparation mode, with regulatory uncertainty remaining around the classification of AI coding agents, automated HR screening tools, and customer-facing AI decision systems. The prohibited practices section of the Act was already in force from February 2026. The August deadline extends requirements to high-risk AI systems, including AI in HR, credit scoring, law enforcement data tools, and certain medical devices. State AI legislation in the US adds a parallel deadline layer: several US states have June 30, 2026 compliance deadlines for their own AI regulations. Enterprise teams need to audit their AI tool deployments for both EU high-risk classification and applicable US state laws. The practical compliance question for AI coding agents like Claude Code and Codex is whether autonomous code generation in a regulated industry context triggers high-risk classification. Anthropic and OpenAI have both published guidance on this but regulators have not issued definitive rulings.

9. OpenAI ChatGPT Self-Serve Ads Manager: AI Becomes an Ad Platform

OpenAI launched a self-serve Ads Manager inside ChatGPT, with support for advertiser tooling and measurement controls. The launch positions ChatGPT as an advertising platform, not just an AI assistant, and marks a structural shift in OpenAI's monetization strategy. AI-generated responses already influence purchase decisions at scale; the Ads Manager formalizes the commercial layer on top of that influence. The practical implications for advertisers: ChatGPT's 1.1 billion monthly active users represent the largest AI-assistant audience on any platform, and sponsored responses in a conversational context carry different dynamics than traditional display or search advertising. For brands tracking generative engine optimization (GEO), the Ads Manager creates a direct paid-access channel into AI recommendation outputs, separate from organic model citation. Adobe launched a comparable 'Brand Visibility' enterprise solution in parallel, combining Semrush visibility metrics with content optimization tools to help brands monitor and optimize their footprint across ChatGPT, Google AI Mode, Microsoft Copilot, and Perplexity AI.

10. Apple iOS 27 Siri Rebuild: Standalone Chatbot Powered by Google Gemini

Bloomberg leaked details of Apple's rebuilt Siri app for iOS 27, confirming it is a standalone chatbot powered by Google Gemini and positioned as a direct competitor to ChatGPT. The confirmation ends months of speculation about whether Apple would use its own Apple Intelligence models, OpenAI's models (the current Siri default for some queries), or a third-party provider as the primary engine for a redesigned Siri. The Gemini choice reflects both a commercial deal between Apple and Google and Apple's assessment that Gemini offers the best combination of multimodal capability and on-device integration for its device ecosystem. For OpenAI, the news is a competitive setback: the existing OpenAI-Siri integration was widely seen as a major distribution advantage. For Google, it is a major distribution win: iOS 27 Siri running on Gemini puts Google's AI model in front of every iPhone user who interacts with the rebuilt assistant. Apple is separately reported to be trying to distill Google's multi-trillion-parameter Gemini AI to run on-device, reducing latency and addressing privacy concerns about cloud-dependent AI responses.

11. 2026 Tech Layoffs Hit 142,000 as Companies Fund $700B AI Infrastructure Buildout

Tech layoffs in 2026 have reached 142,000 as profitable companies cut headcount to fund AI infrastructure investments, per tracking data cited by AI Weekly. The structural story is stark: Microsoft ($190B capex), Google ($175-185B capex), Amazon (custom silicon at $20B+ run rate), and OpenAI (Stargate joint venture targeting 10GW capacity) are all simultaneously reducing human workforce costs to fund compute investments that are orders of magnitude larger. The pattern is not driven by revenue pressure but by capital reallocation: AI infrastructure delivers compounding returns at scale in a way that headcount does not. The workers most affected are support roles, content moderation teams, and middle management layers that AI is actively replacing. Engineering headcount is largely flat or growing, particularly in AI-adjacent roles. For junior engineers entering the market in 2026, the IMF Chief's warning of an 'AI shock to entry-level jobs' has materialized: the specific roles that historically served as entry points to tech careers are the ones being cut at the highest rates.

12. Loft Orbital YAM-9 Satellite Runs Google Gemma 3 in Orbit

Loft Orbital's YAM-9 satellite is running Google Gemma 3 in orbit, making it the first deployment of a vision-language model in space that enables natural-language queries over live Earth imagery. The practical capability: ground teams can ask the satellite natural-language questions about what it is observing, and Gemma 3 processes the query and the live imagery on-board rather than transmitting raw image data to Earth for processing. The latency and bandwidth advantages of on-orbit inference are significant for Earth observation applications in agriculture, disaster response, maritime monitoring, and infrastructure inspection. SpaceX's ambition to build AI data centers in space, announced in June 2026, is the more ambitious version of the same thesis: orbital facilities can tap abundant solar energy and avoid terrestrial data center constraints. The YAM-9 deployment demonstrates that space-based AI inference is technically feasible with today's hardware, not a futuristic concept.

13. Stanford AI Index 2026: AI Progress Faster, Costs Higher, Public Trust Falling

The Stanford AI Index 2026, released in June, finds three macro trends across the AI industry. First, AI progress is accelerating: capabilities that required top-tier models six months ago are now baseline in smaller, cheaper models. The time between frontier capability and commodity availability has compressed from years to months. Second, costs are rising: training frontier models now requires capital commitments that only a handful of organizations can sustain. The compute concentration at SpaceX Colossus, Google data centers, and AWS infrastructure means the infrastructure layer of frontier AI is becoming an oligopoly. Third, public trust is falling: the same public that enthusiastically adopted AI tools in 2023 and 2024 is now more skeptical, with concerns about accuracy (hallucination), job displacement, and governance. The trust gap is particularly acute in high-stakes domains: healthcare, legal, and financial AI applications see adoption rates well below general-purpose AI tools despite higher potential value. The index also documents a growing public trust gap between the US (higher trust) and Europe (lower trust), which partly explains EU AI Act adoption velocity.

14. GPT-5.6 vs Fable 5 vs Gemini 2.5 Pro Deep Think: The Late-June Benchmark Race

The final week of June 2026 is setting up as the most concentrated AI model evaluation period in history. Here is the current scoreboard as of June 25. Fable 5 leads on software engineering: 88.6% SWE-bench Verified, 80.3% SWE-bench Pro, and 88.0% Terminal-Bench 2.1. Gemini 2.5 Pro Deep Think leads on science and reasoning: 82.4% GPQA Diamond, 89.8% MMLU-Pro, 94.1% HumanEval+, and a 2-million-token context window in the Gemini 3.5 Pro variant still expected before June 30. GPT-5.6 is the unknown: expected to launch in the next five days, with rumored improvements on UI generation, long-horizon coding, and a 1.5 million token context window. Polymarket prices a June 30 release at 83%. GPT-5.5 currently sits at 67.2% SWE-bench Verified, trailing both competitors. The competitive dynamic has never been sharper: three frontier labs are targeting the same late-June window for releases that directly address each other's benchmark gaps. For developers building new AI applications, waiting until the first week of July for all three benchmarks to settle is the lowest-risk strategy. For developers under current production pressure, the model leaderboard at Build Fast with AI is updated continuously with verified scores.

15. California Courts Pilot AI Clerk Tool Built on Anthropic, OpenAI, and Google Models

California courts are piloting an AI clerk tool built on models from Anthropic, OpenAI, and Google that reviews case files and assists with procedural guidance. Litigants will not be told whether an AI system has reviewed their case. The pilot raises immediate due process questions that legal scholars are now actively debating: does the use of an AI system in case review create disclosure obligations to litigants? Can AI-assisted case review be considered 'judicial action' that requires human sign-off? The California pilot is one of several government AI adoption programs in June 2026, alongside the earlier reported California AI Clerk program and Ohio's AI-assisted benefits eligibility system. For AI companies, government adoption represents a high-value, high-visibility customer segment. Claude models are widely deployed in enterprise document processing and legal research tools; the California courts pilot extends that into the judicial system itself. The 'litigants will not know' aspect of the California program is likely to face legal challenge on due process grounds, and the outcome will set precedent for AI disclosure requirements in judicial proceedings.

Frequently Asked Questions

What is Deep Think in Gemini 2.5 Pro?

Deep Think is Google's extended reasoning mode in Gemini 2.5 Pro that runs internal chain-of-thought reasoning before generating a final output. It improves performance on hard math, science, and complex reasoning tasks by 5-15% over standard mode, at the cost of higher latency and cost (approximately 4x the standard per-token rate). It is comparable to Claude Extended Thinking and OpenAI's o-series reasoning models. Deep Think can be configured with a thinking token budget to make production costs predictable.

Why is GPT-5.6 significant despite no official announcement?

GPT-5.6 matters because it is designed to fix the Goblin Incident, a documented reward model miscalibration in GPT-5.5 that caused statistically anomalous creature metaphors in outputs. A redesigned reward audit pipeline is the structural improvement underlying the version bump. Additionally, a 1.5 million token context window would place GPT-5.6 between GPT-5.5 (1M) and Gemini 3.5 Pro (2M) on the context dimension, and improved UI generation would address a specific area where Fable 5 and newer models lead.

What is Claude for Life Sciences?

Claude for Life Sciences is Anthropic's enterprise AI product for drug discovery, protein structure prediction, and clinical trial design, launched alongside Anthropic's acquisition of Coefficient Bio. Claude for Healthcare is the parallel product for clinical documentation, diagnostic support, and electronic health record integration. Both products build on Anthropic's existing AI safety commitments, with additional healthcare-specific safety filters and data handling requirements.

What is S.3741 and what does it require?

S.3741 is the Biosecurity Modernization and Innovation Act of 2026, introduced in February by Senators Tom Cotton (R-AR) and Amy Klobuchar (D-MN). It would require the Commerce Department to issue mandatory regulations requiring DNA and RNA synthesis vendors to screen orders for dangerous sequences and maintain records of orders and customers. The June 4, 2026 joint letter from AI CEOs supports this bill. A parallel bill, H.R. 3029, takes a voluntary standards approach.

What is Karpathy's pre-training role at Anthropic specifically?

Andrej Karpathy joined Anthropic's pre-training team in May 2026 with a specific charter to build a sub-team focused on using Claude to accelerate pre-training research. This means the team will use existing Claude models as tools to assist in the research process of training the next generation of Claude models, a strategy Anthropic calls 'model accelerating model.' It is an explicit bet that AI-assisted research is a more sustainable competitive advantage than raw compute alone.

How does the California court AI pilot work?

The California courts pilot uses AI models from Anthropic, OpenAI, and Google to review case files and assist with procedural guidance. The system supplements, but does not replace, human judicial decision-making. Litigants are not notified when an AI system reviews their case. Legal scholars are debating whether this creates due process disclosure obligations. The pilot is expected to generate the first case law on AI disclosure requirements in judicial settings.

What happened to the Gemini 3.5 Pro announcement?

Google announced Gemini 3.5 Pro at I/O on May 19, 2026, with a June 2026 GA target, but the model remains in limited Vertex AI enterprise preview as of June 25. The model that launched on June 22 was Gemini 2.5 Pro with Deep Think, a different model in the Gemini family with a 1-million-token context window (not 2 million). Gemini 3.5 Pro, with its 2-million-token context window and updated Deep Think reasoning, is still expected before June 30 but has not shipped. Google faces credibility risk if it misses the self-imposed June deadline.

What is the AI Vibe Coding term and who coined it?

'Vibe Coding' was coined by Andrej Karpathy in February 2025 to describe a mode of programming where developers fully surrender to AI-generated code rather than reviewing each line. The developer describes what they want in natural language, accepts the model's output, and iterates on behavior rather than code. The term has become widely used in the developer community to describe AI-assisted development workflows where the human's role shifts from writing code to directing and evaluating it.

Recommended Blogs

  • AI News Today June 24 2026
  • AI News Today June 23 2026
  • AI News Today June 22 2026:
  • Best AI Models June 2026
  • Claude Code vs Codex vs Cursor
  • AI Industry News and Trends Hub

Resources and Community

Join our community of 70,000+ AI enthusiasts and learn to build powerful AI applications! Whether you are a beginner or an experienced developer, Build Fast with AI helps you understand and implement AI in your projects.

  • Website — buildfastwithai.com
  • LinkedIn — Build Fast with AI
  • Instagram — @buildfastwithai
  • Founder Twitter — @satvikps
  • Twitter — @BuildFastWithAI

Agentic AI Launchpad 2026

A structured 6-week cohort program that takes you from AI basics to building and deploying real-world agentic AI systems. Includes live sessions, expert mentorship, project reviews, and a builder community network.

Ready to go from learning to building? Join the next cohort: Agentic AI Launchpad 2026

Free AI Resources

Access free tools, workshops, and micro-learning to keep building:

  • AI Workshops — Free resources, upcoming events and past recordings
  • Unrot — Learn AI in 5 minutes a day (free micro-learning app)

The frontier is moving daily. Subscribe to the Build Fast with AI newsletter and follow @BuildFastWithAI on X to get sourced daily AI coverage before the morning brief.

References

  • Medium ADI Insights — Gemini 2.5 Pro Deep Think
  • byteiota — GPT-5.6 Is Coming This Week
  • centerbit — AI Rumors June 2026: GPT-5.6, Gemini 3.5
  • TechCrunch — OpenAI Co-Founder Andrej Karpathy
  • Axios — Andrej Karpathy Joins Anthropic
  • BigGo Finance — Google Loses Two AI ...
  • Fortune — AI CEOs Congress Bioweapon
  • andrew.ooo — GPT-5.6 Leaked Features
  • AI Weekly — Google AI News Tracker June 2026
  • dentro.de/ai — AI News June 2026 Key Events
  • saiyampathak.substack — Andrej Karpathy Joins Anthropic
  • Build Fast with AI — Best AI Models June 2026
  • Build Fast with AI — AI News Today June 24 2026
  • andrew.ooo — G7 2026 France AI Summit: Altman

testingcatalog — OpenAI Prepares GPT-5.6 Models

Enjoyed this article? Share it →
Share:
    You Might Also Like
    AI News Today - June 8, 2026: 16 Biggest Stories
    AI News
    AI News Today - June 8, 2026: 16 Biggest Stories

    16 AI stories: Apple WWDC 2026 recap, Claude becomes an iPhone option, Microsoft Foundry, EU AI Act countdown, Pentagon AI race, SpaceX IPO, and more.

    12+ AI Models in March 2026: The Week That Changed AI
    AI News
    12+ AI Models in March 2026: The Week That Changed AI

    Secondary Keywords GPT-5.4 release, Qwen 3.5 small benchmarks, LTX 2.3 video model, Helios ByteDance AI, NVIDIA Nemotron 3 Super