Google I/O 2026: Gemini 3.5 Flash, Spark & Agentic AI

Google just shipped the most AI-dense I/O keynote in its history. Held on May 19, 2026 in Mountain View, the two-hour event covered Gemini 3.5 Flash, a new 24/7 personal AI agent called Gemini Spark, Antigravity 2.0, and a fundamental reimagining of Search — all within a single sitting. The Gemini app now has over 900 million monthly active users, up from 400 million a year ago. That growth is the backdrop to every announcement on this list.

This guide covers everything developers and builders need to know: what shipped, what it benchmarks at, what it costs, and what it means for the broader model landscape.

1. Gemini 3.5 Flash: The New Default Model

Gemini 3.5 Flash is Google's biggest Flash-tier model launch ever — and it shipped generally available the same day it was announced. As of May 19, 2026, it is the default model in the Gemini app and AI Mode in Google Search worldwide. If you opened Gemini today, you are already running it.

The headline claim is unusual for a Flash release: 3.5 Flash beats Gemini 3.1 Pro on coding and agentic benchmarks, while running roughly 4x faster than comparable frontier models. That inverts the historical Pro/Flash hierarchy in the areas that most developers care about. For the full picture of where this fits in the current landscape, see our May 2026 AI model leaderboard at Build Fast with AI.

The stable API model ID is gemini-3.5-flash (no preview suffix). It replaces the gemini-3-flash-preview identifier used during the preview window. Gemini 3.5 Pro is confirmed and rolling out next month.

Key specs at a glance:

2. Benchmarks: Flash vs. Pro vs. the Field

On Google's published benchmark table, 3.5 Flash leads all reported models on five separate evaluations — including Claude Opus 4.7 and GPT-5.5. It trails Gemini 3.1 Pro on long-context (MRCR v2 at 128k) and Humanity's Last Exam, where raw knowledge depth matters more than agentic capability. Read the rows, not the headline.

My honest read: the agentic and coding results are real. The 14.9-point gap on Finance Agent v2 is not noise. But if you are doing long-document retrieval or Humanity's Last Exam-style knowledge work, Gemini 3.1 Pro still leads on those dimensions and remains the safer choice until 3.5 Pro ships next month.

3. API Pricing and the New thinking_level Surface

Gemini 3.5 Flash is priced at $1.50 per million input tokens and $9.00 per million output tokens — roughly 40% cheaper than Gemini 3.1 Pro on both legs. Cached input tokens cost $0.15 per million. Non-global regions run slightly higher at $1.65/$9.90.

The honest comparison: this is 3x the price of the previous Gemini 3 Flash Preview ($0.50 in / $3.00 out) and 6x the price of 3.1 Flash-Lite. The capability jump is real, but the price bump is real too. This fits a broader trend — OpenAI's GPT-5.5 was 2x the price of GPT-5.4, and Claude Opus 4.7 is roughly 1.46x Opus 4.6 on a per-token basis.

The most developer-facing change is the thinking control surface. The integer thinking_budget parameter that shipped with Gemini 3 Flash Preview is replaced by a string enum called thinking_level. New values: minimal, low, medium (default), and high. Critical note: the default dropped from high to medium. A naive port from gemini-3-flash-preview will silently reason less than your old preview code did unless you opt back in. For API migration details, see our Gemini Deep Research API tutorial which covers the full SDK migration path.

4. Gemini Spark: Your 24/7 Agentic Assistant

Gemini Spark is the headline consumer announcement. It is a personal AI agent inside the Gemini app that runs on dedicated Google Cloud virtual machines — which means it keeps working when your laptop is closed and you are asleep. Sundar Pichai described it as "your personal AI agent that helps you navigate your digital life, taking action on your behalf and under your direction."

Practical capabilities on launch: pulling facts from your emails, Docs, Sheets, and Slides to draft status updates; watching your inbox so small businesses never miss a customer question; emailing Spark directly through a dedicated Gmail address; and interacting with the web directly through Chrome. An Android Halo UI space for tracking agent progress is coming later this year.

Spark is in beta. It is rolling out to trusted testers this week, with a broader beta for Google AI Ultra subscribers in the US next week. Google AI Ultra is a new $100/month tier announced at I/O aimed at developers, creators, and power users.

Spark competes directly with Anthropic's Claude Cowork and OpenAI's ChatGPT agent. Google's underrated advantage here is distribution: 3 billion active Android devices and native access to Gmail data are not capabilities any standalone AI startup can replicate quickly. For an honest assessment of how the broader agent landscape shook out this month, see our Claude Sonnet 4.6 vs GPT-5.5 vs Gemini 3.1 Pro deep-dive.

5. Antigravity 2.0 and the Developer Stack

Antigravity 2.0 is a new standalone desktop application that acts as a central hub for agent interaction. It supports parallel subagent execution, scheduled tasks for background automation, and ecosystem integrations across AI Studio, Android, and Firebase. The internal optimization of Gemini 3.5 Flash inside Antigravity 2.0 reportedly runs at 12x the speed of comparable frontier models — compared to the 4x figure for the public API.

The full developer release at I/O 2026 includes: Antigravity 2.0 desktop app; Managed Agents in the Gemini API (a single API call that spins up a full agent with persistent state across calls); native Android vibe coding support in AI Studio; Google Workspace integrations directly from AI Studio-built apps; and an AI Studio mobile app.

The Interactions API (beta) ships alongside the model — Google's equivalent of OpenAI's server-side history management from the Responses API. For developers already building on Google AI Studio's vibe coding environment, the full Antigravity + AI Studio guide at Build Fast with AI covers what changed in this update and what to migrate.

6. Search, Shopping, and the Agentic Web

Google announced information agents in Search — personalized AI agents you can configure to run in the background, 24/7, to surface information at the right moment and take action on your behalf. This is the clearest signal yet of what AI Mode in Search is becoming: less a search box, more an autonomous research layer. For context on what Gemini's role in Search looked like before I/O, the Gemini in Google Workspace guide covers the pre-I/O baseline.

On the shopping side: a Universal Cart feature combines products added across Search, Gemini, Gmail, and YouTube into a single hub powered by Google Wallet. The Universal Commerce Protocol (UCP) allows AI agents to complete purchases and bookings directly through partners including Amazon, Walmart, Shopify, and Meta.

My contrarian take: the agentic shopping layer is the announcement that Wall Street cares about most, and that most developers are underestimating. Search queries generate ad revenue. Agents that complete transactions generate a cut of GMV. If UCP adoption is real — and the partner list suggests it is — Google just put a direct monetization layer on top of its entire user base.

7. Gemini Omni and Creative Tools

Gemini Omni is a new multimodal world model designed for advanced video generation and editing. Google positioned it as capable of creating and editing videos, images, and simulations from any input. It expands the creative generation stack beyond Nano Banana (the image generation model that has produced over 50 billion images since launch) into video and simulation territory.

Google Flow and Google Pix (a new image editor) also received updates. Audio glasses with Gemini integration — deeply embedded voice access throughout the day without pulling out your phone — were shown off in hardware demos. Three smart glasses partnerships were announced: Samsung, Warby Parker, and Gentle Monster, with Xreal's Project Aura (display glasses with a Qualcomm Snapdragon puck) also shown.

8. My Honest Take: What Actually Matters

Three things are actually new here, and two are mostly noise.

Actually new: (1) A Flash model that legitimately beats Pro on coding and agentic benchmarks — that is a real inversion of the model tier hierarchy and it changes the routing decision for most production systems. (2) Gemini Spark running on dedicated cloud VMs that persist without your device open — that is the architecture that separates a genuine agent from an always-on chatbot. (3) UCP as a transaction layer — if partners actually implement it, this is Google's biggest monetization expansion in a decade.

Mostly noise: The smart glasses are real hardware but they were shown at CES and at the Android Show before this keynote — there is nothing new in the I/O demos. Gemini Omni video generation needs independent benchmarks before it deserves to be in the same sentence as Sora or Veo. The Gemini 3.2 Flash pre-I/O leak post correctly predicted the model structure and consumer-first rollout strategy — the signal-to-noise ratio on I/O leaks has improved considerably.

The right decision for developers right now: update your production routing to gemini-3.5-flash for coding and agentic tasks, hold on Gemini 3.5 Pro for deep reasoning until it ships next month, and test your gemini-3-flash-preview prompts against the new thinking_level defaults before migrating — because the silent reasoning downgrade from high to medium is a real production risk.

Frequently Asked Questions

What is Gemini 3.5 Flash?

Gemini 3.5 Flash is Google's newest Flash-tier model, launched at Google I/O 2026 on May 19, 2026. It is the first model in the Gemini 3.5 family, outperforms Gemini 3.1 Pro on coding and agentic benchmarks (76.2% Terminal-Bench 2.1 vs. 70.3%), runs 4x faster than comparable frontier models, and is priced at $1.50/$9.00 per million tokens.

Is Gemini 3.5 Flash better than GPT-5.5?

On agentic and coding benchmarks, Gemini 3.5 Flash leads GPT-5.5 on MCP Atlas (83.6% vs. lower) and Finance Agent v2. GPT-5.5 still leads on reasoning benchmarks and Terminal-Bench 2.0 (82.7%). The honest answer: Gemini 3.5 Flash is faster and cheaper; GPT-5.5 is stronger on reasoning-heavy workflows.

What is the Gemini 3.5 Flash API pricing?

$1.50 per million input tokens and $9.00 per million output tokens. Cached input tokens cost $0.15 per million. Non-global regions are $1.65/$9.90. This is roughly 40% cheaper than Gemini 3.1 Pro on both input and output.

What is Gemini Spark?

Gemini Spark is a 24/7 personal AI agent inside the Gemini app. It runs on dedicated Google Cloud VMs, which means it keeps working when your device is off. It integrates with Gmail, Docs, Sheets, Slides, and Chrome, and can complete long-horizon research and task management autonomously.

When does Gemini 3.5 Pro release?

Google confirmed Gemini 3.5 Pro is in development and rolling out next month (June 2026). No specific date was given at the I/O keynote.

What is Antigravity 2.0?

Antigravity 2.0 is Google's new standalone desktop application for building and managing AI agents. It supports parallel subagent execution, scheduled background tasks, and ecosystem integrations with AI Studio, Android, and Firebase. It is co-optimized with Gemini 3.5 Flash and runs the model at 12x the speed of the public API.

What is the Google AI Ultra subscription?

Google AI Ultra is a new $100/month tier announced at Google I/O 2026, aimed at developers, creators, and power users. It includes access to Gemini Spark beta, 20TB of cloud storage, and priority access to new model releases.

How does the new thinking_level API work?

The integer thinking_budget parameter has been replaced with a string enum: thinking_level with values minimal, low, medium (default), and high. Critical migration note: the default dropped from high to medium. If you are porting from gemini-3-flash-preview, your prompts will silently reason less unless you explicitly set thinking_level to high.

Recommended Blogs

References