AI Workshops All blogs Agentic AI Launchpad

Mentorship

Agentic AI Launchpad

Go from user to builder in 6 weeks.

Explore Program

Back to blogs

LLMs

Reviews

Sakana AI Fugu Review: The Orchestration Model That Routes Around Export Controls

June 22, 2026

13 min read

Sakana AI Fugu Review: The Orchestration Model That Routes Around Export Controls

On June 22, 2026, Tokyo-based Sakana AI did something unusual: instead of announcing a bigger model, they announced a smarter coordinator. Sakana Fugu is a multi-agent orchestration system that presents itself as a single API, yet internally routes tasks across a pool of the world's best models — dynamically, adaptively, and without hardcoded rules. Fugu Ultra, its flagship variant, benchmarks shoulder-to-shoulder with Anthropic's Fable 5 and Mythos Preview — the very models that became inaccessible to most of the world due to national-security-based export controls on June 12, 2026. Whether that claim holds up in production is a separate question. But the idea — build an orchestration layer that routes around vendor lock-in — is the right idea at exactly the right time.

1. What Is Sakana AI Fugu?

Sakana Fugu is a multi-agent orchestration system that behaves like a foundation model. You send a request to one endpoint — one OpenAI-compatible API call — and Fugu decides internally how to handle it. Simple tasks get answered directly. Complex, multi-step tasks trigger the assembly of a coordinated team of expert models: one plans, one executes, one verifies, one synthesizes. The result reaches you as a single, coherent answer. None of the coordination complexity ever touches your code.

What makes this technically interesting is that Fugu itself is a trained language model, not a router built with if/else logic. The orchestration is learned, not hardcoded. Fugu has been trained to understand when to delegate, how agents should communicate, and how to combine their outputs into something reliable. This is grounded in Sakana AI's two ICLR 2026 papers: TRINITY (an evolved LLM coordinator) and the Conductor (learning to orchestrate agents in natural language). The academic lineage matters — this is not prompt engineering dressed up as a product.

For the AI agent frameworks ecosystem context, this represents a philosophical departure from tools like LangGraph — production agent framework from LangChain or CrewAI multi-agent orchestration. Those frameworks require you to build the orchestration layer in your code. Fugu internalizes it into the model itself.

2. The Geopolitical Context: Why Orchestration Matters Now

Fugu Ultra matches the performance of Fable 5 and Mythos Preview — and that framing is deliberate. On June 12, 2026, Anthropic's most capable models became subject to national-security-based export controls, making them inaccessible to organizations in a broad set of countries. For enterprises that had built critical workflows on top of these models, access disappeared overnight.

David Ha, Sakana AI's CEO, framed the Fugu launch explicitly around this risk: "Relying on a single company's APIs for critical infrastructure, finance, or governance is a material vulnerability. This risk is no longer a hypothetical possibility, but a reality." Fugu's agent pool is explicitly swappable. If one provider restricts access, Fugu dynamically routes around the disruption. As new models arrive — including Sakana's own — they fold into the pool and pass gains to users automatically.

For a deeper understanding of AI industry shifts and model releases happening across the stack in 2026, our AI Industry News & Trends collection tracks the most important developments as they happen.

Hot take: AI sovereignty is a real problem, and Fugu is one of the first products to treat it as a first-class design constraint rather than an afterthought. Whether Sakana delivers on the resilience promise in production remains to be seen, but the framing alone changes how enterprises should be thinking about model dependency.

3. Fugu vs Fugu Ultra - Which One Should You Use?

At launch, Sakana Fugu ships as two models, both accessible through the same OpenAI-compatible API:

My take: start with Fugu for latency-sensitive workflows. Reach for Fugu Ultra only when tasks are genuinely multi-step and deep — the cost difference (reportedly up to $10 per message for heavy Ultra tasks) is significant enough to be selective.

4. Benchmark Performance: What the Numbers Actually Say

Fugu Ultra's benchmark claims are aggressive. Sakana reports top-of-leaderboard performance across:

GPQA-Diamond (PhD-level scientific reasoning)
SWE-Pro (real-world software engineering)
LiveCodeBench (live coding evaluation)
ALE-Bench (autonomous learning and experimentation)

In the AutoResearch experiment — an almost fully automated ML research workflow running on a single H100 GPU over ~14 hours — Fugu Ultra achieved the best mean bits-per-byte (0.9774 ± 0.0019), ahead of three undisclosed frontier model baselines.

The comparison baselines that are named publicly are Gemini 3.1 Pro, Opus 4.8 (max), and GPT-5.5 (xhigh). Fable 5 and Mythos Preview are not in Fugu's agent pool — they're export-controlled and inaccessible. So Sakana's claim that Fugu Ultra 'matches Fable 5 and Mythos' is based on provider-reported benchmark scores, not direct head-to-head testing in the same evaluation environment. That's a meaningful caveat.

For developers wanting to see where Fugu fits in the broader model landscape, our Best AI Models & Leaderboards hub tracks cross-model benchmark comparisons updated as providers publish new scores.

5. How the Technology Works

Fugu is built on Sakana AI's ICLR 2026 research papers: TRINITY (an evolved LLM coordinator) and the Conductor (learning to orchestrate agents in natural language). Both papers show how AI systems can learn to assemble, route, and coordinate expert agents for each task — rather than relying on hand-designed workflows.

The core architecture works like this:

Fugu is itself a small language model trained to call other LLMs — including instances of itself recursively.
TRINITY assigns Thinker, Worker, and Verifier roles across a multi-model pool, adaptively delegating across coding, math, and reasoning tasks.
The Conductor learns to orchestrate agents in natural language — enabling coordination without hardcoded routing rules.
Recursive self-calling enables a new form of test-time scaling: Fugu reads its own prior output and launches corrective workflows when a first attempt falls short.

This means the system improves as better models enter the pool — without any changes to the caller's code. For developers building on top of Fugu via the OpenAI-compatible API, this is the equivalent of moving from managing your own servers to a managed cloud service — except the managed layer is intelligence routing, not infrastructure routing. If you want to build multi-agent systems yourself, the gen-ai-experiments cookbook on multi-agent patterns has hands-on implementations worth exploring.

🚀 Cohort Waitlist Open

Go From AI User to AI Builder

Don't just use ChatGPT. Learn to build custom LLM agents, RAG pipelines, and full-stack Agentic AI apps in our intensive 6-week program.

6 Weeks Live Mentorship

Deploy 5+ Real-world Apps

Weekly App Templates & Code

No Coding Experience Required

Explore Program

Join 1,000+ graduates•Free Registration

6. Real-World Use Cases from Early Users

Sakana ran a beta program with close to 500 early users before the June 22 launch. The patterns that emerged are more telling than the benchmark scores.

A software engineer reported that Fugu Ultra surfaced more than 20 issues in code review, where other frontier models flag roughly three. A cybersecurity engineer noted that Fugu kept scoped assessments within defined bounds while producing evidence and retest steps — a detail that matters in professional security workflows. An enterprise platform executive highlighted persona stability across long sessions: Fugu maintained its identity and context where other models drift, which is critical for agent products running multi-hour tasks.

The AutoResearch case is the most interesting. Fugu Ultra was used in a near-fully automated research mode — planning experiments, running them, interpreting failures, revising hypotheses, and continuing without human intervention. That's not a demo. That's the actual long-horizon agentic capability that most people have been waiting for.

7. Pricing and Access

Fugu launched on June 22, 2026 with a subscription model. Every tier includes both Fugu and Fugu Ultra. Sakana is offering a free second month at the initial subscription tier for anyone who subscribes before the end of July 2026.

There are three tiers:

Personal — for individuals, occasional API calls, small experiments, personal workflows
Professional — for regular coding, review, research, and analysis throughout the week
Team/Enterprise — for larger deployments with compliance and data controls

The cost concern flagged by skeptics is real. For Fugu Ultra on demanding tasks, per-message costs can reach $10. For teams running research pipelines or heavy automation, that adds up quickly. Fugu (non-Ultra) will be significantly cheaper and suitable for most interactive use cases.

Important note: the current launch appears to exclude EU users — full general availability details beyond the initial non-EU rollout remain unconfirmed at time of writing.

8. Honest Criticisms and Transparency Gaps

This is the section of every AI launch that most blogs skip. Let's not.

The most substantive criticism floating around the developer community: Fugu is a closed-source orchestrator that relies partly on closed-source model APIs, and Sakana has not disclosed what percentage of closed vs open models the system uses to achieve its benchmark scores. One researcher put it bluntly on X: "this is very misleading" because the system's performance depends heavily on the underlying models in the pool, and the composition of that pool is opaque.

A second concern: the launch site had errors reported by multiple users on day one. For an infrastructure product positioning itself as production-ready, that's not a good sign. It suggests the launch was rushed, which may mean the product needs more polish before enterprise deployments

A third concern: Fugu Ultra's cost. At up to $10 per message for heavy tasks, Fugu Ultra is expensive for high-volume workflows. Teams need to be selective about when to engage Fugu Ultra versus the standard Fugu model

My honest take: the technology direction is correct, and the ICLR 2026 papers give it real academic credibility. But the transparency around pool composition needs to improve before this is something I'd recommend for enterprise production workloads without testing. Run your own evaluations. Don't rely solely on provider-reported benchmarks.

Frequently Asked Questions

What is Sakana AI Fugu?

Sakana Fugu is a multi-agent orchestration system released by Sakana AI on June 22, 2026. It presents itself as a single foundation model accessible via an OpenAI-compatible API, but internally coordinates a pool of expert models to handle complex tasks. You send one request; Fugu decides whether to answer directly or assemble a team of specialists.

How does Fugu Ultra compare to GPT-5.5 and Claude Opus 4.8?

Sakana's benchmarks show Fugu Ultra outperforming publicly accessible frontier models — including GPT-5.5 (xhigh) and Opus 4.8 (max) — across coding (SWE-Pro), scientific reasoning (GPQA-Diamond), and agentic research benchmarks. These scores are Sakana-reported; independent third-party verification is not yet available. Fugu Ultra is not directly tested against Fable 5 or Mythos Preview because those models are under export controls and not in Fugu's pool.

Is Sakana Fugu open source?

No. Fugu is a closed-source commercial product. The underlying research (TRINITY and Conductor papers) is published and peer-reviewed, but the orchestration model, agent pool composition, and training details are proprietary. Sakana has not disclosed what percentage of the pool consists of open vs closed-source models.

Why did Sakana AI build an orchestration model instead of a frontier model?

Sakana AI's core thesis since founding is that the most powerful AI systems will be collaborative ecosystems rather than isolated monoliths. The Fugu launch also has a practical motivation: Anthropic's export controls on Fable 5 and Mythos Preview (effective June 12, 2026) demonstrated that single-vendor dependency is a real operational risk. Fugu's swappable agent pool is designed to route around such disruptions automatically.

How much does Sakana Fugu cost?

Fugu offers subscription tiers covering personal, professional, and team use. All tiers include both Fugu and Fugu Ultra. For heavy Fugu Ultra tasks, per-message costs can reach $10. Exact subscription pricing is available at sakana.ai/fugu. Sakana is offering a free second month for subscribers who sign up before end of July 2026.

Is Sakana Fugu available outside Japan?

At launch, Fugu is available internationally with a noted exclusion of EU users. Full general availability details beyond the initial non-EU rollout have not been confirmed. Check sakana.ai/fugu for the most current availability information.

What benchmarks does Fugu Ultra top?

Fugu Ultra reports leading or near-leading performance on GPQA-Diamond (PhD-level scientific reasoning), SWE-Pro (software engineering), LiveCodeBench (live coding), ALE-Bench (autonomous learning and experimentation), and AutoResearch (AI-driven ML research). Note that all baseline scores for competing models are provider-reported, not independently verified in the same evaluation environment.

Can I control which models Fugu uses in its pool?

Yes. Both Fugu and Fugu Ultra allow teams to opt specific agents out of the pool for data, privacy, or compliance reasons. This is a meaningful enterprise feature — it means organizations can exclude providers that don't meet their data-residency or contractual requirements while still benefiting from the orchestrated system.

🚀 Cohort Program Open

Claude Mastery: Cowork & Code

The only comprehensive program designed to take you from basic prompting to building interactive Artifacts, custom integrations, and deploying production-ready code with Claude Code.

No coding experience needed

Build interactive Artifacts & Agents

Deploy apps with Claude Code

Cohort-based learning & mentorship

Explore Program

Cohort-based training•Register Now

Recommended Blogs

Resources & Community

Join our community of 70,000+ AI enthusiasts and learn to build powerful AI applications! Whether you're a beginner or an experienced developer, Build Fast with AI helps you understand and implement AI in your projects.

Agentic AI Launchpad 2026

A structured 6-week cohort program that takes you from AI basics to building and deploying real-world agentic AI systems. Includes live sessions, expert mentorship, project reviews, and a builder community network.

Ready to go from learning to building? Join the next cohort → Agentic AI Launchpad 2026

Free AI Resources

Access free tools, workshops, and micro-learning to keep building:

Orchestration models are the next frontier in AI. Follow @BuildFastWithAI on X to stay ahead of every launch that matters.

References

Enjoyed this article? Share it →

Mentorship

Agentic AI Launchpad

Go from user to builder in 6 weeks.

Explore Program

Back to blogs

LLMs

Reviews

Sakana AI Fugu Review: The Orchestration Model That Routes Around Export Controls

June 22, 2026

13 min read

1. What Is Sakana AI Fugu?

2. The Geopolitical Context: Why Orchestration Matters Now

For a deeper understanding of AI industry shifts and model releases happening across the stack in 2026, our AI Industry News & Trends collection tracks the most important developments as they happen.

3. Fugu vs Fugu Ultra - Which One Should You Use?

At launch, Sakana Fugu ships as two models, both accessible through the same OpenAI-compatible API:

4. Benchmark Performance: What the Numbers Actually Say

Fugu Ultra's benchmark claims are aggressive. Sakana reports top-of-leaderboard performance across:

GPQA-Diamond (PhD-level scientific reasoning)
SWE-Pro (real-world software engineering)
LiveCodeBench (live coding evaluation)
ALE-Bench (autonomous learning and experimentation)

For developers wanting to see where Fugu fits in the broader model landscape, our Best AI Models & Leaderboards hub tracks cross-model benchmark comparisons updated as providers publish new scores.

5. How the Technology Works

The core architecture works like this:

Fugu is itself a small language model trained to call other LLMs — including instances of itself recursively.
TRINITY assigns Thinker, Worker, and Verifier roles across a multi-model pool, adaptively delegating across coding, math, and reasoning tasks.
The Conductor learns to orchestrate agents in natural language — enabling coordination without hardcoded routing rules.
Recursive self-calling enables a new form of test-time scaling: Fugu reads its own prior output and launches corrective workflows when a first attempt falls short.

🚀 Cohort Waitlist Open

Go From AI User to AI Builder

Don't just use ChatGPT. Learn to build custom LLM agents, RAG pipelines, and full-stack Agentic AI apps in our intensive 6-week program.

6 Weeks Live Mentorship

Deploy 5+ Real-world Apps

Weekly App Templates & Code

No Coding Experience Required

Explore Program

Join 1,000+ graduates•Free Registration

6. Real-World Use Cases from Early Users

Sakana ran a beta program with close to 500 early users before the June 22 launch. The patterns that emerged are more telling than the benchmark scores.

7. Pricing and Access

There are three tiers:

Personal — for individuals, occasional API calls, small experiments, personal workflows
Professional — for regular coding, review, research, and analysis throughout the week
Team/Enterprise — for larger deployments with compliance and data controls

Important note: the current launch appears to exclude EU users — full general availability details beyond the initial non-EU rollout remain unconfirmed at time of writing.

8. Honest Criticisms and Transparency Gaps

This is the section of every AI launch that most blogs skip. Let's not.

Frequently Asked Questions

What is Sakana AI Fugu?

How does Fugu Ultra compare to GPT-5.5 and Claude Opus 4.8?

Is Sakana Fugu open source?

Why did Sakana AI build an orchestration model instead of a frontier model?

How much does Sakana Fugu cost?

Is Sakana Fugu available outside Japan?

What benchmarks does Fugu Ultra top?

Can I control which models Fugu uses in its pool?

🚀 Cohort Program Open

Claude Mastery: Cowork & Code

The only comprehensive program designed to take you from basic prompting to building interactive Artifacts, custom integrations, and deploying production-ready code with Claude Code.

No coding experience needed

Build interactive Artifacts & Agents

Deploy apps with Claude Code

Cohort-based learning & mentorship

Explore Program

Cohort-based training•Register Now

Recommended Blogs

Resources & Community

Agentic AI Launchpad 2026

Ready to go from learning to building? Join the next cohort → Agentic AI Launchpad 2026

Free AI Resources

Access free tools, workshops, and micro-learning to keep building:

Orchestration models are the next frontier in AI. Follow @BuildFastWithAI on X to stay ahead of every launch that matters.

References

Enjoyed this article? Share it →