Anthropic Launches Advisor Tool Cutting AI Agent Costs by 85%

WikiBit 2026-04-11 08:00

Lawrence Jengar Apr 09, 2026 18:50 Anthropic's new advisor tool pairs Opus with cheaper models, delivering near-premium

Anthropic just dropped a tool that could reshape how developers budget for AI agents. The advisor tool, announced April 9, lets cheaper Claude models tap into Opus-level intelligence only when needed—cutting costs by up to 85% while maintaining competitive performance.

The mechanic is straightforward: Sonnet or Haiku runs your agent end-to-end, handling tool calls and iterations. When it hits a wall, it escalates to Opus for guidance. Opus never touches tools or user output directly—it just advises and hands control back.

The Numbers That Matter

Anthropic‘s benchmarks tell an interesting story. Sonnet with an Opus advisor scored 2.7 percentage points higher on SWE-bench Multilingual than Sonnet alone, while actually costing 11.9% less per task. That’s better performance for less money—not a tradeoff most developers expect.

Haiku users see even more dramatic shifts. On BrowseComp, Haiku with Opus advisor hit 41.2%—more than double its solo score of 19.7%. Yes, it still trails Sonnet‘s standalone performance by 29%, but here’s the kicker: it costs 85% less per task. For high-volume operations where youre burning through thousands of agent calls daily, that math gets very attractive very fast.

Why This Matters Now

The timing isnt accidental. Anthropic shipped Sonnet 4.6 in mid-February, which already matched Opus-level performance in many tasks. OpenAI countered with GPT-5.4 in early March, unifying their Codex and GPT lines with million-token context. The AI agent space is getting crowded, and cost efficiency is becoming the battleground.

The advisor tool flips the typical orchestration pattern. Instead of a big model delegating to smaller workers, a cheap model drives everything and only escalates when stuck. No decomposition logic, no worker pools—just a single API call with built-in handoffs.

Implementation Details

Developers add one tool declaration to their Messages API request. The advisor_20260301 type routes context to Opus automatically when the executor model decides it needs help. A max_uses parameter caps advisor calls per request, and tokens bill at each models respective rate.

Since Opus typically generates just 400-700 tokens of guidance per consultation while the executor handles full output at lower rates, overall spend stays well below running Opus end-to-end.

The tool slots alongside existing capabilities—web search, code execution, whatever youre already using. No architectural overhaul required.

Disclaimer：

The views in this article only represent the author's personal views, and do not constitute investment advice on this platform. This platform does not guarantee the accuracy, completeness and timeliness of the information in the article, and will not be liable for any loss caused by the use of or reliance on the information in the article.

Related exchange