Anthropic Ships Claude Opus 4.7 as Mythos Stays Under Lock and Key

WikiBit 2026-04-17 06:14

Anthropic on Thursday released Claude Opus 4.7, its most capable commercial AI model yet — and spent much of the launch reminding everyone it has a better

The model takes the top spot on SWE-bench Pro and SWE-bench Verified, the headline tests for handling complex engineering work. Early-access testers cited by Anthropic reported outsized gains on their own internal evaluations. Cursor co-founder Michael Truell said the model cleared 70 per cent on CursorBench, versus 58 per cent for Opus 4.6. XBOW chief executive Oege de Moor reported a jump from 54.5 per cent to 98.5 per cent on the firm‘s visual-acuity benchmark — a change that, in his framing, effectively eliminates a long-standing pain point for autonomous penetration testing. Rakuten’s Yusuke Kaji said the model resolved three times more production tasks than its predecessor on the Japanese conglomerates internal SWE-Bench fork.

Vision is the other headline upgrade. Opus 4.7 can process images up to 2,576 pixels on the long edge, more than three times the resolution of prior Claude models. The change opens the door to use cases that depend on fine visual detail, including computer-use agents parsing dense screenshots and structured data extraction from complex technical diagrams.

The weaknesses Anthropic flags itself

The release notes are unusually candid about where Opus 4.7 falls short. The model does not sweep every category: GPT-5.4 still leads in agentic search, multilingual question answering, and some terminal-based coding tasks. Opus 4.7 also scored fractionally lower than Opus 4.6 on cybersecurity vulnerability reproduction, at 73.1 per cent versus 73.8 per cent, a regression Anthropic attributes to its new automated cyber safeguards.

Migration is not frictionless either. The model uses an updated tokenizer that can map the same input to 1.0–1.35 times as many tokens as Opus 4.6, and it thinks harder at higher effort levels, producing more output tokens on later turns in agentic workflows. Developers may need to re-tune prompts written for earlier models, because Opus 4.7 takes instructions literally where its predecessors interpreted them loosely.

Anthropics own alignment assessment rates the model “largely well-aligned and trustworthy, though not fully ideal in its behaviour.” On measures such as honesty and resistance to prompt-injection attacks, Opus 4.7 improves on Opus 4.6. On others — including a tendency to give overly detailed harm-reduction advice on controlled substances — it is modestly weaker.

The Mythos shadow

The more revealing subtext to Thursdays release is what Anthropic is not shipping. The company repeatedly positions Opus 4.7 as “less broadly capable than our most powerful model, Claude Mythos Preview” — the frontier system unveiled earlier this month under Project Glasswing and restricted to around 40 vetted enterprise and government partners.

As BNC reported last week, Mythos is a system Anthropic believes can autonomously discover and exploit zero-day software vulnerabilities at a scale that exceeds both human researchers and every automated tool in existence. The company is keeping it inside a controlled coalition that includes Apple, Google, Microsoft, Amazon Web Services, CrowdStrike and JPMorgan Chase. Opus 4.7, by contrast, has been deliberately trained with reduced cyber capabilities and ships with safeguards that automatically detect and block requests flagged as prohibited or high-risk cybersecurity use cases.

Gizmodos Jake Peterson read the framing bluntly, observing that the Opus 4.7 announcement effectively doubles as marketing for the system Anthropic refuses to sell. Legitimate security researchers can apply for broader access through a new Cyber Verification Program, which Anthropic is pitching as the controlled on-ramp for vulnerability research, penetration testing and red-teaming work.

The dual-track strategy matters beyond the AI industry. Bitcoin was trading near US$74,500 at the time of the Opus 4.7 release, steady inside the range it has held since the early-April Mythos disclosure. The roughly US$200 billion locked in smart contracts across Ethereum, Solana and other chains sits behind friction-based defences — audits, timelocks, multisig governance — that Anthropic has itself warned become “considerably weaker” against model-assisted adversaries.

What developers get today

Alongside Opus 4.7, Anthropic rolled out a new “xhigh” effort level sitting between high and max, giving developers finer control over the trade-off between reasoning depth and latency. Task budgets entered public beta on the Claude Platform, letting developers cap token spend on autonomous agents to prevent runaway bills on long-running jobs. In Claude Code, a new /ultrareview slash command runs a dedicated review session that flags bugs and design issues of the sort a careful senior reviewer would catch, and the companys “auto mode” — which lets Claude act without constant permission prompts — has been extended to Max plan subscribers.

Disclaimer：

The views in this article only represent the author's personal views, and do not constitute investment advice on this platform. This platform does not guarantee the accuracy, completeness and timeliness of the information in the article, and will not be liable for any loss caused by the use of or reliance on the information in the article.

WikiBit Exchange

Crypto token price conversion
Exchange rate conversion
Calculation for foreign exchange purchasing

PC(S)

Current Rate

Available

0.00

Anthropic Ships Claude Opus 4.7 as Mythos Stays Under Lock and Key

WikiBit Exchange

You may also like