Anthropic unveils Claude Opus 4.7 with honesty upgrades

Anthropic released Claude Opus 4.7 on Thursday, its latest hybrid reasoning model and the most publicly available version of Claude to date. The release comes at a complicated moment for the company, which has spent recent weeks managing fallout from internal code leaks and fielding questions about Claude Mythos, a more powerful model the company has decided not to release publicly due to cybersecurity concerns.

Contents

What the honesty benchmarks actually show
Sycophancy remains a work in progress
Where Mythos fits into the picture
What Opus 4.7 improves beyond honesty

Opus 4.7 is a different kind of announcement. Rather than leading with raw capability, Anthropic is centering this release on honesty and reliability, two areas where AI models have historically struggled and where user trust depends most heavily on getting it right.

What the honesty benchmarks actually show

Anthropic evaluated Opus 4.7 across several categories of hallucination and truthfulness. The model posted a MASK honesty rate of 91.7%, a metric that tests whether a model will contradict its own stated beliefs when pressured by a user to do so. That figure improves on both Opus 4.6, which scored 90.3%, and Sonnet 4.6, which came in at 89.1%. It falls short of the 95.4% posted by Claude Opus 4.5, though Opus 4.7 compensates with stronger performance in other categories.

The company has broken hallucinations into distinct types to give a clearer picture of where models succeed and fail. Factual hallucinations measure whether a model provides accurate information and acknowledges the limits of its knowledge rather than inventing answers. Input hallucinations capture instances where a model ignores prompt instructions or generates content that was not requested. False premises honesty rate measures how reliably a model corrects a user who states something incorrect. The MASK rate, as described above, tests resistance to pressure.

Across all four of these categories, Opus 4.7 showed improvement over its predecessors and outperformed comparable models from competitors including Gemini 3.1 Pro and Grok 4.20.

Sycophancy remains a work in progress

One of the more nuanced challenges in AI model development is sycophancy, which refers to a model’s tendency to agree with users rather than offer accurate pushback. Opus 4.7 has been noted for occasionally capitulating under pressure, though its performance on this dimension is better than previous Claude versions and ahead of the competing models tested.

These uses an open-source behavioral audit tool called Petri 2.0 to evaluate sycophantic tendencies, scoring models on a scale where lower numbers reflect less problematic behavior. The improvement in Opus 4.7 is measurable but incremental, suggesting the company views this as an ongoing area of development rather than a solved problem.

Where Mythos fits into the picture

Claude Mythos is the model sitting in the background of this release. Anthropic has described it as more capable than Opus 4.7 across several benchmarks, but has declined to release it publicly because of its advanced ability to identify and potentially exploit software vulnerabilities at a scale that exceeds human capacity.

The company rolled out Mythos earlier this month through a tightly controlled initiative called Project Glasswing, limiting access to roughly 40 organizations focused on defensive cybersecurity work. Partners include Amazon, Apple, Microsoft, and Cisco. Anthropic committed up to $100 million in usage credits for the effort and pledged $4 million to open-source security organizations. The Treasury Secretary and Federal Reserve Chair have both reportedly flagged the model’s potential risk to financial institutions.

The contrast between Mythos and Opus 4.7 illustrates the line Anthropic is navigating as it prepares for a reported IPO as soon as October. The company is simultaneously developing models powerful enough to concern regulators while releasing safer, more measured versions to the public and building a brand around responsible deployment.

What Opus 4.7 improves beyond honesty

Coding, visual intelligence, and document analysis all received upgrades in this release. Anthropic has not provided granular benchmarks for all of these categories in the initial system card, but positions Opus 4.7 as a meaningful step forward for enterprise users in sectors like finance and health care, where accuracy and reliability carry direct consequences.

Anthropic unveils Claude Opus 4.7 with honesty upgrades

The latest Claude release posts a 91.7% MASK honesty rate and meaningful reductions in sycophantic behavior, even as the more powerful Mythos model remains too risky for public deployment.

What the honesty benchmarks actually show

Sycophancy remains a work in progress

Where Mythos fits into the picture

What Opus 4.7 improves beyond honesty

Latest News

Pochettino calls Argentina World Cup favorites after Messi hat trick

Portugal’s Ronaldo problem deepens after World Cup draw with Congo

IAEA welcomes US-Iran deal and prepares to lead technical talks

Ninth Circuit rules Trump can exclude agencies from union rights

Advertise

We influence millions of users and is the number one hard head news network on the planet

Sign Up for Our Newsletter

What the honesty benchmarks actually show

Sycophancy remains a work in progress

Where Mythos fits into the picture

What Opus 4.7 improves beyond honesty

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Latest News

Pochettino calls Argentina World Cup favorites after Messi hat trick

Portugal’s Ronaldo problem deepens after World Cup draw with Congo

IAEA welcomes US-Iran deal and prepares to lead technical talks

Ninth Circuit rules Trump can exclude agencies from union rights

Advertise