Anthropic's Mythos Found Zero-Days in Every Major OS. What Does That Actually Mean for AI Safety?

Mythos found thousands of vulnerabilities that humans missed — across Linux, Windows, macOS, iOS, Android, and every major browser. The AI safety debate just got a lot more complicated.

Anthropic’s Mythos Found Zero-Days in Every Major OS. What Does That Actually Mean for AI Safety?

May 13, 2026 — On April 7, Anthropic announced something it hadn’t announced before: a model so good at finding security flaws that they wouldn’t release it to the public.

They called it Claude Mythos. And according to their own reporting, it found thousands of zero-day vulnerabilities across every major operating system and browser in existence.

Here’s the problem: that’s not the scary part.

What Mythos Actually Found

Anthropic set Mythos loose on Linux, Windows, macOS, iOS, Android, and every major browser. The model found vulnerabilities in all of them. Not a few. Thousands.

Some of these flaws had been sitting in production code for decades — completely undetected. Mozilla’s own analysis showed Mythos found 271 security vulnerabilities in Firefox 150. For comparison, Anthropic’s Opus 4.6 — their previous best model — found only 22 in Firefox 148. That’s roughly a 12x improvement in a single generation.

The UK’s AI Security Institute (AISI) put Mythos through its own tests and called it a “step up” from prior models. Notably, Mythos became the first AI model to successfully complete a 32-step simulated cyberattack in an AISI test environment — without any human guidance.

The National Cyber Security Centre’s chief executive, Richard Horne, said at the CyberUK conference that Mythos would “drive the urgency” for companies to replace obsolete tech. That’s not hype. That’s a government body saying the threat is real and imminent.

So Where’s the Scary Part?

Here’s the thing: the AI safety debate usually frames the threat as AI going rogue. Malicious intent. Skynet. HAL 9000.

That’s not what’s happening here.

The scary part is simpler: AI finds vulnerabilities that humans miss. That means attackers can use it too.

If a model that isn’t even publicly released can find thousands of unpatched flaws across every major platform, what happens when that capability is replicated in open-source models? When it’s available to anyone with a decent GPU and a grudge?

Anthropic’s own announcement acknowledged this. They described it as a “watershed moment for cybersecurity” — one that cuts both ways. They stood up Project Glasswing, a program that gives ~40 partners (Apple, Goldman Sachs, Google, JP Morgan, and others) early access to Mythos so they can find and patch vulnerabilities before attackers get there.

That’s the defensive play. Get there first. Patch faster than the other side can exploit.

The Flip Side (Because There Always Is One)

Here’s what the AI-skeptics crowd is quick to point out: other, cheaper models can find these problems too.

Aisle, a company that works specifically in AI cybersecurity, independently analyzed Anthropic’s claims. Their finding? Less dramatic than the announcement implied. Yes, Mythos found thousands of zero-days. But other models — models that are already widely available — also found them. The capability isn’t as独占 as Anthropic made it sound.

That doesn’t mean Mythos isn’t significant. A 12x improvement in vulnerability detection is a big deal, even if the underlying vulnerabilities aren’t exclusive to Mythos. What it means is that the race is already on. Defenders and attackers both have access to increasingly capable tools. The question is who gets there first.

Why This Should Keep You Up At Night

Let’s be concrete about the threat model:

The UK government has modeled worst-case bank hack scenarios — produced before Mythos existed. The findings: direct debits fail, rents and mortgages don’t get paid, online banking goes dark, cash machines stop dispensing. Commuters can’t pay for bus fares or petrol. Panic sets in. Bank runs follow.

That’s the scenario where a vulnerability in critical financial infrastructure gets exploited. And now we have a model — even a restricted one — that finds those vulnerabilities faster than any human security team.

The US Treasury Secretary Scott Bessent called a meeting with bosses from Goldman, Citi, and other major banks in April specifically to discuss the Cyber risk from Mythos. That’s not standard procedure. That’s the government being alarmed.

What This Actually Means for AI Safety

The AI safety debate has largely focused on alignment — making sure AI systems do what we intend them to do. That’s the right question. But Mythos highlights a different dimension: capability saturation.

When a single model can autonomously find thousands of unpatched vulnerabilities across every major OS and browser, the bottleneck isn’t capability — it’s intent. And intent is hard to control.

Anthropic kept Mythos behind closed doors. Fine. But the AISI already reported that a “handful” of users in a private online forum had gained unauthorized access to the model. That was weeks after the announcement. The containment is already leaking.

The harder question isn’t “how do we build safe AI.” It’s: what happens when safe AI becomes the norm, and unsafe actors have access to the same capabilities through replication, open-source, or simple leaked weights?

The Real Answer

Defenders need to use these tools too. Project Glasswing is a step in the right direction — but it’s limited to ~40 US companies. The vulnerabilities Mythos finds exist in infrastructure that the whole world relies on.

The right answer isn’t to hide these models. It’s to build the defensive infrastructure to act on what they find — faster patches, faster disclosure, faster remediation — than the other side can move.

That’s the actual race. Not AI vs. humans. AI-assisted attackers vs. AI-assisted defenders.

And right now, the defenders are still catching up.


Sources: