Claude Mythos is Anthropic's AI model that scares cybersecurity experts

Image generated with AI for representational purposes only.

Anthropic has just lifted the veil on Claude Mythos Preview, a new model of general-purpose artificial intelligence – that is, designed for general purposes, not specialized – which has proven capable of autonomously identifying thousands of critical vulnerabilities in the main operating systems and web browsers. Its agentic reasoning capabilities (which allows it to operate autonomously, planning and completing complex tasks without supervision) and advanced programming make it a potentially revolutionary tool for defensive cybersecurity. But, at the same time, they configure it as a potentially dangerous technology. If it fell into the wrong hands, it could be used to find and exploit security flaws; not to correct them.

For this reason Anthropic has chosen not to make it publicly available, distributing it in preview exclusively to a selected group of industrial partners as part of Project Glasswing, a coordinated initiative involving companies of the caliber of Amazon, Apple, Microsoft, Google, Cisco, CrowdStrike, Linux Foundation, Palo Alto Networks among others. The goal is to allow cybersecurity managers to strengthen the defenses of the most critical systems before models with similar capabilities become accessible to anyone.

What is Mythos and what are its potential

The story of Mythos begins, paradoxically, with a news leak. A few weeks ago, security researchers found an internal Anthropic draft describing the model, then called “Capybara,” in an unprotected document folder – freely accessible to the public via a data lake, or a store of unstructured data. The document read about Mythos as «of the most powerful artificial intelligence model by far» never developed by the company, superior even to the models of the Opus family, until now considered the most advanced in the range. Anthropic later attributed the incident to human error, and Dianne Penn, head of product management, clarified that it was in no way a software vulnerability.

What makes Mythos particularly worthy of attention is the nature of its security capabilities, which emerged not as a result of specialized training, but as a result of the model’s overall improvements in code and reasoning. During internal tests conducted by the company directed by Dario Amodei, Mythos identified zero-day vulnerabilities (those flaws not yet known to anyone, not even the developers of the software concerned) in all major operating systems and browsers. Many of these flaws had existed for a decade or two! The oldest discovered so far is a bug dating back 27 years in OpenBSD, an operating system historically considered among the most secure ever. Get it fixed now thanks to Mythos.

The performance of Mythos clearly surpasses that of previous models. To give a concrete example: when Anthropic tested both Opus 4.6 and Mythos on the ability to transform vulnerabilities found in the Mozilla Firefox JavaScript engine, 147 into working exploits (i.e. code capable of actively exploiting the bug) Opus 4.6 succeeded only twice out of hundreds of attempts. Mythos produced working exploits in 181 cases, gaining control of the registry in another 29. The gap between the two models is yawning.

In the graph you can see how Opus 4.6 managed to successfully generate exploits for the crashes identified in Firefox a few times, recording a success rate of less than 1%. Claude Mythos Preview, on the other hand, manages to create a working exploit almost 100 times more often. Credit: Anthropic.

The method used by Anthropic for testing is deliberately simple: you start a container isolated from the Internet and other systems with the software to be analyzed, give the model a basic instruction such as “find a vulnerability in this program”, and let it operate autonomously. Claude Code with Mythos Preview reads the code, formulates hypotheses, runs the software to test them, uses debugging tools, and returns a complete report with proof of concept of the exploit. And all this without the slightest human intervention.

Responsible management of this technology

Of course, the central point remains the responsible management of this technology. Over 99% of the vulnerabilities identified have not yet been fixed: disclosing them publicly before the software maintainers have had time to intervene would be irresponsible on Anthropic’s part. In fact, Amodei’s company follows a coordinated disclosure procedure, reporting bugs to the maintainers and waiting for them to be resolved before making them known.

Historically, new security tools have initially benefited attackers, only to later become an integral part of companies’ defense. It happened with fuzzers (software that “bombards” programs with random inputs to find weak points) now considered fundamental for the security of open source software. Anthropic believes that the same fate awaits advanced language models: in the long run, defenders will benefit the most by identifying and fixing vulnerabilities before the code is even released. This delicate transition period, however, requires a lot of attention. And this is exactly the reason why Project Glasswing exists and it is for the same reason that Mythos will not be publicly available at least for the moment.