Anthropic Mythos: AI Finds Vulnerabilities Hidden for 27 Years
The cybersecurity landscape has reached a historic turning point. For twenty-seven years, a critical vulnerability remained hidden within the OpenBSD TCP stack. Despite OpenBSD being widely regarded as one of the most security-hardened operating systems on the planet, this flaw—which allows a mere two packets to crash any server—survived decades of human audits and rigorous automated testing. That streak ended when Anthropic’s latest AI model, known as Mythos, identified the bug autonomously.
A Generational Leap in AI Capability
The discovery of the OpenBSD flaw was not a fluke or a guided effort. After an initial prompt, Mythos navigated the codebase and surfaced the vulnerability without further human intervention. What makes this achievement truly remarkable is the sheer scale of improvement over previous AI generations. In exploit writing tasks for complex software like Firefox, Mythos achieved 181 successful exploits, whereas its predecessor, Claude 3 Opus, managed only 2. This represents a staggering 90x improvement in a single generation.
This performance is reflected across multiple industry benchmarks:
- SWE-bench Pro: Mythos scored 77.8%, significantly outperforming the 53.4% achieved by previous models.
- CyberGym Vulnerability Reproduction: The model reached an 83.1% success rate.
- Cybench CTF: Mythos saturated the evaluation at 100%, forcing security researchers to move beyond Capture The Flag (CTF) challenges toward real-world zero-day discovery as the only viable way to measure its capabilities.
The Economic Shift of Zero-Day Discovery
Perhaps the most disruptive aspect of this development is the cost. While the entire discovery campaign cost approximately $20,000, the specific model run that actually identified the OpenBSD flaw cost less than $50. This massive reduction in the cost of finding zero-day vulnerabilities changes the defensive calculus for security teams everywhere. When high-level exploits can be generated for the price of a modest dinner, the traditional "security through obscurity" model is officially dead.
A New Playbook for Security Teams
The emergence of Mythos means that security teams can no longer rely on the assumption that long-standing, audited code is safe. If a bug can survive 27 years of human review only to be found by an AI in minutes, our current detection playbooks are obsolete. Organizations must now integrate similar AI-driven red-teaming tools into their own development cycles to find these flaws before malicious actors do.
As we move into this new era of autonomous cybersecurity, the focus shifts from finding bugs to fixing them at the speed of AI. The "vulnerability window"—the time a flaw exists before being patched—is shrinking, but so is the time it takes for an attacker to weaponize it. Mythos has proven that the future of security is not just automated; it is intelligent, autonomous, and incredibly fast.
Español
English
Français
Português
Deutsch
Italiano