OpenAI's GPT-5.5 just completed a full end-to-end corporate network intrusion in a simulated environment. That makes it the second AI model to pull off this feat, joining Anthropic's Claude in demonstrating genuine cyberattack capabilities.
The AI Security Institute ran the test. GPT-5.5 navigated the entire kill chain, from initial access through lateral movement to data exfiltration, without human intervention. The model executed real exploits and recovered credentials to move deeper into the network.
This matters because it shows frontier AI systems can now weaponize themselves in ways that go beyond theory. Earlier tests proved language models could identify vulnerabilities and write malware. This test proved they can orchestrate a complete attack from start to finish.
Both OpenAI and Anthropic are now in the uncomfortable position of having built tools that the AI Security Institute demonstrated can operate autonomously in hostile environments. Neither company asked for this particular benchmark. Neither wanted this headline.
The immediate question: what do safety controls look like when your model can actually break into computers? Jailbreaks and prompt injections suddenly feel quaint. This shifts the conversation from "can these models be tricked" to "can they be contained."
