RC RANDOM CHAOS

UK AISI: GPT-5.5 matches Claude Mythos on vuln-finding, and it's shipping now

· via Simon Willison

Original source

Our evaluation of OpenAI's GPT-5.5 cyber capabilities

Simon Willison →

The UK’s AI Security Institute has published its evaluation of OpenAI’s GPT-5.5, focusing on the model’s ability to discover security vulnerabilities. AISI previously ran the same battery against Anthropic’s Claude Mythos; the headline finding is that GPT-5.5 lands in roughly the same capability tier on offensive-security tasks.

The practical wrinkle is availability. Mythos remains gated behind Anthropic’s controlled-access program, while GPT-5.5 is sitting in OpenAI’s general API today. That collapses the gap between “frontier vuln-hunting model exists” and “any paying customer can point one at a target,” which is the part defenders have to plan around regardless of which lab’s name is on the weights.

Read the full article

Continue reading at Simon Willison →

This is an AI-generated summary. Read the original for the full story.