What an AI model “too dangerous” for public release means for directors

Recently, Anthropic, the creators of the Claude AI chatbot, released a new model called Mythos. It generated global headlines as Anthropic said it was too powerful and dangerous for public release.

Instead, Anthropic created Project Glasswing to provide access to Mythos to only a small number of major US critical infrastructure organisations and technology companies to test.

It’s claimed this model has reached a level of “coding capability where it can surpass all but the most skilled humans at finding and exploiting software vulnerabilities”.

Beyond the headlines, this is a big deal for directors around the world. AI is supercharging the writing and fixing of all kinds of software code, but this comes with substantial risk.

The cyber risk of Large Language Models (LLMs)

LLMs like Mythos are ‘trained’ on language and software code. Coding and software bug-finding are perhaps the most important and commercially viable application of LLMs today. And, because LLMs learn from humans, they can also learn mistakes in code.

The smallest error in software code – for instance, a missing semicolon – could result in a devastating kill-chain for a cyber attacker.

Because of this, the cybersecurity industry has a wide range of techniques and tools, such as penetration testing, that already find bugs in code. The previous models released by Anthropic are, when directed by an expert, already very capable of both writing code and finding the bugs in code that would result in software vulnerabilities – vulnerabilities that cyber-criminals can exploit for malicious gain.   

Fixing bugs – or creating a bigger problem?

One of those most pervasive issues with software is that when bugs that could be exploited by an attacker are found, they must be fixed. But many organisations struggle with fixing bugs much more than they do finding them.

So, let's say that Mythos accelerates the discovery of vulnerabilities and prioritises the impactful ones quickly and cheaply. This will create a massive and growing backlog of bugs to fix. It will not have fixed anything, merely shifted the problem – and this results in compression of one workload and expansion in another.

A growing backlog of known but unfixed vulnerabilities creates an expanding attack surface for cyber-criminals. If Mythos is as capable as claimed, adversaries will eventually have access to equivalent tooling, and the organisation that identified 100 new bugs but only fixed three is in a worse position than one that found 10 important bugs and fixed all of them.

Mythos’ potential for scaling cyber-attacks

For decades, the cybersecurity industry has faced the ongoing challenge of zero-day vulnerabilities. These are hidden security flaws in software that are found and exploited by hackers before the people who made the software even know the flaws exist.

Zero-day attacks like this have been increasing in frequency, speed of exploitation and mass impact. While they might be discovered by a researcher and responsibly disclosed, when they are found by an adversary, they might sell it, exploit it quietly or aim for rapid mass-exploitation.

Take, for instance, Log4j – a small but widely used software library. A bug in the library was publicly disclosed in 2021. Because it was easy to exploit and built into huge numbers of systems, attackers quickly took advantage, scanning 60% of the internet within days. The vulnerable version had been in use for seven years and was embedded in millions of products, making it almost impossible to fully fix everywhere. While the bug posed no real risk when it was unknown, it became a major global threat as soon as it was exposed.

Anthropic says it is concerned about releasing Mythos because it could create similar, large-scale identification of zero-day vulnerabilities by a relatively lower-skilled individual. 

This might be a legitimate concern, but if Mythos truly is a step-change, then we are already a long way down the staircase.

What directors can do

For directors of companies involved in software development, there are a few things you can do in practice to manage risk. It starts with a frank discussion with your development team.

There are many security processes that can be embedded in a software development lifecycle and the operational life of the application. Each has strengths and weaknesses, and they should be layered in a complementary way to get the best from each. The processes and tools you use with be influenced by the rate of change, criticality and architecture of your software, size of your development team and compliance requirements. These processes can add friction to the development process, and a trade-off must be found between productivity and security.

As a director, your focus will be on whether suitably skilled development or security professionals have designed and are actively managing vulnerability identification during the development and operational life of the software, and whether remediation can keep pace with discovery.

Key questions to ask include:

- Do you have a process for detecting vulnerabilities in your software applications?

- Do you know what software libraries you are using?

- When a zero-day in a software library is discovered, how do you know if you’re impacted?

- How do you prioritise fixes? Are you able to remediate vulnerabilities within an agreed time? What happens if you can’t?

- How do you know this process is working?

Anthropic is generating a lot of noise currently and we can expect a lot more disruption in future as AI continues to evolve at a rapid pace. As always, the key message for any organisation is to be prepared.