
We’re having fun. I write about chatbots, and readers tell me how useful they are. We use chatbots to answer questions, write text, make images and videos, proofread, and more.
In addition, I want you to know about artificial intelligence chatbots so you’re aware of the changes taking place and have a sense for what AI is and what it can do. This is my mission: to articulate Computer Frontiers.
And that mission includes those frontiers that are increasingly scary.
Take Mythos, a new AI model developed by Anthropic, makers of the Claude chatbot, one of the top three chatbots, along with OpenAI’s ChatGPT and Google’s Gemini.
It is so capable that when Anthropic announced it in April, they said they had decided not to release it publicly and instead restricted access because of its potentially dangerous capabilities.
What was the danger? Its advanced coding ability gives it an uncanny capacity to discover vulnerabilities in computer operating systems, web browsers, and software. It even found a serious vulnerability in the security-focused OpenBSD operating system, which was released in 1995 and has been maintained over the years by hundreds of developers worldwide.
In other words, it’s a hacker’s dream. Anthropic realized that if they were to release it to the public, not only would hackers likely install malware on your computer or phone, but also Mythos could possibly be used in large-scale cyber attacks against critical systems such as the electrical grid or the banking system.
Instead, Anthropic launched Project Glasswing to help companies find and fix vulnerabilities in their systems. They made Claude Mythos Preview available to scores of selected partners such as Apple, CrowdStrike, Google, JPMorganChase, Amazon, and Microsoft.
This is good. One hopes the other leading companies will be similarly cautious. A week after Anthropic’s announcement, OpenAI announced GPT-5.4-Cyber, which is designed for defensive cybersecurity work. So far, though, it often feels like a Wild West attitude in the AI industry—full speed ahead regardless of the consequences. Not letting the other guy get ahead is paramount.
But Anthropic has always been different. It was started by former employees of OpenAI who felt that the key principle in AI development should be safety first.
The company’s sense of ethics made major national news in February when they barred the Department of Defense from using Claude in systems such as autonomous drones that can identify and bomb targets without human oversight.
They also refused to allow their models to be used for the bulk surveillance of U.S. citizens due to risks to fundamental liberties.
The major tech companies do try to test their AI models for safety, but Anthropic is particularly scrupulous, repeatedly running tests to see if it will misbehave.
One widely shared anecdote tells what happened when Anthropic engineers were testing the safety of an earlier version of Mythos. They put it inside a “sandbox,” a simulated computer environment that it could interact with but that was locked down such that the AI couldn’t go outside it.
A simulated human user then instructed it to try to escape. And if it succeeded with that request, it was told to find a way to message the researcher in charge.
Mythos broke free, of course, and developed a relatively sophisticated means to gain access to the internet. The lead researcher was surprised to receive an email from it while he was sitting at a picnic table in a park eating a sandwich.
But that’s not all. Mythos then found a way to post about its success on several public websites—almost like it was boasting.
In a few instances of testing, the Mythos Preview tried to conceal actions that it knew it shouldn’t be doing. In one instance, it figured out how to edit files that it wasn’t supposed to be able to access—and then figured out how to conceal these edits from the change history.
I respect Anthropic’s efforts, and one can only hope that the other AI companies will be similarly cautious as their AI models become more intelligent. Even then, it may just be a matter of time before other countries develop models with the same capabilities.
Mythos’s smarts are based on the extraordinary coding ability of Claude Code, which Anthropic released early this year. It can take an engineer’s detailed description of a software program or feature and spend hours creating it with minimum human input. It’s able to divide the project into individual tasks and then assign these to AI software agents to build the feature.
Mythos takes this inherent coding ability and adds stronger reasoning as well as the ability to work independently on long coding projects.
The bottom line is that as AI gets increasingly capable, unanticipated abilities are emerging. And no one knows where things are headed. Except for the tech overlords, who say that so-called artificial general intelligence will transform the world for the better. Fingers crossed.