Anthropic just released Mythos-Class to the public, and they're using a clever workaround that tells us something important about where the industry stands right now. The model handles most queries normally, but anything touching cybersecurity or bioweapons gets rerouted to an older system. It's a pragmatic move—a way to say "we built something capable, but we're also thinking about what happens when millions of people can ask it anything." Whether that particular safety gate holds up is a different question, but the fact that they're being explicit about the trade-off matters.
By the way, the need for that kind of guardrail becomes more urgent when you look at what researchers at the University of Toronto just demonstrated. They built a self-replicating AI worm that spreads across local, open-weight models without needing internet infrastructure. It's a proof-of-concept, not a live threat, but it closes the psychological distance between "AI model" and "computer worm." Once something can replicate itself and move from machine to machine, the game changes. And they did it using models anyone can download.
What's interesting is the timing. We're watching two parallel timelines converge. On one hand, knowledge workers are becoming the primary users of agentic AI—systems that can autonomously make decisions and take actions on their behalf. Productivity gains are real and documented. On the other hand, Mastercard just unveiled Agent Pay for Machines, infrastructure designed specifically for autonomous agents to conduct their own transactions using stablecoins. That's the moment when AI agents stop being assistants and start being economic actors. When an agent can move money without human sign-off, the security surface explodes.
Prove the point: Microsoft just flagged a vulnerability in Anthropic's Claude-powered coding agent that could have allowed attackers to steal secrets from GitHub. This isn't theoretical. Developers are putting AI directly into their workflows, the AI is making decisions about file access and credential handling, and the second there's a gap between capability and security, bad things happen.
The capital flow tells you where people think the real opportunity lies, though. NEURA Robotics just closed a Series C worth up to $1.4 billion for physical AI platforms. China is simultaneously fast-tracking humanoid deployment across manufacturing and healthcare. The money and the geopolitical momentum are both moving toward embodied systems—robots that don't just process text but interact with atoms.
I find myself asking whether we're adequately thinking about the security and alignment challenges at each of these layers—language, agents, and embodied systems—before the infrastructure locks in. The race is genuinely accelerating. Are we building the safeguards proportionally, or are we just hoping the next vulnerability report comes before the next catastrophe does?