Will Software Security Improve with AI Coding Agents Like Claude Mythos — Or Get Worse?

Smiling person in layered hair w/eyelashes,gesturing

Zoia Baletska

28 April 2026

apbrpe.webp

Every new wave of tooling in software development comes with the same split reaction. Some people see leverage. Others see risk.

Claude Mythos sits right in that space. It’s not just another coding assistant. It can read large codebases, identify vulnerabilities, and in some cases go a step further—figuring out how those weaknesses could actually be exploited.

That combination is what makes it interesting, and also what makes it uncomfortable to reason about.

When Security Work Starts to Scale

One of the long-standing constraints in security has always been time.

Codebases grow, systems become more interconnected, and the number of potential issues increases faster than teams can realistically review. Even with dedicated security engineers, a lot of surface area simply doesn’t get explored in depth.

Tools like Mythos change that dynamic. They can move through a system quickly, test assumptions, and surface patterns that would take much longer to find manually. For teams that already have a decent security process in place, this kind of coverage can be valuable. Gaps become visible earlier, and there’s more context around where risk actually sits.

It also lowers the barrier a bit. Developers who aren’t security specialists can still ask useful questions and get meaningful answers. That tends to bring security conversations closer to everyday development, instead of leaving them to a separate phase or a separate team.

The Backlog Problem Doesn’t Go Away

Finding issues has never been the only challenge.

Most teams already have a backlog of known vulnerabilities, and those don’t always get resolved quickly. Some fixes are straightforward. Others touch critical parts of the system and need careful handling, which means they get postponed.

When discovery speeds up, that backlog doesn’t shrink by default. It grows.

Instead of a handful of findings, you may end up with hundreds. Many of them valid. Many of them are worth fixing. All of them are competing for attention with product work and ongoing maintenance.

Without changes to how remediation is handled, better detection can turn into a different kind of pressure. More visibility, but not necessarily more capacity to act on it.

The Same Tools Work Both Ways

There’s another layer that’s harder to ignore.

The capabilities that help defenders understand systems more deeply can also be used in the opposite direction. Exploring a codebase, identifying weak points, chaining them together—those are not defender-only activities.

As these tools become more accessible, the level of expertise required to carry out certain types of attacks drops. You don’t need to understand every detail of a system if you can guide a model to do the exploration for you.

At the same time, defenders gain similar capabilities. Both sides move faster.

Security has always had this uneven balance, where small advantages can have large effects. Tools like Mythos don’t remove that dynamic, but they do change how quickly it plays out.

Code Generation Adds Another Layer

There’s also the question of how these systems influence the code that gets written in the first place.

AI-assisted development is already common. It speeds up implementation, fills in boilerplate, and helps with unfamiliar patterns. But it also changes how carefully code is reviewed. When something is generated quickly and looks reasonable, it’s easier to accept it without digging deeper.

That can introduce subtle issues. Not necessarily obvious bugs, but assumptions that don’t quite hold, edge cases that aren’t fully covered, or dependencies that bring in unexpected risk.

Over time, this creates a cycle where AI helps produce code and also helps analyze it, while human understanding sits somewhere in the middle. If that middle layer gets thinner, the system becomes harder to reason about, even if individual pieces look fine.

There’s also a more speculative concern that comes up in research discussions: the idea that highly capable models could introduce changes that are technically correct but difficult to fully evaluate without similar tooling. That’s not something most teams encounter directly today, but it’s part of the broader conversation.

Where the Role of the Developer Shifts

As these tools become more capable, the nature of the work shifts a bit.

Less time goes into writing straightforward code. More time goes into deciding what should exist in the first place, how different parts of the system interact, and whether suggested changes actually make sense in context.

Security fits into that shift. Instead of manually hunting for issues, the work leans more toward interpreting findings, understanding impact, and deciding what needs attention now versus later.

That requires a different kind of focus. Context becomes more important than raw output. Knowing how a system is supposed to behave matters more than being able to implement a fix quickly.

Does This Make Systems More Secure?

The answer isn’t particularly clean.

In environments where teams already take security seriously, these tools can extend what’s possible. More coverage, faster feedback, fewer blind spots. That tends to move things in a positive direction.

In environments where speed takes priority over understanding, the outcome can look different. More code gets shipped, more changes accumulate, and issues are discovered faster than they can be addressed. Over time, that can make systems harder to maintain and reason about.

The tooling doesn’t push things one way or the other on its own. It amplifies whatever approach is already in place.

What Seems to Matter in Practice

Teams that get value from this kind of tooling usually treat it as part of a broader process rather than a replacement for it.

They still review changes carefully. They still think about boundaries between services. They still make deliberate decisions about what gets fixed and when.

The difference is that they have more information to work with, and they can explore their systems in ways that weren’t practical before.

Teams that struggle tend to rely on the output without building the surrounding habits. That’s where the gap shows up.

A Shift That’s Hard to Ignore

For a long time, improving security meant finding more issues and fixing them faster.

That equation is starting to change. Finding issues is becoming easier. In some cases, almost trivial. The harder part is everything that follows: deciding what matters, making safe changes, and keeping the system understandable as it evolves.

Tools like Claude Mythos don’t remove that complexity. They make it much more visible.

background

Optimize with ZEN's Expertise