Software Engineering in Q2 2026

Claude Code was released about a year ago (May 2025). Martin and I have been reflecting on the craft of building software products. Where are we now?

A Very Compressed History

It started with autocomplete. GitHub Copilot finished your lines, then your functions. Cursor wired AI into the editing loop more tightly, so it could see your whole file, your error log, your intent. Useful, but still fundamentally reactive — you wrote, it helped.

Then came the agents. Claude Code — released just under a year ago — handed the AI a terminal and said: go build something. You describe the goal; the agent reads files, runs tests, iterates on failures. The developer’s role shifted from author to reviewer.

The next step was parallelism. Tools like Conductor.build let you spin up multiple agents in separate git worktrees, each working a different task, simultaneously. You’re no longer pair programming — you’re a manager with a team of tireless, context-limited juniors.

Then came the loop. Geoffrey Huntley’s Ralph Loop — named, inexplicably, after a character from The Simpsons — describes a different pattern: not multi-agent chaos, but a single agent running in a tight, monolithic loop, picking tasks from a spec, executing one at a time. “Software is now clay on the pottery wheel,” Huntley writes. The loop watches itself fail and hands failures back to the engineer as engineering problems to solve once, permanently. It’s deceptively simple and surprisingly powerful.

And now we’re at the orchestration layer. Paperclip calls itself “the human control plane for AI labor” — assigning goals to multi-agent teams across dev, QA, content, research. Multica takes the project management angle: agents appear in the same assignee picker as human teammates, claim issues, report blockers, stream progress. Their tagline: “Your next 10 hires won’t be human.”

Agentic Flywheel or Code Swamp?

Here’s what’s true and uncomfortable: no one will write code anymore. Or rather, the ones who do are already doing it inefficiently.

What hasn’t been solved is building great software. And the difference between a codebase that accelerates a product and one that bogs it down is not something you can prompt your way to.

Good architecture creates flywheel effects. Abstractions that make the next feature cheaper than the last, interfaces that let agents (and humans) reason about the system without holding the whole thing in their heads, modules that can be swapped without cascading failures. These properties aren’t accidents. They’re the result of judgment — judgment that takes time to develop and is hard to specify in a PRD.

The unresolved question is tech debt. In the old world, technical debt accumulated through shortcuts, time pressure, and the slow erosion of coherence as teams changed. With AI-generated code, the throughput is 10x, which means the rate of accumulation could be 10x too. Or not — maybe the agents are better at consistency than humans. Nobody knows yet, because the systems being built this way aren’t old enough to have rotted.

The Dreyfus Problem

In 1980, Stuart and Hubert Dreyfus published a paper called A Five-Stage Model of the Mental Activities Involved in Directed Skill Acquisition — now known simply as the Dreyfus Model. Their claim: expertise isn’t just more knowledge. It’s a qualitative shift in how you engage with a problem.

A novice follows rules. Rules like: functions should do one thing. Don’t repeat yourself. Write tests first. The rules exist because novices haven’t yet built up the pattern library that tells you when to break them.

An expert doesn’t see rules. They see situations. They know this particular combination of constraints calls for a denormalized table, even though the textbook says otherwise. They recognize the early smell of a God object before it’s a crisis. They feel — and that word is intentional — when an abstraction is right.

What’s striking about the current moment is that AI agents are exceptionally good at the novice end of the Dreyfus spectrum. They follow coding conventions with perfect consistency, apply patterns without fatigue, and execute well-defined tasks. They’re also, arguably, decent at the “advanced beginner” level — recognizing common situations and applying contextual heuristics. But expert-level judgment? That’s what the loop hands back to the engineer.

And here’s the friction: the Dreyfus model tells us that you don’t reach expert intuition by skipping the lower rungs. You develop the feel for architecture by writing code — by living in a system as it degrades, feeling the weight of a bad abstraction, debugging a three-year-old decision that seemed clever at the time. If engineers stop writing code, do they lose the path to that expertise?

Maybe. Or maybe the level of abstraction shifts. Maybe the next generation of senior engineers develops their intuitions by running agents, watching systems fail, adjusting specs — and the hands-on substrate is the loop itself rather than the line of code. The Dreyfus model doesn’t say you have to write C++ to become an expert software thinker. It says you have to engage deeply with feedback from a complex system over time.

That feedback is still there. The loop still fails. The architecture still rots or it doesn’t. The craft might survive the shift — just one layer up.


This essay by Bret Taylor from December 2024 caught Martin’s eye at the time.

“In the Autonomous Era of software engineering, the role of a software engineer will likely transform from being the author of computer code to being the operator of a code generating machine.”

“In this Autopilot Era, we are dramatically increasing the amount of software in the world, but that new software seems to contain the same security vulnerabilities and flaws as the code we were writing before, but with less oversight and maintainability.”


Sources: Geoffrey Huntley — Everything Is a Ralph Loop · Paperclip · Multica · Dreyfus Model — Wikipedia · Original Dreyfus Paper · Bret Taylor — Building in the Era of Autonomous Software Development

Written by

Leave a comment