The clearest signal that AI agents have changed development patterns comes from Emil building JustHTML, a pure Python HTML5 parser:

But “quickly” doesn’t mean “without thinking.” I spent a lot of time reviewing code, making design decisions, and steering the agent in the right direction. The agent did the typing; I did the thinking. That’s probably the right division of labor.

More than productivity gains, this felt about role separation. The work that matters stayed with Emil. Choosing test suites, picking API design patterns, deciding when to throw away code and start over. Implementation details got delegated. Simon Willison then ported the entire library to JavaScript in 4.5 hours while decorating a Christmas tree. That’s only possible when the architectural decisions are already made and the agent just needs to translate them.

I’ve been seeing this pattern myself porting Java and Go code to Rust. The translation works remarkably well when there are no dependencies to contend with, no external APIs with subtle behavioral differences. Breaking tasks into chunks under 400 lines of code per commit helps enormously. You verify each piece works before moving to the next, or make verification very explicit if you want to automate that part as well.

The markdown file with a plan becomes the source of truth. One doc or folder where you review progress, reorder tasks, even spawn parallel work across multiple sessions using git worktrees. Simon created a spec.md with milestones before telling his agent to proceed. Emil made design decisions and let the agent implement them. If you can’t specify what you want clearly enough for an agent to execute it, you rollback and keep decomposing until you can. That breakdown process is where architectural thinking develops. Specifications become compiler targets in this model.

What the Evidence Shows Link to heading

Related research from the University of Chicago quantifies what is actually happening:

One standard deviation higher work experience corresponds to 6% higher accept rates… This positive experience gradient for agent accepts contrasts with usage patterns of older AI tools like autocompletion which favored junior workers… experienced workers are better skilled at delegating tasks to agents, often by developing plans in their initial instructions.

The skill that matters is task specification. Experienced developers plan before implementing, send strategic direction rather than asking agents to explain code. They know what to verify because they’ve built the mental model of what should happen. Junior developers are still building that context. Without it, delegating end-to-end becomes risky because you accept what the agent produces without recognizing the shortcuts that’ll cause problems six months later.

If you look at what Emil actually did with use of html5lib conformance tests from the start, picking the core API design himself, building comparative benchmarks, and creating custom profilers and fuzzers; that’s architecture work. The agent might have typed thousands of lines of parser logic, but major decisions came from human judgment about what would produce a maintainable, correct, performant library.

That kind of judgment develops over time. The transition point is when you start recognizing agent output quality without running it. You see a test suite and know what it’s missing. You spot the performance footgun before profiling confirms it. That pattern recognition doesn’t require 15 years but it does requires deliberate practice reviewing code, yours and the agent’s, with someone who can explain what you’re missing.

You wouldn’t build a monolithic service handling everything at once. You decompose into focused chunks, verify each independently, then compose. Same principle applies here. Instead of decomposing code, you’re decomposing cognitive work. The architect decides what to build and how to verify it. The implementer executes the plan. It’s incremental collaboration with a different collaborator.

Building Architectural Judgment Link to heading

Which raises a question about how that architectural skill develops. If implementation work gets delegated to agents, where do people learn to make good architectural decisions? You can’t skip straight to “I know which test suite to use” without first writing enough parsers to understand why test coverage matters.

Start smaller. Use agents for bounded tasks where you already understand the domain. This might mean refactoring a function you wrote, adding tests to code you know works, and so on. Definitely review every line the agent produces. When something feels off, that’s pattern recognition developing. The specification skill builds from noticing what breaks when you’re vague versus specific.

Older AI tools like autocomplete in copilot helped junior developers more because it was assistance during execution. Agents are flipping this. Experienced developers excel because they specify tasks clearly upfront. But being less prescriptive has genuine advantages. Junior developers experimenting without predetermined approaches can stumble into solutions that experienced developers wouldn’t consider because they already “know” the right architecture. Fresh perspectives combined with rapid agent iteration might surface patterns that “we’ve always done it this way” thinking blocks. The exploration path is slower at first but discovers different things than the specification path.

The division of labor is starting to solidify. Teams that recognize this early by 1/ investing in specification skills, 2/ building verification infrastructure, and 3/ treating planning artifacts as seriously as code will adapt faster than those waiting for clarity. The question is whether you’re actively shaping what that means for your team or passively discovering it through hiring mismatches and productivity confusion.