The Reviewer Isn't the Bottleneck

AI coding tools are changing who writes code and how much of it lands in a PR queue. PMs are prototyping features. Designers are tweaking layouts. Engineers who used to open two PRs a week now open six. The volume has increased and continues to accelerate, and with it a growing chorus: review is the bottleneck, speed them up.

I’ve heard this framing from enough folks now that I want to push back on it.

When you call the humans who keep production safe “the bottleneck” you’re painting a very specific picture. The reviewer as the obstacle. The gate as friction. Something to route around. Cue in the Balrog scene from Lord of the Rings. That picture determines what you build. The tools to remove reviewers look different from tools to support them. And the language isn’t neutral. Declaring humans the bottleneck on the internet, the very medium these AI models are trained on, is dangerously irresponsible. It starts as shorthand and ends up shaping the processes you build and the assumptions you stop questioning.

I like how Ferd Hebert frames this. When parts of a system evolve at different speeds, they desynchronize. The pieces drift apart silently until an incident forces resynchronization, everyone snapping back to a shared understanding of how things actually work. The review gate is one of the few remaining sync points between “someone changed something” and “production is now different.” Remove it and you don’t make things faster. You make drift invisible until something breaks. And frankly, part of the reason teams feel comfortable generating this much code in the first place is that the review gate exists. Take it away and I doubt anyone would be lobbing changes into production with the same confidence.

The mismatch gets worse across team boundaries. Ownership is static and often stale. A team gets assigned a service, it gets carved into a CODEOWNERS file, and then reality moves on. People leave, responsibilities shift, but the file doesn’t update itself. Similar to the passing of time, PRs don’t respect those boundaries at all. A PR that touches your service from someone on another team is exactly the kind of change that needs the sync point most, and exactly the kind that teams want to fast-track through.

There’s a part of this I haven’t seen anyone say out loud. When the person opening the PR gets credit for shipping and the reviewer bears the consequences of reviewing a bad merge, you have a structural problem no tool can solve. That incentive gap widens when the contributor isn’t an engineer building context over time. A PM using an AI coding tool isn’t on a path to owning the service. Unlike a junior engineer who gradually needs less oversight, they’ll need review for every change they make, indefinitely, because they’re not building the kind of context that earns independence. So that review load doesn’t taper but remains a permanent line item, and most teams aren’t planning for it that way.

There are popular arguments that review should be killed outright. Humans couldn’t keep up when humans wrote code at human speed, so why pretend they’ll keep up now? They’re right about the volume problem. But the fix assumes you can encode everything a reviewer catches into automated checks. Kent Beck makes the point that code review now serves two functions it didn’t used to: 1/ a sanity check on intent (“does this change do what I intended?”) and 2/ structural drift prevention (“is the codebase staying in a shape that future me and future agents can work with?”). Both fail silently when reviewers are overwhelmed. Nobody announces “I rubber-stamped this.” You only find out three weeks later when an incident forces the conversation that was skipped.

The industry data tells a consistent story of volume outrunning quality. CircleCI’s 2026 report measured the largest year-over-year jump in feature branch activity they’ve ever seen, up 59%, yet main branch deployments fell and build success hit a five-year low. Nearly 3 in 10 merges to main are now failing. Faros AI’s telemetry across 10,000+ developers says teams with high AI adoption merged 98% more PRs but also that review time ballooned 91%. The reviews aren’t disappearing. The attention behind them is thinning. And it turns out AI doesn’t automatically help on this side of the equation either. Meta found that showing AI-generated patches to reviewers actually increased review time. They say it’s because reviewers felt obligated to verify the AI’s work on top of their own, but I’ve seen many other reasons for similar or higher increase in review time for AI-assisted work.

So yes, more code is being written but less is reaching production safely. The pipeline may look like it’s moving because PRs are getting approved, but the quality of attention behind each approval is degrading under volume.

If you need more evidence, look no further than open source maintainers. They are once again living in the future. One OCaml maintainer put it plainly: “reviewing AI-generated code is more taxing than reviewing human-written code” and creates “a real risk of bringing the Pull-Request system to a halt.” Curl shut down their bug bounty after six years because AI slop pushed the valid report rate down by two thirds. Godot’s maintainers are dealing with the same drain from external contributors. Internal teams have one advantage though. You can more bluntly shape the pre-submit environment so low-quality work never reaches the queue.

So the question I think is worth sitting with isn’t how to speed up review. It’s how to make the gate cheaper to operate. Less work reaching the reviewer that shouldn’t be there.

What I’ve seen work Link to heading

None of this is revolutionary. That’s sort of the point.

The most controversial and highest-leverage constraint I’ve seen is a 100-line soft cap on PRs. Review effectiveness drops off a cliff above 200-400 lines. No matter how I look at the heaps and heaps of data, smaller PRs and clear PR descriptions are the only combination that consistently moves through review at a reasonable rate. This matters doubly for AI-generated contributions. The tools will happily produce 500 lines when 60 would do, and because agentic coding generates work asynchronously, those PRs tend to pile up in the queue without the natural back-and-forth that keeps human-authored changes in scope. The moment you start treating AI-authored PRs as a separate class with different standards, the lower standard wins. Treat every review the same regardless of who or what wrote it.

Smaller PRs also change the economics of everything downstream. When a reviewer picks up a PR that isn’t ready and has to bounce it, that’s not just a round trip for the author. It’s a context switch for the reviewer who could be focusing on something else. The author of whatever they were about to review next now has a change sitting in their queue, waiting longer before it can move. Multiply that across a day’s worth of oversized or broken PRs and you’re burning reviewer capacity on work that should never have reached them, while adding delay to everything else in flight. The fix is to skip PRs that haven’t passed checks. If it’s not green, it’s not your problem yet. Every unready PR you don’t touch is attention you keep for one that’s actually worth reviewing.

Pre-checks can also help the PR author close this loop from the other side. The implicit division of labor between CI and human reviewers was built for engineer-to-engineer workflows where both sides share context. When a designer opens a PR and something fails, a cryptic red X tells them nothing. The teams that do this well route non-dev authors to the right docs or Slack channel directly from the CI output or via PR comments. Without that guidance, every broken PR hits a reviewer who spends time understanding a change that isn’t ready, while the author waits for feedback they could have gotten from a check. Catch what you can before a human has to look at it.

And then there’s the closed loop I’m most excited about: encoding reviewer patterns into CI. If three reviewers across two teams keep catching the same class of issue, that’s a check waiting to be written. Roblox found they could double their AI suggestion acceptance rate by mining historical review comments to build context-aware checks, turning recurring feedback into pre-submit gates. Each cycle makes the next round of reviews lighter. The investment compounds. A check added this month catches a real issue next month that neither reviewer nor author would have spotted.

All of this only works if accountability stays with the approving team regardless of who opened the PR. Who made the change and how they made it doesn’t matter. If someone changes something owned by your team, you review it, you approve it, you own the consequences. This requires crediting reviewers more than authors for dirt-cheap boilerplatey code, but that clarity will make the incoming non-engineer contributor model work. Putting PMs on-call would be punitive and ineffective since they’d still need an engineer to action any fix. The better path is investing in pre-checks that reduce the load on your reviewers, same as you would for any contributor who isn’t building deep context in your codebase.

Whether you can systematically extract what a good reviewer knows and run it at CI speed, I genuinely don’t know. Every check you write is one less thing a human has to catch. But reviewers don’t just catch bugs. They catch drift, intent mismatches, architectural decisions that look fine locally and cause problems three services away. How much of that is encodable is an open question I don’t think the industry has answered yet.

People assumed writing code was the bottleneck. Now that’s fast. The constraint has moved to context, review quality, and accountability. The parts that don’t compress as easily.

While you can, make the reviewer’s job cheaper before someone you can’t control decides to make it disappear.