Little's Law Explains Why AI Breaks Your Process

AI lets developers write code faster. GitHub reports 43.2 million pull requests merged per month in 2025, up 23% year-over-year. More code flowing into the system should mean faster delivery, right?

Except a controlled study of 800 developers using GitHub Copilot found essentially no improvement in PR throughput, just a 1.7 minute decrease in cycle time. The code is being written, but it’s not moving faster through the pipeline.

Little’s Law, proven in 1961, says L = λW. The average number of items in a system equals the arrival rate times the time each item spends in the system. Double your arrival rate while service rate stays constant, and you double the queue length. Or, more painfully, you double the wait time.

PRs are arriving faster (λ↑), but human review capacity hasn’t changed. The queue grows.

The service rate is getting worse, not staying constant Link to heading

The simple version of Little’s Law assumes your service rate (as in how fast you can process work) stays fixed. But GitClear’s analysis of 211 million lines of code from 2020-2024 suggests AI isn’t just increasing arrival rates. It’s making each PR harder to review.

Copy-pasted code blocks increased from 8.3% to 12.3% of all changed lines. That’s a 48% relative jump! Code refactoring dropped from 24.1% to 9.5%. Newly written code that gets revised within two weeks climbed from 5.5% to 7.9%. The Uplevel study found a 41% increase in bug rates for developers with Copilot access.

This isn’t just more work arriving. It’s messier work that requires more careful attention. The effective service rate (μ) is decreasing while arrival rate increases. It’s important to highlight that in queueing theory, things fall apart exponentially.

Human review capacity has a hard ceiling Link to heading

Even if code quality stayed constant, review capacity can’t scale linearly. A 2006 study of 2,500 code reviews at Cisco found that effectiveness drops sharply above 450 lines per hour. 87% of reviews faster than that threshold had below average defect detection. The optimal range is 200-400 lines of code per review, inspected at under 500 LOC per hour.

They’re pointing to cognitive limits. A 60-minute review session reviewing 400 lines at 300 LOC/hour is close to the effectiveness ceiling. You can’t just “review faster” without missing things.

This looks familiar if you’ve designed distributed systems. It’s the same dynamic as a load balancer sending requests faster than backend services can process them. The queue depth grows, latency spikes, and eventually timeouts start cascading. You can’t fix it by telling the backends to “work harder” when they’re already at CPU limits.

The difference is that in systems, you can add more backends (up to a limit of course). In code review, you can’t just add more senior engineers in seconds (and definitely not in this market).

The steady-state assumption doesn’t hold Link to heading

Little’s Law requires a system in steady state: arrival rate equals departure rate over time. Software development rarely meets this condition. Work comes in bursts. PRs block each other. A deployment freeze pauses everything, then releases a flood.

But the principle still holds even if the formula doesn’t apply exactly. When you increase the rate of work arriving without increasing the rate at which work can be processed, queues grow. The instability just makes it worse.

The Uplevel study measured access to Copilot, and not usage. It’s possible developers with access weren’t using it heavily, or were using it in ways that didn’t affect PR volume. But Stack Overflow’s 2024 survey found that 62% of professional developers are actively using AI tools, up from 44% the previous year. The arrival rate increase is happening.

What happens when the bottleneck shifts from “writing code” to “reviewing code”? Individual assignment helps (as we saw with the bystander effect), but it doesn’t change the fundamental capacity constraint. You’re just distributing the load more efficiently across a service fleet that’s already at capacity.

The compounding bottleneck Link to heading

The bottleneck has shifted from writing code to reviewing it. And the math is worse than the simple version of Little’s Law suggests.

It’s not just that PRs are arriving faster (λ↑). It’s that each one is harder to review: more copy-paste, less refactoring, higher bug rates. The service rate is going down (μ↓) while the arrival rate goes up. In queueing theory, that’s when systems break.

Individual assignment helps (as we saw with the bystander effect), but it doesn’t change the fundamental capacity constraint. You’re just distributing the load more efficiently across backends that are already maxed out.

The question isn’t just “how do we review faster?” It’s whether the long-term codebase health cost is worth the short-term velocity gain. If AI code needs more careful review, not less, we might be optimizing the wrong part of the system.