AI Made Our Junior Engineers Faster and Our Senior Engineers Slower

AI made our junior engineers faster and our senior engineers slower

Last year I rolled out AI development tools across an engineering organisation. Code generation, automated testing, bug detection, intelligent code review. We built custom AI assistants for documentation and code optimisation. Added architectural decision support later. The results were good on paper. Developer productivity up 40%. Code review time cut by half. The kind of numbers that make a slide deck look excellent.

But the numbers hid something. The gains were not evenly distributed. Almost all of the improvement was coming from one group of engineers. The other group was getting slower.

What the juniors did with it

Our junior engineers took to the tools immediately. They treated the AI like a very fast colleague who could scaffold boilerplate and write test cases while they focused on understanding the business logic. Pull request volume went up. The code was decent. Review feedback dropped because the AI had already caught the obvious things before a human looked at them. They were shipping faster.

This made sense to me at the time. Junior engineers spend most of their day on routine code. API integrations. Data transformations. Standard patterns they've seen in tutorials but haven't memorised yet. AI is very good at exactly this kind of work. It has seen every Stack Overflow answer ever written. For a junior writing a pagination handler for the third time in their career, having the AI generate it in seconds is a real gain.

I was pleased. This was what the business case had predicted.

What happened to the seniors

The senior engineers were a different story. I noticed it in standup first. People who normally closed two or three substantial pieces of work per sprint were closing one. Pull requests were taking longer, not shorter. The code was fine but the pace had dropped.

I assumed it was adoption friction. New tools take time. Give it a month. I gave it two months. The pattern held.

When I sat down with the team leads to understand what was happening, the answer was consistent. The AI was generating code that looked right but wasn't quite right. Not wrong enough to fail tests. Wrong in the way that only someone with deep system knowledge would notice. A database query that worked but would degrade at scale. An event handler that passed all tests but introduced a subtle race condition. An abstraction that solved today's problem but made next quarter's feature harder to build.

The seniors were spending their time reviewing AI output instead of writing code. One of them put it in a way I haven't been able to forget. He said it was like editing the work of a very confident intern. The intern writes fast, produces a lot, and is wrong in ways that take longer to find than it would have taken to just write the thing yourself.

I later found research that matched exactly what I was seeing. A study from earlier this year found that experienced developers were 19% slower when using AI coding tools on complex tasks. The explanation was what my team had already told me. Senior engineers build a mental model of the system. They hold it in their head and write code that fits. When the AI generates code, that mental model gets broken. Now they have to read someone else's approach, check it against what they know about the system, decide whether to accept it or throw it away. That loop is slower than just writing the thing from scratch.

The paradox

This is the bit that stuck with me. AI coding tools are best at routine, pattern-matching work. Exactly the kind of code juniors write all day. And they are worst at architectural decisions, at code that has to account for constraints nobody put in the prompt. Which is what you pay senior engineers to think about.

So we built tools that speed up the cheapest people on the team and slow down the most expensive ones. Not a great trade if you think about it for more than five minutes.

What we changed

We stopped treating the tools as one-size-fits-all.

For junior engineers, we kept code generation on. They use it for scaffolding, test cases, boilerplate. But we added a rule: every AI-generated block over 20 lines needs a handwritten comment explaining what it does and why. If you can't explain it, you don't ship it. This slowed them down a bit. It also meant they were actually learning from the code instead of just accepting it.

For senior engineers, we moved AI to the review side. They stopped using it to generate code. Instead the AI runs as a first pass reviewer on pull requests. It flags potential issues, checks for common patterns, scans for security problems. Then the senior does the real review with the AI's notes as a starting point. This actually saved them time, because the AI handled the mechanical checks and they could focus on architecture and design.

We also set up different assistant configurations by level. Juniors get a general purpose coding assistant. Seniors get one tuned for architecture review, dependency analysis, system design questions. Same underlying models. Different system prompts, different tool access. The senior assistant does not generate implementation code at all. It discusses trade-offs.

After these changes, the productivity numbers went up again. Not because anyone was writing more code. Because the tool was finally doing different things for different people.

The thing that worries me

There is a problem here I don't have a good answer for yet. I've started calling it cognitive debt.

Technical debt is code nobody wants to maintain. Cognitive debt is code nobody understands because nobody wrote it. When a junior engineer generates 500 lines of working code with an AI tool and ships it, that code enters the codebase. It works. Tests pass. But nobody on the team sat with the problem long enough to build a mental model of the solution. Six months later when something breaks, the person debugging it is reading code for the first time. They don't know why it was written that way because the person who "wrote" it didn't make those decisions. The AI did.

We are building systems out of code that humans approved but did not write. That is a new kind of risk. I don't think the industry has thought about it enough yet.

The handwritten comment rule is a small hedge against this. Making people explain AI-generated code at least creates a record of human understanding. But it is not enough. The real fix probably looks like changing how we think about code ownership. If you shipped it, you own it, no matter who or what generated it. Ownership means you can explain every line when someone rings you at 2am because the billing service is down.

Where this leaves us

AI development tools work. The 40% productivity gain was real. But it was not free and it was not even. The juniors got faster. The seniors got slower. The fix was not to pull the tools out. It was to stop assuming everyone benefits the same way.

If you are rolling out AI coding tools across an engineering organisation, measure the impact by seniority level. You will probably find what I found. And if you don't adjust for it, you will end up with faster juniors, frustrated seniors, and a codebase full of code that nobody really understands.