Progressive Autonomy for AI Agents: From Intern to Principal

Progressive autonomy for AI agents: from intern to principal

Most companies give AI agents either full access or no access. Binary. On or off. An agent that can query a database can also delete records, because the permissions model does not know the difference between reading and destroying.

I spent the last year building an agent governance system that treats this differently. The core idea is not complicated. Agents are workers. Workers earn trust over time. So we built a system where they do exactly that.

Why RBAC falls apart for agents

RBAC was built for humans. A human with database admin access understands that you do not drop tables in production on a Friday afternoon. An agent does not have that instinct. It follows instructions. If the instructions are wrong, or if the model hallucinates a step, the permissions model will not help you.

So most teams pick one of two bad options. Give agents broad access and hope nothing goes wrong. Or lock them down so hard they cannot do anything useful.

We wanted a third option.

Agents as employees

In our platform, every agent gets an identity. A proper identity token that follows it across every system it touches. That identity carries a trust level. The trust level controls what the agent can and cannot do.

Four levels. Each one earned, not given.

Intern

Read only. Query data, read logs, look at things. Cannot change anything. Every new agent starts here. No exceptions, no fast track.

Junior

Can write to non-critical systems, but a human approves each action first. The agent proposes a change. A person reviews it. If it looks right, the person approves. Slow on purpose.

Senior

Production writes, no human in the loop per action. But everything is logged. What was called, what parameters were passed, what the system state looked like before and after. Full audit trail.

Principal

Broad tool access. Can chain actions across services and make production decisions. Runs under real time behavioural monitoring, which I will get to shortly.

How trust gets earned

Agents do not get promoted because someone ticks a box. The system tracks every tool call. Success rates. Policy compliance. Whether outputs match known correct answers when we can check.

After enough correct behaviour at one level, the agent becomes eligible for the next. A human still approves the promotion. But the data backs the decision.

This is what a good manager already does with new hires. Small tasks first. Check the results. Bigger tasks if the results are good. We just wrote it down as code and made it auditable.

The permission check

Every tool in the system has a minimum trust level attached. A read only database query might need Intern. A production deployment needs Principal. When an agent calls a tool, the system checks the agent's level against the tool's requirement. Too low? Blocked. Logged. Done.

This happens on every call. No caching, no "it worked last time." Every call is checked fresh, because the agent's level can change between calls if something goes sideways.

Watching for strange behaviour

Even Principal agents are not fully trusted. The system builds a behavioural baseline for each agent. How many records does it normally read? How often does it write? Which tools does it call?

An agent that normally reads 10 records suddenly trying to read 10,000 gets flagged. An agent that calls three tools on a normal day suddenly calling fifteen gets flagged. Does not always mean something is wrong. Does mean someone should look.

Kill switches

Every trust level has a kill switch. If an agent does something that worries you, drop it to Intern in milliseconds. Not minutes. The very next tool call hits the new, lower permission level. If things are bad enough, revoke the identity entirely. The agent is frozen until someone investigates.

Agent failures are not like human failures. A person makes mistakes at human speed. An agent can make a thousand bad calls in a second. You need a response mechanism that matches that speed.

What went wrong early on

The first version had three levels. No Junior. Agents went from read only straight to production writes with logging. Too big a jump. We had an agent get promoted too quickly, make a run of bad writes, and it took hours to clean up. The Junior level exists because of that incident. The human approval step on every write slows things down, but it caught problems we would have missed.

We also did not track behavioural patterns at first. Only permission levels. An agent could go from reading 5 records a day to reading 50,000, and as long as it had the right level, nobody noticed. That gap is closed now, but it should have been there from day one.

Why regulators will ask about this

EU AI Act, Article 14. Human oversight of automated decisions. Most companies treat this as a checkbox exercise. Add a human review step somewhere and move on.

Progressive autonomy gives you something real to show an auditor. At the Intern and Junior levels, humans are in the loop on every write. At Senior and Principal, oversight is structural: audit logs, behavioural monitoring, kill switches. Not bolted on after the fact. Built into how the system works.

When someone asks how you maintain human oversight, you show them the trust levels, the promotion history, the audit trail, the kill switch logs. That holds up better than a process diagram with a "human review" box in it.

What comes next

Promotions still need human approval. I think that is right for now. But the system already has enough data to recommend promotions. The question is how much friction we want to remove, and when.

I also think four levels is too coarse. A Senior agent probably should not have the same permissions for the reporting database and the payments system. Per-system trust levels are on the list.

The basic principle stays the same, though. Agents earn trust through behaviour. That trust is tracked. And it can be taken away faster than it was given.