When an AI agent deployment fails, the post-mortem almost always reaches the same conclusion: the technology wasn't ready. Sometimes that's true. But in my experience, it's the least common reason — and the one that gets blamed most often.

I've had some version of the same conversation dozens of times. A founder, an operator, or a department head tells me they tried agents and it didn't work. When I ask what happened, the answer follows a pattern I've come to recognise: they deployed, it worked okay for a while, outputs started to drift, someone complained, and eventually the whole thing was quietly shut down with a note that said something like "AI not ready for this use case."

Sometimes the AI genuinely wasn't ready. The task was too complex, the context too ambiguous, the model capability not yet there. Those failures are real and worth taking seriously.

But a lot of the time — more than most organisations are willing to admit — the AI was fine. What was missing was anyone managing it. Nobody owned the agent. Nobody reviewed its outputs. Nobody iterated on it when early results were mediocre. Nobody raised the bar. The agent was treated as a piece of software that you configure once and walk away from. And it failed the way any employee would fail if you hired them, gave them a job description, and never spoke to them again. Management failure gets misdiagnosed as AI failure because we don't yet have the language or the frameworks to recognise it for what it is.

01 / The Real Failure Modes

What bad agent management actually looks like — and why it gets misdiagnosed.

The failure modes of agent deployments are remarkably consistent once you know what to look for. They don't look like AI failures. They look like management failures — because that's what they are.

The five failure modes
What gets blamed on the AI. What it actually is.
1. The prompt written once and never touched.

The agent was configured at deployment with a set of instructions that reflected how the team understood the task in week one. Three months later the product has changed, the tone has shifted, and the ICP has evolved — but the instructions haven't. The agent is faithfully doing exactly what it was told to do six months ago. That's not an AI failure. That's a management failure. Nobody reviewed the brief.

2. No success criteria defined at the start.

The agent was deployed without anyone specifying what "good" actually looks like. So when outputs were mediocre, nobody could articulate why — just that something felt off. Without clear criteria, there's no basis for iteration. The agent drifts without correction until someone loses patience and shuts it down. You wouldn't onboard an employee without defining what success looks like in their first 90 days. The same principle applies.

3. Nobody owned the improvement curve.

The agent launched at 70% quality. That was good enough to keep running, so everyone moved on. Nobody was responsible for getting it to 80%, then 90%. The 70% performance became the permanent performance — not because the agent couldn't improve, but because no human was accountable for improving it. In every organisation I've seen get this right, there was one person whose job it was to make the agents better. In every organisation I've seen get it wrong, that person didn't exist.

4. The escalation threshold was never set.

When should the agent handle something autonomously? When should it flag for human review? When should it refuse to act at all? Most deployments never answer these questions explicitly. The agent either over-escalates — becoming a glorified notification system that requires human sign-off on everything — or under-escalates, autonomously handling situations it shouldn't until something goes wrong and trust collapses entirely.

5. The deployment was treated as a project, not a function.

There was a launch. There was a go-live. There was a retrospective. And then everyone moved on to the next project. The agent became orphaned infrastructure — still running, technically, but with no owner, no review cadence, and no improvement plan. Agents aren't projects. They're operational functions. They need ongoing management the same way a team needs ongoing management. Treating deployment as a finish line is how you end up with an agent that was good in month one and embarrassing by month six.

None of these failure modes have anything to do with model quality. Every one of them is a management failure — the kind of failure that would be immediately obvious if you replaced "agent" with "employee" in the description. We wouldn't accept this standard of management for a human team. We've been accepting it for agent teams because we've been treating agents as software rather than workforce.

02 / The Analogy That Changes Everything

You wouldn't hire someone and never speak to them again.

Think about how you onboard a new employee. You don't write a job description, hand them a laptop, and return six months later to check if the output is any good. You set expectations in week one. You check in regularly. You give feedback. You raise the bar as they develop. You review performance formally. You address problems when they emerge rather than letting them compound. You invest in their development because you understand that the quality of their work improves with intentional management.

Now think about how most organisations deploy agents. They write the initial instructions. They test it briefly. They go live. And then they treat it as a piece of software that either works or doesn't — checking in only when something breaks badly enough that someone complains.

The instincts that make someone a good people manager are exactly the instincts that make someone a good agent manager. Set clear expectations. Review outputs regularly. Give structured feedback — in the form of prompt iteration and process refinement. Identify where the agent is underperforming and work out why. Develop it deliberately toward higher capability over time.

The difference is that for humans, we have decades of accumulated management practice — frameworks, review cycles, HR functions, performance management systems — that encode this discipline into organisational behaviour. For agents, we have nothing. Every organisation is improvising. And most are improvising badly, because the people responsible for deploying agents come from engineering or operations backgrounds that treat software as something you ship, not something you develop over time.

"The instincts that make someone a good people manager are exactly the instincts that make someone a good agent manager. We just haven't created the role yet."

03 / The Gap Nobody Has Named

The agent manager is the most important job nobody has created yet.

There is a function that needs to exist in every organisation deploying agents at scale. What form it takes depends on where you are. In a smaller organisation, it's a new dimension of every existing role — a shared responsibility embedded into how people work, not a separate job title. In a larger organisation deploying dozens of agents across multiple functions, that distributed responsibility needs to be anchored by someone whose primary job is the performance of the agent workforce. Both models are right for their context. What's wrong in every context is treating agent management as nobody's job.

Right now, in most organisations, it is nobody's job. The responsibility either sits with whoever deployed the agent in the first place — usually an engineer or an ops person who has since moved on to the next project — or it doesn't sit anywhere at all. Neither works. Distributed ownership without accountability means nobody is responsible for the improvement curve. No ownership means the agent stagnates at whatever quality it reached at launch and slowly drifts from there.

I think about the agent management function the way I think about what a great COO does for a human organisation. They don't do the individual work. They create the conditions under which the team does the work as well as it possibly can — setting standards, reviewing performance, removing friction, building systems that make good outcomes more likely and bad outcomes less likely. Whether that function lives in one person or is distributed across every person in the organisation, it needs to exist, it needs to be named, and someone needs to be accountable for it.

What the agent manager actually does

Sets performance standards before deployment.

Defines what good looks like for each agent — not in vague terms but specifically. What response quality score triggers a review? What escalation rate is acceptable? What does success look like in 30, 60, and 90 days?

Reviews outputs on a regular cadence.

Not waiting for something to break. Actively reviewing a sample of agent outputs every week — looking for drift, for edge cases being handled badly, for patterns that suggest the instructions need updating.

Iterates on instructions and processes deliberately.

Treats prompt refinement the way a good manager treats a development conversation — specific, structured, and tied to observed performance gaps. Not "make it better" but "this type of input is being handled incorrectly, and here's why."

Owns the escalation framework.

Defines clearly what the agent handles autonomously, what it flags for review, and what it refuses to touch. Reviews this framework regularly as trust in the agent's judgment builds over time.

Manages the human-agent interface.

Works with the human team to ensure handoffs are clean, escalations are acted on quickly, and the humans working alongside agents understand how to give feedback that improves agent performance over time.

04 / A New Job or Everyone's Job?

The right question isn't who owns this. It's whether anyone does.

When I describe the agent manager function to people, the first question is almost always the same: is this a new dedicated role, or is it something everyone takes on? It's a reasonable question. And the honest answer is that it depends on the organisation — but the form it takes matters much less than whether it exists at all.

In a smaller organisation, agent management is likely distributed — a new dimension of every existing role rather than a separate function. The sales lead is responsible for the performance of the agents running through their pipeline. The ops manager owns the agents handling their workflows. The HR coordinator is accountable for how well the onboarding agent is doing its job. In this model, agent management doesn't require a new hire. It requires a new way of thinking about every hire you already have.

In a larger organisation deploying agents at scale — across multiple functions, with hundreds of agent interactions happening daily — that distributed responsibility needs an anchor. Someone whose primary function is the performance of the agent workforce. Not IT. Not an automation team. Someone with the operational instincts of a great COO and the management discipline to hold a mixed human-agent team to a rising standard.

But here's the thing both models share — and this is the part most organisations are missing entirely. In both cases, the agents need to be on the org chart. Not as a technology project. Not as a line item in an automation budget. As named members of the workforce, with defined responsibilities, clear owners, and performance expectations.They have roles. They have owners. They get reviewed. They get better.

At Lua, every role — not just management roles — requires the person hiring to name both the human hires and the agent hires they need to win in the next quarter. It makes the mixed workforce structural rather than theoretical. You can't write a job requirement without thinking about the agent counterpart. Which means every person joining the team arrives already thinking about how humans and agents divide the work — and who is accountable for the performance of both. That's what it looks like when agents are genuinely part of the organisation, not bolted onto the side of it.

That's the model we think every organisation will eventually arrive at. Not an AI team managing agents on behalf of everyone else. A workforce where every person understands their agents the way they understand their direct reports — with clear expectations, regular review, and genuine accountability for improvement. The organisations that build this awareness into how they hire, how they set objectives, and how they review performance will develop agent management as a core organisational capability. The ones that treat it as someone else's problem will discover the cost of that decision too late.

05 / The Compounding Advantage

The organisations that get this right will be very difficult to catch.

Here is why this matters beyond the immediate question of making your current agent deployments work better.

An agent that is actively managed — reviewed regularly, iterated on deliberately, held to a rising standard — improves continuously. Month one performance is the floor, not the ceiling. The organisation accumulates institutional knowledge about how to deploy agents effectively: what works, what doesn't, where the edge cases are, how to configure for their specific operational context. That knowledge compounds in ways that are genuinely hard to replicate.

An agent that is deployed and left to run compounds in the opposite direction. Drift accumulates. Edge cases pile up unaddressed. The gap between what the agent does and what the organisation actually needs widens quietly until someone eventually shuts it down and concludes that agents don't work — when what they actually discovered is that unmanaged agents don't work.

The organisations that formalise this role first — that create the agent manager function before their competitors do — will have a 12 to 18 month head start on the improvement curve that is very hard to close. Not because their AI is better. Because their management of it is better. And management, unlike AI capability, is not something you can buy off the shelf or replicate by deploying the same platform as everyone else.

"The companies that win the agent era won't have the best AI. They'll have the best management of AI. That distinction is available to any organisation willing to take it seriously — and almost none are taking it seriously yet."

What to do with this.

If you are deploying agents and you don't have a clear answer to the question "who is responsible for making these agents better?" — that's the problem to solve first. Not the model selection, not the integration complexity, not the deployment architecture. The ownership question.

Name a person. Give them accountability. Define what success looks like. Set a review cadence. Treat the agent workforce the way you treat your human workforce — with the expectation that performance is something you develop continuously, not something you configure once and walk away from.

At Lua, this is what we built the platform for — not just to make it possible to deploy agents, but to give the people managing them the infrastructure to do it well. The visibility into agent performance, the tools to iterate on instructions, the frameworks to manage escalation and human-agent handoff. The agent manager role doesn't exist yet as a formal function in most organisations. But the organisations that create it are the ones building something that compounds. And at the rate agents are becoming central to how businesses operate, the window to be early on this is shorter than it looks.

See how Lua works →