Have You Tried Our New AI Agent?
How Architecture Can Shape the Agentic Maturity Curve
We’ve all seen the meme.
An endless row of urinals. One person standing in solitude. Another walks in and inexplicably takes the spot right next to them, breaking the unwritten rules of personal space. In the final frame, the interloper leans in and says: “Have you tried our new AI chatbot?”
It’s funny because it’s true. In enterprise technology, we’ve reached that moment.
No matter where you are, be it at a conference, in a customer meeting, or online, someone is pitching their AI. Not as a feature, but as a future. A bold promise of agentic transformation. A system that doesn’t just respond to prompts, but acts, learns, adapts, and even initiates. What they’re really describing is the top of the maturity curve: full agentic autonomy, integrated into complex workflows, spanning departments, functions, and systems.
But most organisations aren’t there. They are not even close.
To make sense of where we really are, we need to zoom out and think about the roles AI is actually playing inside the enterprise today. Not in terms of dry technical levels, but in terms of behavioral archetypes. That is, how agentic systems show up, contribute, and evolve within operational environments.
This isn’t just a metaphor. When you think about it, it’s how we already assess capability in people. Think of frameworks like SFIA, where progression is measured through increasing autonomy, complexity, and influence. Agentic systems can, and should, be viewed through the same lens. The difference is, these “digital roles” evolve fast, and organisations must be ready to architect around their growth.
Consider this maturity curve:
Level 1: The Intern
It waits to be told what to do. You type a prompt, it generates a response. It’s helpful, occasionally insightful, but entirely dependent on your initiative. It doesn’t know your business. It just listens, replies, and resets.Level 2: The Assistant
It starts to understand the task at hand. It can reference documents, fetch status updates, and personalise answers slightly. But it’s still reactive. It doesn’t start work or close loops without being asked.Level 3: The Advisor
Only now is it starting to show up inside workflows. It offers suggestions based on current context, flags likely next steps, highlights gaps. It has a seat at the table, but still needs your sign-off to act.Level 4: The Operator
The AI begins to act independently on routine tasks. It follows observed patterns, initiates tasks based on known thresholds, and performs actions within bounded autonomy. Think rules with judgment. Still supervised, but operationally useful.Level 5: The Orchestrator
It moves across systems, coordinates between teams, and links processes end-to-end. It behaves like a workflow manager with agency, making decisions, notifying stakeholders, and closing loops faster than traditional automation ever could.Level 6: The Autonomous Partner
It sets goals, adapts to new constraints, learns from outcomes, and proactively adjusts course. This is the rare agent that doesn’t just do the work, it understands why the work matters. Few, if any, organisations are truly here, and none will arrive by accident.
I think it’s a helpful frame. Because when someone pitches you “agentic AI,” what they’re really doing is pitching Level 5 or 6. The thing is, most organisations are still at Level 2 or 3. They are just trying to figure out how to manage (Level 2) Assistants and (Level 3) Advisors, not (Level 6) Partners. And that’s exactly where the real work is.
But this is also where architectural maturity matters.
To operationalise Agentic AI, organisations must first address where they actually are. This is not a deficiency; it’s the natural early phase of a longer journey. It’s where foundational work happens. Building structured workflows, cleaning up data debt, linking systems, embedding machine-readable signals into processes that were never designed to be dynamic.
Yet even while focusing on these foundational steps, architectural choices must still be made with higher maturity in mind. This doesn’t mean stalling delivery while you stand up a Centre of Excellence or blueprint every possible future state. In fact, the danger is doing exactly that.
This is about applying a core TOGAF-aligned principle: just enough, just in time.
Good architecture doesn't demand you predict the future but it does ensure you are ready for it. It scaffolds the business to evolve without locking it into imagined outcomes.
Because by the time you need AI to act autonomously across systems, functions, and boundaries it’s too late to retrofit the integration fabric, policy scaffolding, and trust frameworks that should have been laid down during the so-called “basic” levels.
The real “oops” moment won’t be that your COE wasn’t ready. It will be that your architecture wasn’t. Or worse. That there wasn’t one at all.
And as architecture matures, so too must cost modelling.
At Levels 1 to 3, AI usage feels mostly invisible. It is prompt-response tools bundled into product licenses, sporadic usage patterns, and low volumes. There’s no real cost signal and nothing to optimise.
But once you enter Levels 4 through 6, the economics of autonomy become real.
Agents aren’t just responding; they’re initiating. They’re generating, evaluating, and acting. Potentially tens of thousands of times per day. That’s when cost starts to matter. Not just to Finance, but to Architecture.
In preparation for this shift, I’ve been developing a lightweight benchmarking approach using two metrics: TRU (Token Resource Unit) and TAPS (Tokens Attributed Per Second). These are not industry gospel. Just my own working tools to help clients begin meaningful conversations about cost visibility, token efficiency, and agent value attribution.
I’ll be sharing a more fulsome post on this in the coming weeks, focused on how to benchmark and anticipate the operational cost (not price) of agentic systems before they become invisible and unmanageable. Because cost, in an agentic world, is no longer just a licensing construct. It is also a function of architecture, observability, and flow.
The platforms best positioned for this journey, like ServiceNow’s NowAssist, Microsoft’s Copilot stack, and Google’s Duet AI (now evolving under Gemini), offer the ability to start small (Levels 1 and 2), embed gradually (Levels 3 and 4), and scale wisely (Levels 5 and 6). They don’t force you to leap ahead. They do allow you to build forward.
And crucially, they do so regardless of the underlying LLM instance whether that’s OpenAI, Anthropic, Mistral, Meta, Cohere, or Google itself. Because what matters isn’t which model generates the token. It’s how that token moves through your business, and what architecture governs its use.
But that same principle, meet the enterprise where it is, not just where you want it to go, should start applying to the vendors themselves. It’s probably time for a pivot. We get the vision. We understand the North Star. But now do a better job of lighting the path. Map it. Segment it. Make it real.
By all means, keep the platform-level vision and high-fidelity futurecasting for architects, strategists, and digital governance leaders who are building with 2030 in mind. But also come back to ground level. Meet the operators, the service managers, and the domain leads at Level 2, not just Level 6. Help them succeed in the present. Help them lay track, not just stare at the horizon.
Because while the men’s bathroom meme is light-hearted, the message beneath it is not. We are at risk of overwhelming teams and disappointing leaders by over-promising agentic transformation and under-delivering on operational value.
Worse, we risk turning AI into background noise. Just another system that talks a lot, but helps little.
It’s time to acknowledge the curve. Then respect the curve.
Design for the Partner. Deliver with the Assistant. Model the cost. Build the architecture. And maybe, let people finish their business in peace.