Stop building an AI agent army. Build one workflow that ships.

Anurag Sharma Avatar
Live Reading
Stop building an AI agent army. Build one workflow. Neon hub and spoke with lime command panel routing to pink and purple workers

← The AI-First Marketer series

The most expensive thing in marketing right now is not a model. It is the gap between the agent stack you bought and the workflow you never finished.

I have watched founders, heads of marketing, and creators spend the last eighteen months wiring together orchestration frameworks, vector stores, MCP servers, evaluator loops, and four flavors of memory. I have done it myself. Then I sit with them on a Tuesday afternoon and ask the only question that matters: which workflow shipped this quarter, end to end, with a real metric attached. The room goes quiet. Somebody mentions an experiment. Somebody mentions a prototype. Nobody mentions a workflow that actually runs every Monday and moves a number.

The agent stack is a vanity metric. The only number that matters is how many real workflows you ship per quarter, and most operators ship zero because they keep collecting tools instead of finishing one.

This essay is the disciplined-operator counterpunch to the agent-army theatre. If you are the restless founder with seventeen tabs open right now, three of them frameworks you starred this morning, this is for you.

Where I am wrong

Before the prescription, the honest part.

There is a class of work where the agent army wins. If you are an AI lab building general-purpose agents, you should absolutely be running parallel research lanes, swapping orchestration patterns, and stress-testing agent collaboration. Anthropic’s own multi-agent research system shows that for sufficiently complex, parallelizable research tasks, multi-agent setups can outperform single-agent ones by a meaningful margin, while using roughly fifteen times the tokens (Anthropic Engineering, “How we built our multi-agent research system,” June 2025, retrieved 2026-05-10). That is real. If you are them, build the army.

Second, in some enterprises the right answer is genuinely a portfolio of small workflows, not one. McKinsey’s 2025 State of AI survey found that organizations capturing the most EBIT impact from generative AI tend to redesign workflows in more than one function at a time, not fewer (McKinsey, “The state of AI: How organizations are rewiring to capture value,” March 2025, retrieved 2026-05-10). If you sit on a real ops budget and a CTO who can stand behind redesign across three functions in parallel, ignore me.

Third, the discovery phase matters. You cannot pick the right workflow without trying five wrong ones. Some of what looks like collecting tools is actually mapping the terrain. I am not arguing against exploration. I am arguing against permanent exploration as a substitute for shipping.

If any of those three describe you, this essay is the wrong essay.

If you are a marketing leader, founder, or creator with a small team, a real metric to move, and a backlog of half-built agent experiments that haven’t touched a customer in six months, keep reading. This is the one.

The agent-army trap

Here is what is actually happening in 2026.

Bain’s 2024 Technology Report called out a specific failure mode. Companies are running large numbers of generative AI pilots, but very few are scaling to enterprise impact. The bottleneck is not model quality. It is operating-model integration. The work of wiring AI into a workflow that real humans use every week (Bain & Company, “Technology Report 2024,” retrieved 2026-05-10).

MIT Sloan Management Review put a number on the same problem. In their long-running research with BCG on AI in organizations, the gap between firms that report AI value and firms that don’t comes down to whether AI is woven into a defined process, not whether they have access to better models (MIT Sloan Management Review and BCG, “Expanding AI’s Impact With Organizational Learning,” October 2020, retrieved 2026-05-10).

Translation: the model is not the moat. The model is a commodity. The workflow is the moat. And almost nobody is building the workflow.

In my own org, when I look across the thirty people on my marketing team, the pattern is identical. The operators who shipped meaningful AI-driven outcomes in the last two quarters all did the same thing. They picked one workflow, one metric, and one owner. They did not build an agent army. They built one boring, repeatable loop. The operators who shipped nothing all did the opposite. They had stacks. They had Notion pages full of agent diagrams. They had no workflow in production.

This is not a tooling problem. It is a finishing problem. And finishing is unsexy, which is why the agent-army crowd avoids it.

The five enemies of shipping

When I dig into why a workflow doesn’t ship, it is almost always one of these five.

One: tool collection feels like progress. Wiring a new framework gives you a hit of the same dopamine that finishing a workflow would, at one tenth the effort. Your brain cannot tell the difference. Your business absolutely can.

Two: the workflow you picked is too big. “Automate content marketing” is not a workflow. “Turn one Loom from the founder into three LinkedIn posts and one blog draft, every Friday, owned by one person” is a workflow. The first one ships nothing. The second one ships fifty-two times a year.

Three: no single owner. If two people own it, nobody owns it. AI workflows in particular die fast when ownership is fuzzy because the failure modes are weird and require taste, not a ticket queue.

Four: no metric attached. If you cannot say “this workflow moves X by Y% per quarter,” you are not shipping a workflow, you are shipping a hobby.

Five: no recovery built in. The teams that ship every week are the teams that built the workflow assuming it will fail twice a quarter. The teams that ship nothing assumed it would never fail and stopped maintaining it the day it broke. Recovery is part of the work.

The model is not the moat. The model is a commodity. The workflow is the moat. And almost nobody is building the workflow.

The Single Workflow Filter (operator artifact)

Here is the filter I run with my team before we build any AI workflow. If a workflow does not pass all six gates, we do not start. We park it.

#GateThe questionIf you fail this gate
1Metric GateDoes this workflow move one specific number we already track on a weekly dashboard?Pick a different workflow. No metric, no build.
2Owner GateIs there one named human who will be on the hook for this workflow shipping every week, by name, on Slack?Assign an owner first. If nobody volunteers, the workflow is not important enough.
3Cadence GateDoes this run on a real, repeating cadence (daily, weekly, per release), not “on demand”?Find a cadence or kill it. On-demand workflows decay.
430-Minute GateCan the owner describe the entire workflow, end to end, in under thirty minutes to a new hire, without opening a single AI tool?If not, the workflow is too complex. Cut it in half.
5Boring GateIf you described this workflow at a dinner party, would people change the subject? Good. Boring is the point.If it sounds exciting, you are probably building a demo, not a workflow.
6Recovery GateWhat happens the day this breaks? Who notices in the first hour? What is the manual fallback?If the answer is “we will figure it out,” you have not finished designing it.

We use this filter on every proposed AI workflow in my org. About eighty percent get parked at Gate 1 or Gate 2. The twenty percent that pass all six are the ones that ship and stay shipped.

What “one workflow that ships” actually looks like

A real example, sanitized.

We pick a single, ugly, recurring marketing task. Not a moonshot. Something the team already does badly, with too many human hours. We design one workflow around it. One agent or two, not seven. One owner. One dashboard tile. One weekly cadence. One recovery plan.

The first version is bad. It saves maybe twenty percent of the time it should. It breaks the second week. The owner fixes it without escalating. By week six, it runs without supervision on Mondays. By week twelve, the metric attached to it has visibly moved on the weekly dashboard. By week twenty, nobody on the team remembers who used to do that work manually.

That is one workflow that shipped. It is not impressive in a screenshot. It is devastating over a year.

Now multiply: one workflow per quarter, per team. Four workflows a year. Twenty workflows in five years. Each one boring, each one running, each one moving a real number. That is what an AI-first marketing org actually looks like in 2026. It does not look like an agent diagram. It looks like a quietly productive Monday.

The recovery part

Last point, because I am not in the business of grindset cosplay.

Shipping one workflow per quarter sounds modest until you try to actually do it while running everything else. The reason most operators don’t ship is not laziness. It is that the work of finishing is psychologically more taxing than the work of starting. Every workflow that ships costs you taste, judgement calls, ten ugly fixes, and the patience to maintain it after the novelty wears off.

If you are going to commit to one workflow per quarter, commit to the recovery cadence too. Block a real off-day after each workflow ships. Move slowly the week after. The teams I see burn out on AI adoption are the teams that treat shipping as a continuous sprint instead of a series of finishes with rest in between. Recovery is part of the work. Always has been.

The next agent framework will not save you. The next workflow you actually finish, will.

Pick one. Name an owner. Attach a metric. Ship it by next Friday. Then rest. Then pick the next one.

That is the whole game.


Read the rest of the AI-First Marketer series

Liked this?

Get The Operator in your inbox.

Every Sunday. One tactic, what is working, one AI tool, one question. For marketing leaders and founders running lean.

Leave a Reply

Your email address will not be published. Required fields are marked *