Why We Stopped Letting AI Tools Decide How Our Team Codes

1 hour ago 1

Madhav Haldia, Founding Engineer at Color, a published AI researcher who architects consumer AI systems from the ground up.

A few weeks ago, one of our mobile engineers spent half a day debugging an iOS distribution signing issue for an Expo app. The kind of problem where the answer depends on whether your team uses Xcode-managed automatic signing, what your provisioning profiles look like and a handful of other context-dependent factors. The AI coding agent confidently suggested four different fixes in sequence. Each was plausible. Each was wrong.

The engineer eventually solved it the old-fashioned way. He read Apple's documentation, checked our actual certificate setup and reasoned through it from first principles. The agent had been a distraction, not a co-pilot.

That moment forced a question I've been sitting with since. Are we choosing AI coding tools, or are AI coding tools choosing how we work?

The Workflow Inversion Most Teams Don't Notice

Walk into most engineering orgs today and you'll see something strange. Teams have reorganized their entire development process around whatever the current popular agent expects. They write specs in the format the tool prefers. They break work into the chunk size the agent handles best. They tolerate multi-hour agent runs because that's what the tool does.

The tail is wagging the dog.

This happens quietly because each individual accommodation feels minor. But the cumulative effect is significant. Your team's workflow is now a derivative of someone else's product roadmap. When that vendor has an outage, and there have been several this year that brought entire teams to a standstill, your productivity goes with it. When pricing shifts or a model gets quietly nerfed between releases, your unit economics change without warning.

There's a second-order problem too. The workflow these tools push you toward, generate then review then iterate, assumes the human reviewing the output can still spot what's wrong. That assumption does a lot of work. And there's growing evidence it doesn't hold up over time.

The Skill You Can't Outsource

Here's the uncomfortable part. The engineers on our team who get the most value from AI tools could already do the work without them. They use agents to accelerate the parts they understand deeply. They catch the subtle wrongness in generated code because they've internalized what right looks like.

The reverse is also true. Engineers who lean on agents to learn unfamiliar territory tend to ship code that works until it doesn't. The edge case appears. The integration breaks in production. Someone asks them to explain a design decision they didn't actually make.

Reviewing generated code is roughly half of the learning loop. The other half is the friction of writing it, getting it wrong and debugging your way out. If your team's AI workflow assumes everyone has the judgment to supervise an agent at architectural depth, you've built a workflow that only works for people who, ironically, need the agent least.

What Worked: Lean Tools, Custom Extensions

We stopped trying to find the perfect all-in-one agent platform. Instead, we picked a small set of lean, extensible tools and wrote our own thin layer around them.

Our editor is Zed. It's just an editor, fast and out of the way. Our agent is pi, which is lean enough that we don't get ten different agents collaborating for hours on something we didn't ask for. When we need pi to do something specific to our workflow, we write a custom extension for it. We don't reshape how we work to match what a tool expects. We shape the tool to match how we already work. For managing multiple agents across different repos when we're working on several features at once, we use Supacode.

This sounds like more setup, and at first it is. But the payoff is meaningful. We wrote every behavior of the system, which means we can debug, change or remove every behavior. There's no mystery layer. When something goes sideways, we know where to look. When a vendor changes its pricing, we can swap one component instead of re-platforming an entire workflow.

I'll add an honest caveat. This approach only makes sense if your team can already code well without AI. If they can't, no amount of tool customization fixes that. It probably makes it worse.

Three Questions Before You Pick A Tool

If you're an engineering leader evaluating AI coding tools right now, I'd run any candidate through three filters before any feature comparison.

Does this tool fit our workflow, or are we about to reshape our workflow to fit it? The asymmetry of that answer tells you who has the leverage.

What skills does this tool let our team stop practicing? Some atrophy is fine. Nobody mourns hand-rolled boilerplate. But debugging, architectural reasoning and reading unfamiliar code are not skills you want your team to lose.

What happens to our work if this vendor disappears tomorrow? If the honest answer is "we stop," you've made a deeper commitment than a tooling decision. You've made a dependency decision.

The Takeaway

The teams I see thriving with AI coding tools aren't using the most agents or burning the most tokens. They're the ones who treat these tools as exactly what they are. Powerful, fallible and best when they fit into a workflow the team already understood before the tools arrived.

Pick tools that bend to your team. Not the other way around.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Read Entire Article