My AI Development Everyday Kit

Everyday kit started as a community around what people put in their pockets. Flashlights, multitools, wallets, pens. The serious practitioners don't just assemble a kit, they obsess about which items, and why, and what they'd swap out if something better came along. The gear itself is almost secondary to the discipline of continuous refinement.

I've been thinking about my development workflow the same way.

What This Is Not

My resume talks about significant productivity gains from AI development at a previous employer. Those results are real, and I'm proud of them. That work predates this workflow by a substantial margin. The approach that drove those results was structured, deliberate, and effective. It was also entirely manually coordinated, step by step, by me.

I want to say that plainly before describing what I'm doing now, because the instinct when writing about workflows is to let past results prove the concept. That's backwards. The past results proved what was possible when an experienced engineer learned to work alongside AI tooling. This workflow is what I'm using today, when the tools are considerably more capable, and when the coordination layer itself can be automated.

Three Months Is a Long Time

Twenty years of shipping software means you've accumulated a specific kind of pattern recognition. You know what a plan looks like when it's hand-waving over the hard part. You know when a proposed implementation is clean versus clean-looking but will cause pain in six months. You know which shortcuts are acceptable in context and which ones are the kind of thing you'll be debugging at midnight two quarters from now.

When I was coordinating AI manually, that judgment was the coordination. I directed Claude Code to migrate an endpoint from a monolith to a microservice using a file-based plan template. Then reviewed what it produced. Then decided whether to proceed or push back. Then ran the unit tests. Then validated the contracts held. Then picked the next endpoint and started over. Nothing moved forward without me there to give the next instruction.

Think of a construction foreman who has to personally hand every tool to every worker. The crew is capable. The process is sound. You are the bottleneck at every transaction.

Over the past few months, the coordination layer got automated. The phases I was already running mentally got formalized, the transitions between them became explicit approval gates, and different model configurations took on different cognitive tasks. What I used to sequence manually now sequences itself between checkpoints.

What I have now is less a conversation partner and more a structured workflow with multiple specialized agents operating in sequence, each with defined responsibilities, and me as the approval gate at every phase boundary. I'm still the foreman. I'm just not handing out tools anymore.

The Current Kit

Here's what's in my kit. This is not a tutorial, and I'm not suggesting you copy it. This is what I'm using right now.

The core structure is a workflow orchestrator with seven potential phases. I say potential because most are conditional, and skipping them when they are not needed is part of the discipline.

Brainstorm is where vague problems get clarified. I skip this when input is specific and actionable. If I know exactly what needs to happen, spending time on brainstorming is waste. When the problem is fuzzy, or when I have multiple viable approaches I haven't thought through yet, this phase uses an Opus-class model to explore options before anything gets built.

Research is a codebase deep-dive. I skip this when working in familiar territory. I use it when touching a system I haven't been in recently, or when a feature spans multiple subsystems and I need to understand what's already there before proposing changes.

Plan creates a concrete implementation plan as a document. This phase is not optional. Every piece of work has a plan, and the plan checkpoint requires my explicit approval before anything moves forward. I can revise it. I can abort. I cannot skip it.

Design expands the plan into implementation-ready specifications: exact function signatures, test cases with inputs and outputs, edge cases documented. This phase skips when the plan is already detailed enough, which is common for smaller changes.

Plans and designs are written to disk as part of the repository. This matters for two reasons: the artifacts are available across sessions, and they don't get abridged or lost when conversation context compacts over a long implementation session. The decisions made at the start of a piece of work remain accessible when you're deep in the middle of it. There's also an analog here to architectural decision records: the plan captures what was decided and why, the summary captures where implementation diverged and what drove that. Together they form a decision trail that lives with the code.

Implement is where the code gets written, per the design. This runs on a Sonnet-class model, because code generation at speed matters more here than the deeper reasoning that planning requires. Tests are in scope, not an afterthought.

Review runs in sequence. First, a full build and test suite. If anything fails, nothing proceeds. Second, a code review agent (Opus again, because catching subtle security issues and performance problems requires the same quality of reasoning as planning) plus an accessibility reviewer if the change touches UI. The review agent can trigger an auto-fix loop for critical issues, but only up to two cycles before halting for me to decide.

Summary is generated after review and before commit. It captures what was built, but more importantly it captures what drifted from the plan during implementation, where the design said one thing and the code ended up somewhere slightly different, and why. That delta is often the most useful part. The summary goes to disk alongside the plan and design.

Commit is handled by a Haiku-class model, because at that point you are formatting a commit message, not thinking. Simple task, small model.

The model selection is not arbitrary. Opus for thinking, Sonnet for doing, Haiku for formatting. That maps roughly to what a thoughtful senior engineer would say about when to go slow versus when to go fast.

Across all of this are human checkpoints. Every phase boundary requires my explicit approval to proceed. I see what was produced, and I decide whether to continue, revise, or abort. Nothing moves forward without me saying so.

What Actually Changed

Claude Code has been able to write good code for a while. What changed is where the expertise lives and who handles the coordination.

In the earlier workflow, both of those were me. All of that judgment lived in my head, and I applied it manually at each step. The discipline of moving through phases correctly was also mine to maintain, which meant it was mine to erode when I was tired or in a hurry.

The current workflow separates those responsibilities. The skills are where the expertise now lives. The skills encode what a good implementation looks like, which libraries we use and why, what I won't accept in a code review. None of it gets reconstructed from memory each time. Every project gets the same standards applied consistently, because it's written down rather than held in my head.

The orchestrator handles the coordination. Planning is done by a different agent with different instructions than implementation. Review is done by an agent specifically primed to look for security and performance issues, with no stake in defending the implementation choices. I can't skip the review because I'm tired. The process requires it.

The checkpoints are still where the 20 years matters. When I approve a plan, I'm evaluating whether the agent's approach is one I'd have chosen myself, whether it's glossing over something that will hurt later, whether the architecture will still make sense in six months. The workflow makes sure that scrutiny lands on something well-structured rather than something that already got half-built in the wrong direction.

The Honest Truth About Everyday Kits

Here's what the everyday kit people know and most tech writing omits: what's in my kit today is not what I'll be using in three months.

I'm actively working on a next-generation version of this workflow. The details aren't settled enough to write about yet. That's exactly the point.

This is the correct relationship with your tools. Use what works today, keep asking whether it could work better, and be willing to replace something you just built when something better comes along. Attachment to workflow is one of the more insidious forms of technical debt.

The everyday kit practitioners figured this out. You don't get the most out of a good tool by using it for everything. You get the most out of it by understanding exactly what it does well, pairing it with the right things, and constantly asking whether your kit could be better.