Single-agent, tools, or a team? A practical comparison of AI coding setups
Single-agent, agent-with-tools, or multi-agent? The same feature through all three setups, the failure modes to watch for, and a decision matrix you can actually use.
This is Part 1 of a 5-part series on mastering AI-assisted development. Each week, we’ll dive deeper into practical techniques for building production-ready applications with AI coding assistants.
A few weeks ago, I presented at a Xebia .NET synergy event on moving from ad-hoc “vibe coding” to structured, spec-driven development. The questions and discussions afterward showed that developers are wrestling with these same issues, so I decided to expand the content into this blog series.
This blog series is the result of that presentation, expanded and refined based on the feedback and discussions with fellow developers. Whether you attended the event or are discovering this topic for the first time, my goal is to give you practical, actionable guidance for building production-ready AI-assisted applications.
AI coding assistants have changed how we write software. The code they generate often works well enough to ship, but falls apart when you need to maintain or extend it. This guide shows you how to get from code that works to code that’s production-ready using specification-driven development.
Based on GitHub’s Spec-Kit and Addy Osmani’s “Beyond Vibe Coding” guide, this series walks you through a structured approach to AI-assisted development that actually works for real-world applications.
In this first post, we’ll explore the fundamental problem with unstructured AI coding and introduce you to the spec-driven approach. Next week, we’ll get hands-on with the actual workflow.
Software development has changed significantly in the past few years:
The question is how to use AI effectively while maintaining code quality.

The term “vibe coding” comes from Andrej Karpathy (former head of AI at Tesla). It describes an approach where you:
Accept AI suggestions without critical review, trusting the output completely
For prototypes and experiments? Maybe fine. For production code? We need something more structured.
Key insight: Vibe coding isn’t inherently bad code—it’s a specific approach where you trust the AI completely and don’t review what it produces.

Here’s the core problem with vibe coding: AI can get you 70% of the way incredibly fast. But that last 30%? That’s where things get difficult.
“We’ve seen apps leak database credentials because the AI ‘helpfully’ included them in client-side code.”
That’s not hypothetical—it happens.

AI-assisted development exists on a spectrum:
| Approach | Risk | Reward | Control |
|---|---|---|---|
| Autocomplete | Low | Low | High |
| Chatbot assistance | Medium | Medium | Medium |
| Agentic coding | High | High | Lower |
| Spec-driven development | Managed | High | High |
The key insight: As AI gets more capable, we need more structure, not less.

AI-assisted engineering is not about letting AI do whatever it wants. It’s about maintaining human oversight while leveraging AI capabilities.
Think of it like being the architect while AI is the contractor. You design, they build, but you review everything.

This is where the paradigm shift happens:
| OLD WAY | NEW WAY |
|---|---|
| Write code first | Write specifications first |
| Document later (maybe) | Code follows from specs |
| Specs are scaffolding | Specs are source of truth |
For decades, we treated specifications as scaffolding—useful during construction but discarded afterward. Now, specifications become the source of truth that generates the implementation.
As Martin Fowler explores in his article on exploring generative AI, spec-driven development exists on a maturity spectrum. Understanding these levels helps clarify where Spec-Kit fits.
%%{init: {'theme':'base', 'themeVariables': { 'fontSize':'18px'}}}%%
graph TB
subgraph Level1["<b>Level 1: Spec-First (Throwaway)</b>"]
direction TB
A1["📄 Write spec.md<br/>for feature"]
B1["⚙️ Generate<br/>code"]
C1["🗑️ Delete<br/>spec.md"]
D1[" "]
E1["📄 Write new<br/>spec.md"]
F1["⚙️ Update<br/>code"]
A1 --> B1
B1 --> C1
C1 -.->|New feature needed| D1
D1 -.-> E1
E1 --> F1
end
subgraph Level2["<b>Level 2: Spec-Anchored (Multiple Files)</b>"]
direction TB
A2["📄 Original<br/>spec.md"]
B2["⚙️ Generate<br/>code"]
D2[" "]
E2["📝 Write<br/>change-spec.md"]
F2["⚙️ Update<br/>code"]
G2["📄 spec.md stays<br/>but outdated"]
A2 --> B2
B2 -.->|Change needed| D2
D2 -.-> E2
E2 --> F2
F2 --> G2
end
subgraph Level3["<b>Level 3: Spec-as-Source (Single Truth)</b>"]
direction TB
A3["📄 spec.md<br/>is truth"]
B3["⚙️ Generate<br/>code"]
D3[" "]
E3["✏️ Edit<br/>spec.md"]
F3["♻️ Regenerate<br/>code"]
A3 --> B3
B3 -.->|Change needed| D3
D3 -.-> E3
E3 --> F3
F3 -.-> A3
end
style C1 fill:#ffdddd,stroke:#cc0000,stroke-width:3px
style E2 fill:#fff4cc,stroke:#cc9900,stroke-width:3px
style E3 fill:#ddffdd,stroke:#00cc00,stroke-width:3px
style F3 fill:#ddffdd,stroke:#00cc00,stroke-width:3px
style A3 fill:#ddffdd,stroke:#00cc00,stroke-width:3px
style Level1 fill:#f9f9f9,stroke:#666,stroke-width:2px
style Level2 fill:#f9f9f9,stroke:#666,stroke-width:2px
style Level3 fill:#f9f9f9,stroke:#666,stroke-width:2px
You write a spec to help the AI understand what to build, then delete it once the code is generated. The spec was just scaffolding—useful temporarily, then discarded.
Problem: When you need to change the feature, you start from scratch with a new spec. No continuity, no history.
The original spec persists, but changes are documented in separate files. You’re building a paper trail of evolution, but the original spec becomes outdated.
Problem: You end up with spec.md, new-feature-spec.md, bug-fix-spec.md, etc. Which one is the source of truth? You have to read them all in order.
The spec is the source of truth. When you need changes, you edit the spec and regenerate the code. The spec stays current because it’s the authoritative definition of what the system should do.
This is where Spec-Kit lives. The specification isn’t documentation of the code—the code is an implementation of the specification.
In traditional development, we wrote code and maybe documented it later. The code was the truth.
In spec-as-source development, we write specifications and generate code from them. The spec is the truth.
This isn’t just a philosophical shift—it’s practical. When bugs appear or requirements change, you update the spec and regenerate. The spec never drifts out of sync with reality.

Spec-Kit is GitHub’s open-source framework for spec-driven development.
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
specify init my-project

Four core principles guide Spec-Kit:
These aren’t arbitrary—they’re based on what actually works in practice.

Here’s the complete Spec-Kit workflow:
Constitution → Specification → Plan → Tasks → Implementation
Each step has a slash command. Each step produces artifacts for the next step. This chain creates accountability—you can trace any decision back to its source.
| Step | Command | Output |
|---|---|---|
| Constitution | /speckit.constitution |
constitution.md |
| Specify | /speckit.specify |
spec.md |
| Plan | /speckit.plan |
plan.md, data-model.md, api-spec.json |
| Tasks | /speckit.tasks |
tasks.md |
| Implement | /speckit.implement |
Working code |

You might be thinking: “Wait, I already have custom instructions for Copilot or Claude Code. Isn’t this the same thing?”
Great question—and this is the heart of the confusion.
Both files are just Markdown. An LLM can read both the same way. So why does one work better?
Spec-Kit’s constitution is part of a multi-step enforced workflow:
constitution.md firstCustom instructions (like Copilot’s copilot-instructions.md):
| Spec-Kit | Custom instructions |
|---|---|
| Architect’s blueprint + construction plan + building permits checked at every phase | Style guide for a contractor + final inspection |
Both are valuable! You can even use both together—Copilot’s instructions for coding style, Spec-Kit’s workflow for complex features.

Think of this as your project’s “bill of rights”—the principles that guide all decisions.
/speckit.constitution
The AI references this during all phases. It’s your guardrail against scope creep and over-engineering.

Focus on what and why—not how.
Build me a todo app
## User Stories
As a busy professional, I want to:
- Quickly capture tasks with minimal friction
- See my tasks organized by priority
- Mark tasks complete with a single tap
## Acceptance Criteria
- Task creation takes < 2 seconds
- Tasks persist across browser sessions
- Works offline with sync when online
The more context you provide here, the better your results throughout the entire process.

Now you specify the tech stack. Not before.
Why wait? Because understanding what you’re building should drive how you build it.
plan.md: Overall architecturedata-model.md: Your data structuresapi-spec.json: API contractsresearch.md: Framework recommendationsPro tip: Ask the AI to research rapidly-changing frameworks. Its training data might be outdated on specific library versions.

This takes your plan and breaks it into actionable, implementable chunks.
The key is ordered execution—dependencies are respected automatically.
Review these tasks! This is your last chance to adjust scope before implementation begins.

The /speckit.implement command:
Load Constitution → Validate Spec → Review Plan → Execute Tasks → Run Tests
Critical: Test the application after completion. Feed runtime errors back to the AI.

What happens when specs change or bugs appear? This is frontier territory, but here’s the workflow:
/speckit.analyze — Cross-artifact consistency check/speckit.checklist — Verify readiness/implement fix bug: [description with full context]
spec.md/speckit.plan/speckit.tasks/implementKey principle: “Specification is durable, plan/tasks are flexible”
After any fix, ask AI to: “Update plan, tasks, data-model to reflect this change”

Four reasons:
Result: You get the full 100%, not just the easy 70%.

Why is my code not working?
The handleSubmit function in UserForm.tsx throws
"Cannot read property 'email' of undefined" on line 47
when the form is submitted with empty fields.
Stack trace:
[full trace here]
Expected: Form validation should prevent submission
Actual: Error thrown before validation runs
The quality of AI output is directly proportional to the context you provide.

This is exactly what happens when you say “build me a todo app” without planning:
You ask for a bicycle. The AI proudly presents… a massive over-engineered robot spaceship.
“Give me options, starting with the simplest. Don’t code yet.”
Ask for architecture OPTIONS first. Start with the simplest viable solution.

After every AI update:
❌ “It’s broken”
✅ “The submit button should save the form data, but instead it shows ‘TypeError: Cannot read property map of undefined’ in the console”
Small, incremental testing prevents nightmare debugging sessions.

The key question: Will someone (including future you) need to understand this code later?

Now that you understand the why behind spec-driven development, you’re ready for the how.
We’ll do a hands-on walkthrough of the complete workflow:
/speckit.implementEach step will include real code examples, common mistakes, and troubleshooting tips.
Before next week’s post, you can:
# Install Spec-Kit CLI
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
# Verify installation
specify --version
We’ll use this in Part 2 to build a real application together.

Vibe coding gets you 70%: The last 30% is where real engineering happens
Specifications are the new source code: Write them first, code follows
Structure enables speed: More guardrails means less debugging
Three levels of spec-driven development: Spec-Kit operates at Level 3 (spec-as-source)
You’re the architect: AI is a tool, but you make the decisions
Next week in Part 2, we’ll put these concepts into practice with a complete walkthrough of the Spec-Kit workflow.
📍 You are here: Part 1 - The problem and the solution
This series is based on a presentation I gave about moving from ad-hoc AI-assisted coding to structured, specification-driven development. The full presentation slides are available for download.
Questions or feedback? Connect with me on LinkedIn or check out more posts at hiddedesmet.com.
Want to get notified when Part 2 drops? Follow me on LinkedIn for updates.
Start the conversation