Doc-First Development: Programming in Markdown

I don’t write code anymore. I don’t write docs either. I work with Claude to produce both.

The .md file is the source of truth. The code is generated from it. But the doc itself? Also generated—through conversation, iteration, pushing back until it’s right.

The Workflow

Step 1: Brain dump. Stream of consciousness to Claude. Misspellings, run-on sentences, whatever. It handles the mess.

Step 2: Ask for an architecture plan. Always. Before any code. Claude produces a draft.

Step 3: Be nitpicky. LLMs are weak at system-2 thinking. They don’t always interpret instructions the way you think you said them. Push back on approach and structure. This is where you catch misinterpretations.

Step 4: Iterate until it’s exactly right. Don’t accept the first plan. Keep refining.

Step 5: Let it implement. Never manually edit files. Re-prompt instead.

That last part is the hardest for most developers. Every manual edit is a bottleneck. You’re faster re-prompting than you are typing—for code and docs.

Why Planning Matters More Now

Counter-intuitive: AI makes planning more important, not less.

When implementation is instant, bad architecture costs you instantly too. You can build the wrong thing in an afternoon. The constraint isn’t coding speed anymore—it’s knowing what to build.

Spend the time upfront. Argue with the plan. Push back on approach and structure. That’s where you add value.

The Three Mistakes Everyone Makes

1. Not Asking For Enough

People are still limited by what was possible before. They ask for small changes because that’s what they’re used to.

Ask for more. The ceiling is way higher now. Describe whole features, entire systems. See what happens.

2. Under-Documenting

Documentation used to be expensive. Write the code, then spend hours documenting it. Nobody wanted to do that.

Now documentation is cheap. Claude writes it. And here’s the thing: documentation is Claude’s long-term memory.

That architecture.md file? It’s not for humans. It’s so the next Claude session knows what you built. Over-document. It pays off.

3. Conforming AI Code to Human Standards

Stop trying to make Claude write code “the way you used to write it.”

Humans aren’t going to read this code line by line anymore. It will be read and edited agentically. Your old formatting preferences don’t matter. Let go of them.

Testing Non-Deterministic Systems

The hard question: LLMs give different answers each time. How do you test?

Split deterministic from non-deterministic. Design your system so LLMs return structured data alongside prose. The JSON fields—extracted entities, classifications, scores—those are deterministic. Test them traditionally. The creative text? That needs a different approach.

LLM-as-judge for the rest. For truly non-deterministic outputs, use a judge model. “Does this reply match the intended tone?” “Did the summary capture the key points?” The judge returns a boolean or score you can assert against.

Allow multiple valid interpretations. “Schedule a meeting with the London team”—did they mean London UK or London Ontario? Both could be valid depending on context. Your tests should accept either when the input is genuinely ambiguous.

We built 500+ integration tests this way for a production system. It works.

The Stack Level Up

Code review didn’t disappear. It moved up a level.

Old way: Read code line by line, check for bugs.

New way: Build integration tests with AI. Step through features. Verify the system works.

Same loop, higher abstraction. You operate from a level up now.

Stream of Consciousness Works

Your prompts don’t need to be perfect. Brain dump. Ramble. Misspell things.

Claude understands poor typing and run-on sentences. The iteration fixes problems. Perfect prompting is unnecessary overhead.

Make it fun for yourself. It still works.

Why This Is Actually Fun

Here’s what surprised me: this way of working is genuinely enjoyable.

It’s a dopamine rush. You describe something, it exists minutes later. That feedback loop is fast enough to stay in flow.

I’ve replaced 30 hours a week of video games with building software. Not because I should—because shipping fast hits the same reward centers.

The friction disappeared. When building becomes as responsive as gaming, you don’t want to stop.

380+ commits, 5 products, 2 people. All built by programming in markdown.