We Rebuilt Our Video Pipeline as an AI Agent
From brittle scripts to self-correcting agent.
Our video generation pipeline was a mess. Dozens of scripts, manual handoffs, constant babysitting. One bad output and you’d restart from scratch.
So we rewrote it as an AI agent.
The Problem With Linear Pipelines
Traditional video automation looks like this: script → voice → video → done. Linear. Brittle. If the script is bad, you get a bad video. No feedback loops.
We kept hitting the same issues:
- Scripts that sounded robotic
- Facts that drifted from the source material
- Transitions that didn’t flow
- Sections that repeated the same points
The fix was always the same: human reviews the output, spots the problems, feeds corrections back in. We were the quality loop.
Making the AI Its Own Critic
The insight: don’t try to get it right the first time. Build in self-correction.
Our agent runs in phases:
- Research - Extract facts and evidence from source material
- Outline - Build narrative structure
- Draft - Write the full script
- Quality Loop - Analyze, critique, rewrite until it’s good
- Audio - Generate narration
- Video - Compose final output
The magic is phase 4. Instead of hoping the first draft is good, we assume it won’t be. The agent analyzes its own work, flags issues, rewrites the weak sections, and checks again.
The Quality Loop
The quality phase runs multiple iterations. Each pass:
- Coherence check - Does the argument flow? Are transitions smooth? Is the voice consistent?
- Fact validation - Do claims match the source research?
- Issue flagging - Which sections need work?
- Targeted rewrites - Fix only the flagged parts
- Diminishing returns check - Are we still improving?
It keeps looping until quality gates pass or improvements plateau.
This sounds expensive. It’s not. Targeted rewrites are cheap. You’re not regenerating everything—just the weak sections. And catching problems early saves the real cost: your time reviewing garbage output.
What We Learned
1. Separate the Thinker From the Doer
The agent that writes isn’t the same as the agent that critiques. Different prompts, different roles. The critic is harsh. The writer takes feedback and improves.
This mirrors how good human teams work. Writers need editors. The same brain that created something is bad at finding its flaws.
2. Facts Drift Without Grounding
LLMs hallucinate. Everyone knows this. But they also drift—they’ll start with your source material and slowly wander toward generic takes.
The fix: explicit fact validation against the original research. Not “does this sound true” but “does this match what we extracted in phase 1.”
3. Quality Gates Beat Vibes
We used to eyeball outputs. “Yeah, that’s pretty good.” Now we have explicit criteria: argument flow score, evidence variety score, transition quality. Numbers.
The agent can measure itself. No more subjective “good enough.”
4. State Saves Everything
Long pipelines fail. Network issues, API limits, whatever. Our agent saves state after each phase. Crash at minute 45? Resume from where you stopped.
This was annoying to build. Worth it every time something breaks halfway through.
The Result
What used to take an afternoon of babysitting now runs unattended. Drop in a source document, come back to a finished video. The agent handles the quality loop we used to do manually.
Is every output perfect? No. But the floor is higher. Bad outputs are rare instead of common. And when something is off, it’s usually a judgment call, not an obvious mistake the agent should have caught.
The shift: from hoping AI gets it right to expecting it to self-correct. Build the feedback loop into the system. Let the agent be its own critic.
That’s the pattern. Works for video, works for anything where quality matters and first drafts are usually rough.