Clawdbot With Teeth

Everyone’s buying Mac Minis to run Clawdbot. My Twitter feed is full of them—unboxing photos, screenshots of Discord bots running 24/7, people bragging about their “always-on AI assistant.” I get the appeal. Low power, clean setup, sits quietly on a shelf while your agent handles tasks in the background.

But here’s what those setups actually are: expensive API routers. The Mac handles orchestration—message routing, session management, tool execution—while Claude or GPT does the actual thinking. Every inference call goes out to the cloud. The Mac Mini is a middleman with good cable management.

I went a different direction.

The 5090 Setup

I’ve got an ASUS ROG Strix SCAR 18 with an RTX 5090 on the way. 24GB VRAM, running Ubuntu 24.04. This machine isn’t a router—it’s a workhorse. Qwen3-30B-A3B runs locally on this thing. That’s a model with 30 billion parameters, only 3 billion active per inference thanks to mixture-of-experts architecture. It would’ve been frontier-class 18 months ago. Now it screams on a laptop GPU.

The plan is to set up Clawdbot on the 5090 and actually teach it the business. Not just “here’s my calendar”—I mean deep context. Wild Pines projects, client histories, our content pipeline, how we build things. I want an agent that understands our workflows well enough to do real work, not just answer questions about work.

Claude Max Pro stays in the mix for the hard stuff. Opus 4.5 is still the best model available and some tasks genuinely need it. But the local compute handles the daily volume—summarization, code review, research, drafting—so I’m not burning through my Opus budget on routine tasks. The 5090 does the heavy lifting. Opus handles the precision strikes.

The Upgrade Question

I’m considering turning this into a full AI datacenter by adding an NVIDIA DGX Spark. 128GB unified memory. Enough to run Qwen3-235B—a 235 billion parameter model that physically cannot fit on consumer hardware. Deep reasoning. Massive context windows. Throw a 500-page contract at it and let it process the whole thing in a single pass.

But that’s a $4,000 box. I don’t want to buy hardware that sits idle because I overestimated what I’d actually use it for.

So I’m going to make Clawdbot argue for it.

Get the 5090 setup running first. Let Clawdbot learn the business, understand the workflows, run into the walls. Let it hit tasks where it needs to say “I can’t fit this in context” or “this reasoning chain would take too long locally.” Then I’ll ask it to build the case for the Spark. Plan the use cases. Justify the spend. Show me the workflows that would unlock.

If Clawdbot can’t make a compelling argument for its own compute upgrade, maybe it doesn’t need it. If it can—if it’s hitting real limits and can articulate exactly what more compute would enable—then I’ll know the investment is worth it.

Freedom and Trajectory

The real value here isn’t cost savings on API calls. It’s freedom.

I can run thinking models with massive chain-of-thought outputs all day. Let an agent reason through a complex problem for twenty minutes without watching a meter tick. Load an entire codebase into context and ask architectural questions. Process sensitive business data without it ever leaving my machines.

No rate limits. No vendor lock-in. No “is this task worth the API call” calculus that makes you second-guess every interaction. The compute is mine. I use it however I want.

And the timing is right. Local models finally crossed the threshold where this makes sense. Open source caught up. Qwen, DeepSeek, GLM are matching last year’s cloud flagships on coding and reasoning benchmarks. A year ago, running models locally meant real compromise. Now it means running capable models on your own hardware—and the next generation will be better. Every six months, what you can run locally gets more impressive.

This is infrastructure that grows more valuable over time. The compute I invest in now will run whatever drops next.

The Bottom Line

I’m building toward something: a mini AI datacenter run by my own AI assistant. Real compute for coding, research, content, analysis—without asking permission or watching a meter. Infrastructure I own and control.

But first, the 5090 has to prove the concept. And Clawdbot has to earn its upgrade.