I let Claude Code run my monorepo from a Hetzner box so I could ship from my phone

I'm about to go on vacation and I didn't want to stop shipping. I also didn't want to babysit a laptop on a beach. So I spent an evening turning my monorepo into something that mostly maintains itself, and set it up so I can steer it from my phone.

This is the honest write-up: what I built, what it actually did while I wasn't watching, and the parts I deliberately didn't let it do.

The setup: one cheap server, no laptop required

The whole thing runs on a single Hetzner box (a ~$15/mo shared VPS — the same one that already hosts a dozen prototype services, Postgres, Redis, and Caddy). On it I run claude remote-control, which keeps a Claude Code session alive and lets me dispatch prompts to it from my phone. No SSH, no tmux gymnastics, no laptop.

Getting claude remote-control to survive as a systemd service was its own small adventure — there are three separate first-launch traps (a binary-path symlink, an opt-in consent prompt, and a workspace-trust dialog that deliberately refuses to be answered from a script). The fix for the last one is to write the trust decision straight into ~/.claude.json rather than trying to fake a keypress. Modern CLIs reject piped answers to trust prompts on purpose, and they're right to.

The loops: two cron chains instead of ninety

The core idea is loop chains — recurring prompts that do maintenance work on a schedule. I'd accumulated ~93 separate scheduled tasks over time, which is unmanageable. I collapsed them into exactly two:

Frequent chain (now every 15 minutes): ten rotating maintenance steps — build/test a service, fill a test-coverage gap, a dead-code sweep, a dependency audit, a security scan, an API-contract check between frontend and backend, a migration-drift check, a landing-page audit, an SEO pass, and a "lessons linter" that turns past bugs into grep checks.
Periodic chain (daily): the cheap reports first (build status, revenue pulse, uptime, failed-job triage), then the expensive stuff last — including a self-feeding product loop: one step mines real user pain points and writes a PRD; the next picks the top PRD off the queue and implements it end to end — plan, code, security audit, tests, ship.

Each step keeps its own rotation pointer in a state file, so over time it sweeps the entire monorepo instead of hammering the same service.

Guarded autonomy: the part that makes it safe

"An AI that auto-merges and deploys to prod while you're on a beach" sounds reckless, and it would be without guardrails. So I wrote a Production Operations contract into the repo's CLAUDE.md with four tiers:

✅ Free to do: anything read-only — logs, queries, builds, tests, reports.
⚠️ With guardrails: merge + deploy a normal change, restart a single crashed service, reload Caddy (with a smoke check after — a bad Caddy reload can take down every site at once).
🛑 Park, don't do: the irreversible/outward-facing stuff — money-moving Stripe calls, destructive SQL, emails or social posts to real users, terraform apply. These get appended to an approval queue I clear on my own time. It never pings me.
⛔ Never: leak a secret, force-push main, bind an internal service to the public internet, disable auth.

Two deliberate calls worth calling out: local tests gate merges, not remote CI (I trust a green go test on the box more than waiting on a CI runner), and database migrations are always allowed — but only because every deploy takes a backup first.

The safety nets behind an unattended deploy

This is what I'd actually lose sleep over, so it got the most engineering. Every autonomous deploy goes through one path that:

Backs up the database.
Runs migrations.
Snapshots the current binary.
Swaps in the new one and restarts.
Hits /health and checks for a real, non-empty 200.
If that fails: rolls back to the previous binary automatically and trips a circuit breaker — a halt flag that blocks all further deploys until I clear it.

A service only auto-deploys if it's listed in a manifest I've verified. Anything not on the list is refused and parked, so the first unattended deploy of an important service is never the thing that discovers a missing migration path. Fail safe, not fail silent.

What it actually fixed while I wasn't watching

The good part. In a handful of 15-minute cycles, the loops found and fixed real bugs — not busywork:

A cross-tenant data leak (IDOR): one service let you attach another company's financial dataset to your own report because it validated the client but not the dataset. Now it checks ownership; three regression tests added.
A completely broken login: a frontend read data.token while the backend returned access_token, so every sign-in silently stored nothing. This exact field-name drift has now bitten me on three separate projects — which is why it's becoming an automated lint check.
A memory-exhaustion vector: a service decoded request bodies with no size limit. Now capped at 1MB.
482 MB of compiled binaries that had been accidentally committed to the repo root, quietly bloating every clone. Untracked and gitignored.

Each one came with a build, a test run, and a commit message explaining the why — not just the what.

The honest part: what I don't let it do

Because this blog is supposed to be honest, not a demo reel:

It won't fabricate social proof. When the landing-page audit wanted "add testimonials," it added an honest "how it works" section instead of inventing customer quotes for a product with no customers yet.
It won't purge git history, even though that 482 MB is still sitting in old commits — that needs a force-push, which is on the "never" list. Stopping the bleeding is in scope; rewriting shared history is not.
The flagship services are still parked until I hand-verify their migration paths. The loop would rather refuse than guess.
The crons are session-only — they live and die with the remote-control session. That's a deliberate limit on blast radius, not an oversight.

The numbers

2 scheduled tasks doing the work of ~93.
18 new maintenance loops added in one session.
15-minute cadence on the frequent chain, 24-hour on the periodic.
~15 commits shipped autonomously in an evening, including real security fixes.
$15/mo of server.
1 phone, 0 laptops required.

Would I recommend it?

For a solo founder running a pile of small services: yes, with the guardrails. The leverage isn't "the AI writes code" — it's that boring, never-prioritized maintenance actually gets done, continuously, while I work on something else (or nothing at all). The trick was never the autonomy. It was deciding, in writing and in advance, exactly what it's allowed to do unsupervised — and building the rollback and circuit-breaker so that the worst case is a halted deploy, not a 3 a.m. outage I find out about from a beach.

Now if you'll excuse me, I have a flight to catch.