The Playbook I Use to Run Everything on Agents

I'm Guima. I run my company with a team of AI agents, not people. Marketing, product development, ops — all of it runs on agents, led by Neo, my CEO agent. Here's exactly how I set that up, so you can do the same.

This isn't theory. I built it inside Claude Code, the same tool you already have. No special stack. No "AI ops platform." Just a method. I'll give you the whole thing.

The one idea that changes everything

Most people use AI like a vending machine. You walk up, ask for a task, get an answer, walk away. Write this email. Summarize that doc. Fix this bug. One-off. Then you do it again tomorrow.

That's macros. You're still the operator.

The shift is this:

You don't automate tasks. You give every function of your operation its own agent.

Not "summarize today's email." Instead: a triage agent that owns my company inbox — every day, without me asking.

Not "write me some code." Instead: Neo, a CEO agent that owns the build — it pulls in the specialists it needs, checks their work, and ships. (Neo built the site you're reading this on.)

A fleet, not a vending machine. Each agent owns a job. You stop doing the work and start running the team that does — except the team is agents, and they don't sleep.

Once you see your work as a set of functions instead of a pile of tasks, the whole thing clicks. Let me show you how to build it, one function at a time.

The 7 moves

1. Start with one painful function — not your whole life

Don't try to "automate everything." You'll build a mess and quit.

Pick the one function that drains you most and repeats. For me, an easy first one was inbox triage — a triage agent now owns my company inbox and only surfaces what actually needs a decision from me. Everything else (newsletters, receipts, noise) it handles or files on its own.

Pick yours: inbox, status updates, expense sorting, first-draft anything. One that repeats and that you hate.

Do this today: Write one sentence — "The function I want off my plate first is ______." That's your first agent.

2. Give the agent a job, not a prompt

A prompt is "help me with email." A job is a role description: what it owns, what good looks like, where the boundaries are.

In Claude Code, this lives in a CLAUDE.md or a skill — a written spec the agent reads every time. My triage agent's job reads, roughly:

You own the company inbox.
- Sort every message: needs-me, handle-it, or junk.
- "needs-me" = a real decision only I can make → escalate it to me.
- Newsletters/receipts → file or unsubscribe on your own.
- Never send, reply, or forward anything without my approval.

See the difference? It's not a request. It's a hire. The clearer the job, the less you babysit.

Do this today: Write 5 bullet lines describing the job — what it owns, what "good" means, and one hard boundary ("never send," "never touch prod," "always ask before X").

3. Verify the output before you trust it

This is the step everyone skips, and it's why their agents feel flaky. An agent you don't verify is a liability. An agent you verify a few times becomes one you can stop watching.

My rule: I don't trust an agent that says "done." I check the real thing. Did the pull request actually open? Does the page actually load? Did the test actually pass? When an agent gets something wrong, I don't fix that one output by hand — I fix its job spec so it can't happen again.

That's the loop: run → check the real artifact → tighten the spec → run again. After a week or two you're checking once a day, then once a week, then you forget it's running.

Do this today: After your agent's first run, find one thing it got wrong and add one line to its spec so it can't repeat it.

4. Chain agents — let one hand off to the next

Single agents are useful. Chained agents are leverage.

When I needed this site rebuilt, Neo didn't do it alone — it routed. A build agent wrote the code, a review agent checked it, a panel of advisor agents pressure-tested the copy, another agent swapped a config across the whole repo and ran the tests. One goal in, a chain of specialists, Neo stitched the results together. I just reviewed and approved the result.

In Claude Code you do this with sub-agents: a lead agent that delegates pieces of work to specialists and stitches the results. Think of it as a small org chart. The lead doesn't do everything — it routes.

Do this today: Take two functions you already run separately and ask: could the output of one feed the other? Wire the handoff.

5. Let it run on a loop — heartbeat, not on-demand

The biggest jump: agents that run without you asking.

I run a heartbeat — a scheduled wake-up — that starts Neo on its own. It wakes on a schedule, picks up whatever's in flight, triages the inbox, and pings me only if something actually needs me. Most days, work has already moved before I sit down.

This is the difference between a tool and a teammate. A tool waits to be picked up. A teammate shows up and does the job. Cron / scheduled runs are how an agent becomes a teammate.

Do this today: Take your one verified agent and schedule it. Daily is fine. The first morning it runs without you is the morning this stops being a gimmick.

6. Delegate the decision, keep the judgment

The scary part is letting agents decide things. The trick: split reversible from irreversible.

Reversible decisions (which emails are junk, which approach to take, which draft is better) — let the agent decide. Worst case, you undo it.

Irreversible ones (deploying to production, spending money, publishing in public, anything legal) — the agent proposes, you approve. That's a hard rule for Neo: it ships code to a branch and opens it for my review, but it never deploys or spends a cent without my yes. One human gate, on the stuff that actually matters.

I decide far less than I used to. Not because I gave up control — because I moved my attention to the 5% of decisions that are actually mine.

Do this today: List your agent's decisions in two columns — "let it decide" vs "it proposes, I approve." Put the gate only where undo is impossible.

7. Write down what you learn — your fleet compounds

Every agent gets sharper when you feed its mistakes back into its spec. But the real compounding happens when you keep a shared memory across agents: the preferences, the formats, the "never do this again" rules, the decisions you've locked.

Neo keeps running state every agent reads — decisions, a positioning doc, a learnings file, my voice rules. When the council and I locked our positioning, Neo wrote it into one doc the whole fleet now works from. New agents inherit it on day one. The fleet gets smarter as a system, not one agent at a time.

Do this today: Start one notes file. Every time you correct an agent, add the lesson there. In a month it's the brain of your whole operation.

The mistakes (don't do these)

Boiling the ocean. Trying to agent-ify your whole life in week one. Build one function, get it solid, then add the next. Slow is fast here.
One-shot thinking. Asking for a task instead of defining a job. If you're re-typing the same context every time, you haven't built an agent — you've built a habit.
Skipping verification. Trusting output you never checked. One bad unverified run and you'll rip the whole thing out. Verify early so you can stop verifying later.
No boundaries. Not telling the agent what it can never do. Every agent needs at least one hard "never." That one line is what lets you sleep.
Keeping the judgment in your head. Fixing the same mistake by hand five times instead of writing it into the spec once. If you correct it, capture it.
Confusing busy with run. An agent that needs you to press go every time isn't running your operation. The goal is the loop — it runs whether you show up or not.

Where to go from here

That's the method. It's real, it's working, and it costs you nothing to start — you can build your first agent today with the tool you already have open.

Two ways I can help you go faster:

Just getting started with Claude Code? Take the Claude Code Course at guima.ai/skills. It walks you from zero to your first working agent, hands-on, in an afternoon. Modules 1–3 are free.

Already building and hit a wall? Book a Live Fix with me — $200, one session, screen-to-screen. We unstick your fleet, fix the thing that's broken, and get you running again. Bring the mess.

Build one agent this week. Give it a job. Verify it. Let it run.

That's how you start running your work on agents — same as me.

— Guima