I Built an Agent Framework in 1500 Lines of Bash

Every agent framework wants you to learn a new language, install a runtime, and buy into an ecosystem. I wanted to orchestrate AI coding agents. So I wrote 1500 lines of bash and called it sage.

The Problem

I was building with AI coding agents — Claude Code, Cline — and kept running into the same friction. I'd have one agent building a frontend, another on the backend, and I'd be the human bottleneck switching between terminals, copy-pasting results, losing track of what was where.

I looked at what existed. Every "agent framework" I found wanted me to:

Install Python + pip + virtualenv + a dozen packages
Learn a custom DSL for defining agent workflows
Run a daemon or server process
Store state in a database
Buy into an opinionated architecture I'd fight in two weeks

I didn't need any of that. I needed to dispatch tasks to CLI agents and watch them work. That's it.

The Insight

Unix already solved process orchestration fifty years ago. Processes communicate through files. You manage them with signals. You observe them through terminals.

So what if an "agent framework" was just... Unix primitives?

Agents = processes running in tmux windows
Messages = JSON files dropped in inbox directories
Tasks = tracked with IDs, status files, and result files
Orchestration = agents creating other agents
Observability = tmux capture-pane

No daemon. No database. No Docker. No npm. Everything is a file you can cat.

What sage Looks Like

Create an agent. Give it work. Watch it go:

sage create worker --runtime claude-code
sage send worker "Build a REST API with auth, tests, and docs"
sage peek worker   # watch it work in real-time

That's three commands. Your agent is running in a tmux pane, writing code, calling tools. sage peek shows you exactly what's happening — live tool calls streaming as they happen:

⚡ peek: worker

Runner log:
  [22:15:28] worker: invoking claude-code...
  I'll create a professional REST API for you...
    → ToolSearch
    → TodoWrite
    → Write
    → TodoWrite
    → Write

Workspace: 4 file(s)
  22:17  19889  routes.py
  22:16  23212  app.py

Need multiple agents? Orchestrate them:

sage create orch --runtime claude-code
sage send orch "Build a full-stack app. Delegate to sub-agents."

sage status
#  orch         claude-code  running   45s
#    └─ frontend claude-code  running   30s
#    └─ backend  claude-code  running   28s

Agent going the wrong direction? Course-correct without starting over:

# Soft steer — guidance for next task
sage steer orch "Use PostgreSQL instead of SQLite"

# Hard steer — nuclear option: stop everything, restart with context
sage steer orch "Switch to Go" --restart

The --restart cascades. It stops all child agents, stops the orchestrator, writes the steering context, re-queues the in-flight task, and restarts. The orchestrator re-creates sub-agents as needed. One command.

The Architecture (It's Embarrassingly Simple)

~/.sage/
├── agents/worker/
│   ├── inbox/          # drop a JSON file = send a message
│   ├── workspace/      # agent writes files here
│   ├── results/        # task status + output (mechanical)
│   └── steer.md        # steering context (injected into prompts)
├── runtimes/           # bash, cline, claude-code
├── trace.jsonl         # append-only event log
└── runner.sh           # 300ms polling loop, that's the whole engine

The runner is a bash loop. Every 300ms, it checks the inbox. If there's a message, it sources the runtime (claude-code, cline, or bash) and calls runtime_inject(). That's the entire architecture.

Adding a new runtime — say, Gemini CLI or Aider — is one file with two functions: runtime_start() and runtime_inject(). You can read the whole thing in an afternoon.

Mechanical > Behavioral

This is the design principle I keep coming back to. Every agent framework I've seen makes the same mistake: they ask the LLM to manage state. "Remember to report your results." "Track your progress." "Communicate with the orchestrator when done."

LLMs forget. They hallucinate protocols. They skip steps. You can't build reliable systems on behavioral contracts with a stochastic model.

In sage, everything that matters is mechanical:

Task status? The runner writes queued → running → done. Not the LLM.
Parent-child tracking? The engine records it via SAGE_AGENT_NAME env var. Not the agent.
Results? The runner captures output and writes result files. Not a prompt instruction.
Tracing? Append-only JSONL written by runner code. Not agent self-reporting.

The LLM does what it's good at: reasoning and writing code. The engine handles everything else.

Live Streaming (The Hardest Part)

The most annoying engineering problem was getting live output. claude -p (print mode) suppresses ALL terminal output during execution. Doesn't matter if stdout is a TTY, a pipe, or wrapped in script(1). Print mode collects everything and outputs once at the end.

I tried everything: tee, tail -f, script --flush, process substitution, tmux pipe-pane. All dead ends because the CLI simply doesn't write to stdout during tool use.

The solution: --output-format stream-json --verbose. Claude emits JSONL events in real-time — one per tool call, one per text response. A while read loop parses each event and prints to the terminal:

cat "$prompt" | claude -p --output-format stream-json --verbose \
  --allowedTools "Bash(*)" "Write(*)" "Read(*)" "Edit(*)" \
  | while IFS= read -r line; do
    # parse JSON event, print tool calls + text
  done

Same approach for Cline with --json. Each CLI has its own event format, but the pattern is identical: structured events piped through a parser to the terminal.

What I Learned

Constraints are features. "Only bash, jq, and tmux" sounds limiting. It's actually liberating. No dependency hell. No version conflicts. No build step. The entire tool is one file you can read, understand, and modify.

Files are the universal API. Every tool can read and write files. JSON in a directory is the lowest-common-denominator message bus. It survives restarts, is inspectable with cat, debuggable with ls -la, and needs zero infrastructure.

Observability isn't optional. The first version had no peek, no trace, no streaming. It was like driving blind. You'd submit a task and wait. Maybe it works in 30 seconds. Maybe it's stuck for 5 minutes. Adding live output changed the entire experience — from "fire and pray" to "fire and watch."

The LLM is the worst part of the system. Not because it's bad — it's incredible at code generation. But it's unreliable at following protocols. Every time I tried to make agents "remember" to do something (report results, clean up, communicate), they'd forget. Making those things mechanical instead of behavioral was the single biggest reliability improvement.

Try It

# Install (pick one)
brew tap youwangd/sage && brew install sage
npm install -g @youwangd/sage
curl -fsSL https://raw.githubusercontent.com/youwangd/SageCLI/main/install.sh | bash

# Use it
sage init
sage create worker --runtime claude-code
sage send worker "Build something cool"
sage peek worker

The entire thing is open source: github.com/youwangd/SageCLI

~1500 lines of bash. MIT license. Read it in an afternoon. Modify it by evening.

Because the best agent framework is the one you can actually understand.