Every agent framework wants you to learn a new language, install a runtime, and buy into an ecosystem. I wanted to orchestrate AI coding agents. So I wrote 1500 lines of bash and called it sage.
The Problem
I was building with AI coding agents — Claude Code, Cline — and kept running into the same friction. I'd have one agent building a frontend, another on the backend, and I'd be the human bottleneck switching between terminals, copy-pasting results, losing track of what was where.
I looked at what existed. Every "agent framework" I found wanted me to:
- Install Python + pip + virtualenv + a dozen packages
- Learn a custom DSL for defining agent workflows
- Run a daemon or server process
- Store state in a database
- Buy into an opinionated architecture I'd fight in two weeks
I didn't need any of that. I needed to dispatch tasks to CLI agents and watch them work. That's it.
The Insight
Unix already solved process orchestration fifty years ago. Processes communicate through files. You manage them with signals. You observe them through terminals.
So what if an "agent framework" was just... Unix primitives?
- Agents = processes running in tmux windows
- Messages = JSON files dropped in inbox directories
- Tasks = tracked with IDs, status files, and result files
- Orchestration = agents creating other agents
- Observability =
tmux capture-pane
No daemon. No database. No Docker. No npm. Everything is a file you can cat.
What sage Looks Like
Create an agent. Give it work. Watch it go:
sage create worker --runtime claude-code
sage send worker "Build a REST API with auth, tests, and docs"
sage peek worker # watch it work in real-time
That's three commands. Your agent is running in a tmux pane, writing code, calling tools. sage peek shows you exactly what's happening — live tool calls streaming as they happen:
⚡ peek: worker
Runner log:
[22:15:28] worker: invoking claude-code...
I'll create a professional REST API for you...
→ ToolSearch
→ TodoWrite
→ Write
→ TodoWrite
→ Write
Workspace: 4 file(s)
22:17 19889 routes.py
22:16 23212 app.py
Need multiple agents? Orchestrate them:
sage create orch --runtime claude-code
sage send orch "Build a full-stack app. Delegate to sub-agents."
sage status
# orch claude-code running 45s
# └─ frontend claude-code running 30s
# └─ backend claude-code running 28s
Agent going the wrong direction? Course-correct without starting over:
# Soft steer — guidance for next task
sage steer orch "Use PostgreSQL instead of SQLite"
# Hard steer — nuclear option: stop everything, restart with context
sage steer orch "Switch to Go" --restart
The --restart cascades. It stops all child agents, stops the orchestrator, writes the steering context, re-queues the in-flight task, and restarts. The orchestrator re-creates sub-agents as needed. One command.
The Architecture (It's Embarrassingly Simple)
~/.sage/
├── agents/worker/
│ ├── inbox/ # drop a JSON file = send a message
│ ├── workspace/ # agent writes files here
│ ├── results/ # task status + output (mechanical)
│ └── steer.md # steering context (injected into prompts)
├── runtimes/ # bash, cline, claude-code
├── trace.jsonl # append-only event log
└── runner.sh # 300ms polling loop, that's the whole engine
The runner is a bash loop. Every 300ms, it checks the inbox. If there's a message, it sources the runtime (claude-code, cline, or bash) and calls runtime_inject(). That's the entire architecture.
Adding a new runtime — say, Gemini CLI or Aider — is one file with two functions: runtime_start() and runtime_inject(). You can read the whole thing in an afternoon.
Mechanical > Behavioral
This is the design principle I keep coming back to. Every agent framework I've seen makes the same mistake: they ask the LLM to manage state. "Remember to report your results." "Track your progress." "Communicate with the orchestrator when done."
LLMs forget. They hallucinate protocols. They skip steps. You can't build reliable systems on behavioral contracts with a stochastic model.
In sage, everything that matters is mechanical:
- Task status? The runner writes
queued → running → done. Not the LLM. - Parent-child tracking? The engine records it via
SAGE_AGENT_NAMEenv var. Not the agent. - Results? The runner captures output and writes result files. Not a prompt instruction.
- Tracing? Append-only JSONL written by runner code. Not agent self-reporting.
The LLM does what it's good at: reasoning and writing code. The engine handles everything else.
Live Streaming (The Hardest Part)
The most annoying engineering problem was getting live output. claude -p (print mode) suppresses ALL terminal output during execution. Doesn't matter if stdout is a TTY, a pipe, or wrapped in script(1). Print mode collects everything and outputs once at the end.
I tried everything: tee, tail -f, script --flush, process substitution, tmux pipe-pane. All dead ends because the CLI simply doesn't write to stdout during tool use.
The solution: --output-format stream-json --verbose. Claude emits JSONL events in real-time — one per tool call, one per text response. A while read loop parses each event and prints to the terminal:
cat "$prompt" | claude -p --output-format stream-json --verbose \
--allowedTools "Bash(*)" "Write(*)" "Read(*)" "Edit(*)" \
| while IFS= read -r line; do
# parse JSON event, print tool calls + text
done
Same approach for Cline with --json. Each CLI has its own event format, but the pattern is identical: structured events piped through a parser to the terminal.
What I Learned
Constraints are features. "Only bash, jq, and tmux" sounds limiting. It's actually liberating. No dependency hell. No version conflicts. No build step. The entire tool is one file you can read, understand, and modify.
Files are the universal API. Every tool can read and write files. JSON in a directory is the lowest-common-denominator message bus. It survives restarts, is inspectable with cat, debuggable with ls -la, and needs zero infrastructure.
Observability isn't optional. The first version had no peek, no trace, no streaming. It was like driving blind. You'd submit a task and wait. Maybe it works in 30 seconds. Maybe it's stuck for 5 minutes. Adding live output changed the entire experience — from "fire and pray" to "fire and watch."
The LLM is the worst part of the system. Not because it's bad — it's incredible at code generation. But it's unreliable at following protocols. Every time I tried to make agents "remember" to do something (report results, clean up, communicate), they'd forget. Making those things mechanical instead of behavioral was the single biggest reliability improvement.
Try It
# Install (pick one)
brew tap youwangd/sage && brew install sage
npm install -g @youwangd/sage
curl -fsSL https://raw.githubusercontent.com/youwangd/SageCLI/main/install.sh | bash
# Use it
sage init
sage create worker --runtime claude-code
sage send worker "Build something cool"
sage peek worker
The entire thing is open source: github.com/youwangd/SageCLI
~1500 lines of bash. MIT license. Read it in an afternoon. Modify it by evening.
Because the best agent framework is the one you can actually understand.