- FALSEWORK.md: Full audit of API costs per component, migration plan for shifting load from Claude to local models incrementally - start-dashboard.sh: Launches tmux layout with only zero-cost panes active (status, loopstat). Loop and chat panes held until manual start. - tower-timmy.sh: No changes (already source-controlled) Falsework principle: build on cheap/local scaffolding, upgrade to Claude only where quality demands it.
159 lines
6.7 KiB
Markdown
159 lines
6.7 KiB
Markdown
# Falsework Principle — API Cost Management
|
|
# Created: 2026-03-18
|
|
# Purpose: Document what runs on Claude (expensive), what runs local (free),
|
|
# and how to incrementally shift load from cloud to local.
|
|
|
|
## The Metaphor
|
|
|
|
Falsework = temporary scaffolding that holds the structure while it cures.
|
|
When the permanent structure (local models) can bear the load, remove the
|
|
scaffolding (cloud API calls). Don't wait for perfection — use what works
|
|
NOW, upgrade incrementally.
|
|
|
|
---
|
|
|
|
## Current State (2026-03-18)
|
|
|
|
### ZERO COST (running now)
|
|
| Component | What it does | API calls |
|
|
|---------------------|---------------------------------|-----------|
|
|
| timmy-status.sh | Gitea + git dashboard (bash) | 0 |
|
|
| timmy-loopstat.sh | Queue/perf stats from logs | 0 |
|
|
| timmy-strategy.sh | Strategic view panel | 0 |
|
|
| timmy-watchdog.sh | Restarts dead tmux panes | 0 |
|
|
| tower-watchdog.sh | Restarts dead tower panes | 0 |
|
|
| hermes-startup.sh | Boot orchestrator | 0 |
|
|
| start-dashboard.sh | tmux layout creator | 0 |
|
|
| tower-timmy.sh | Timmy's tower side | 0 (local) |
|
|
|
|
### MODERATE COST (running now)
|
|
| Component | What it does | API calls |
|
|
|---------------------|---------------------------------|---------------------|
|
|
| tower-hermes.sh | Hermes side of tower chat | 1 Claude/turn |
|
|
| | | Gated by Timmy's |
|
|
| | | local response time |
|
|
| | | (~1 call/30-60sec) |
|
|
|
|
### HEAVY COST (NOT running — held)
|
|
| Component | What it does | API calls |
|
|
|---------------------|---------------------------------|---------------------|
|
|
| timmy-loop.sh | Continuous triage + delegation | 1 Claude Opus/cycle |
|
|
| | + timmy-loop-prompt.md | Runs continuously |
|
|
| | | BIGGEST COST CENTER |
|
|
| kimi-loop.sh | Per-issue coding agent | 1 Claude Code/issue |
|
|
| | | Bursty, not cont. |
|
|
| hermes (pane 4) | Interactive Hermes chat | Per-interaction |
|
|
|
|
---
|
|
|
|
## Falsework Migration Plan
|
|
|
|
### Phase 1: DONE — Separate and hold (today)
|
|
- Split the tmux layout so API-heavy panes don't auto-start
|
|
- Tower-hermes is the only active Claude consumer
|
|
- All monitoring is pure bash, zero API cost
|
|
|
|
### Phase 2: Tower Hermes → Local (next)
|
|
Tower conversation is LOW STAKES. It's two AIs chatting. This does NOT
|
|
need Claude Opus.
|
|
|
|
FALSEWORK APPROACH:
|
|
- Create ~/.hermes-tower/ config with local-only backend
|
|
- tower-hermes.sh: change `hermes chat` to `HERMES_HOME=~/.hermes-tower hermes chat`
|
|
- Backend: hermes3:latest or qwen3:30b via Ollama
|
|
- Result: tower becomes ZERO API COST
|
|
- Quality: will be dumber but that's fine for conversation
|
|
|
|
### Phase 3: Loop Triage → Hybrid (requires work)
|
|
The loop prompt (timmy-loop-prompt.md) does 6 phases. NOT all need Opus:
|
|
|
|
WHAT CAN GO LOCAL:
|
|
- Phase 0 (check stop file) — already bash
|
|
- Phase 1 (fix broken PRs) — needs code reasoning → KEEP CLAUDE
|
|
- Phase 2 (fast triage) — read issues, score them → LOCAL POSSIBLE
|
|
A local model can read JSON and assign priorities
|
|
- Phase 3 (execute top) — depends on task type
|
|
- Phase 4 (retro) — summarize what happened → LOCAL POSSIBLE
|
|
- Phase 5/6 (deep triage/cleanup) — periodic → LOCAL POSSIBLE
|
|
|
|
FALSEWORK APPROACH:
|
|
- Split the loop into "triage" (local) and "execute" (Claude)
|
|
- Local model handles: reading issues, scoring, assigning labels
|
|
- Claude handles: actual code review, complex delegation decisions
|
|
- Gate: only call Claude when there's real work, not every cycle
|
|
|
|
### Phase 4: Kimi → Local Coding Agent (requires model work)
|
|
kimi-loop.sh currently runs `kimi` which is Claude Code ($2/issue budget).
|
|
|
|
FALSEWORK OPTIONS:
|
|
a) Use qwen3:30b as coding agent (has tool use, just slower)
|
|
b) Use Kimi API (Moonshot) — cheaper than Claude, decent at code
|
|
c) Keep Claude Code but increase poll interval to reduce frequency
|
|
d) Only assign Kimi issues that are scoped/small (1-3 files)
|
|
|
|
RECOMMENDED: Option (c) for now — same agent, less frequent. Then migrate
|
|
to (a) as local model quality improves.
|
|
|
|
### Phase 5: Smart Routing (permanent structure)
|
|
Once local models handle triage reliably:
|
|
- Enable smart_model_routing in hermes config
|
|
- Simple turns → hermes3:latest (local, free)
|
|
- Complex turns → Claude Opus (cloud, paid)
|
|
- Tower → always local
|
|
- Loop triage → local, execution → Claude
|
|
- PR review → always Claude (stakes too high)
|
|
|
|
---
|
|
|
|
## Cost Estimation (rough)
|
|
|
|
| Scenario | Claude calls/hour | Opus cost/hour* |
|
|
|-----------------------|-------------------|-----------------|
|
|
| Everything on Claude | ~120 | ~$12-24 |
|
|
| Current (tower only) | ~60 | ~$6-12 |
|
|
| Phase 2 (tower local) | ~0 | ~$0 |
|
|
| Phase 3 (loop hybrid) | ~10-20 | ~$1-4 |
|
|
| Phase 5 (smart route) | ~5-10 | ~$0.50-2 |
|
|
|
|
*Very rough. Depends on prompt size, response length, Opus pricing.
|
|
|
|
---
|
|
|
|
## Rules for Falsework
|
|
|
|
1. NEVER sacrifice quality gates for cost. If local model can't do PR
|
|
review reliably, keep it on Claude.
|
|
2. Start with the LOWEST STAKES component. Tower chat → loop triage →
|
|
PR review. Never the reverse.
|
|
3. Test locally BEFORE removing the scaffolding. Run both paths, compare
|
|
results, then switch.
|
|
4. Keep the Claude path AVAILABLE. Don't delete configs — comment them
|
|
out. If local breaks, flip back in 30 seconds.
|
|
5. Monitor degradation. If local triage starts miscategorizing issues,
|
|
that's the signal to keep Claude for that phase.
|
|
|
|
---
|
|
|
|
## Quick Reference: How to Start Each Component
|
|
|
|
```bash
|
|
# Zero cost — start freely
|
|
~/.hermes/bin/start-dashboard.sh # tmux layout + status panels
|
|
~/.hermes/bin/tower-timmy.sh # Timmy side (local)
|
|
~/.hermes/bin/timmy-watchdog.sh # cron: */8 * * * *
|
|
~/.hermes/bin/tower-watchdog.sh # cron: */5 * * * *
|
|
|
|
# Moderate cost — start with awareness
|
|
~/.hermes/bin/tower-hermes.sh # ~1 Claude call per Timmy response
|
|
|
|
# Heavy cost — start deliberately
|
|
~/.hermes/bin/timmy-loop.sh # Continuous Claude Opus calls
|
|
~/.hermes/bin/kimi-loop.sh # Claude Code per issue
|
|
hermes # Interactive Hermes (per-interaction)
|
|
|
|
# Stop everything
|
|
touch ~/Timmy-Time-dashboard/.loop/STOP # stops the loop
|
|
tmux kill-session -t timmy-loop # kills dashboard
|
|
tmux kill-session -t tower # kills tower
|
|
```
|