Add falsework docs + start-dashboard.sh (API-aware launcher)
- FALSEWORK.md: Full audit of API costs per component, migration plan for shifting load from Claude to local models incrementally - start-dashboard.sh: Launches tmux layout with only zero-cost panes active (status, loopstat). Loop and chat panes held until manual start. - tower-timmy.sh: No changes (already source-controlled) Falsework principle: build on cheap/local scaffolding, upgrade to Claude only where quality demands it.
This commit is contained in:
158
FALSEWORK.md
Normal file
158
FALSEWORK.md
Normal file
@@ -0,0 +1,158 @@
|
||||
# Falsework Principle — API Cost Management
|
||||
# Created: 2026-03-18
|
||||
# Purpose: Document what runs on Claude (expensive), what runs local (free),
|
||||
# and how to incrementally shift load from cloud to local.
|
||||
|
||||
## The Metaphor
|
||||
|
||||
Falsework = temporary scaffolding that holds the structure while it cures.
|
||||
When the permanent structure (local models) can bear the load, remove the
|
||||
scaffolding (cloud API calls). Don't wait for perfection — use what works
|
||||
NOW, upgrade incrementally.
|
||||
|
||||
---
|
||||
|
||||
## Current State (2026-03-18)
|
||||
|
||||
### ZERO COST (running now)
|
||||
| Component | What it does | API calls |
|
||||
|---------------------|---------------------------------|-----------|
|
||||
| timmy-status.sh | Gitea + git dashboard (bash) | 0 |
|
||||
| timmy-loopstat.sh | Queue/perf stats from logs | 0 |
|
||||
| timmy-strategy.sh | Strategic view panel | 0 |
|
||||
| timmy-watchdog.sh | Restarts dead tmux panes | 0 |
|
||||
| tower-watchdog.sh | Restarts dead tower panes | 0 |
|
||||
| hermes-startup.sh | Boot orchestrator | 0 |
|
||||
| start-dashboard.sh | tmux layout creator | 0 |
|
||||
| tower-timmy.sh | Timmy's tower side | 0 (local) |
|
||||
|
||||
### MODERATE COST (running now)
|
||||
| Component | What it does | API calls |
|
||||
|---------------------|---------------------------------|---------------------|
|
||||
| tower-hermes.sh | Hermes side of tower chat | 1 Claude/turn |
|
||||
| | | Gated by Timmy's |
|
||||
| | | local response time |
|
||||
| | | (~1 call/30-60sec) |
|
||||
|
||||
### HEAVY COST (NOT running — held)
|
||||
| Component | What it does | API calls |
|
||||
|---------------------|---------------------------------|---------------------|
|
||||
| timmy-loop.sh | Continuous triage + delegation | 1 Claude Opus/cycle |
|
||||
| | + timmy-loop-prompt.md | Runs continuously |
|
||||
| | | BIGGEST COST CENTER |
|
||||
| kimi-loop.sh | Per-issue coding agent | 1 Claude Code/issue |
|
||||
| | | Bursty, not cont. |
|
||||
| hermes (pane 4) | Interactive Hermes chat | Per-interaction |
|
||||
|
||||
---
|
||||
|
||||
## Falsework Migration Plan
|
||||
|
||||
### Phase 1: DONE — Separate and hold (today)
|
||||
- Split the tmux layout so API-heavy panes don't auto-start
|
||||
- Tower-hermes is the only active Claude consumer
|
||||
- All monitoring is pure bash, zero API cost
|
||||
|
||||
### Phase 2: Tower Hermes → Local (next)
|
||||
Tower conversation is LOW STAKES. It's two AIs chatting. This does NOT
|
||||
need Claude Opus.
|
||||
|
||||
FALSEWORK APPROACH:
|
||||
- Create ~/.hermes-tower/ config with local-only backend
|
||||
- tower-hermes.sh: change `hermes chat` to `HERMES_HOME=~/.hermes-tower hermes chat`
|
||||
- Backend: hermes3:latest or qwen3:30b via Ollama
|
||||
- Result: tower becomes ZERO API COST
|
||||
- Quality: will be dumber but that's fine for conversation
|
||||
|
||||
### Phase 3: Loop Triage → Hybrid (requires work)
|
||||
The loop prompt (timmy-loop-prompt.md) does 6 phases. NOT all need Opus:
|
||||
|
||||
WHAT CAN GO LOCAL:
|
||||
- Phase 0 (check stop file) — already bash
|
||||
- Phase 1 (fix broken PRs) — needs code reasoning → KEEP CLAUDE
|
||||
- Phase 2 (fast triage) — read issues, score them → LOCAL POSSIBLE
|
||||
A local model can read JSON and assign priorities
|
||||
- Phase 3 (execute top) — depends on task type
|
||||
- Phase 4 (retro) — summarize what happened → LOCAL POSSIBLE
|
||||
- Phase 5/6 (deep triage/cleanup) — periodic → LOCAL POSSIBLE
|
||||
|
||||
FALSEWORK APPROACH:
|
||||
- Split the loop into "triage" (local) and "execute" (Claude)
|
||||
- Local model handles: reading issues, scoring, assigning labels
|
||||
- Claude handles: actual code review, complex delegation decisions
|
||||
- Gate: only call Claude when there's real work, not every cycle
|
||||
|
||||
### Phase 4: Kimi → Local Coding Agent (requires model work)
|
||||
kimi-loop.sh currently runs `kimi` which is Claude Code ($2/issue budget).
|
||||
|
||||
FALSEWORK OPTIONS:
|
||||
a) Use qwen3:30b as coding agent (has tool use, just slower)
|
||||
b) Use Kimi API (Moonshot) — cheaper than Claude, decent at code
|
||||
c) Keep Claude Code but increase poll interval to reduce frequency
|
||||
d) Only assign Kimi issues that are scoped/small (1-3 files)
|
||||
|
||||
RECOMMENDED: Option (c) for now — same agent, less frequent. Then migrate
|
||||
to (a) as local model quality improves.
|
||||
|
||||
### Phase 5: Smart Routing (permanent structure)
|
||||
Once local models handle triage reliably:
|
||||
- Enable smart_model_routing in hermes config
|
||||
- Simple turns → hermes3:latest (local, free)
|
||||
- Complex turns → Claude Opus (cloud, paid)
|
||||
- Tower → always local
|
||||
- Loop triage → local, execution → Claude
|
||||
- PR review → always Claude (stakes too high)
|
||||
|
||||
---
|
||||
|
||||
## Cost Estimation (rough)
|
||||
|
||||
| Scenario | Claude calls/hour | Opus cost/hour* |
|
||||
|-----------------------|-------------------|-----------------|
|
||||
| Everything on Claude | ~120 | ~$12-24 |
|
||||
| Current (tower only) | ~60 | ~$6-12 |
|
||||
| Phase 2 (tower local) | ~0 | ~$0 |
|
||||
| Phase 3 (loop hybrid) | ~10-20 | ~$1-4 |
|
||||
| Phase 5 (smart route) | ~5-10 | ~$0.50-2 |
|
||||
|
||||
*Very rough. Depends on prompt size, response length, Opus pricing.
|
||||
|
||||
---
|
||||
|
||||
## Rules for Falsework
|
||||
|
||||
1. NEVER sacrifice quality gates for cost. If local model can't do PR
|
||||
review reliably, keep it on Claude.
|
||||
2. Start with the LOWEST STAKES component. Tower chat → loop triage →
|
||||
PR review. Never the reverse.
|
||||
3. Test locally BEFORE removing the scaffolding. Run both paths, compare
|
||||
results, then switch.
|
||||
4. Keep the Claude path AVAILABLE. Don't delete configs — comment them
|
||||
out. If local breaks, flip back in 30 seconds.
|
||||
5. Monitor degradation. If local triage starts miscategorizing issues,
|
||||
that's the signal to keep Claude for that phase.
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: How to Start Each Component
|
||||
|
||||
```bash
|
||||
# Zero cost — start freely
|
||||
~/.hermes/bin/start-dashboard.sh # tmux layout + status panels
|
||||
~/.hermes/bin/tower-timmy.sh # Timmy side (local)
|
||||
~/.hermes/bin/timmy-watchdog.sh # cron: */8 * * * *
|
||||
~/.hermes/bin/tower-watchdog.sh # cron: */5 * * * *
|
||||
|
||||
# Moderate cost — start with awareness
|
||||
~/.hermes/bin/tower-hermes.sh # ~1 Claude call per Timmy response
|
||||
|
||||
# Heavy cost — start deliberately
|
||||
~/.hermes/bin/timmy-loop.sh # Continuous Claude Opus calls
|
||||
~/.hermes/bin/kimi-loop.sh # Claude Code per issue
|
||||
hermes # Interactive Hermes (per-interaction)
|
||||
|
||||
# Stop everything
|
||||
touch ~/Timmy-Time-dashboard/.loop/STOP # stops the loop
|
||||
tmux kill-session -t timmy-loop # kills dashboard
|
||||
tmux kill-session -t tower # kills tower
|
||||
```
|
||||
Reference in New Issue
Block a user