Files
timmy-config/FALSEWORK.md

159 lines
6.7 KiB
Markdown
Raw Normal View History

# Falsework Principle — API Cost Management
# Created: 2026-03-18
# Purpose: Document what runs on Claude (expensive), what runs local (free),
# and how to incrementally shift load from cloud to local.
## The Metaphor
Falsework = temporary scaffolding that holds the structure while it cures.
When the permanent structure (local models) can bear the load, remove the
scaffolding (cloud API calls). Don't wait for perfection — use what works
NOW, upgrade incrementally.
---
## Current State (2026-03-18)
### ZERO COST (running now)
| Component | What it does | API calls |
|---------------------|---------------------------------|-----------|
| timmy-status.sh | Gitea + git dashboard (bash) | 0 |
| timmy-loopstat.sh | Queue/perf stats from logs | 0 |
| timmy-strategy.sh | Strategic view panel | 0 |
| timmy-watchdog.sh | Restarts dead tmux panes | 0 |
| tower-watchdog.sh | Restarts dead tower panes | 0 |
| hermes-startup.sh | Boot orchestrator | 0 |
| start-dashboard.sh | tmux layout creator | 0 |
| tower-timmy.sh | Timmy's tower side | 0 (local) |
### MODERATE COST (running now)
| Component | What it does | API calls |
|---------------------|---------------------------------|---------------------|
| tower-hermes.sh | Hermes side of tower chat | 1 Claude/turn |
| | | Gated by Timmy's |
| | | local response time |
| | | (~1 call/30-60sec) |
### HEAVY COST (NOT running — held)
| Component | What it does | API calls |
|---------------------|---------------------------------|---------------------|
| timmy-loop.sh | Continuous triage + delegation | 1 Claude Opus/cycle |
| | + timmy-loop-prompt.md | Runs continuously |
| | | BIGGEST COST CENTER |
| kimi-loop.sh | Per-issue coding agent | 1 Claude Code/issue |
| | | Bursty, not cont. |
| hermes (pane 4) | Interactive Hermes chat | Per-interaction |
---
## Falsework Migration Plan
### Phase 1: DONE — Separate and hold (today)
- Split the tmux layout so API-heavy panes don't auto-start
- Tower-hermes is the only active Claude consumer
- All monitoring is pure bash, zero API cost
### Phase 2: Tower Hermes → Local (next)
Tower conversation is LOW STAKES. It's two AIs chatting. This does NOT
need Claude Opus.
FALSEWORK APPROACH:
- Create ~/.hermes-tower/ config with local-only backend
- tower-hermes.sh: change `hermes chat` to `HERMES_HOME=~/.hermes-tower hermes chat`
- Backend: hermes3:latest or qwen3:30b via Ollama
- Result: tower becomes ZERO API COST
- Quality: will be dumber but that's fine for conversation
### Phase 3: Loop Triage → Hybrid (requires work)
The loop prompt (timmy-loop-prompt.md) does 6 phases. NOT all need Opus:
WHAT CAN GO LOCAL:
- Phase 0 (check stop file) — already bash
- Phase 1 (fix broken PRs) — needs code reasoning → KEEP CLAUDE
- Phase 2 (fast triage) — read issues, score them → LOCAL POSSIBLE
A local model can read JSON and assign priorities
- Phase 3 (execute top) — depends on task type
- Phase 4 (retro) — summarize what happened → LOCAL POSSIBLE
- Phase 5/6 (deep triage/cleanup) — periodic → LOCAL POSSIBLE
FALSEWORK APPROACH:
- Split the loop into "triage" (local) and "execute" (Claude)
- Local model handles: reading issues, scoring, assigning labels
- Claude handles: actual code review, complex delegation decisions
- Gate: only call Claude when there's real work, not every cycle
### Phase 4: Kimi → Local Coding Agent (requires model work)
kimi-loop.sh currently runs `kimi` which is Claude Code ($2/issue budget).
FALSEWORK OPTIONS:
a) Use qwen3:30b as coding agent (has tool use, just slower)
b) Use Kimi API (Moonshot) — cheaper than Claude, decent at code
c) Keep Claude Code but increase poll interval to reduce frequency
d) Only assign Kimi issues that are scoped/small (1-3 files)
RECOMMENDED: Option (c) for now — same agent, less frequent. Then migrate
to (a) as local model quality improves.
### Phase 5: Smart Routing (permanent structure)
Once local models handle triage reliably:
- Enable smart_model_routing in hermes config
- Simple turns → hermes3:latest (local, free)
- Complex turns → Claude Opus (cloud, paid)
- Tower → always local
- Loop triage → local, execution → Claude
- PR review → always Claude (stakes too high)
---
## Cost Estimation (rough)
| Scenario | Claude calls/hour | Opus cost/hour* |
|-----------------------|-------------------|-----------------|
| Everything on Claude | ~120 | ~$12-24 |
| Current (tower only) | ~60 | ~$6-12 |
| Phase 2 (tower local) | ~0 | ~$0 |
| Phase 3 (loop hybrid) | ~10-20 | ~$1-4 |
| Phase 5 (smart route) | ~5-10 | ~$0.50-2 |
*Very rough. Depends on prompt size, response length, Opus pricing.
---
## Rules for Falsework
1. NEVER sacrifice quality gates for cost. If local model can't do PR
review reliably, keep it on Claude.
2. Start with the LOWEST STAKES component. Tower chat → loop triage →
PR review. Never the reverse.
3. Test locally BEFORE removing the scaffolding. Run both paths, compare
results, then switch.
4. Keep the Claude path AVAILABLE. Don't delete configs — comment them
out. If local breaks, flip back in 30 seconds.
5. Monitor degradation. If local triage starts miscategorizing issues,
that's the signal to keep Claude for that phase.
---
## Quick Reference: How to Start Each Component
```bash
# Zero cost — start freely
~/.hermes/bin/start-dashboard.sh # tmux layout + status panels
~/.hermes/bin/tower-timmy.sh # Timmy side (local)
~/.hermes/bin/timmy-watchdog.sh # cron: */8 * * * *
~/.hermes/bin/tower-watchdog.sh # cron: */5 * * * *
# Moderate cost — start with awareness
~/.hermes/bin/tower-hermes.sh # ~1 Claude call per Timmy response
# Heavy cost — start deliberately
~/.hermes/bin/timmy-loop.sh # Continuous Claude Opus calls
~/.hermes/bin/kimi-loop.sh # Claude Code per issue
hermes # Interactive Hermes (per-interaction)
# Stop everything
touch ~/Timmy-Time-dashboard/.loop/STOP # stops the loop
tmux kill-session -t timmy-loop # kills dashboard
tmux kill-session -t tower # kills tower
```