timmy-home/specs/2026-03-29-local-only-harness-cutover-plan.md

# Local-Only Timmy Harness Cutover Plan

> For Hermes: treat this as a phased migration plan. Do not restore deprecated cloud loops.

Goal: make this harness boot and operate Timmy locally with no required cloud-model dependency.

Architecture: keep the boundary already written down in `~/.timmy/decisions.md` — `~/.hermes` is the harness, `~/.timmy` is Timmy's workspace. The cutover is not "move Timmy into the cloudless void." It is: strip cloud inference out of the active harness, quarantine legacy remote ops loops, and prove Timmy can think and operate against localhost-only model endpoints.

Tech stack: Hermes harness, local llama.cpp/OpenAI-compatible endpoint, Ollama, Timmy workspace in `~/.timmy`, local scripts, local cron jobs.

---

## Ground truth found on 2026-03-29

1. `~/.hermes/config.yaml` still has a cloud default:
   - `model.default: gpt-5.4`
   - `model.provider: openai-codex`
   - `model.base_url: https://chatgpt.com/backend-api/codex`

2. `~/.hermes/config.yaml` still contains cloud escape hatches:
   - custom provider `Google Gemini`
   - `fallback_model` pointing at Gemini

3. Built-in cron still has an accidental cloud path:
   - job `a77a87392582` (`Health Monitor`) is enabled every 5m
   - `model = null`, `provider = null`
   - per `~/.timmy/OPERATIONS.md`, all crons must specify model explicitly
   - with current live config, null/null means inherit the cloud default

4. `~/.timmy/OPERATIONS.md` says the active system is Hermes + timmy-config sidecar and explicitly says these old loops are deprecated and must not be restored:
   - `claude-loop.sh`
   - `gemini-loop.sh`
   - `timmy-orchestrator.sh`
   - `nexus-merge-bot.sh`
   - `agent-loop.sh`

5. Legacy ops scripts in `~/.hermes/bin/` still point at VPS/Gitea and the archived dashboard world:
   - `timmy-status.sh`
   - `ops-panel.sh`
   - `ops-gitea.sh`
   - `ops-helpers.sh`
   - `claude-loop.sh`
   - `gemini-loop.sh`
   - `timmy-orchestrator.sh`
   - `agent-loop.sh`
   - `agent-dispatch.sh`
   - `claudemax-watchdog.sh`
   These still reference `143.198.27.163` or cloud agent CLIs.

6. Timmy's local Nexus path is almost right but not fully clean:
   - `~/.timmy/nexus-localhost/nexus/nexus_think.py` defaults to local Ollama
   - but it still supports optional Groq offload through `groq_worker.py`
   - `~/.timmy/nexus-localhost/nexus/groq_worker.py` still calls `https://api.groq.com/openai/v1/chat/completions`

7. There is already a local proof tool available:
   - `~/.hermes/bin/local-model-smoke-test.sh`
   - this should become one of the main acceptance checks

8. Important honesty note:
   - `~/.timmy/reports/production/2026-03-28-morning-production-report.md` says `timmy-config` was moved to a local default
   - live world state in `~/.hermes/config.yaml` still shows cloud default
   - trust world state, not yesterday's report

---

## North-star definition

A local-only Timmy harness means:

1. No active default model points to a remote API.
2. No enabled cron can silently inherit a remote model.
3. No required runtime path depends on OpenAI, Gemini, Groq, Anthropic, or other hosted inference.
4. Timmy can boot, answer, use tools, and run the local health loop with only localhost model endpoints.
5. Cloud-specific scripts may exist only if they are quarantined outside the active path and clearly marked legacy.

---

## Recommended migration shape

Do this in three waves:

### Wave 1 — Stop accidental cloud usage
Cut the hidden spend and inheritance bugs first.

### Wave 2 — Make local the only active path
Set a single canonical local default. Remove cloud fallback and auxiliary escape hatches.

### Wave 3 — Prove and lock it
Run local smoke tests, scrub optional cloud hooks, and add regression checks so the harness cannot quietly drift back.

---

## Task 1: Freeze the current state

**Objective:** capture the live state before editing anything.

**Files:**
- Read: `~/.hermes/config.yaml`
- Read: `~/.timmy/OPERATIONS.md`
- Capture: current `cronjob list --include-disabled`

**Steps:**
1. Copy `~/.hermes/config.yaml` to a dated backup.
2. Save the current cron inventory to a dated note.
3. Save a short note with the active risks found above.

**Verification:**
- backup file exists
- cron inventory is saved
- you can diff before/after safely

---

## Task 2: Kill the inheritance bug in cron

**Objective:** no enabled cron may run with `model=null` and `provider=null`.

**Files:**
- Runtime cron registry
- `~/.timmy/OPERATIONS.md`

**Steps:**
1. Pause `Health Monitor` immediately or update it to explicit local model/provider.
2. Audit every remaining enabled cron for explicit local-only routing.
3. If a cron just runs a shell health check, replace the LLM wrapper with a pure bash/python script.

**Verification:**
- `cronjob list --include-disabled` shows no enabled job with null model/provider
- `Health Monitor` is either paused or explicitly local

---

## Task 3: Pick one canonical local default for the harness

**Objective:** the active harness must have one obvious local mind path.

**Files:**
- Modify: `~/.hermes/config.yaml`

**Recommendation:**
Use the already-present local llama.cpp/OpenAI-compatible path as the day-1 cutover default because it is already represented in config and matches the recent report:
- provider: `custom`
- model: `hermes4:14b`
- base URL: `http://localhost:8081/v1`

Keep Ollama available for explicit overrides and evaluation (`qwen3:30b`, `timmy:v0.1-q4`, etc.), but do not leave the harness split-brained with a cloud default and local side entries.

**Steps:**
1. Change `model.default`, `model.provider`, and `model.base_url` to the chosen local endpoint.
2. Keep only local endpoints in the active default path.
3. Document the reason in a nearby comment or note: cut cloud first, tune model second.

**Verification:**
- opening `~/.hermes/config.yaml` shows a localhost default
- a fresh Hermes session comes up on the local endpoint without extra flags

---

## Task 4: Remove cloud fallback from the active config

**Objective:** there must be no silent remote escape hatch.

**Files:**
- Modify: `~/.hermes/config.yaml`

**Steps:**
1. Remove the `Google Gemini` custom provider from the active config.
2. Remove or localize `fallback_model`.
3. Review `auxiliary.*` and `delegation` sections:
   - if they can run local, pin them local
   - if they cannot run local, disable them rather than leaving `provider: auto`
4. Confirm no active base URL points to a remote inference API.

**Verification:**
Search the active config for these strings and get zero hits:
- `chatgpt.com/backend-api/codex`
- `generativelanguage.googleapis.com`
- `api.groq.com`
- `anthropic`
- `openai-codex`

---

## Task 5: Quarantine legacy cloud and VPS ops scripts

**Objective:** old remote ops machinery must not remain on the active path where it can be restarted by accident.

**Files:**
- Review/move from `~/.hermes/bin/`:
  - `claude-loop.sh`
  - `gemini-loop.sh`
  - `timmy-orchestrator.sh`
  - `nexus-merge-bot.sh`
  - `agent-loop.sh`
  - `agent-dispatch.sh`
  - `ops-panel.sh`
  - `ops-gitea.sh`
  - `ops-helpers.sh`
  - `timmy-status.sh`
  - `claudemax-watchdog.sh`

**Steps:**
1. Create a quarantine location such as `~/.hermes/bin/legacy-cloud/`.
2. Move or rename remote-only scripts out of the active PATH.
3. Leave a short README there explaining why they were retired.
4. Do not delete until the local harness is stable.

**Verification:**
- the active `~/.hermes/bin/` no longer contains scripts that point at `143.198.27.163` unless they are explicitly retained for non-Timmy infra
- `OPERATIONS.md` and the actual bin directory agree

---

## Task 6: Replace stale status/ops surfaces with local truth

**Objective:** dashboards and status panels should describe the real local system, not the old VPS/dashboard world.

**Files:**
- Replace or retire:
  - `~/.hermes/bin/timmy-status.sh`
  - `~/.hermes/bin/ops-panel.sh`
  - `~/.hermes/bin/ops-gitea.sh`

**Steps:**
1. Remove references to `~/Timmy-Time-dashboard` and `rockachopa/Timmy-time-dashboard`.
2. Build one small local status script that checks:
   - local model endpoint alive
   - local session landing directory alive
   - local Timmy workspace health
   - optional local Nexus process health
3. Keep it brutally simple.

**Verification:**
- the status panel works with Wi-Fi off
- it reports localhost services, not VPS state

---

## Task 7: Scrub optional cloud offload paths from Timmy's runtime

**Objective:** Timmy's lived runtime should not keep a hidden cloud brain in reserve.

**Files:**
- Modify: `~/.timmy/nexus-localhost/nexus/nexus_think.py`
- Modify or quarantine: `~/.timmy/nexus-localhost/nexus/groq_worker.py`

**Steps:**
1. Remove the `groq_model` runtime path from `nexus_think.py`, or gate it behind a separate legacy plugin that is not imported by default.
2. If Groq support is kept for archeology, move it to a `legacy` or `experimental` namespace.
3. Make local Ollama the only supported default thinker in the active Nexus runtime.

**Verification:**
- `nexus_think.py` can run without `GROQ_API_KEY`
- no active import path requires `groq_worker.py`

---

## Task 8: Clean the environment and credentials

**Objective:** the harness shell should not carry cloud keys it no longer needs.

**Files:**
- shell startup files
- launch wrappers for Hermes/Timmy
- local env files used by the harness

**Steps:**
1. Inventory active cloud inference keys used by this harness.
2. Remove or quarantine the keys for providers you are cutting.
3. Keep local-only env vars and local endpoint URLs.
4. If other unrelated houses still need cloud keys, move those to house-specific wrappers rather than global shell scope.

**Verification:**
- a fresh shell used for Timmy starts without OpenAI/Gemini/Groq inference keys
- local Timmy still boots and responds

---

## Task 9: Prove local operation end-to-end

**Objective:** prove the harness still works after the cut.

**Files / commands:**
- `~/.hermes/bin/local-model-smoke-test.sh`
- a fresh Hermes session
- optional local Nexus startup command

**Steps:**
1. Run `local-model-smoke-test.sh`.
2. Start a fresh Hermes session with no provider overrides and confirm it uses the local default.
3. Run one small tool-use task.
4. Start the local Nexus thinker and confirm it thinks locally.
5. Re-run the health monitor in its new local form.

**Verification:**
- smoke test passes
- fresh Hermes session uses localhost
- one real tool call succeeds
- Nexus thinker runs without cloud keys

---

## Task 10: Add regression guards

**Objective:** make it hard to silently drift back to cloud defaults.

**Files:**
- New guard script, for example: `~/.timmy/scripts/assert_local_only_harness.sh`
- Optional tests for the active config

**Steps:**
1. Add a script that fails if the active harness config contains remote inference URLs.
2. Add a check that fails if any enabled cron has null model/provider.
3. Add a check that scans the active bin directory for forbidden strings like:
   - `chatgpt.com/backend-api/codex`
   - `generativelanguage.googleapis.com`
   - `api.groq.com`
4. Exclude vendored benchmark/test fixtures under `~/.hermes/hermes-agent/environments/` so the guard only checks active runtime paths.

**Verification:**
- the guard script passes on the cleaned harness
- reintroducing a cloud URL makes it fail immediately

---

## Definition of done

The cutover is done when all of these are true:

1. `~/.hermes/config.yaml` has a localhost-only default model path.
2. `fallback_model` is local or absent.
3. No enabled cron inherits a cloud default.
4. Old remote loops are quarantined out of active `~/.hermes/bin/`.
5. Timmy status/ops scripts no longer point at the archived dashboard or VPS by default.
6. `nexus_think.py` has no active cloud offload path.
7. `local-model-smoke-test.sh` passes.
8. A fresh Timmy/Hermes session works with only local endpoints available.
9. Turning off cloud API keys does not break the harness.

---

## Recommended order for the next working session

1. Fix the cron inheritance bug.
2. Flip the harness default to local.
3. Remove cloud fallback entries.
4. Quarantine legacy scripts.
5. Scrub Groq from `nexus-localhost` active runtime.
6. Run smoke tests.
7. Add regression guard.

That order gets you the biggest sovereignty win first and reduces the chance of half-cut state.

---

## Notes on scope

- This plan is about Timmy's harness, not every possible experimental repo on disk.
- Vendor benchmark configs inside `~/.hermes/hermes-agent/` can stay for research as long as they are not part of the active runtime path.
- The first victory is not perfection. The first victory is: no active cloud default, no hidden fallback, no accidental remote cron burn.

---

*Sovereignty and service always.*