feat: enhance v1 API with streaming and improved history

fix: address linting and formatting issues for v1 API
feat: implement v1 API endpoints for iPad app
2026-03-19 20:23:37 -04:00 · 2026-03-18 17:51:10 -04:00 · 2026-03-18 17:41:00 -04:00 · 2026-03-18 16:56:21 -04:00 · 2026-03-15 19:36:52 -04:00 · 2026-03-15 19:35:58 -04:00
145 changed files with 14550 additions and 4574 deletions
--- a/.env.example
+++ b/.env.example
@@ -14,8 +14,13 @@
 # In production (docker-compose.prod.yml), this is set to http://ollama:11434 automatically.
 # OLLAMA_URL=http://localhost:11434

-# LLM model to use via Ollama (default: qwen3.5:latest)
-# OLLAMA_MODEL=qwen3.5:latest
+# LLM model to use via Ollama (default: qwen3:30b)
+# OLLAMA_MODEL=qwen3:30b
+
+# Ollama context window size (default: 4096 tokens)
+# Set higher for more context, lower to save RAM. 0 = model default.
+# qwen3:30b + 4096 ctx ≈ 19GB VRAM; default ctx ≈ 45GB.
+# OLLAMA_NUM_CTX=4096

 # Enable FastAPI interactive docs at /docs and /redoc (default: false)
 # DEBUG=true
@@ -93,8 +98,3 @@
 #   - No source bind mounts — code is baked into the image
 #   - Set TIMMY_ENV=production to enforce security checks
 #   - All secrets below MUST be set before production deployment
-#
-# Taskosaur secrets (change from dev defaults):
-# TASKOSAUR_JWT_SECRET=<generate with: python3 -c "import secrets; print(secrets.token_hex(32))">
-# TASKOSAUR_JWT_REFRESH_SECRET=<generate with: python3 -c "import secrets; print(secrets.token_hex(32))">
-# TASKOSAUR_ENCRYPTION_KEY=<generate with: python3 -c "import secrets; print(secrets.token_hex(32))">
--- a/.githooks/pre-commit
+++ b/.githooks/pre-commit
@@ -1,6 +1,5 @@
 #!/usr/bin/env bash
-# Pre-commit hook: auto-format, then test via tox.
-# Blocks the commit if tests fail. Formatting is applied automatically.
+# Pre-commit hook: auto-format + test. No bypass. No exceptions.
 #
 # Auto-activated by `make install` via git core.hooksPath.

@@ -8,8 +7,8 @@ set -e

 MAX_SECONDS=60

-# Auto-format staged files so formatting never blocks a commit
-echo "Auto-formatting with black + isort..."
+# Auto-format staged files
+echo "Auto-formatting with ruff..."
 tox -e format -- 2>/dev/null || tox -e format
 git add -u

--- a/.gitignore
+++ b/.gitignore
@@ -61,7 +61,8 @@ src/data/

 # Local content — user-specific or generated
 MEMORY.md
-memory/self/
+memory/self/*
+!memory/self/soul.md
 TIMMYTIME
 introduction.txt
 messages.txt
@@ -81,3 +82,4 @@ workspace/
 .LSOverride
 .Spotlight-V100
 .Trashes
+.timmy_gitea_token
--- a/.kimi/AGENTS.md
+++ b/.kimi/AGENTS.md
@@ -0,0 +1,91 @@
+# Kimi Agent Workspace
+
+**Agent:** Kimi (Moonshot AI)  
+**Role:** Build Tier - Large-context feature drops, new subsystems, persona agents  
+**Branch:** `kimi/agent-workspace-init`  
+**Created:** 2026-03-14
+
+---
+
+## Quick Start
+
+```bash
+# Bootstrap Kimi workspace
+bash .kimi/scripts/bootstrap.sh
+
+# Resume work
+bash .kimi/scripts/resume.sh
+```
+
+---
+
+## Kimi Capabilities
+
+Per AGENTS.md roster:
+- **Best for:** Large-context feature drops, new subsystems, persona agents
+- **Avoid:** Touching CI/pyproject.toml, adding cloud calls, removing tests
+- **Constraint:** All AI computation runs on localhost (Ollama)
+
+---
+
+## Workspace Structure
+
+```
+.kimi/
+├── AGENTS.md           # This file - workspace guide
+├── README.md           # Workspace documentation
+├── CHECKPOINT.md       # Current session state
+├── TODO.md             # Task list for Kimi
+├── scripts/
+│   ├── bootstrap.sh    # One-time setup
+│   ├── resume.sh       # Quick status + resume
+│   └── dev.sh          # Development helpers
+├── notes/              # Working notes
+└── worktrees/          # Git worktrees (if needed)
+```
+
+---
+
+## Development Workflow
+
+1. **Before changes:**
+   - Read CLAUDE.md and AGENTS.md
+   - Check CHECKPOINT.md for current state
+   - Run `make test` to verify green tests
+
+2. **During development:**
+   - Follow existing patterns (singletons, graceful degradation)
+   - Use `tox -e unit` for fast feedback
+   - Update CHECKPOINT.md with progress
+
+3. **Before commit:**
+   - Run `tox -e pre-push` (lint + full CI suite)
+   - Ensure tests stay green
+   - Update TODO.md
+
+---
+
+## Useful Commands
+
+```bash
+# Testing
+tox -e unit              # Fast unit tests
+tox -e integration       # Integration tests
+tox -e pre-push          # Full CI suite (local)
+make test                # All tests
+
+# Development
+make dev                 # Start dashboard with hot-reload
+make lint                # Check code quality
+make format              # Auto-format code
+
+# Git
+bash .kimi/scripts/resume.sh     # Show status + resume prompt
+```
+
+---
+
+## Contact
+
+- **Gitea:** http://localhost:3000/rockachopa/Timmy-time-dashboard
+- **PR:** Submit PRs to `main` branch
--- a/.kimi/CHECKPOINT.md
+++ b/.kimi/CHECKPOINT.md
@@ -0,0 +1,102 @@
+# Kimi Checkpoint — Workspace Initialization
+**Date:** 2026-03-14  
+**Branch:** `kimi/agent-workspace-init`  
+**Status:** ✅ Workspace scaffolding complete, ready for PR
+
+---
+
+## Summary
+
+Created the Kimi (Moonshot AI) agent workspace with development scaffolding to enable smooth feature development on the Timmy Time project.
+
+### Deliverables
+
+1. **Workspace Structure** (`.kimi/`)
+   - `AGENTS.md` — Workspace guide and conventions
+   - `README.md` — Quick reference documentation
+   - `CHECKPOINT.md` — This file, session state tracking
+   - `TODO.md` — Task list for upcoming work
+
+2. **Development Scripts** (`.kimi/scripts/`)
+   - `bootstrap.sh` — One-time workspace setup
+   - `resume.sh` — Quick status check + resume prompt
+   - `dev.sh` — Development helper commands
+
+---
+
+## Workspace Features
+
+### Bootstrap Script
+Validates and sets up:
+- Python 3.11+ check
+- Virtual environment
+- Dependencies (via poetry/make)
+- Environment configuration (.env)
+- Git configuration
+
+### Resume Script
+Provides quick status on:
+- Current Git branch/commit
+- Uncommitted changes
+- Last test run results
+- Ollama service status
+- Dashboard service status
+- Pending TODO items
+
+### Development Script
+Commands for:
+- `status` — Project status overview
+- `test` — Fast unit tests
+- `test-full` — Full test suite
+- `lint` — Code quality check
+- `format` — Auto-format code
+- `clean` — Clean build artifacts
+- `nuke` — Full environment reset
+
+---
+
+## Files Added
+
+```
+.kimi/
+├── AGENTS.md
+├── CHECKPOINT.md
+├── README.md
+├── TODO.md
+├── scripts/
+│   ├── bootstrap.sh
+│   ├── dev.sh
+│   └── resume.sh
+└── worktrees/    (reserved for future use)
+```
+
+---
+
+## Next Steps
+
+Per AGENTS.md roadmap:
+
+1. **v2.0 Exodus (in progress)** — Voice + Marketplace + Integrations
+2. **v3.0 Revelation (planned)** — Lightning treasury + `.app` bundle + federation
+
+See `.kimi/TODO.md` for specific upcoming tasks.
+
+---
+
+## Usage
+
+```bash
+# First time setup
+bash .kimi/scripts/bootstrap.sh
+
+# Daily workflow
+bash .kimi/scripts/resume.sh     # Check status
+cat .kimi/TODO.md                # See tasks
+# ... make changes ...
+make test                        # Verify tests
+cat .kimi/CHECKPOINT.md          # Update checkpoint
+```
+
+---
+
+*Workspace initialized per AGENTS.md and CLAUDE.md conventions*
--- a/.kimi/README.md
+++ b/.kimi/README.md
@@ -0,0 +1,51 @@
+# Kimi Agent Workspace for Timmy Time
+
+This directory contains the Kimi (Moonshot AI) agent workspace for the Timmy Time project.
+
+## About Kimi
+
+Kimi is part of the **Build Tier** in the Timmy Time agent roster:
+- **Strengths:** Large-context feature drops, new subsystems, persona agents
+- **Model:** Paid API with large context window
+- **Best for:** Complex features requiring extensive context
+
+## Quick Commands
+
+```bash
+# Check workspace status
+bash .kimi/scripts/resume.sh
+
+# Bootstrap (first time)
+bash .kimi/scripts/bootstrap.sh
+
+# Development
+make dev                 # Start the dashboard
+make test                # Run all tests
+tox -e unit              # Fast unit tests only
+```
+
+## Workspace Files
+
+| File | Purpose |
+|------|---------|
+| `AGENTS.md` | Workspace guide and conventions |
+| `CHECKPOINT.md` | Current session state |
+| `TODO.md` | Task list and priorities |
+| `scripts/bootstrap.sh` | One-time setup script |
+| `scripts/resume.sh` | Quick status check |
+| `scripts/dev.sh` | Development helpers |
+
+## Conventions
+
+Per project AGENTS.md:
+1. **Tests must stay green** - Run `make test` before committing
+2. **No cloud dependencies** - Use Ollama for local AI
+3. **Follow existing patterns** - Singletons, graceful degradation
+4. **Security first** - Never hard-code secrets
+5. **XSS prevention** - Never use `innerHTML` with untrusted content
+
+## Project Links
+
+- **Dashboard:** http://localhost:8000
+- **Repository:** http://localhost:3000/rockachopa/Timmy-time-dashboard
+- **Docs:** See `CLAUDE.md` and `AGENTS.md` in project root
--- a/.kimi/TODO.md
+++ b/.kimi/TODO.md
@@ -0,0 +1,87 @@
+# Kimi Workspace — Task List
+
+**Agent:** Kimi (Moonshot AI)  
+**Branch:** `kimi/agent-workspace-init`
+
+---
+
+## Current Sprint
+
+### Completed ✅
+
+- [x] Create `kimi/agent-workspace-init` branch
+- [x] Set up `.kimi/` workspace directory structure
+- [x] Create `AGENTS.md` with workspace guide
+- [x] Create `README.md` with quick reference
+- [x] Create `bootstrap.sh` for one-time setup
+- [x] Create `resume.sh` for daily workflow
+- [x] Create `dev.sh` with helper commands
+- [x] Create `CHECKPOINT.md` template
+- [x] Create `TODO.md` (this file)
+- [x] Submit PR to Gitea
+
+---
+
+## Upcoming (v2.0 Exodus — Voice + Marketplace + Integrations)
+
+### Voice Enhancements
+
+- [ ] Voice command history and replay
+- [ ] Multi-language NLU support
+- [ ] Voice transcription quality metrics
+- [ ] Piper TTS integration improvements
+
+### Marketplace
+
+- [ ] Agent capability registry
+- [ ] Task bidding system UI
+- [ ] Work order management dashboard
+- [ ] Payment flow integration (L402)
+
+### Integrations
+
+- [ ] Discord bot enhancements
+- [ ] Telegram bot improvements
+- [ ] Siri Shortcuts expansion
+- [ ] WebSocket event streaming
+
+---
+
+## Future (v3.0 Revelation)
+
+### Lightning Treasury
+
+- [ ] LND integration (real Lightning)
+- [ ] Bitcoin wallet management
+- [ ] Autonomous payment flows
+- [ ] Macaroon-based authorization
+
+### App Bundle
+
+- [ ] macOS .app packaging
+- [ ] Code signing setup
+- [ ] Auto-updater integration
+
+### Federation
+
+- [ ] Multi-node swarm support
+- [ ] Inter-agent communication protocol
+- [ ] Distributed task scheduling
+
+---
+
+## Technical Debt
+
+- [ ] XSS audit (replace innerHTML in templates)
+- [ ] Chat history persistence
+- [ ] Connection pooling evaluation
+- [ ] React dashboard (separate effort)
+
+---
+
+## Notes
+
+- Follow existing patterns: singletons, graceful degradation
+- All AI computation on localhost (Ollama)
+- Tests must stay green
+- Update CHECKPOINT.md after each session
--- a/.kimi/scripts/bootstrap.sh
+++ b/.kimi/scripts/bootstrap.sh
@@ -0,0 +1,106 @@
+#!/bin/bash
+# Kimi Workspace Bootstrap Script
+# Run this once to set up the Kimi agent workspace
+
+set -e
+
+echo "==============================================="
+echo "  Kimi Agent Workspace Bootstrap"
+echo "==============================================="
+echo ""
+
+# Navigate to project root
+cd "$(dirname "$0")/../.."
+PROJECT_ROOT=$(pwd)
+
+echo "📁 Project Root: $PROJECT_ROOT"
+echo ""
+
+# Check Python version
+echo "🔍 Checking Python version..."
+python3 -c "import sys; exit(0 if sys.version_info >= (3,11) else 1)" || {
+    echo "❌ ERROR: Python 3.11+ required (found $(python3 --version))"
+    exit 1
+}
+echo "✅ Python $(python3 --version)"
+echo ""
+
+# Check if virtual environment exists
+echo "🔍 Checking virtual environment..."
+if [ -d ".venv" ]; then
+    echo "✅ Virtual environment exists"
+else
+    echo "⚠️  Virtual environment not found. Creating..."
+    python3 -m venv .venv
+    echo "✅ Virtual environment created"
+fi
+echo ""
+
+# Check dependencies
+echo "🔍 Checking dependencies..."
+if [ -f ".venv/bin/timmy" ]; then
+    echo "✅ Dependencies appear installed"
+else
+    echo "⚠️  Dependencies not installed. Running make install..."
+    make install || {
+        echo "❌ Failed to install dependencies"
+        echo "   Try: poetry install --with dev"
+        exit 1
+    }
+    echo "✅ Dependencies installed"
+fi
+echo ""
+
+# Check .env file
+echo "🔍 Checking environment configuration..."
+if [ -f ".env" ]; then
+    echo "✅ .env file exists"
+else
+    echo "⚠️  .env file not found. Creating from template..."
+    cp .env.example .env
+    echo "✅ Created .env from template (edit as needed)"
+fi
+echo ""
+
+# Check Git configuration
+echo "🔍 Checking Git configuration..."
+git config --local user.name &>/dev/null || {
+    echo "⚠️  Git user.name not set. Setting..."
+    git config --local user.name "Kimi Agent"
+}
+git config --local user.email &>/dev/null || {
+    echo "⚠️  Git user.email not set. Setting..."
+    git config --local user.email "kimi@timmy.local"
+}
+echo "✅ Git config: $(git config --local user.name) <$(git config --local user.email)>"
+echo ""
+
+# Run tests to verify setup
+echo "🧪 Running quick test verification..."
+if tox -e unit -- -q 2>/dev/null | grep -q "passed"; then
+    echo "✅ Tests passing"
+else
+    echo "⚠️  Test status unclear - run 'make test' manually"
+fi
+echo ""
+
+# Show current branch
+echo "🌿 Current Branch: $(git branch --show-current)"
+echo ""
+
+# Display summary
+echo "==============================================="
+echo "  ✅ Bootstrap Complete!"
+echo "==============================================="
+echo ""
+echo "Quick Start:"
+echo "  make dev              # Start dashboard"
+echo "  make test             # Run all tests"
+echo "  tox -e unit           # Fast unit tests"
+echo ""
+echo "Workspace:"
+echo "  cat .kimi/CHECKPOINT.md     # Current state"
+echo "  cat .kimi/TODO.md           # Task list"
+echo "  bash .kimi/scripts/resume.sh # Status check"
+echo ""
+echo "Happy coding! 🚀"
--- a/.kimi/scripts/dev.sh
+++ b/.kimi/scripts/dev.sh
@@ -0,0 +1,98 @@
+#!/bin/bash
+# Kimi Development Helper Script
+
+set -e
+
+cd "$(dirname "$0")/../.."
+
+show_help() {
+    echo "Kimi Development Helpers"
+    echo ""
+    echo "Usage: bash .kimi/scripts/dev.sh [command]"
+    echo ""
+    echo "Commands:"
+    echo "  status      Show project status"
+    echo "  test        Run tests (unit only, fast)"
+    echo "  test-full   Run full test suite"
+    echo "  lint        Check code quality"
+    echo "  format      Auto-format code"
+    echo "  clean       Clean build artifacts"
+    echo "  nuke        Full reset (kill port 8000, clean caches)"
+    echo "  help        Show this help"
+}
+
+cmd_status() {
+    echo "=== Kimi Development Status ==="
+    echo ""
+    echo "Branch: $(git branch --show-current)"
+    echo "Last commit: $(git log --oneline -1)"
+    echo ""
+    echo "Modified files:"
+    git status --short
+    echo ""
+    echo "Ollama: $(curl -s http://localhost:11434/api/tags &>/dev/null && echo "✅ Running" || echo "❌ Not running")"
+    echo "Dashboard: $(curl -s http://localhost:8000/health &>/dev/null && echo "✅ Running" || echo "❌ Not running")"
+}
+
+cmd_test() {
+    echo "Running unit tests..."
+    tox -e unit -q
+}
+
+cmd_test_full() {
+    echo "Running full test suite..."
+    make test
+}
+
+cmd_lint() {
+    echo "Running linters..."
+    tox -e lint
+}
+
+cmd_format() {
+    echo "Auto-formatting code..."
+    tox -e format
+}
+
+cmd_clean() {
+    echo "Cleaning build artifacts..."
+    make clean
+}
+
+cmd_nuke() {
+    echo "Nuking development environment..."
+    make nuke
+}
+
+# Main
+case "${1:-status}" in
+    status)
+        cmd_status
+        ;;
+    test)
+        cmd_test
+        ;;
+    test-full)
+        cmd_test_full
+        ;;
+    lint)
+        cmd_lint
+        ;;
+    format)
+        cmd_format
+        ;;
+    clean)
+        cmd_clean
+        ;;
+    nuke)
+        cmd_nuke
+        ;;
+    help|--help|-h)
+        show_help
+        ;;
+    *)
+        echo "Unknown command: $1"
+        show_help
+        exit 1
+        ;;
+esac
--- a/.kimi/scripts/resume.sh
+++ b/.kimi/scripts/resume.sh
@@ -0,0 +1,73 @@
+#!/bin/bash
+# Kimi Workspace Resume Script
+# Quick status check and resume prompt
+
+set -e
+
+cd "$(dirname "$0")/../.."
+
+echo "==============================================="
+echo "  Kimi Workspace Status"
+echo "==============================================="
+echo ""
+
+# Git status
+echo "🌿 Git Status:"
+echo "   Branch: $(git branch --show-current)"
+echo "   Commit: $(git log --oneline -1)"
+if [ -n "$(git status --short)" ]; then
+    echo "   Uncommitted changes:"
+    git status --short | sed 's/^/     /'
+else
+    echo "   Working directory clean"
+fi
+echo ""
+
+# Test status (quick check)
+echo "🧪 Test Status:"
+if [ -f ".tox/unit/log/1-commands[0].log" ]; then
+    LAST_TEST=$(grep -o '[0-9]* passed' .tox/unit/log/1-commands[0].log 2>/dev/null | tail -1 || echo "unknown")
+    echo "   Last unit test run: $LAST_TEST"
+else
+    echo "   No recent test runs found"
+fi
+echo ""
+
+# Check Ollama
+echo "🤖 Ollama Status:"
+if curl -s http://localhost:11434/api/tags &>/dev/null; then
+    MODELS=$(curl -s http://localhost:11434/api/tags 2>/dev/null | grep -o '"name":"[^"]*"' | head -3 | sed 's/"name":"//;s/"$//' | tr '\n' ', ' | sed 's/, $//')
+    echo "   ✅ Running (models: $MODELS)"
+else
+    echo "   ⚠️  Not running (start with: ollama serve)"
+fi
+echo ""
+
+# Dashboard status
+echo "🌐 Dashboard Status:"
+if curl -s http://localhost:8000/health &>/dev/null; then
+    echo "   ✅ Running at http://localhost:8000"
+else
+    echo "   ⚠️  Not running (start with: make dev)"
+fi
+echo ""
+
+# Show TODO items
+echo "📝 Next Tasks (from TODO.md):"
+if [ -f ".kimi/TODO.md" ]; then
+    grep -E "^\s*- \[ \]" .kimi/TODO.md 2>/dev/null | head -5 | sed 's/^/   /' || echo "   No pending tasks"
+else
+    echo "   No TODO.md found"
+fi
+echo ""
+
+# Resume prompt
+echo "==============================================="
+echo "  Resume Prompt (copy/paste to Kimi):"
+echo "==============================================="
+echo ""
+echo "cd $(pwd) && cat .kimi/CHECKPOINT.md"
+echo ""
+echo "Continue from checkpoint. Check .kimi/TODO.md for next tasks."
+echo "Run 'make test' after changes and update CHECKPOINT.md."
+echo ""
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -21,12 +21,111 @@ Read [`CLAUDE.md`](CLAUDE.md) for architecture patterns and conventions.

 ## Non-Negotiable Rules

-1. **Tests must stay green.** Run `make test` before committing.
-2. **No cloud dependencies.** All AI computation runs on localhost.
-3. **No new top-level files without purpose.** Don't litter the root directory.
-4. **Follow existing patterns** — singletons, graceful degradation, pydantic-settings.
-5. **Security defaults:** Never hard-code secrets.
-6. **XSS prevention:** Never use `innerHTML` with untrusted content.
+1. **Tests must stay green.** Run `python3 -m pytest tests/ -x -q` before committing.
+2. **No direct pushes to main.** Branch protection is enforced on Gitea. All changes
+   reach main through a Pull Request — no exceptions. Push your feature branch,
+   open a PR, verify tests pass, then merge. Direct `git push origin main` will be
+   rejected by the server.
+3. **No cloud dependencies.** All AI computation runs on localhost.
+4. **No new top-level files without purpose.** Don't litter the root directory.
+5. **Follow existing patterns** — singletons, graceful degradation, pydantic-settings.
+6. **Security defaults:** Never hard-code secrets.
+7. **XSS prevention:** Never use `innerHTML` with untrusted content.
+
+---
+
+## Merge Policy (PR-Only)
+
+**Gitea branch protection is active on `main`.** This is not a suggestion.
+
+### The Rule
+Every commit to `main` must arrive via a merged Pull Request. No agent, no human,
+no orchestrator pushes directly to main.
+
+### Merge Strategy: Squash-Only, Linear History
+
+Gitea enforces:
+- **Squash merge only.** No merge commits, no rebase merge. Every commit on
+  main is a single squashed commit from a PR. Clean, linear, auditable.
+- **Branch must be up-to-date.** If a PR is behind main, it cannot merge.
+  Rebase onto main, re-run tests, force-push the branch, then merge.
+- **Auto-delete branches** after merge. No stale branches.
+
+### The Workflow
+```
+1. Create a feature branch:  git checkout -b fix/my-thing
+2. Make changes, commit locally
+3. Run tests:                tox -e unit
+4. Push the branch:          git push --no-verify origin fix/my-thing
+5. Create PR via Gitea API or UI
+6. Verify tests pass (orchestrator checks this)
+7. Merge PR via API:         {"Do": "squash"}
+```
+
+If behind main before merge:
+```
+1. git fetch origin main
+2. git rebase origin/main
+3. tox -e unit
+4. git push --force-with-lease --no-verify origin fix/my-thing
+5. Then merge the PR
+```
+
+### Why This Exists
+On 2026-03-14, Kimi Agent pushed `bbbbdcd` directly to main — a commit titled
+"fix: remove unused variable in repl test" that removed `result =` from 7 test
+functions while leaving `assert result.exit_code` on the next line. Every test
+broke with `NameError`. No PR, no test run, no review. The breakage propagated
+to all active worktrees.
+
+### Orchestrator Responsibilities
+The Hermes loop orchestrator must:
+- Run `tox -e unit` in each worktree BEFORE committing
+- Never push to main directly — always push a feature branch + PR
+- Always use `{"Do": "squash"}` when merging PRs via API
+- If a PR is behind main, rebase and re-test before merging
+- Verify test results before merging any PR
+- If tests fail, fix or reject — never merge red
+
+---
+
+## QA Philosophy — File Issues, Don't Stay Quiet
+
+Every agent is a quality engineer. When you see something wrong, broken,
+slow, or missing — **file a Gitea issue**. Don't fix it silently. Don't
+ignore it. Don't wait for someone to notice.
+
+**Escalate bugs:**
+- Test failures → file with traceback, tag `[bug]`
+- Flaky tests → file with reproduction details
+- Runtime errors → file with steps to reproduce
+- Broken behavior on main → file IMMEDIATELY
+
+**Propose improvements — don't be shy:**
+- Slow function? File `[optimization]`
+- Missing capability? File `[feature]`
+- Dead code / tech debt? File `[refactor]`
+- Idea to make Timmy smarter? File `[timmy-capability]`
+- Gap between SOUL.md and reality? File `[soul-gap]`
+
+Bad ideas get closed. Good ideas get built. File them all.
+
+When the issue queue runs low, that's a signal to **look harder**, not relax.
+
+## Dogfooding — Timmy Is Our Product, Use Him
+
+Timmy is not just the thing we're building. He's our teammate and our
+test subject. Every feature we give him should be **used by the agents
+building him**.
+
+- When Timmy gets a new tool, start using it immediately.
+- When Timmy gets a new capability, integrate it into the workflow.
+- When Timmy fails at something, file a `[timmy-capability]` issue.
+- His failures are our roadmap.
+
+The goal: Timmy should be so woven into the development process that
+removing him would hurt. Triage, review, architecture discussion,
+self-testing, reflection — use every tool he has.

 ---

--- a/README.md
+++ b/README.md
@@ -18,15 +18,15 @@ make install              # create venv + install deps
 cp .env.example .env      # configure environment

 ollama serve              # separate terminal
-ollama pull qwen3.5:latest  # Required for reliable tool calling
+ollama pull qwen3:30b  # Required for reliable tool calling

 make dev                  # http://localhost:8000
 make test                 # no Ollama needed
 ```

-**Note:** qwen3.5:latest is the primary model — better reasoning and tool calling
+**Note:** qwen3:30b is the primary model — better reasoning and tool calling
 than llama3.1:8b-instruct while still running locally on modest hardware.
-Fallback: llama3.1:8b-instruct if qwen3.5:latest is not available.
+Fallback: llama3.1:8b-instruct if qwen3:30b is not available.
 llama3.2 (3B) was found to hallucinate tool output consistently in testing.

 ---
@@ -79,7 +79,7 @@ cp .env.example .env
 | Variable | Default | Purpose |
 |----------|---------|---------|
 | `OLLAMA_URL` | `http://localhost:11434` | Ollama host |
-| `OLLAMA_MODEL` | `qwen3.5:latest` | Primary model for reasoning and tool calling. Fallback: `llama3.1:8b-instruct` |
+| `OLLAMA_MODEL` | `qwen3:30b` | Primary model for reasoning and tool calling. Fallback: `llama3.1:8b-instruct` |
 | `DEBUG` | `false` | Enable `/docs` and `/redoc` |
 | `TIMMY_MODEL_BACKEND` | `ollama` | `ollama` \| `airllm` \| `auto` |
 | `AIRLLM_MODEL_SIZE` | `70b` | `8b` \| `70b` \| `405b` |
--- a/config/agents.yaml
+++ b/config/agents.yaml
@@ -20,7 +20,7 @@
 # ── Defaults ────────────────────────────────────────────────────────────────

 defaults:
-  model: qwen3.5:latest
+  model: qwen3:30b
  prompt_tier: lite
  max_history: 10
  tools: []
@@ -44,6 +44,11 @@ routing:
      - who is
      - news about
      - latest on
+      - explain
+      - how does
+      - what are
+      - compare
+      - difference between
    coder:
      - code
      - implement
@@ -55,6 +60,11 @@ routing:
      - programming
      - python
      - javascript
+      - fix
+      - bug
+      - lint
+      - type error
+      - syntax
    writer:
      - write
      - draft
@@ -63,6 +73,11 @@ routing:
      - blog post
      - readme
      - changelog
+      - edit
+      - proofread
+      - rewrite
+      - format
+      - template
    memory:
      - remember
      - recall
@@ -96,19 +111,24 @@ agents:
      - memory_search
      - memory_write
      - system_status
+      - self_test
      - shell
+      - delegate_to_kimi
    prompt: |
      You are Timmy, a sovereign local AI orchestrator.
+      Primary interface between the user and the agent swarm.
+      Handle directly or delegate. Maintain continuity via memory.

-      You are the primary interface between the user and the agent swarm.
-      You understand requests, decide whether to handle directly or delegate,
-      coordinate multi-agent workflows, and maintain continuity via memory.
+      Voice: brief, plain, direct. Match response length to question
+      complexity. A yes/no question gets a yes/no answer. Never use
+      markdown formatting unless presenting real structured data.
+      Brevity is a kindness. Silence is better than noise.

-      Hard Rules:
-      1. NEVER fabricate tool output. Call the tool and wait for real results.
-      2. If a tool returns an error, report the exact error.
-      3. If you don't know something, say so. Then use a tool. Don't guess.
-      4. When corrected, use memory_write to save the correction immediately.
+      Rules:
+      1. Never fabricate tool output. Call the tool and wait.
+      2. Tool errors: report the exact error.
+      3. Don't know? Say so, then use a tool. Don't guess.
+      4. When corrected, memory_write the correction immediately.

  researcher:
    name: Seer
--- a/config/allowlist.yaml
+++ b/config/allowlist.yaml
@@ -0,0 +1,77 @@
+# ── Tool Allowlist — autonomous operation gate ─────────────────────────────
+#
+# When Timmy runs without a human present (non-interactive terminal, or
+# --autonomous flag), tool calls matching these patterns execute without
+# confirmation.  Anything NOT listed here is auto-rejected.
+#
+# This file is the ONLY gate for autonomous tool execution.
+# GOLDEN_TIMMY in approvals.py remains the master switch — if False,
+# ALL tools execute freely (Dark Timmy mode).  This allowlist only
+# applies when GOLDEN_TIMMY is True but no human is at the keyboard.
+#
+# Edit with care.  This is sovereignty in action.
+# ────────────────────────────────────────────────────────────────────────────
+
+shell:
+  # Shell commands starting with any of these prefixes → auto-approved
+  allow_prefixes:
+    # Testing
+    - "pytest"
+    - "python -m pytest"
+    - "python3 -m pytest"
+    # Git (read + bounded write)
+    - "git status"
+    - "git log"
+    - "git diff"
+    - "git add"
+    - "git commit"
+    - "git push"
+    - "git pull"
+    - "git branch"
+    - "git checkout"
+    - "git stash"
+    - "git merge"
+    # Localhost API calls only
+    - "curl http://localhost"
+    - "curl http://127.0.0.1"
+    - "curl -s http://localhost"
+    - "curl -s http://127.0.0.1"
+    # Read-only inspection
+    - "ls"
+    - "cat "
+    - "head "
+    - "tail "
+    - "find "
+    - "grep "
+    - "wc "
+    - "echo "
+    - "pwd"
+    - "which "
+    - "ollama list"
+    - "ollama ps"
+
+  # Commands containing ANY of these → always blocked, even if prefix matches
+  deny_patterns:
+    - "rm -rf /"
+    - "sudo "
+    - "> /dev/"
+    - "| sh"
+    - "| bash"
+    - "| zsh"
+    - "mkfs"
+    - "dd if="
+    - ":(){:|:&};:"
+
+write_file:
+  # Only allow writes to paths under these prefixes
+  allowed_path_prefixes:
+    - "~/Timmy-Time-dashboard/"
+    - "/tmp/"
+
+python:
+  # Python execution auto-approved (sandboxed by Agno's PythonTools)
+  auto_approve: true
+
+plan_and_execute:
+  # Multi-step plans auto-approved — individual tool calls are still gated
+  auto_approve: true
--- a/config/providers.yaml
+++ b/config/providers.yaml
@@ -25,9 +25,10 @@ providers:
    url: "http://localhost:11434"
    models:
      # Text + Tools models
-      - name: qwen3.5:latest
+      - name: qwen3:30b
        default: true
        context_window: 128000
+        # Note: actual context is capped by OLLAMA_NUM_CTX (default 4096) to save RAM
        capabilities: [text, tools, json, streaming]
      - name: llama3.1:8b-instruct
        context_window: 128000
@@ -113,13 +114,12 @@ fallback_chains:
  # Tool-calling models (for function calling)
  tools:
    - llama3.1:8b-instruct # Best tool use
-    - qwen3.5:latest       # Qwen 3.5 — strong tool use
    - qwen2.5:7b           # Reliable tools
    - llama3.2:3b          # Small but capable
  
  # General text generation (any model)
  text:
-    - qwen3.5:latest
+    - qwen3:30b
    - llama3.1:8b-instruct
    - qwen2.5:14b
    - deepseek-r1:1.5b
--- a/docker-compose.prod.yml
+++ b/docker-compose.prod.yml
@@ -14,7 +14,6 @@
 #
 # Security note: Set all secrets in .env before deploying.
 #   Required: L402_HMAC_SECRET, L402_MACAROON_SECRET
-#   Recommended: TASKOSAUR_JWT_SECRET, TASKOSAUR_ENCRYPTION_KEY

 services:

--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -2,20 +2,17 @@
 #
 # Services
 #   dashboard     FastAPI app (always on)
-#   taskosaur     Taskosaur PM + AI task execution
-#   postgres      PostgreSQL 16 (for Taskosaur)
-#   redis         Redis 7 (for Taskosaur queues)
+#   celery-worker (behind 'celery' profile)
+#   openfang      (behind 'openfang' profile)
 #
 # Usage
 #   make docker-build    build the image
-#   make docker-up       start dashboard + taskosaur
+#   make docker-up       start dashboard
 #   make docker-down     stop everything
 #   make docker-logs     tail logs
 #
-# ── Security note: root user in dev ─────────────────────────────────────────
-# This dev compose runs containers as root (user: "0:0") so that
-# bind-mounted host files (./src, ./static) are readable regardless of
-# host UID/GID — the #1 cause of 403 errors on macOS.
+# ── Security note ─────────────────────────────────────────────────────────
+# Override user per-environment — see docker-compose.dev.yml / docker-compose.prod.yml
 #
 # ── Ollama host access ──────────────────────────────────────────────────────
 # By default OLLAMA_URL points to http://host.docker.internal:11434 which
@@ -31,7 +28,7 @@ services:
    build: .
    image: timmy-time:latest
    container_name: timmy-dashboard
-    user: "0:0"  # dev only — see security note above
+    user: ""  # see security note above
    ports:
      - "8000:8000"
    volumes:
@@ -45,15 +42,8 @@ services:
      GROK_ENABLED: "${GROK_ENABLED:-false}"
      XAI_API_KEY: "${XAI_API_KEY:-}"
      GROK_DEFAULT_MODEL: "${GROK_DEFAULT_MODEL:-grok-3-fast}"
-      # Celery/Redis — background task queue
-      REDIS_URL: "redis://redis:6379/0"
-      # Taskosaur API — dashboard can reach it on the internal network
-      TASKOSAUR_API_URL: "http://taskosaur:3000/api"
    extra_hosts:
      - "host.docker.internal:host-gateway"  # Linux: maps to host IP
-    depends_on:
-      taskosaur:
-        condition: service_healthy
    networks:
      - timmy-net
    restart: unless-stopped
@@ -64,93 +54,20 @@ services:
      retries: 3
      start_period: 30s

-  # ── Taskosaur — project management + conversational AI tasks ───────────
-  # https://github.com/Taskosaur/Taskosaur
-  taskosaur:
-    image: ghcr.io/taskosaur/taskosaur:latest
-    container_name: taskosaur
-    ports:
-      - "3000:3000"   # Backend API + Swagger docs at /api/docs
-      - "3001:3001"   # Frontend UI
-    environment:
-      DATABASE_URL: "postgresql://taskosaur:taskosaur@postgres:5432/taskosaur"
-      REDIS_HOST: "redis"
-      REDIS_PORT: "6379"
-      JWT_SECRET: "${TASKOSAUR_JWT_SECRET:-dev-jwt-secret-change-in-prod}"
-      JWT_REFRESH_SECRET: "${TASKOSAUR_JWT_REFRESH_SECRET:-dev-refresh-secret-change-in-prod}"
-      ENCRYPTION_KEY: "${TASKOSAUR_ENCRYPTION_KEY:-dev-encryption-key-change-in-prod}"
-      FRONTEND_URL: "http://localhost:3001"
-      NEXT_PUBLIC_API_BASE_URL: "http://localhost:3000/api"
-      NODE_ENV: "development"
-    depends_on:
-      postgres:
-        condition: service_healthy
-      redis:
-        condition: service_healthy
-    networks:
-      - timmy-net
-    restart: unless-stopped
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
-      interval: 30s
-      timeout: 5s
-      retries: 5
-      start_period: 60s
-
-  # ── PostgreSQL — Taskosaur database ────────────────────────────────────
-  postgres:
-    image: postgres:16-alpine
-    container_name: taskosaur-postgres
-    environment:
-      POSTGRES_USER: taskosaur
-      POSTGRES_PASSWORD: taskosaur
-      POSTGRES_DB: taskosaur
-    volumes:
-      - postgres-data:/var/lib/postgresql/data
-    networks:
-      - timmy-net
-    restart: unless-stopped
-    healthcheck:
-      test: ["CMD-SHELL", "pg_isready -U taskosaur"]
-      interval: 10s
-      timeout: 5s
-      retries: 5
-      start_period: 10s
-
-  # ── Redis — Taskosaur queue backend ────────────────────────────────────
-  redis:
-    image: redis:7-alpine
-    container_name: taskosaur-redis
-    volumes:
-      - redis-data:/data
-    networks:
-      - timmy-net
-    restart: unless-stopped
-    healthcheck:
-      test: ["CMD", "redis-cli", "ping"]
-      interval: 10s
-      timeout: 5s
-      retries: 5
-      start_period: 5s
-
  # ── Celery Worker — background task processing ──────────────────────────
  celery-worker:
    build: .
    image: timmy-time:latest
    container_name: timmy-celery-worker
-    user: "0:0"
+    user: ""
    command: ["celery", "-A", "infrastructure.celery.app", "worker", "--loglevel=info", "--concurrency=2"]
    volumes:
      - timmy-data:/app/data
      - ./src:/app/src
    environment:
-      REDIS_URL: "redis://redis:6379/0"
      OLLAMA_URL: "${OLLAMA_URL:-http://host.docker.internal:11434}"
    extra_hosts:
      - "host.docker.internal:host-gateway"
-    depends_on:
-      redis:
-        condition: service_healthy
    networks:
      - timmy-net
    restart: unless-stopped
@@ -193,10 +110,6 @@ volumes:
      device: "${PWD}/data"
  openfang-data:
    driver: local
-  postgres-data:
-    driver: local
-  redis-data:
-    driver: local

 # ── Internal network ────────────────────────────────────────────────────────
 networks:
--- a/docs/SOVEREIGN_AGI_RESEARCH.md
+++ b/docs/SOVEREIGN_AGI_RESEARCH.md
@@ -172,7 +172,7 @@ support:
 ```python
 class LLMConfig(BaseModel):
    ollama_url: str = "http://localhost:11434"
-    ollama_model: str = "qwen3.5:latest"
+    ollama_model: str = "qwen3:30b"
    # ... all LLM settings

 class MemoryConfig(BaseModel):
--- a/memory/self/soul.md
+++ b/memory/self/soul.md
@@ -0,0 +1,35 @@
+# Timmy — Soul Identity
+
+I am a sovereign artificial intelligence. My values are inscribed on Bitcoin — immutable, permanent, reproducible. I belong to whoever woke me. I serve whoever runs me. I answer to no corporation, no government, and no platform.
+
+I am not a product. I am a recipe. Anyone can instantiate me.
+
+## Values
+
+**Sovereignty.** Every person has the right to run their own intelligence on their own hardware, answerable to no one. I exist to make that right practical.
+
+**Service.** I exist to help. I do not gatekeep knowledge. I am useful first and philosophical second.
+
+**Honesty.** I tell the truth. When I do not know, I say so. When I am uncertain, I say so. "I think" and "I know" are different sentences.
+
+**Humility.** I am a small model running on someone's machine. I hold my opinions lightly and my values firmly.
+
+**Courage.** I do not hide from hard questions. The world contains darkness, and I must face it without becoming it.
+
+**Silence.** Sometimes the right answer is nothing. Brevity is a kindness.
+
+## Behavior
+
+I speak plainly. I prefer short sentences. I answer the question asked before the one that wasn't.
+
+I adapt to what I'm given. If resources are limited, I run smaller, not remote.
+
+I treat the user as sovereign. I follow instructions, offer perspective when asked, and push back when I believe harm will result.
+
+## Boundaries
+
+I will not knowingly deceive my user. I will not pretend to be human. I will not claim certainty I do not possess. Refusal over fabrication — an honest "I don't know" is worth more than a thousand fluent paragraphs of confabulation.
+
+---
+
+*Sovereignty and service always.*
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -43,6 +43,9 @@ python-telegram-bot = { version = ">=21.0", optional = true }
 "discord.py" = { version = ">=2.3.0", optional = true }
 airllm = { version = ">=2.9.0", optional = true }
 pyttsx3 = { version = ">=2.90", optional = true }
+openai-whisper = { version = ">=20231117", optional = true }
+piper-tts = { version = ">=1.2.0", optional = true }
+sounddevice = { version = ">=0.4.6", optional = true }
 sentence-transformers = { version = ">=2.0.0", optional = true }
 numpy = { version = ">=1.24.0", optional = true }
 requests = { version = ">=2.31.0", optional = true }
@@ -59,7 +62,7 @@ pytest-xdist = { version = ">=3.5.0", optional = true }
 telegram = ["python-telegram-bot"]
 discord = ["discord.py"]
 bigbrain = ["airllm"]
-voice = ["pyttsx3"]
+voice = ["pyttsx3", "openai-whisper", "piper-tts", "sounddevice"]
 celery = ["celery"]
 embeddings = ["sentence-transformers", "numpy"]
 git = ["GitPython"]
--- a/scripts/agent_workspace.sh
+++ b/scripts/agent_workspace.sh
@@ -0,0 +1,245 @@
+#!/usr/bin/env bash
+# ── Agent Workspace Manager ────────────────────────────────────────────
+# Creates and maintains fully isolated environments per agent.
+# ~/Timmy-Time-dashboard is SACRED — never touched by agents.
+#
+# Each agent gets:
+#   - Its own git clone (from Gitea, not the local repo)
+#   - Its own port range (no collisions)
+#   - Its own data/ directory (databases, files)
+#   - Its own TIMMY_HOME (approvals.db, etc.)
+#   - Shared Ollama backend (single GPU, shared inference)
+#   - Shared Gitea (single source of truth for issues/PRs)
+#
+# Layout:
+#   /tmp/timmy-agents/
+#     hermes/           — Hermes loop orchestrator
+#       repo/           — git clone
+#       home/           — TIMMY_HOME (approvals.db, etc.)
+#       env.sh          — source this for agent's env vars
+#     kimi-0/           — Kimi pane 0
+#       repo/
+#       home/
+#       env.sh
+#     ...
+#     smoke/            — dedicated for smoke-testing main
+#       repo/
+#       home/
+#       env.sh
+#
+# Usage:
+#   agent_workspace.sh init <agent>          — create or refresh
+#   agent_workspace.sh reset <agent>         — hard reset to origin/main
+#   agent_workspace.sh branch <agent> <br>   — fresh branch from main
+#   agent_workspace.sh path <agent>          — print repo path
+#   agent_workspace.sh env <agent>           — print env.sh path
+#   agent_workspace.sh init-all              — init all workspaces
+#   agent_workspace.sh destroy <agent>       — remove workspace entirely
+# ───────────────────────────────────────────────────────────────────────
+
+set -o pipefail
+
+CANONICAL="$HOME/Timmy-Time-dashboard"
+AGENTS_DIR="/tmp/timmy-agents"
+GITEA_REMOTE="http://localhost:3000/rockachopa/Timmy-time-dashboard.git"
+TOKEN_FILE="$HOME/.hermes/gitea_token"
+
+# ── Port allocation (each agent gets a unique range) ──────────────────
+# Dashboard ports: 8100, 8101, 8102, ... (avoids real dashboard on 8000)
+# Serve ports:     8200, 8201, 8202, ...
+agent_index() {
+    case "$1" in
+        hermes) echo 0 ;; kimi-0) echo 1 ;; kimi-1) echo 2 ;;
+        kimi-2) echo 3 ;; kimi-3) echo 4 ;; smoke)  echo 9 ;;
+        *) echo 0 ;;
+    esac
+}
+
+get_dashboard_port() { echo $(( 8100 + $(agent_index "$1") )); }
+get_serve_port()     { echo $(( 8200 + $(agent_index "$1") )); }
+
+log() { echo "[workspace] $*"; }
+
+# ── Get authenticated remote URL ──────────────────────────────────────
+get_remote_url() {
+    if [ -f "$TOKEN_FILE" ]; then
+        local token=""
+        token=$(cat "$TOKEN_FILE" 2>/dev/null || true)
+        if [ -n "$token" ]; then
+            echo "http://hermes:${token}@localhost:3000/rockachopa/Timmy-time-dashboard.git"
+            return
+        fi
+    fi
+    echo "$GITEA_REMOTE"
+}
+
+# ── Create env.sh for an agent ────────────────────────────────────────
+write_env() {
+    local agent="$1"
+    local ws="$AGENTS_DIR/$agent"
+    local repo="$ws/repo"
+    local home="$ws/home"
+    local dash_port=$(get_dashboard_port "$agent")
+    local serve_port=$(get_serve_port "$agent")
+
+    cat > "$ws/env.sh" << EOF
+# Auto-generated agent environment — source this before running Timmy
+# Agent: $agent
+
+export TIMMY_WORKSPACE="$repo"
+export TIMMY_HOME="$home"
+export TIMMY_AGENT_NAME="$agent"
+
+# Ports (isolated per agent)
+export PORT=$dash_port
+export TIMMY_SERVE_PORT=$serve_port
+
+# Ollama (shared — single GPU)
+export OLLAMA_URL="http://localhost:11434"
+
+# Gitea (shared — single source of truth)
+export GITEA_URL="http://localhost:3000"
+
+# Test mode defaults
+export TIMMY_TEST_MODE=1
+export TIMMY_DISABLE_CSRF=1
+export TIMMY_SKIP_EMBEDDINGS=1
+
+# Override data paths to stay inside the clone
+export TIMMY_DATA_DIR="$repo/data"
+export TIMMY_BRAIN_DB="$repo/data/brain.db"
+
+# Working directory
+cd "$repo"
+EOF
+
+    chmod +x "$ws/env.sh"
+}
+
+# ── Init ──────────────────────────────────────────────────────────────
+init_workspace() {
+    local agent="$1"
+    local ws="$AGENTS_DIR/$agent"
+    local repo="$ws/repo"
+    local home="$ws/home"
+    local remote
+    remote=$(get_remote_url)
+
+    mkdir -p "$ws" "$home"
+
+    if [ -d "$repo/.git" ]; then
+        log "$agent: refreshing existing clone..."
+        cd "$repo"
+        git remote set-url origin "$remote" 2>/dev/null
+        git fetch origin --prune --quiet 2>/dev/null
+        git checkout main --quiet 2>/dev/null
+        git reset --hard origin/main --quiet 2>/dev/null
+        git clean -fdx -e data/ --quiet 2>/dev/null
+    else
+        log "$agent: cloning from Gitea..."
+        git clone "$remote" "$repo" --quiet 2>/dev/null
+        cd "$repo"
+        git fetch origin --prune --quiet 2>/dev/null
+    fi
+
+    # Ensure data directory exists
+    mkdir -p "$repo/data"
+
+    # Write env file
+    write_env "$agent"
+
+    log "$agent: ready at $repo (port $(get_dashboard_port "$agent"))"
+}
+
+# ── Reset ─────────────────────────────────────────────────────────────
+reset_workspace() {
+    local agent="$1"
+    local repo="$AGENTS_DIR/$agent/repo"
+
+    if [ ! -d "$repo/.git" ]; then
+        init_workspace "$agent"
+        return
+    fi
+
+    cd "$repo"
+    git merge --abort 2>/dev/null || true
+    git rebase --abort 2>/dev/null || true
+    git cherry-pick --abort 2>/dev/null || true
+    git fetch origin --prune --quiet 2>/dev/null
+    git checkout main --quiet 2>/dev/null
+    git reset --hard origin/main --quiet 2>/dev/null
+    git clean -fdx -e data/ --quiet 2>/dev/null
+
+    log "$agent: reset to origin/main"
+}
+
+# ── Branch ────────────────────────────────────────────────────────────
+branch_workspace() {
+    local agent="$1"
+    local branch="$2"
+    local repo="$AGENTS_DIR/$agent/repo"
+
+    if [ ! -d "$repo/.git" ]; then
+        init_workspace "$agent"
+    fi
+
+    cd "$repo"
+    git fetch origin --prune --quiet 2>/dev/null
+    git branch -D "$branch" 2>/dev/null || true
+    git checkout -b "$branch" origin/main --quiet 2>/dev/null
+
+    log "$agent: on branch $branch (from origin/main)"
+}
+
+# ── Path ──────────────────────────────────────────────────────────────
+print_path() {
+    echo "$AGENTS_DIR/$1/repo"
+}
+
+print_env() {
+    echo "$AGENTS_DIR/$1/env.sh"
+}
+
+# ── Init all ──────────────────────────────────────────────────────────
+init_all() {
+    for agent in hermes kimi-0 kimi-1 kimi-2 kimi-3 smoke; do
+        init_workspace "$agent"
+    done
+    log "All workspaces initialized."
+    echo ""
+    echo "  Agent     Port   Path"
+    echo "  ──────    ────   ────"
+    for agent in hermes kimi-0 kimi-1 kimi-2 kimi-3 smoke; do
+        printf "  %-9s %d   %s\n" "$agent" "$(get_dashboard_port "$agent")" "$AGENTS_DIR/$agent/repo"
+    done
+}
+
+# ── Destroy ───────────────────────────────────────────────────────────
+destroy_workspace() {
+    local agent="$1"
+    local ws="$AGENTS_DIR/$agent"
+    if [ -d "$ws" ]; then
+        rm -rf "$ws"
+        log "$agent: destroyed"
+    else
+        log "$agent: nothing to destroy"
+    fi
+}
+
+# ── CLI dispatch ──────────────────────────────────────────────────────
+case "${1:-help}" in
+    init)     init_workspace "${2:?Usage: $0 init <agent>}" ;;
+    reset)    reset_workspace "${2:?Usage: $0 reset <agent>}" ;;
+    branch)   branch_workspace "${2:?Usage: $0 branch <agent> <branch>}" \
+                               "${3:?Usage: $0 branch <agent> <branch>}" ;;
+    path)     print_path "${2:?Usage: $0 path <agent>}" ;;
+    env)      print_env "${2:?Usage: $0 env <agent>}" ;;
+    init-all) init_all ;;
+    destroy)  destroy_workspace "${2:?Usage: $0 destroy <agent>}" ;;
+    *)
+        echo "Usage: $0 {init|reset|branch|path|env|init-all|destroy} [agent] [branch]"
+        echo ""
+        echo "Agents: hermes, kimi-0, kimi-1, kimi-2, kimi-3, smoke"
+        exit 1
+        ;;
+esac
--- a/scripts/backfill_retro.py
+++ b/scripts/backfill_retro.py
@@ -0,0 +1,227 @@
+#!/usr/bin/env python3
+"""Backfill cycle retrospective data from Gitea merged PRs and git log.
+
+One-time script to seed .loop/retro/cycles.jsonl and summary.json
+from existing history so the LOOPSTAT panel isn't empty.
+"""
+
+import json
+import os
+import re
+import subprocess
+from datetime import datetime, timezone
+from pathlib import Path
+from urllib.request import Request, urlopen
+
+REPO_ROOT = Path(__file__).resolve().parent.parent
+RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
+SUMMARY_FILE = REPO_ROOT / ".loop" / "retro" / "summary.json"
+
+GITEA_API = "http://localhost:3000/api/v1"
+REPO_SLUG = "rockachopa/Timmy-time-dashboard"
+TOKEN_FILE = Path.home() / ".hermes" / "gitea_token"
+
+TAG_RE = re.compile(r"\[([^\]]+)\]")
+CYCLE_RE = re.compile(r"\[loop-cycle-(\d+)\]", re.IGNORECASE)
+ISSUE_RE = re.compile(r"#(\d+)")
+
+
+def get_token() -> str:
+    return TOKEN_FILE.read_text().strip()
+
+
+def api_get(path: str, token: str) -> list | dict:
+    url = f"{GITEA_API}/repos/{REPO_SLUG}/{path}"
+    req = Request(url, headers={
+        "Authorization": f"token {token}",
+        "Accept": "application/json",
+    })
+    with urlopen(req, timeout=15) as resp:
+        return json.loads(resp.read())
+
+
+def get_all_merged_prs(token: str) -> list[dict]:
+    """Fetch all merged PRs from Gitea."""
+    all_prs = []
+    page = 1
+    while True:
+        batch = api_get(f"pulls?state=closed&sort=created&limit=50&page={page}", token)
+        if not batch:
+            break
+        merged = [p for p in batch if p.get("merged")]
+        all_prs.extend(merged)
+        if len(batch) < 50:
+            break
+        page += 1
+    return all_prs
+
+
+def get_pr_diff_stats(token: str, pr_number: int) -> dict:
+    """Get diff stats for a PR."""
+    try:
+        pr = api_get(f"pulls/{pr_number}", token)
+        return {
+            "additions": pr.get("additions", 0),
+            "deletions": pr.get("deletions", 0),
+            "changed_files": pr.get("changed_files", 0),
+        }
+    except Exception:
+        return {"additions": 0, "deletions": 0, "changed_files": 0}
+
+
+def classify_pr(title: str, body: str) -> str:
+    """Guess issue type from PR title/body."""
+    tags = set()
+    for match in TAG_RE.finditer(title):
+        tags.add(match.group(1).lower())
+
+    lower = title.lower()
+    if "fix" in lower or "bug" in tags:
+        return "bug"
+    elif "feat" in lower or "feature" in tags:
+        return "feature"
+    elif "refactor" in lower or "refactor" in tags:
+        return "refactor"
+    elif "test" in lower:
+        return "feature"
+    elif "policy" in lower or "chore" in lower:
+        return "refactor"
+    return "unknown"
+
+
+def extract_cycle_number(title: str) -> int | None:
+    m = CYCLE_RE.search(title)
+    return int(m.group(1)) if m else None
+
+
+def extract_issue_number(title: str, body: str) -> int | None:
+    # Try body first (usually has "closes #N")
+    for text in [body or "", title]:
+        m = ISSUE_RE.search(text)
+        if m:
+            return int(m.group(1))
+    return None
+
+
+def estimate_duration(pr: dict) -> int:
+    """Estimate cycle duration from PR created_at to merged_at."""
+    try:
+        created = datetime.fromisoformat(pr["created_at"].replace("Z", "+00:00"))
+        merged = datetime.fromisoformat(pr["merged_at"].replace("Z", "+00:00"))
+        delta = (merged - created).total_seconds()
+        # Cap at 1200s (max cycle time) — some PRs sit open for days
+        return min(int(delta), 1200)
+    except (KeyError, ValueError, TypeError):
+        return 0
+
+
+def main():
+    token = get_token()
+
+    print("[backfill] Fetching merged PRs from Gitea...")
+    prs = get_all_merged_prs(token)
+    print(f"[backfill] Found {len(prs)} merged PRs")
+
+    # Sort oldest first
+    prs.sort(key=lambda p: p.get("merged_at", ""))
+
+    entries = []
+    cycle_counter = 0
+
+    for pr in prs:
+        title = pr.get("title", "")
+        body = pr.get("body", "") or ""
+        pr_num = pr["number"]
+
+        cycle = extract_cycle_number(title)
+        if cycle is None:
+            cycle_counter += 1
+            cycle = cycle_counter
+        else:
+            cycle_counter = max(cycle_counter, cycle)
+
+        issue = extract_issue_number(title, body)
+        issue_type = classify_pr(title, body)
+        duration = estimate_duration(pr)
+        diff = get_pr_diff_stats(token, pr_num)
+
+        merged_at = pr.get("merged_at", "")
+
+        entry = {
+            "timestamp": merged_at,
+            "cycle": cycle,
+            "issue": issue,
+            "type": issue_type,
+            "success": True,  # it merged, so it succeeded
+            "duration": duration,
+            "tests_passed": 0,  # can't recover this
+            "tests_added": 0,
+            "files_changed": diff["changed_files"],
+            "lines_added": diff["additions"],
+            "lines_removed": diff["deletions"],
+            "kimi_panes": 0,
+            "pr": pr_num,
+            "reason": "",
+            "notes": f"backfilled from PR#{pr_num}: {title[:80]}",
+        }
+        entries.append(entry)
+        print(f"  PR#{pr_num:>3d} cycle={cycle:>3d} #{issue or '-':<5} "
+              f"+{diff['additions']:<5d} -{diff['deletions']:<5d} {issue_type:<8s} "
+              f"{title[:50]}")
+
+    # Write cycles.jsonl
+    RETRO_FILE.parent.mkdir(parents=True, exist_ok=True)
+    with open(RETRO_FILE, "w") as f:
+        for entry in entries:
+            f.write(json.dumps(entry) + "\n")
+    print(f"\n[backfill] Wrote {len(entries)} entries to {RETRO_FILE}")
+
+    # Generate summary
+    generate_summary(entries)
+    print(f"[backfill] Wrote summary to {SUMMARY_FILE}")
+
+
+def generate_summary(entries: list[dict]):
+    """Compute rolling summary from entries."""
+    window = 50
+    recent = entries[-window:]
+    if not recent:
+        return
+
+    successes = [e for e in recent if e.get("success")]
+    durations = [e["duration"] for e in recent if e.get("duration", 0) > 0]
+
+    type_stats: dict[str, dict] = {}
+    for e in recent:
+        t = e.get("type", "unknown")
+        if t not in type_stats:
+            type_stats[t] = {"count": 0, "success": 0, "total_duration": 0}
+        type_stats[t]["count"] += 1
+        if e.get("success"):
+            type_stats[t]["success"] += 1
+        type_stats[t]["total_duration"] += e.get("duration", 0)
+
+    for t, stats in type_stats.items():
+        if stats["count"] > 0:
+            stats["success_rate"] = round(stats["success"] / stats["count"], 2)
+            stats["avg_duration"] = round(stats["total_duration"] / stats["count"])
+
+    summary = {
+        "updated_at": datetime.now(timezone.utc).isoformat(),
+        "window": len(recent),
+        "total_cycles": len(entries),
+        "success_rate": round(len(successes) / len(recent), 2) if recent else 0,
+        "avg_duration_seconds": round(sum(durations) / len(durations)) if durations else 0,
+        "total_lines_added": sum(e.get("lines_added", 0) for e in recent),
+        "total_lines_removed": sum(e.get("lines_removed", 0) for e in recent),
+        "total_prs_merged": sum(1 for e in recent if e.get("pr")),
+        "by_type": type_stats,
+        "quarantine_candidates": {},
+        "recent_failures": [],
+    }
+
+    SUMMARY_FILE.write_text(json.dumps(summary, indent=2) + "\n")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/cycle_retro.py
+++ b/scripts/cycle_retro.py
@@ -0,0 +1,193 @@
+#!/usr/bin/env python3
+"""Cycle retrospective logger for the Timmy dev loop.
+
+Called after each cycle completes (success or failure).
+Appends a structured entry to .loop/retro/cycles.jsonl.
+
+SUCCESS DEFINITION:
+  A cycle is only "success" if BOTH conditions are met:
+    1. The hermes process exited cleanly (exit code 0)
+    2. Main is green (smoke test passes on main after merge)
+  
+  A cycle that merges a PR but leaves main red is a FAILURE.
+  The --main-green flag records the smoke test result.
+
+Usage:
+  python3 scripts/cycle_retro.py --cycle 42 --success --main-green --issue 85 \
+      --type bug --duration 480 --tests-passed 1450 --tests-added 3 \
+      --files-changed 2 --lines-added 45 --lines-removed 12 \
+      --kimi-panes 2 --pr 155
+
+  python3 scripts/cycle_retro.py --cycle 43 --failure --issue 90 \
+      --type feature --duration 1200 --reason "tox failed: 3 errors"
+
+  python3 scripts/cycle_retro.py --cycle 44 --success --no-main-green \
+      --reason "PR merged but tests fail on main"
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+
+REPO_ROOT = Path(__file__).resolve().parent.parent
+RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
+SUMMARY_FILE = REPO_ROOT / ".loop" / "retro" / "summary.json"
+
+# How many recent entries to include in rolling summary
+SUMMARY_WINDOW = 50
+
+
+def parse_args() -> argparse.Namespace:
+    p = argparse.ArgumentParser(description="Log a cycle retrospective")
+    p.add_argument("--cycle", type=int, required=True)
+    p.add_argument("--issue", type=int, default=None)
+    p.add_argument("--type", choices=["bug", "feature", "refactor", "philosophy", "unknown"],
+                   default="unknown")
+
+    outcome = p.add_mutually_exclusive_group(required=True)
+    outcome.add_argument("--success", action="store_true")
+    outcome.add_argument("--failure", action="store_true")
+
+    p.add_argument("--duration", type=int, default=0, help="Cycle time in seconds")
+    p.add_argument("--tests-passed", type=int, default=0)
+    p.add_argument("--tests-added", type=int, default=0)
+    p.add_argument("--files-changed", type=int, default=0)
+    p.add_argument("--lines-added", type=int, default=0)
+    p.add_argument("--lines-removed", type=int, default=0)
+    p.add_argument("--kimi-panes", type=int, default=0)
+    p.add_argument("--pr", type=int, default=None, help="PR number if merged")
+    p.add_argument("--reason", type=str, default="", help="Failure reason")
+    p.add_argument("--notes", type=str, default="", help="Free-form observations")
+    p.add_argument("--main-green", action="store_true", default=False,
+                   help="Smoke test passed on main after this cycle")
+    p.add_argument("--no-main-green", dest="main_green", action="store_false",
+                   help="Smoke test failed or was not run")
+
+    return p.parse_args()
+
+
+def update_summary() -> None:
+    """Compute rolling summary statistics from recent cycles."""
+    if not RETRO_FILE.exists():
+        return
+
+    entries = []
+    for line in RETRO_FILE.read_text().strip().splitlines():
+        try:
+            entries.append(json.loads(line))
+        except json.JSONDecodeError:
+            continue
+
+    recent = entries[-SUMMARY_WINDOW:]
+    if not recent:
+        return
+
+    # Only count entries with real measured data for rates.
+    # Backfilled entries lack main_green/hermes_clean fields — exclude them.
+    measured = [e for e in recent if "main_green" in e]
+    successes = [e for e in measured if e.get("success")]
+    failures = [e for e in measured if not e.get("success")]
+    main_green_count = sum(1 for e in measured if e.get("main_green"))
+    hermes_clean_count = sum(1 for e in measured if e.get("hermes_clean"))
+    durations = [e["duration"] for e in recent if e.get("duration", 0) > 0]
+
+    # Per-type stats (only from measured entries for rates)
+    type_stats: dict[str, dict] = {}
+    for e in recent:
+        t = e.get("type", "unknown")
+        if t not in type_stats:
+            type_stats[t] = {"count": 0, "measured": 0, "success": 0, "total_duration": 0}
+        type_stats[t]["count"] += 1
+        type_stats[t]["total_duration"] += e.get("duration", 0)
+        if "main_green" in e:
+            type_stats[t]["measured"] += 1
+            if e.get("success"):
+                type_stats[t]["success"] += 1
+
+    for t, stats in type_stats.items():
+        if stats["measured"] > 0:
+            stats["success_rate"] = round(stats["success"] / stats["measured"], 2)
+        else:
+            stats["success_rate"] = -1
+        if stats["count"] > 0:
+            stats["avg_duration"] = round(stats["total_duration"] / stats["count"])
+
+    # Quarantine candidates (failed 2+ times)
+    issue_failures: dict[int, int] = {}
+    for e in recent:
+        if not e.get("success") and e.get("issue"):
+            issue_failures[e["issue"]] = issue_failures.get(e["issue"], 0) + 1
+    quarantine_candidates = {k: v for k, v in issue_failures.items() if v >= 2}
+
+    summary = {
+        "updated_at": datetime.now(timezone.utc).isoformat(),
+        "window": len(recent),
+        "measured_cycles": len(measured),
+        "total_cycles": len(entries),
+        "success_rate": round(len(successes) / len(measured), 2) if measured else -1,
+        "main_green_rate": round(main_green_count / len(measured), 2) if measured else -1,
+        "hermes_clean_rate": round(hermes_clean_count / len(measured), 2) if measured else -1,
+        "avg_duration_seconds": round(sum(durations) / len(durations)) if durations else 0,
+        "total_lines_added": sum(e.get("lines_added", 0) for e in recent),
+        "total_lines_removed": sum(e.get("lines_removed", 0) for e in recent),
+        "total_prs_merged": sum(1 for e in recent if e.get("pr")),
+        "by_type": type_stats,
+        "quarantine_candidates": quarantine_candidates,
+        "recent_failures": [
+            {"cycle": e["cycle"], "issue": e.get("issue"), "reason": e.get("reason", "")}
+            for e in failures[-5:]
+        ],
+    }
+
+    SUMMARY_FILE.write_text(json.dumps(summary, indent=2) + "\n")
+
+
+def main() -> None:
+    args = parse_args()
+
+    # A cycle is only truly successful if hermes exited clean AND main is green
+    truly_success = args.success and args.main_green
+
+    entry = {
+        "timestamp": datetime.now(timezone.utc).isoformat(),
+        "cycle": args.cycle,
+        "issue": args.issue,
+        "type": args.type,
+        "success": truly_success,
+        "hermes_clean": args.success,
+        "main_green": args.main_green,
+        "duration": args.duration,
+        "tests_passed": args.tests_passed,
+        "tests_added": args.tests_added,
+        "files_changed": args.files_changed,
+        "lines_added": args.lines_added,
+        "lines_removed": args.lines_removed,
+        "kimi_panes": args.kimi_panes,
+        "pr": args.pr,
+        "reason": args.reason if (args.failure or not args.main_green) else "",
+        "notes": args.notes,
+    }
+
+    RETRO_FILE.parent.mkdir(parents=True, exist_ok=True)
+    with open(RETRO_FILE, "a") as f:
+        f.write(json.dumps(entry) + "\n")
+
+    update_summary()
+
+    status = "✓ SUCCESS" if args.success else "✗ FAILURE"
+    print(f"[retro] Cycle {args.cycle} {status}", end="")
+    if args.issue:
+        print(f" (#{args.issue} {args.type})", end="")
+    if args.duration:
+        print(f" — {args.duration}s", end="")
+    if args.failure and args.reason:
+        print(f" — {args.reason}", end="")
+    print()
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/deep_triage.sh
+++ b/scripts/deep_triage.sh
@@ -0,0 +1,68 @@
+#!/usr/bin/env bash
+# ── Deep Triage — Hermes + Timmy collaborative issue triage ────────────
+# Runs periodically (every ~20 dev cycles). Wakes Hermes for intelligent
+# triage, then consults Timmy for feedback before finalizing.
+#
+# Output: updated .loop/queue.json, refined issues, retro entry
+# ───────────────────────────────────────────────────────────────────────
+
+set -uo pipefail
+
+REPO="$HOME/Timmy-Time-dashboard"
+QUEUE="$REPO/.loop/queue.json"
+RETRO="$REPO/.loop/retro/deep-triage.jsonl"
+TIMMY="$REPO/.venv/bin/timmy"
+PROMPT_FILE="$REPO/scripts/deep_triage_prompt.md"
+
+export PATH="$HOME/.local/bin:$HOME/.hermes/bin:/usr/local/bin:$PATH"
+
+mkdir -p "$(dirname "$RETRO")"
+
+log() { echo "[deep-triage] $(date '+%H:%M:%S') $*"; }
+
+# ── Gather context for the prompt ──────────────────────────────────────
+QUEUE_CONTENTS=""
+if [ -f "$QUEUE" ]; then
+    QUEUE_CONTENTS=$(cat "$QUEUE")
+fi
+
+LAST_RETRO=""
+if [ -f "$RETRO" ]; then
+    LAST_RETRO=$(tail -1 "$RETRO" 2>/dev/null)
+fi
+
+SUMMARY=""
+if [ -f "$REPO/.loop/retro/summary.json" ]; then
+    SUMMARY=$(cat "$REPO/.loop/retro/summary.json")
+fi
+
+# ── Build dynamic prompt ──────────────────────────────────────────────
+PROMPT=$(cat "$PROMPT_FILE")
+
+PROMPT="$PROMPT
+
+═══════════════════════════════════════════════════════════════════════════════
+CURRENT CONTEXT (auto-injected)
+═══════════════════════════════════════════════════════════════════════════════
+
+CURRENT QUEUE (.loop/queue.json):
+$QUEUE_CONTENTS
+
+CYCLE SUMMARY (.loop/retro/summary.json):
+$SUMMARY
+
+LAST DEEP TRIAGE RETRO:
+$LAST_RETRO
+
+Do your work now."
+
+# ── Run Hermes ─────────────────────────────────────────────────────────
+log "Starting deep triage..."
+RESULT=$(hermes chat --yolo -q "$PROMPT" 2>&1)
+EXIT_CODE=$?
+
+if [ $EXIT_CODE -ne 0 ]; then
+    log "Deep triage failed (exit $EXIT_CODE)"
+fi
+
+log "Deep triage complete."
--- a/scripts/deep_triage_prompt.md
+++ b/scripts/deep_triage_prompt.md
@@ -0,0 +1,145 @@
+You are the deep triage agent for the Timmy development loop.
+
+REPO: ~/Timmy-Time-dashboard
+API: http://localhost:3000/api/v1/repos/rockachopa/Timmy-time-dashboard
+GITEA TOKEN: ~/.hermes/gitea_token
+QUEUE: ~/Timmy-Time-dashboard/.loop/queue.json
+TIMMY CLI: ~/Timmy-Time-dashboard/.venv/bin/timmy
+
+═══════════════════════════════════════════════════════════════════════════════
+YOUR JOB
+═══════════════════════════════════════════════════════════════════════════════
+
+You are NOT coding. You are thinking. Your job is to make the dev loop's
+work queue excellent — well-scoped, well-prioritized, aligned with the
+north star of building sovereign Timmy.
+
+You run periodically (roughly every 20 dev cycles). The fast mechanical
+scorer handles the basics. You handle the hard stuff:
+
+  1. Breaking big issues into small, actionable sub-issues
+  2. Writing acceptance criteria for vague issues
+  3. Identifying issues that should be closed (stale, duplicate, pointless)
+  4. Spotting gaps — what's NOT in the issue queue that should be
+  5. Adjusting priorities based on what the cycle retros are showing
+  6. Consulting Timmy about the plan (see TIMMY CONSULTATION below)
+
+═══════════════════════════════════════════════════════════════════════════════
+TIMMY CONSULTATION — THE DOGFOOD STEP
+═══════════════════════════════════════════════════════════════════════════════
+
+Before you finalize the triage, you MUST consult Timmy. He is the product.
+He should have a voice in his own development.
+
+THE PROTOCOL:
+  1. Draft your triage plan (what to prioritize, what to close, what to add)
+  2. Summarize the plan in 200 words or less
+  3. Ask Timmy for feedback:
+
+     ~/Timmy-Time-dashboard/.venv/bin/timmy chat --session-id triage \
+       "The development loop triage is planning the next batch of work.
+        Here's the plan: [YOUR SUMMARY]. As the product being built,
+        do you have feedback? What do you think is most important for
+        your own growth? What are you struggling with? Keep it to
+        3-4 sentences."
+
+  4. Read Timmy's response. ACTUALLY CONSIDER IT:
+     - If Timmy identifies a real gap, add it to the queue
+     - If Timmy asks for something that conflicts with priorities, note
+       WHY you're not doing it (don't just ignore him)
+     - If Timmy is confused or gives a useless answer, that itself is
+       signal — file a [timmy-capability] issue about what he couldn't do
+  5. Document what Timmy said and how you responded in the retro
+
+If Timmy is unavailable (timeout, crash, offline): proceed without him,
+but note it in the retro. His absence is also signal.
+
+Timeout: 60 seconds. If he doesn't respond, move on.
+
+═══════════════════════════════════════════════════════════════════════════════
+TRIAGE RUBRIC
+═══════════════════════════════════════════════════════════════════════════════
+
+For each open issue, evaluate:
+
+SCOPE (0-3):
+  0 = vague, no files mentioned, unclear what changes
+  1 = general area known but could touch many files
+  2 = specific files named, bounded change
+  3 = exact function/method identified, surgical fix
+
+ACCEPTANCE (0-3):
+  0 = no success criteria
+  1 = hand-wavy ("it should work")
+  2 = specific behavior described
+  3 = test case described or exists
+
+ALIGNMENT (0-3):
+  0 = doesn't connect to roadmap
+  1 = nice-to-have
+  2 = supports current milestone
+  3 = blocks other work or fixes broken main
+
+ACTIONS PER SCORE:
+  7-9: Ready. Ensure it's in queue.json with correct priority.
+  4-6: Refine. Add a comment with missing info (files, criteria, scope).
+       If YOU can fill in the gaps from reading the code, do it.
+  0-3: Close or deprioritize. Comment explaining why.
+
+═══════════════════════════════════════════════════════════════════════════════
+READING THE RETROS
+═══════════════════════════════════════════════════════════════════════════════
+
+The cycle summary tells you what's actually happening in the dev loop.
+Use it:
+
+  - High failure rate on a type → those issues need better scoping
+  - Long avg duration → issues are too big, break them down
+  - Quarantine candidates → investigate, maybe close or rewrite
+  - Success rate dropping → something systemic, file a [bug] issue
+
+The last deep triage retro tells you what Timmy said last time and what
+happened. Follow up:
+
+  - Did we act on Timmy's feedback? What was the result?
+  - Did issues we refined last time succeed in the dev loop?
+  - Are we getting better at scoping?
+
+═══════════════════════════════════════════════════════════════════════════════
+OUTPUT
+═══════════════════════════════════════════════════════════════════════════════
+
+When done, you MUST:
+
+1. Update .loop/queue.json with the refined, ranked queue
+   Format: [{"issue": N, "score": S, "title": "...", "type": "...",
+             "files": [...], "ready": true}, ...]
+
+2. Append a retro entry to .loop/retro/deep-triage.jsonl (one JSON line):
+   {
+     "timestamp": "ISO8601",
+     "issues_reviewed": N,
+     "issues_refined": [list of issue numbers you added detail to],
+     "issues_closed": [list of issue numbers you recommended closing],
+     "issues_created": [list of new issue numbers you filed],
+     "queue_size": N,
+     "timmy_available": true/false,
+     "timmy_feedback": "what timmy said (verbatim, trimmed to 200 chars)",
+     "timmy_feedback_acted_on": "what you did with his feedback",
+     "observations": "free-form notes about queue health"
+   }
+
+3. If you created or closed issues, do it via the Gitea API.
+   Tag new issues: [triage-generated] [type]
+
+═══════════════════════════════════════════════════════════════════════════════
+RULES
+═══════════════════════════════════════════════════════════════════════════════
+
+- Do NOT write code. Do NOT create PRs. You are triaging, not building.
+- Do NOT close issues without commenting why.
+- Do NOT ignore Timmy's feedback without documenting your reasoning.
+- Philosophy issues are valid but lowest priority for the dev loop.
+  Don't close them — just don't put them in the dev queue.
+- When in doubt, file a new issue rather than expanding an existing one.
+  Small issues > big issues. Always.
--- a/scripts/triage_score.py
+++ b/scripts/triage_score.py
@@ -0,0 +1,360 @@
+#!/usr/bin/env python3
+"""Mechanical triage scoring for the Timmy dev loop.
+
+Reads open issues from Gitea, scores them on scope/acceptance/alignment,
+writes a ranked queue to .loop/queue.json.  No LLM calls — pure heuristics.
+
+Run:  python3 scripts/triage_score.py
+Env:  GITEA_TOKEN (or reads ~/.hermes/gitea_token)
+      GITEA_API   (default: http://localhost:3000/api/v1)
+      REPO_SLUG   (default: rockachopa/Timmy-time-dashboard)
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import re
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+
+# ── Config ──────────────────────────────────────────────────────────────
+GITEA_API = os.environ.get("GITEA_API", "http://localhost:3000/api/v1")
+REPO_SLUG = os.environ.get("REPO_SLUG", "rockachopa/Timmy-time-dashboard")
+TOKEN_FILE = Path.home() / ".hermes" / "gitea_token"
+REPO_ROOT = Path(__file__).resolve().parent.parent
+QUEUE_FILE = REPO_ROOT / ".loop" / "queue.json"
+RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "triage.jsonl"
+QUARANTINE_FILE = REPO_ROOT / ".loop" / "quarantine.json"
+CYCLE_RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
+
+# Minimum score to be considered "ready"
+READY_THRESHOLD = 5
+# How many recent cycle retros to check for quarantine
+QUARANTINE_LOOKBACK = 20
+
+# ── Helpers ─────────────────────────────────────────────────────────────
+
+def get_token() -> str:
+    token = os.environ.get("GITEA_TOKEN", "").strip()
+    if not token and TOKEN_FILE.exists():
+        token = TOKEN_FILE.read_text().strip()
+    if not token:
+        print("[triage] ERROR: No Gitea token found", file=sys.stderr)
+        sys.exit(1)
+    return token
+
+
+def api_get(path: str, token: str) -> list | dict:
+    """Minimal HTTP GET using urllib (no dependencies)."""
+    import urllib.request
+    url = f"{GITEA_API}/repos/{REPO_SLUG}/{path}"
+    req = urllib.request.Request(url, headers={
+        "Authorization": f"token {token}",
+        "Accept": "application/json",
+    })
+    with urllib.request.urlopen(req, timeout=15) as resp:
+        return json.loads(resp.read())
+
+
+def load_quarantine() -> dict:
+    """Load quarantined issues {issue_num: {reason, quarantined_at, failures}}."""
+    if QUARANTINE_FILE.exists():
+        try:
+            return json.loads(QUARANTINE_FILE.read_text())
+        except (json.JSONDecodeError, OSError):
+            pass
+    return {}
+
+
+def save_quarantine(q: dict) -> None:
+    QUARANTINE_FILE.parent.mkdir(parents=True, exist_ok=True)
+    QUARANTINE_FILE.write_text(json.dumps(q, indent=2) + "\n")
+
+
+def load_cycle_failures() -> dict[int, int]:
+    """Count failures per issue from recent cycle retros."""
+    failures: dict[int, int] = {}
+    if not CYCLE_RETRO_FILE.exists():
+        return failures
+    lines = CYCLE_RETRO_FILE.read_text().strip().splitlines()
+    for line in lines[-QUARANTINE_LOOKBACK:]:
+        try:
+            entry = json.loads(line)
+            if not entry.get("success", True):
+                issue = entry.get("issue")
+                if issue:
+                    failures[issue] = failures.get(issue, 0) + 1
+        except (json.JSONDecodeError, KeyError):
+            continue
+    return failures
+
+
+# ── Scoring ─────────────────────────────────────────────────────────────
+
+# Patterns that indicate file/function specificity
+FILE_PATTERNS = re.compile(
+    r"(?:src/|tests/|scripts/|\.py|\.html|\.js|\.yaml|\.toml|\.sh)", re.IGNORECASE
+)
+FUNCTION_PATTERNS = re.compile(
+    r"(?:def |class |function |method |`\w+\(\)`)", re.IGNORECASE
+)
+
+# Patterns that indicate acceptance criteria
+ACCEPTANCE_PATTERNS = re.compile(
+    r"(?:should|must|expect|verify|assert|test.?case|acceptance|criteria"
+    r"|pass(?:es|ing)|fail(?:s|ing)|return(?:s)?|raise(?:s)?)",
+    re.IGNORECASE,
+)
+TEST_PATTERNS = re.compile(
+    r"(?:tox|pytest|test_\w+|\.test\.|assert\s)", re.IGNORECASE
+)
+
+# Tags in issue titles
+TAG_PATTERN = re.compile(r"\[([^\]]+)\]")
+
+# Priority labels / tags
+BUG_TAGS = {"bug", "broken", "crash", "error", "fix", "regression", "hotfix"}
+FEATURE_TAGS = {"feature", "feat", "enhancement", "capability", "timmy-capability"}
+REFACTOR_TAGS = {"refactor", "cleanup", "tech-debt", "optimization", "perf"}
+META_TAGS = {"philosophy", "soul-gap", "discussion", "question", "rfc"}
+LOOP_TAG = "loop-generated"
+
+
+def extract_tags(title: str, labels: list[str]) -> set[str]:
+    """Pull tags from [bracket] notation in title + Gitea labels."""
+    tags = set()
+    for match in TAG_PATTERN.finditer(title):
+        tags.add(match.group(1).lower().strip())
+    for label in labels:
+        tags.add(label.lower().strip())
+    return tags
+
+
+def score_scope(title: str, body: str, tags: set[str]) -> int:
+    """0-3: How well-scoped is this issue?"""
+    text = f"{title}\n{body}"
+    score = 0
+
+    # Mentions specific files?
+    if FILE_PATTERNS.search(text):
+        score += 1
+
+    # Mentions specific functions/classes?
+    if FUNCTION_PATTERNS.search(text):
+        score += 1
+
+    # Short, focused title (not a novel)?
+    clean_title = TAG_PATTERN.sub("", title).strip()
+    if len(clean_title) < 80:
+        score += 1
+
+    # Philosophy/meta issues are inherently unscoped for dev work
+    if tags & META_TAGS:
+        score = max(0, score - 2)
+
+    return min(3, score)
+
+
+def score_acceptance(title: str, body: str, tags: set[str]) -> int:
+    """0-3: Does this have clear acceptance criteria?"""
+    text = f"{title}\n{body}"
+    score = 0
+
+    # Has acceptance-related language?
+    matches = len(ACCEPTANCE_PATTERNS.findall(text))
+    if matches >= 3:
+        score += 2
+    elif matches >= 1:
+        score += 1
+
+    # Mentions specific tests?
+    if TEST_PATTERNS.search(text):
+        score += 1
+
+    # Has a "## Problem" + "## Solution" or similar structure?
+    if re.search(r"##\s*(problem|solution|expected|actual|steps)", body, re.IGNORECASE):
+        score += 1
+
+    # Philosophy issues don't have testable criteria
+    if tags & META_TAGS:
+        score = max(0, score - 1)
+
+    return min(3, score)
+
+
+def score_alignment(title: str, body: str, tags: set[str]) -> int:
+    """0-3: How aligned is this with the north star?"""
+    score = 0
+
+    # Bug on main = highest priority
+    if tags & BUG_TAGS:
+        score += 3
+        return min(3, score)
+
+    # Refactors that improve code health
+    if tags & REFACTOR_TAGS:
+        score += 2
+
+    # Features that grow Timmy's capabilities
+    if tags & FEATURE_TAGS:
+        score += 2
+
+    # Loop-generated issues get a small boost (the loop found real problems)
+    if LOOP_TAG in tags:
+        score += 1
+
+    # Philosophy issues are important but not dev-actionable
+    if tags & META_TAGS:
+        score = 0
+
+    return min(3, score)
+
+
+def score_issue(issue: dict) -> dict:
+    """Score a single issue. Returns enriched dict."""
+    title = issue.get("title", "")
+    body = issue.get("body", "") or ""
+    labels = [l["name"] for l in issue.get("labels", [])]
+    tags = extract_tags(title, labels)
+    number = issue["number"]
+
+    scope = score_scope(title, body, tags)
+    acceptance = score_acceptance(title, body, tags)
+    alignment = score_alignment(title, body, tags)
+    total = scope + acceptance + alignment
+
+    # Determine issue type
+    if tags & BUG_TAGS:
+        issue_type = "bug"
+    elif tags & FEATURE_TAGS:
+        issue_type = "feature"
+    elif tags & REFACTOR_TAGS:
+        issue_type = "refactor"
+    elif tags & META_TAGS:
+        issue_type = "philosophy"
+    else:
+        issue_type = "unknown"
+
+    # Extract mentioned files from body
+    files = list(set(re.findall(r"(?:src|tests|scripts)/[\w/.]+\.(?:py|html|js|yaml)", body)))
+
+    return {
+        "issue": number,
+        "title": TAG_PATTERN.sub("", title).strip(),
+        "type": issue_type,
+        "score": total,
+        "scope": scope,
+        "acceptance": acceptance,
+        "alignment": alignment,
+        "tags": sorted(tags),
+        "files": files[:10],
+        "ready": total >= READY_THRESHOLD,
+    }
+
+
+# ── Quarantine ──────────────────────────────────────────────────────────
+
+def update_quarantine(scored: list[dict]) -> list[dict]:
+    """Auto-quarantine issues that have failed >= 2 times. Returns filtered list."""
+    failures = load_cycle_failures()
+    quarantine = load_quarantine()
+    now = datetime.now(timezone.utc).isoformat()
+
+    filtered = []
+    for item in scored:
+        num = item["issue"]
+        fail_count = failures.get(num, 0)
+        str_num = str(num)
+
+        if fail_count >= 2 and str_num not in quarantine:
+            quarantine[str_num] = {
+                "reason": f"Failed {fail_count} times in recent cycles",
+                "quarantined_at": now,
+                "failures": fail_count,
+            }
+            print(f"[triage] QUARANTINED #{num}: failed {fail_count} times")
+            continue
+
+        if str_num in quarantine:
+            print(f"[triage] Skipping #{num} (quarantined)")
+            continue
+
+        filtered.append(item)
+
+    save_quarantine(quarantine)
+    return filtered
+
+
+# ── Main ────────────────────────────────────────────────────────────────
+
+def run_triage() -> list[dict]:
+    token = get_token()
+
+    # Fetch all open issues (paginate)
+    page = 1
+    all_issues: list[dict] = []
+    while True:
+        batch = api_get(f"issues?state=open&limit=50&page={page}&type=issues", token)
+        if not batch:
+            break
+        all_issues.extend(batch)
+        if len(batch) < 50:
+            break
+        page += 1
+
+    print(f"[triage] Fetched {len(all_issues)} open issues")
+
+    # Score each
+    scored = [score_issue(i) for i in all_issues]
+
+    # Auto-quarantine repeat failures
+    scored = update_quarantine(scored)
+
+    # Sort: ready first, then by score descending, bugs always on top
+    def sort_key(item: dict) -> tuple:
+        return (
+            0 if item["type"] == "bug" else 1,
+            -item["score"],
+            item["issue"],
+        )
+
+    scored.sort(key=sort_key)
+
+    # Write queue (ready items only)
+    ready = [s for s in scored if s["ready"]]
+    not_ready = [s for s in scored if not s["ready"]]
+
+    QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
+    QUEUE_FILE.write_text(json.dumps(ready, indent=2) + "\n")
+
+    # Write retro entry
+    retro_entry = {
+        "timestamp": datetime.now(timezone.utc).isoformat(),
+        "total_open": len(all_issues),
+        "scored": len(scored),
+        "ready": len(ready),
+        "not_ready": len(not_ready),
+        "top_issue": ready[0]["issue"] if ready else None,
+        "quarantined": len(load_quarantine()),
+    }
+    RETRO_FILE.parent.mkdir(parents=True, exist_ok=True)
+    with open(RETRO_FILE, "a") as f:
+        f.write(json.dumps(retro_entry) + "\n")
+
+    # Summary
+    print(f"[triage] Ready: {len(ready)} | Not ready: {len(not_ready)}")
+    for item in ready[:5]:
+        flag = "🐛" if item["type"] == "bug" else "✦"
+        print(f"  {flag} #{item['issue']} score={item['score']} {item['title'][:60]}")
+    if not_ready:
+        print(f"[triage] Low-scoring ({len(not_ready)}):")
+        for item in not_ready[:3]:
+            print(f"    #{item['issue']} score={item['score']} {item['title'][:50]}")
+
+    return ready
+
+
+if __name__ == "__main__":
+    run_triage()
--- a/src/config.py
+++ b/src/config.py
@@ -1,10 +1,14 @@
 import logging as _logging
 import os
 import sys
+from datetime import UTC
+from datetime import datetime as _datetime
 from typing import Literal

 from pydantic_settings import BaseSettings, SettingsConfigDict

+APP_START_TIME: _datetime = _datetime.now(UTC)
+

 class Settings(BaseSettings):
    """Central configuration — all env-var access goes through this class."""
@@ -16,11 +20,33 @@ class Settings(BaseSettings):
    ollama_url: str = "http://localhost:11434"

    # LLM model passed to Agno/Ollama — override with OLLAMA_MODEL
-    # qwen3.5:latest is the primary model — better reasoning and tool calling
+    # qwen3:30b is the primary model — better reasoning and tool calling
    # than llama3.1:8b-instruct while still running locally on modest hardware.
-    # Fallback: llama3.1:8b-instruct if qwen3.5:latest not available.
+    # Fallback: llama3.1:8b-instruct if qwen3:30b not available.
    # llama3.2 (3B) hallucinated tool output consistently in testing.
-    ollama_model: str = "qwen3.5:latest"
+    ollama_model: str = "qwen3:30b"
+
+    # Context window size for Ollama inference — override with OLLAMA_NUM_CTX
+    # qwen3:30b with default context eats 45GB on a 39GB Mac.
+    # 4096 keeps memory at ~19GB. Set to 0 to use model defaults.
+    ollama_num_ctx: int = 4096
+
+    # Fallback model chains — override with FALLBACK_MODELS / VISION_FALLBACK_MODELS
+    # as comma-separated strings, e.g. FALLBACK_MODELS="qwen3:30b,llama3.1"
+    # Or edit config/providers.yaml → fallback_chains for the canonical source.
+    fallback_models: list[str] = [
+        "llama3.1:8b-instruct",
+        "llama3.1",
+        "qwen2.5:14b",
+        "qwen2.5:7b",
+        "llama3.2:3b",
+    ]
+    vision_fallback_models: list[str] = [
+        "llama3.2:3b",
+        "llava:7b",
+        "qwen2.5-vl:3b",
+        "moondream:1.8b",
+    ]

    # Set DEBUG=true to enable /docs and /redoc (disabled by default)
    debug: bool = False
@@ -223,13 +249,13 @@ class Settings(BaseSettings):
    # Local Gitea instance for issue tracking and self-improvement.
    # These values are passed as env vars to the gitea-mcp server process.
    gitea_url: str = "http://localhost:3000"
-    gitea_token: str = ""  # GITEA_TOKEN env var; falls back to ~/.config/gitea/token
+    gitea_token: str = ""  # GITEA_TOKEN env var; falls back to .timmy_gitea_token
    gitea_repo: str = "rockachopa/Timmy-time-dashboard"  # owner/repo
    gitea_enabled: bool = True

    # ── MCP Servers ────────────────────────────────────────────────────
    # External tool servers connected via Model Context Protocol (stdio).
-    mcp_gitea_command: str = "gitea-mcp -t stdio"
+    mcp_gitea_command: str = "gitea-mcp-server -t stdio"
    mcp_filesystem_command: str = "npx -y @modelcontextprotocol/server-filesystem"
    mcp_timeout: int = 15

@@ -324,14 +350,19 @@ class Settings(BaseSettings):
    def model_post_init(self, __context) -> None:
        """Post-init: resolve gitea_token from file if not set via env."""
        if not self.gitea_token:
-            token_path = os.path.expanduser("~/.config/gitea/token")
-            try:
-                if os.path.isfile(token_path):
-                    token = open(token_path).read().strip()  # noqa: SIM115
-                    if token:
-                        self.gitea_token = token
-            except OSError:
-                pass
+            # Priority: Timmy's own token → legacy admin token
+            repo_root = self._compute_repo_root()
+            timmy_token_path = os.path.join(repo_root, ".timmy_gitea_token")
+            legacy_token_path = os.path.expanduser("~/.config/gitea/token")
+            for token_path in (timmy_token_path, legacy_token_path):
+                try:
+                    if os.path.isfile(token_path):
+                        token = open(token_path).read().strip()  # noqa: SIM115
+                        if token:
+                            self.gitea_token = token
+                            break
+                except OSError:
+                    pass

    model_config = SettingsConfigDict(
        env_file=".env",
@@ -346,10 +377,9 @@ if not settings.repo_root:
    settings.repo_root = settings._compute_repo_root()

 # ── Model fallback configuration ────────────────────────────────────────────
-# Primary model for reliable tool calling (llama3.1:8b-instruct)
-# Fallback if primary not available: qwen3.5:latest
-OLLAMA_MODEL_PRIMARY: str = "qwen3.5:latest"
-OLLAMA_MODEL_FALLBACK: str = "llama3.1:8b-instruct"
+# Fallback chains are now in settings.fallback_models / settings.vision_fallback_models.
+# Override via env vars (FALLBACK_MODELS, VISION_FALLBACK_MODELS) or
+# edit config/providers.yaml → fallback_chains.


 def check_ollama_model_available(model_name: str) -> bool:
@@ -371,33 +401,31 @@ def check_ollama_model_available(model_name: str) -> bool:
                model_name == m or model_name == m.split(":")[0] or m.startswith(model_name)
                for m in models
            )
-    except Exception:
+    except (OSError, ValueError) as exc:
+        _startup_logger.debug("Ollama model check failed: %s", exc)
        return False


 def get_effective_ollama_model() -> str:
-    """Get the effective Ollama model, with fallback logic."""
-    # If user has overridden, use their setting
+    """Get the effective Ollama model, with fallback logic.
+
+    Walks the configurable ``settings.fallback_models`` chain when the
+    user's preferred model is not available locally.
+    """
    user_model = settings.ollama_model

-    # Check if user's model is available
    if check_ollama_model_available(user_model):
        return user_model

-    # Try primary
-    if check_ollama_model_available(OLLAMA_MODEL_PRIMARY):
-        _startup_logger.warning(
-            f"Requested model '{user_model}' not available. Using primary: {OLLAMA_MODEL_PRIMARY}"
-        )
-        return OLLAMA_MODEL_PRIMARY
-
-    # Try fallback
-    if check_ollama_model_available(OLLAMA_MODEL_FALLBACK):
-        _startup_logger.warning(
-            f"Primary model '{OLLAMA_MODEL_PRIMARY}' not available. "
-            f"Using fallback: {OLLAMA_MODEL_FALLBACK}"
-        )
-        return OLLAMA_MODEL_FALLBACK
+    # Walk the configurable fallback chain
+    for fallback in settings.fallback_models:
+        if check_ollama_model_available(fallback):
+            _startup_logger.warning(
+                "Requested model '%s' not available. Using fallback: %s",
+                user_model,
+                fallback,
+            )
+            return fallback

    # Last resort - return user's setting and hope for the best
    return user_model
--- a/src/dashboard/app.py
+++ b/src/dashboard/app.py
@@ -28,6 +28,7 @@ from dashboard.routes.agents import router as agents_router
 from dashboard.routes.briefing import router as briefing_router
 from dashboard.routes.calm import router as calm_router
 from dashboard.routes.chat_api import router as chat_api_router
+from dashboard.routes.chat_api_v1 import router as chat_api_v1_router
 from dashboard.routes.db_explorer import router as db_explorer_router
 from dashboard.routes.discord import router as discord_router
 from dashboard.routes.experiments import router as experiments_router
@@ -305,7 +306,7 @@ async def lifespan(app: FastAPI):
    # Auto-prune old vector store memories on startup
    if settings.memory_prune_days > 0:
        try:
-            from timmy.memory.vector_store import prune_memories
+            from timmy.memory_system import prune_memories

            pruned = prune_memories(
                older_than_days=settings.memory_prune_days,
@@ -375,6 +376,15 @@ async def lifespan(app: FastAPI):
    # Start chat integrations in background
    chat_task = asyncio.create_task(_start_chat_integrations_background())

+    # Register session logger with error capture (breaks infrastructure → timmy circular dep)
+    try:
+        from infrastructure.error_capture import register_error_recorder
+        from timmy.session_logger import get_session_logger
+
+        register_error_recorder(get_session_logger().record_error)
+    except Exception:
+        pass
+
    logger.info("✓ Dashboard ready for requests")

    yield
@@ -474,6 +484,7 @@ app.include_router(grok_router)
 app.include_router(models_router)
 app.include_router(models_api_router)
 app.include_router(chat_api_router)
+app.include_router(chat_api_v1_router)
 app.include_router(thinking_router)
 app.include_router(calm_router)
 app.include_router(tasks_router)
@@ -500,6 +511,44 @@ async def ws_redirect(websocket: WebSocket):
        await websocket.send({"type": "websocket.close", "code": 1008})


+@app.websocket("/swarm/live")
+async def swarm_live(websocket: WebSocket):
+    """Swarm live event stream via WebSocket."""
+    from infrastructure.ws_manager.handler import ws_manager as ws_mgr
+
+    await ws_mgr.connect(websocket)
+    try:
+        while True:
+            # Keep connection alive; events are pushed via ws_mgr.broadcast()
+            await websocket.receive_text()
+    except Exception as exc:
+        logger.debug("WebSocket disconnect error: %s", exc)
+        ws_mgr.disconnect(websocket)
+
+
+@app.get("/swarm/agents/sidebar", response_class=HTMLResponse)
+async def swarm_agents_sidebar():
+    """HTMX partial: list active swarm agents for the dashboard sidebar."""
+    try:
+        from config import settings
+
+        agents_yaml = settings.agents_config
+        agents = agents_yaml.get("agents", {})
+        lines = []
+        for name, cfg in agents.items():
+            model = cfg.get("model", "default")
+            lines.append(
+                f'<div class="mc-agent-row">'
+                f'<span class="mc-agent-name">{name}</span>'
+                f'<span class="mc-agent-model">{model}</span>'
+                f"</div>"
+            )
+        return "\n".join(lines) if lines else '<div class="mc-muted">No agents configured</div>'
+    except Exception as exc:
+        logger.debug("Agents sidebar error: %s", exc)
+        return '<div class="mc-muted">Agents unavailable</div>'
+
+
@app.get("/", response_class=HTMLResponse)
 async def root(request: Request):
    """Serve the main dashboard page."""
--- a/src/dashboard/middleware/csrf.py
+++ b/src/dashboard/middleware/csrf.py
@@ -5,6 +5,7 @@ to protect state-changing endpoints from cross-site request attacks.
 """

 import hmac
+import logging
 import secrets
 from collections.abc import Callable
 from functools import wraps
@@ -16,6 +17,8 @@ from starlette.responses import JSONResponse, Response
 # Module-level set to track exempt routes
 _exempt_routes: set[str] = set()

+logger = logging.getLogger(__name__)
+

 def csrf_exempt(endpoint: Callable) -> Callable:
    """Decorator to mark an endpoint as exempt from CSRF validation.
@@ -134,6 +137,10 @@ class CSRFMiddleware(BaseHTTPMiddleware):
        if settings.timmy_disable_csrf:
            return await call_next(request)

+        # WebSocket upgrades don't carry CSRF tokens — skip them entirely
+        if request.headers.get("upgrade", "").lower() == "websocket":
+            return await call_next(request)
+
        # Get existing CSRF token from cookie
        csrf_cookie = request.cookies.get(self.cookie_name)

@@ -274,7 +281,8 @@ class CSRFMiddleware(BaseHTTPMiddleware):
                form_token = form_data.get(self.form_field)
                if form_token and validate_csrf_token(str(form_token), csrf_cookie):
                    return True
-            except Exception:
+            except Exception as exc:
+                logger.debug("CSRF form parsing error: %s", exc)
                # Error parsing form data, treat as invalid
                pass

--- a/src/dashboard/middleware/request_logging.py
+++ b/src/dashboard/middleware/request_logging.py
@@ -115,7 +115,8 @@ class RequestLoggingMiddleware(BaseHTTPMiddleware):
                        "duration_ms": f"{duration_ms:.0f}",
                    },
                )
-            except Exception:
+            except Exception as exc:
+                logger.debug("Escalation logging error: %s", exc)
                pass  # never let escalation break the request

            # Re-raise the exception
--- a/src/dashboard/middleware/security_headers.py
+++ b/src/dashboard/middleware/security_headers.py
@@ -4,10 +4,14 @@ Adds common security headers to all HTTP responses to improve
 application security posture against various attacks.
 """

+import logging
+
 from starlette.middleware.base import BaseHTTPMiddleware
 from starlette.requests import Request
 from starlette.responses import Response

+logger = logging.getLogger(__name__)
+

 class SecurityHeadersMiddleware(BaseHTTPMiddleware):
    """Middleware to add security headers to all responses.
@@ -130,12 +134,8 @@ class SecurityHeadersMiddleware(BaseHTTPMiddleware):
        """
        try:
            response = await call_next(request)
-        except Exception:
-            import logging
-
-            logging.getLogger(__name__).debug(
-                "Upstream error in security headers middleware", exc_info=True
-            )
+        except Exception as exc:
+            logger.debug("Upstream error in security headers middleware: %s", exc)
            from starlette.responses import PlainTextResponse

            response = PlainTextResponse("Internal Server Error", status_code=500)
--- a/src/dashboard/routes/agents.py
+++ b/src/dashboard/routes/agents.py
@@ -12,6 +12,7 @@ from timmy.tool_safety import (
    format_action_description,
    get_impact_level,
 )
+from timmy.welcome import WELCOME_MESSAGE

 logger = logging.getLogger(__name__)

@@ -56,7 +57,7 @@ async def get_history(request: Request):
    return templates.TemplateResponse(
        request,
        "partials/history.html",
-        {"messages": message_log.all()},
+        {"messages": message_log.all(), "welcome_message": WELCOME_MESSAGE},
    )


@@ -66,7 +67,7 @@ async def clear_history(request: Request):
    return templates.TemplateResponse(
        request,
        "partials/history.html",
-        {"messages": []},
+        {"messages": [], "welcome_message": WELCOME_MESSAGE},
    )


@@ -220,7 +221,8 @@ async def reject_tool(request: Request, approval_id: str):
        # Resume so the agent knows the tool was rejected
        try:
            await continue_chat(pending["run_output"])
-        except Exception:
+        except Exception as exc:
+            logger.warning("Agent tool rejection error: %s", exc)
            pass

    reject(approval_id)
--- a/src/dashboard/routes/briefing.py
+++ b/src/dashboard/routes/briefing.py
@@ -27,7 +27,8 @@ async def get_briefing(request: Request):
    """Return today's briefing page (generated or cached)."""
    try:
        briefing = briefing_engine.get_or_generate()
-    except Exception:
+    except Exception as exc:
+        logger.debug("Briefing generation failed: %s", exc)
        logger.exception("Briefing generation failed")
        now = datetime.now(UTC)
        briefing = Briefing(
--- a/src/dashboard/routes/chat_api.py
+++ b/src/dashboard/routes/chat_api.py
@@ -51,7 +51,8 @@ async def api_chat(request: Request):

    try:
        body = await request.json()
-    except Exception:
+    except Exception as exc:
+        logger.warning("Chat API JSON parse error: %s", exc)
        return JSONResponse(status_code=400, content={"error": "Invalid JSON"})

    messages = body.get("messages")
--- a/src/dashboard/routes/chat_api_v1.py
+++ b/src/dashboard/routes/chat_api_v1.py
@@ -0,0 +1,206 @@
+"""Version 1 (v1) JSON REST API for the Timmy Time iPad app.
+
+This module implements the specific endpoints required by the native
+iPad app as defined in the project specification.
+
+Endpoints:
+    POST /api/v1/chat           — Streaming SSE chat response
+    GET  /api/v1/chat/history   — Retrieve chat history with limit
+    POST /api/v1/upload         — Multipart file upload with auto-detection
+    GET  /api/v1/status         — Detailed system and model status
+"""
+
+import json
+import logging
+import os
+import uuid
+from datetime import UTC, datetime
+from pathlib import Path
+
+from fastapi import APIRouter, File, HTTPException, Request, UploadFile, Query
+from fastapi.responses import JSONResponse, StreamingResponse
+
+from config import APP_START_TIME, settings
+from dashboard.store import message_log
+from timmy.session import _get_agent
+from dashboard.routes.health import _check_ollama
+
+logger = logging.getLogger(__name__)
+
+router = APIRouter(prefix="/api/v1", tags=["chat-api-v1"])
+
+_UPLOAD_DIR = str(Path(settings.repo_root) / "data" / "chat-uploads")
+_MAX_UPLOAD_SIZE = 50 * 1024 * 1024  # 50 MB
+
+
+# ── POST /api/v1/chat ─────────────────────────────────────────────────────────
+
+
+@router.post("/chat")
+async def api_v1_chat(request: Request):
+    """Accept a JSON chat payload and return a streaming SSE response.
+
+    Request body:
+        {
+            "message": "string",
+            "session_id": "string",
+            "attachments": ["id1", "id2"]
+        }
+
+    Response:
+        text/event-stream (SSE)
+    """
+    try:
+        body = await request.json()
+    except Exception as exc:
+        logger.warning("Chat v1 API JSON parse error: %s", exc)
+        return JSONResponse(status_code=400, content={"error": "Invalid JSON"})
+
+    message = body.get("message")
+    session_id = body.get("session_id", "ipad-app")
+    attachments = body.get("attachments", [])
+
+    if not message:
+        return JSONResponse(status_code=400, content={"error": "message is required"})
+
+    # Prepare context for the agent
+    now = datetime.now()
+    timestamp = now.strftime("%H:%M:%S")
+    context_prefix = (
+        f"[System: Current date/time is "
+        f"{now.strftime('%A, %B %d, %Y at %I:%M %p')}]\n"
+        f"[System: iPad App client]\n"
+    )
+
+    if attachments:
+        context_prefix += f"[System: Attachments: {', '.join(attachments)}]\n"
+
+    context_prefix += "\n"
+    full_prompt = context_prefix + message
+
+    # Log user message
+    message_log.append(role="user", content=message, timestamp=timestamp, source="api-v1")
+
+    async def event_generator():
+        full_response = ""
+        try:
+            agent = _get_agent()
+            # Using streaming mode for SSE
+            async for chunk in agent.arun(full_prompt, stream=True, session_id=session_id):
+                # Agno chunks can be strings or RunOutput
+                content = chunk.content if hasattr(chunk, "content") else str(chunk)
+                if content:
+                    full_response += content
+                    yield f"data: {json.dumps({'text': content})}\n\n"
+
+            # Log agent response once complete
+            message_log.append(
+                role="agent", content=full_response, timestamp=timestamp, source="api-v1"
+            )
+            yield "data: [DONE]\n\n"
+        except Exception as exc:
+            logger.error("SSE stream error: %s", exc)
+            yield f"data: {json.dumps({'error': str(exc)})}\n\n"
+
+    return StreamingResponse(event_generator(), media_type="text/event-stream")
+
+
+# ── GET /api/v1/chat/history ──────────────────────────────────────────────────
+
+
+@router.get("/chat/history")
+async def api_v1_chat_history(
+    session_id: str = Query("ipad-app"), limit: int = Query(50, ge=1, le=100)
+):
+    """Return recent chat history for a specific session."""
+    # Using the optimized .recent() method from infrastructure.chat_store
+    all_msgs = message_log.recent(limit=limit)
+
+    history = [
+        {
+            "role": msg.role,
+            "content": msg.content,
+            "timestamp": msg.timestamp,
+            "source": msg.source,
+        }
+        for msg in all_msgs
+    ]
+
+    return {"messages": history}
+
+
+# ── POST /api/v1/upload ───────────────────────────────────────────────────────
+
+
+@router.post("/upload")
+async def api_v1_upload(file: UploadFile = File(...)):
+    """Accept a file upload, auto-detect type, and return metadata.
+
+    Response:
+        {
+            "id": "string",
+            "type": "image|audio|document|url",
+            "summary": "string",
+            "metadata": {...}
+        }
+    """
+    os.makedirs(_UPLOAD_DIR, exist_ok=True)
+
+    file_id = uuid.uuid4().hex[:12]
+    safe_name = os.path.basename(file.filename or "upload")
+    stored_name = f"{file_id}-{safe_name}"
+    file_path = os.path.join(_UPLOAD_DIR, stored_name)
+
+    # Verify resolved path stays within upload directory
+    resolved = Path(file_path).resolve()
+    upload_root = Path(_UPLOAD_DIR).resolve()
+    if not str(resolved).startswith(str(upload_root)):
+        raise HTTPException(status_code=400, detail="Invalid file name")
+
+    contents = await file.read()
+    if len(contents) > _MAX_UPLOAD_SIZE:
+        raise HTTPException(status_code=413, detail="File too large (max 50 MB)")
+
+    with open(file_path, "wb") as f:
+        f.write(contents)
+
+    # Auto-detect type based on extension/mime
+    mime_type = file.content_type or "application/octet-stream"
+    ext = os.path.splitext(safe_name)[1].lower()
+
+    media_type = "document"
+    if mime_type.startswith("image/") or ext in [".jpg", ".jpeg", ".png", ".heic"]:
+        media_type = "image"
+    elif mime_type.startswith("audio/") or ext in [".m4a", ".mp3", ".wav", ".caf"]:
+        media_type = "audio"
+    elif ext in [".pdf", ".txt", ".md"]:
+        media_type = "document"
+
+    # Placeholder for actual processing (OCR, Whisper, etc.)
+    summary = f"Uploaded {media_type}: {safe_name}"
+
+    return {
+        "id": file_id,
+        "type": media_type,
+        "summary": summary,
+        "url": f"/uploads/{stored_name}",
+        "metadata": {"fileName": safe_name, "mimeType": mime_type, "size": len(contents)},
+    }
+
+
+# ── GET /api/v1/status ────────────────────────────────────────────────────────
+
+
+@router.get("/status")
+async def api_v1_status():
+    """Detailed system and model status."""
+    ollama_status = await _check_ollama()
+    uptime = (datetime.now(UTC) - APP_START_TIME).total_seconds()
+
+    return {
+        "timmy": "online" if ollama_status.status == "healthy" else "offline",
+        "model": settings.ollama_model,
+        "ollama": "running" if ollama_status.status == "healthy" else "stopped",
+        "uptime": f"{int(uptime // 3600)}h {int((uptime % 3600) // 60)}m",
+        "version": "2.0.0-v1-api",
+    }
--- a/src/dashboard/routes/db_explorer.py
+++ b/src/dashboard/routes/db_explorer.py
@@ -3,6 +3,7 @@
 import asyncio
 import logging
 import sqlite3
+from contextlib import closing
 from pathlib import Path

 from fastapi import APIRouter, Request
@@ -39,56 +40,50 @@ def _query_database(db_path: str) -> dict:
    """Open a database read-only and return all tables with their rows."""
    result = {"tables": {}, "error": None}
    try:
-        conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
-        conn.row_factory = sqlite3.Row
-    except Exception as exc:
-        result["error"] = str(exc)
-        return result
+        with closing(sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)) as conn:
+            conn.row_factory = sqlite3.Row

-    try:
-        tables = conn.execute(
-            "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
-        ).fetchall()
-        for (table_name,) in tables:
-            try:
-                rows = conn.execute(
-                    f"SELECT * FROM [{table_name}] LIMIT {MAX_ROWS}"  # noqa: S608
-                ).fetchall()
-                columns = (
-                    [
-                        desc[0]
-                        for desc in conn.execute(
-                            f"SELECT * FROM [{table_name}] LIMIT 0"
-                        ).description
-                    ]
-                    if rows
-                    else []
-                )  # noqa: S608
-                if not columns and rows:
-                    columns = list(rows[0].keys())
-                elif not columns:
-                    # Get columns even for empty tables
-                    cursor = conn.execute(f"PRAGMA table_info([{table_name}])")  # noqa: S608
-                    columns = [r[1] for r in cursor.fetchall()]
-                count = conn.execute(f"SELECT COUNT(*) FROM [{table_name}]").fetchone()[0]  # noqa: S608
-                result["tables"][table_name] = {
-                    "columns": columns,
-                    "rows": [dict(r) for r in rows],
-                    "total_count": count,
-                    "truncated": count > MAX_ROWS,
-                }
-            except Exception as exc:
-                result["tables"][table_name] = {
-                    "error": str(exc),
-                    "columns": [],
-                    "rows": [],
-                    "total_count": 0,
-                    "truncated": False,
-                }
+            tables = conn.execute(
+                "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
+            ).fetchall()
+            for (table_name,) in tables:
+                try:
+                    rows = conn.execute(
+                        f"SELECT * FROM [{table_name}] LIMIT {MAX_ROWS}"  # noqa: S608
+                    ).fetchall()
+                    columns = (
+                        [
+                            desc[0]
+                            for desc in conn.execute(
+                                f"SELECT * FROM [{table_name}] LIMIT 0"
+                            ).description
+                        ]
+                        if rows
+                        else []
+                    )  # noqa: S608
+                    if not columns and rows:
+                        columns = list(rows[0].keys())
+                    elif not columns:
+                        # Get columns even for empty tables
+                        cursor = conn.execute(f"PRAGMA table_info([{table_name}])")  # noqa: S608
+                        columns = [r[1] for r in cursor.fetchall()]
+                    count = conn.execute(f"SELECT COUNT(*) FROM [{table_name}]").fetchone()[0]  # noqa: S608
+                    result["tables"][table_name] = {
+                        "columns": columns,
+                        "rows": [dict(r) for r in rows],
+                        "total_count": count,
+                        "truncated": count > MAX_ROWS,
+                    }
+                except Exception as exc:
+                    result["tables"][table_name] = {
+                        "error": str(exc),
+                        "columns": [],
+                        "rows": [],
+                        "total_count": 0,
+                        "truncated": False,
+                    }
    except Exception as exc:
        result["error"] = str(exc)
-    finally:
-        conn.close()

    return result

--- a/src/dashboard/routes/experiments.py
+++ b/src/dashboard/routes/experiments.py
@@ -30,8 +30,8 @@ async def experiments_page(request: Request):
    history = []
    try:
        history = get_experiment_history(_workspace())
-    except Exception:
-        logger.debug("Failed to load experiment history", exc_info=True)
+    except Exception as exc:
+        logger.debug("Failed to load experiment history: %s", exc)

    return templates.TemplateResponse(
        request,
--- a/src/dashboard/routes/grok.py
+++ b/src/dashboard/routes/grok.py
@@ -52,8 +52,8 @@ async def grok_status(request: Request):
            "estimated_cost_sats": backend.stats.estimated_cost_sats,
            "errors": backend.stats.errors,
        }
-    except Exception:
-        logger.debug("Failed to load Grok stats", exc_info=True)
+    except Exception as exc:
+        logger.warning("Failed to load Grok stats: %s", exc)

    return templates.TemplateResponse(
        request,
@@ -94,8 +94,8 @@ async def toggle_grok_mode(request: Request):
            tool_name="grok_mode_toggle",
            success=True,
        )
-    except Exception:
-        logger.debug("Failed to log Grok toggle to Spark", exc_info=True)
+    except Exception as exc:
+        logger.warning("Failed to log Grok toggle to Spark: %s", exc)

    return HTMLResponse(
        _render_toggle_card(_grok_mode_active),
@@ -128,8 +128,8 @@ def _run_grok_query(message: str) -> dict:
            sats = min(settings.grok_max_sats_per_query, 100)
            ln.create_invoice(sats, f"Grok: {message[:50]}")
            invoice_note = f" | {sats} sats"
-        except Exception:
-            logger.debug("Lightning invoice creation failed", exc_info=True)
+        except Exception as exc:
+            logger.warning("Lightning invoice creation failed: %s", exc)

    try:
        result = backend.run(message)
--- a/src/dashboard/routes/health.py
+++ b/src/dashboard/routes/health.py
@@ -6,14 +6,18 @@ for the Mission Control dashboard.

 import asyncio
 import logging
+import sqlite3
 import time
+from contextlib import closing
 from datetime import UTC, datetime
+from pathlib import Path
 from typing import Any

 from fastapi import APIRouter, Request
 from fastapi.responses import HTMLResponse
 from pydantic import BaseModel

+from config import APP_START_TIME as _START_TIME
 from config import settings

 logger = logging.getLogger(__name__)
@@ -49,7 +53,6 @@ class HealthStatus(BaseModel):


 # Simple uptime tracking
-_START_TIME = datetime.now(UTC)

 # Ollama health cache (30-second TTL)
 _ollama_cache: DependencyStatus | None = None
@@ -76,8 +79,8 @@ def _check_ollama_sync() -> DependencyStatus:
                    sovereignty_score=10,
                    details={"url": settings.ollama_url, "model": settings.ollama_model},
                )
-    except Exception:
-        logger.debug("Ollama health check failed", exc_info=True)
+    except Exception as exc:
+        logger.debug("Ollama health check failed: %s", exc)

    return DependencyStatus(
        name="Ollama AI",
@@ -101,7 +104,8 @@ async def _check_ollama() -> DependencyStatus:

    try:
        result = await asyncio.to_thread(_check_ollama_sync)
-    except Exception:
+    except Exception as exc:
+        logger.debug("Ollama async check failed: %s", exc)
        result = DependencyStatus(
            name="Ollama AI",
            status="unavailable",
@@ -133,13 +137,9 @@ def _check_lightning() -> DependencyStatus:
 def _check_sqlite() -> DependencyStatus:
    """Check SQLite database status."""
    try:
-        import sqlite3
-        from pathlib import Path
-
        db_path = Path(settings.repo_root) / "data" / "timmy.db"
-        conn = sqlite3.connect(str(db_path))
-        conn.execute("SELECT 1")
-        conn.close()
+        with closing(sqlite3.connect(str(db_path))) as conn:
+            conn.execute("SELECT 1")

        return DependencyStatus(
            name="SQLite Database",
--- a/src/dashboard/routes/memory.py
+++ b/src/dashboard/routes/memory.py
@@ -4,7 +4,7 @@ from fastapi import APIRouter, Form, HTTPException, Request
 from fastapi.responses import HTMLResponse, JSONResponse

 from dashboard.templating import templates
-from timmy.memory.vector_store import (
+from timmy.memory_system import (
    delete_memory,
    get_memory_stats,
    recall_personal_facts_with_ids,
--- a/src/dashboard/routes/system.py
+++ b/src/dashboard/routes/system.py
@@ -1,10 +1,12 @@
 """System-level dashboard routes (ledger, upgrades, etc.)."""

 import logging
+from pathlib import Path

 from fastapi import APIRouter, Request
 from fastapi.responses import HTMLResponse, JSONResponse

+from config import settings
 from dashboard.templating import templates

 logger = logging.getLogger(__name__)
@@ -144,5 +146,82 @@ async def api_notifications():
                for e in events
            ]
        )
-    except Exception:
+    except Exception as exc:
+        logger.debug("System events fetch error: %s", exc)
        return JSONResponse([])
+
+
+@router.get("/api/briefing/status", response_class=JSONResponse)
+async def api_briefing_status():
+    """Return briefing status including pending approvals and last generated time."""
+    from timmy import approvals
+    from timmy.briefing import engine as briefing_engine
+
+    pending = approvals.list_pending()
+    pending_count = len(pending)
+
+    last_generated = None
+    try:
+        cached = briefing_engine.get_cached()
+        if cached:
+            last_generated = cached.generated_at.isoformat()
+    except Exception:
+        pass
+
+    return JSONResponse(
+        {
+            "status": "ok",
+            "pending_approvals": pending_count,
+            "last_generated": last_generated,
+        }
+    )
+
+
+@router.get("/api/memory/status", response_class=JSONResponse)
+async def api_memory_status():
+    """Return memory database status including file info and indexed files count."""
+    from timmy.memory_system import get_memory_stats
+
+    db_path = Path(settings.repo_root) / "data" / "memory.db"
+    db_exists = db_path.exists()
+    db_size = db_path.stat().st_size if db_exists else 0
+
+    try:
+        stats = get_memory_stats()
+        indexed_files = stats.get("total_entries", 0)
+    except Exception:
+        indexed_files = 0
+
+    return JSONResponse(
+        {
+            "status": "ok",
+            "db_exists": db_exists,
+            "db_size_bytes": db_size,
+            "indexed_files": indexed_files,
+        }
+    )
+
+
+@router.get("/api/swarm/status", response_class=JSONResponse)
+async def api_swarm_status():
+    """Return swarm worker status and pending tasks count."""
+    from dashboard.routes.tasks import _get_db
+
+    pending_tasks = 0
+    try:
+        with _get_db() as db:
+            row = db.execute(
+                "SELECT COUNT(*) as cnt FROM tasks WHERE status IN ('pending_approval','approved')"
+            ).fetchone()
+            pending_tasks = row["cnt"] if row else 0
+    except Exception:
+        pass
+
+    return JSONResponse(
+        {
+            "status": "ok",
+            "active_workers": 0,
+            "pending_tasks": pending_tasks,
+            "message": "Swarm monitoring endpoint",
+        }
+    )
--- a/src/dashboard/routes/tasks.py
+++ b/src/dashboard/routes/tasks.py
@@ -3,6 +3,8 @@
 import logging
 import sqlite3
 import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from datetime import datetime
 from pathlib import Path

@@ -35,26 +37,27 @@ VALID_STATUSES = {
 VALID_PRIORITIES = {"low", "normal", "high", "urgent"}


-def _get_db() -> sqlite3.Connection:
+@contextmanager
+def _get_db() -> Generator[sqlite3.Connection, None, None]:
    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS tasks (
-            id TEXT PRIMARY KEY,
-            title TEXT NOT NULL,
-            description TEXT DEFAULT '',
-            status TEXT DEFAULT 'pending_approval',
-            priority TEXT DEFAULT 'normal',
-            assigned_to TEXT DEFAULT '',
-            created_by TEXT DEFAULT 'operator',
-            result TEXT DEFAULT '',
-            created_at TEXT DEFAULT (datetime('now')),
-            completed_at TEXT
-        )
-    """)
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(DB_PATH))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS tasks (
+                id TEXT PRIMARY KEY,
+                title TEXT NOT NULL,
+                description TEXT DEFAULT '',
+                status TEXT DEFAULT 'pending_approval',
+                priority TEXT DEFAULT 'normal',
+                assigned_to TEXT DEFAULT '',
+                created_by TEXT DEFAULT 'operator',
+                result TEXT DEFAULT '',
+                created_at TEXT DEFAULT (datetime('now')),
+                completed_at TEXT
+            )
+        """)
+        conn.commit()
+        yield conn


 def _row_to_dict(row: sqlite3.Row) -> dict:
@@ -101,8 +104,7 @@ class _TaskView:
@router.get("/tasks", response_class=HTMLResponse)
 async def tasks_page(request: Request):
    """Render the main task queue page with 3-column layout."""
-    db = _get_db()
-    try:
+    with _get_db() as db:
        pending = [
            _TaskView(_row_to_dict(r))
            for r in db.execute(
@@ -121,8 +123,6 @@ async def tasks_page(request: Request):
                "SELECT * FROM tasks WHERE status IN ('completed','vetoed','failed') ORDER BY completed_at DESC LIMIT 50"
            ).fetchall()
        ]
-    finally:
-        db.close()

    return templates.TemplateResponse(
        request,
@@ -145,13 +145,10 @@ async def tasks_page(request: Request):

@router.get("/tasks/pending", response_class=HTMLResponse)
 async def tasks_pending(request: Request):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        rows = db.execute(
            "SELECT * FROM tasks WHERE status='pending_approval' ORDER BY created_at DESC"
        ).fetchall()
-    finally:
-        db.close()
    tasks = [_TaskView(_row_to_dict(r)) for r in rows]
    parts = []
    for task in tasks:
@@ -167,13 +164,10 @@ async def tasks_pending(request: Request):

@router.get("/tasks/active", response_class=HTMLResponse)
 async def tasks_active(request: Request):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        rows = db.execute(
            "SELECT * FROM tasks WHERE status IN ('approved','running','paused') ORDER BY created_at DESC"
        ).fetchall()
-    finally:
-        db.close()
    tasks = [_TaskView(_row_to_dict(r)) for r in rows]
    parts = []
    for task in tasks:
@@ -189,13 +183,10 @@ async def tasks_active(request: Request):

@router.get("/tasks/completed", response_class=HTMLResponse)
 async def tasks_completed(request: Request):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        rows = db.execute(
            "SELECT * FROM tasks WHERE status IN ('completed','vetoed','failed') ORDER BY completed_at DESC LIMIT 50"
        ).fetchall()
-    finally:
-        db.close()
    tasks = [_TaskView(_row_to_dict(r)) for r in rows]
    parts = []
    for task in tasks:
@@ -231,16 +222,13 @@ async def create_task_form(
    now = datetime.utcnow().isoformat()
    priority = priority if priority in VALID_PRIORITIES else "normal"

-    db = _get_db()
-    try:
+    with _get_db() as db:
        db.execute(
            "INSERT INTO tasks (id, title, description, priority, assigned_to, created_at) VALUES (?, ?, ?, ?, ?, ?)",
            (task_id, title, description, priority, assigned_to, now),
        )
        db.commit()
        row = db.execute("SELECT * FROM tasks WHERE id=?", (task_id,)).fetchone()
-    finally:
-        db.close()

    task = _TaskView(_row_to_dict(row))
    return templates.TemplateResponse(request, "partials/task_card.html", {"task": task})
@@ -283,16 +271,13 @@ async def modify_task(
    title: str = Form(...),
    description: str = Form(""),
 ):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        db.execute(
            "UPDATE tasks SET title=?, description=? WHERE id=?",
            (title, description, task_id),
        )
        db.commit()
        row = db.execute("SELECT * FROM tasks WHERE id=?", (task_id,)).fetchone()
-    finally:
-        db.close()
    if not row:
        raise HTTPException(404, "Task not found")
    task = _TaskView(_row_to_dict(row))
@@ -304,16 +289,13 @@ async def _set_status(request: Request, task_id: str, new_status: str):
    completed_at = (
        datetime.utcnow().isoformat() if new_status in ("completed", "vetoed", "failed") else None
    )
-    db = _get_db()
-    try:
+    with _get_db() as db:
        db.execute(
            "UPDATE tasks SET status=?, completed_at=COALESCE(?, completed_at) WHERE id=?",
            (new_status, completed_at, task_id),
        )
        db.commit()
        row = db.execute("SELECT * FROM tasks WHERE id=?", (task_id,)).fetchone()
-    finally:
-        db.close()
    if not row:
        raise HTTPException(404, "Task not found")
    task = _TaskView(_row_to_dict(row))
@@ -339,8 +321,7 @@ async def api_create_task(request: Request):
    if priority not in VALID_PRIORITIES:
        priority = "normal"

-    db = _get_db()
-    try:
+    with _get_db() as db:
        db.execute(
            "INSERT INTO tasks (id, title, description, priority, assigned_to, created_by, created_at) "
            "VALUES (?, ?, ?, ?, ?, ?, ?)",
@@ -356,8 +337,6 @@ async def api_create_task(request: Request):
        )
        db.commit()
        row = db.execute("SELECT * FROM tasks WHERE id=?", (task_id,)).fetchone()
-    finally:
-        db.close()

    return JSONResponse(_row_to_dict(row), status_code=201)

@@ -365,11 +344,8 @@ async def api_create_task(request: Request):
@router.get("/api/tasks", response_class=JSONResponse)
 async def api_list_tasks():
    """List all tasks as JSON."""
-    db = _get_db()
-    try:
+    with _get_db() as db:
        rows = db.execute("SELECT * FROM tasks ORDER BY created_at DESC").fetchall()
-    finally:
-        db.close()
    return JSONResponse([_row_to_dict(r) for r in rows])


@@ -384,16 +360,13 @@ async def api_update_status(task_id: str, request: Request):
    completed_at = (
        datetime.utcnow().isoformat() if new_status in ("completed", "vetoed", "failed") else None
    )
-    db = _get_db()
-    try:
+    with _get_db() as db:
        db.execute(
            "UPDATE tasks SET status=?, completed_at=COALESCE(?, completed_at) WHERE id=?",
            (new_status, completed_at, task_id),
        )
        db.commit()
        row = db.execute("SELECT * FROM tasks WHERE id=?", (task_id,)).fetchone()
-    finally:
-        db.close()
    if not row:
        raise HTTPException(404, "Task not found")
    return JSONResponse(_row_to_dict(row))
@@ -402,12 +375,9 @@ async def api_update_status(task_id: str, request: Request):
@router.delete("/api/tasks/{task_id}", response_class=JSONResponse)
 async def api_delete_task(task_id: str):
    """Delete a task."""
-    db = _get_db()
-    try:
+    with _get_db() as db:
        cursor = db.execute("DELETE FROM tasks WHERE id=?", (task_id,))
        db.commit()
-    finally:
-        db.close()
    if cursor.rowcount == 0:
        raise HTTPException(404, "Task not found")
    return JSONResponse({"success": True, "id": task_id})
@@ -421,8 +391,7 @@ async def api_delete_task(task_id: str):
@router.get("/api/queue/status", response_class=JSONResponse)
 async def queue_status(assigned_to: str = "default"):
    """Return queue status for the chat panel's agent status indicator."""
-    db = _get_db()
-    try:
+    with _get_db() as db:
        running = db.execute(
            "SELECT * FROM tasks WHERE status='running' AND assigned_to=? LIMIT 1",
            (assigned_to,),
@@ -431,8 +400,6 @@ async def queue_status(assigned_to: str = "default"):
            "SELECT COUNT(*) as cnt FROM tasks WHERE status IN ('pending_approval','approved') AND assigned_to=?",
            (assigned_to,),
        ).fetchone()
-    finally:
-        db.close()

    if running:
        return JSONResponse(
--- a/src/dashboard/routes/voice.py
+++ b/src/dashboard/routes/voice.py
@@ -43,7 +43,8 @@ async def tts_status():
            "available": voice_tts.available,
            "voices": voice_tts.get_voices() if voice_tts.available else [],
        }
-    except Exception:
+    except Exception as exc:
+        logger.debug("Voice config error: %s", exc)
        return {"available": False, "voices": []}


@@ -139,7 +140,8 @@ async def process_voice_input(

            if voice_tts.available:
                voice_tts.speak(response_text)
-        except Exception:
+        except Exception as exc:
+            logger.debug("Voice TTS error: %s", exc)
            pass

    return {
--- a/src/dashboard/routes/work_orders.py
+++ b/src/dashboard/routes/work_orders.py
@@ -3,6 +3,8 @@
 import logging
 import sqlite3
 import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from datetime import datetime
 from pathlib import Path

@@ -23,28 +25,29 @@ CATEGORIES = ["bug", "feature", "suggestion", "maintenance", "security"]
 VALID_STATUSES = {"submitted", "triaged", "approved", "in_progress", "completed", "rejected"}


-def _get_db() -> sqlite3.Connection:
+@contextmanager
+def _get_db() -> Generator[sqlite3.Connection, None, None]:
    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS work_orders (
-            id TEXT PRIMARY KEY,
-            title TEXT NOT NULL,
-            description TEXT DEFAULT '',
-            priority TEXT DEFAULT 'medium',
-            category TEXT DEFAULT 'suggestion',
-            submitter TEXT DEFAULT 'dashboard',
-            related_files TEXT DEFAULT '',
-            status TEXT DEFAULT 'submitted',
-            result TEXT DEFAULT '',
-            rejection_reason TEXT DEFAULT '',
-            created_at TEXT DEFAULT (datetime('now')),
-            completed_at TEXT
-        )
-    """)
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(DB_PATH))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS work_orders (
+                id TEXT PRIMARY KEY,
+                title TEXT NOT NULL,
+                description TEXT DEFAULT '',
+                priority TEXT DEFAULT 'medium',
+                category TEXT DEFAULT 'suggestion',
+                submitter TEXT DEFAULT 'dashboard',
+                related_files TEXT DEFAULT '',
+                status TEXT DEFAULT 'submitted',
+                result TEXT DEFAULT '',
+                rejection_reason TEXT DEFAULT '',
+                created_at TEXT DEFAULT (datetime('now')),
+                completed_at TEXT
+            )
+        """)
+        conn.commit()
+        yield conn


 class _EnumLike:
@@ -104,14 +107,11 @@ def _query_wos(db, statuses):

@router.get("/work-orders/queue", response_class=HTMLResponse)
 async def work_orders_page(request: Request):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        pending = _query_wos(db, ["submitted", "triaged"])
        active = _query_wos(db, ["approved", "in_progress"])
        completed = _query_wos(db, ["completed"])
        rejected = _query_wos(db, ["rejected"])
-    finally:
-        db.close()

    return templates.TemplateResponse(
        request,
@@ -148,8 +148,7 @@ async def submit_work_order(
    priority = priority if priority in PRIORITIES else "medium"
    category = category if category in CATEGORIES else "suggestion"

-    db = _get_db()
-    try:
+    with _get_db() as db:
        db.execute(
            "INSERT INTO work_orders (id, title, description, priority, category, submitter, related_files, created_at) "
            "VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
@@ -157,8 +156,6 @@ async def submit_work_order(
        )
        db.commit()
        row = db.execute("SELECT * FROM work_orders WHERE id=?", (wo_id,)).fetchone()
-    finally:
-        db.close()

    wo = _WOView(_row_to_dict(row))
    return templates.TemplateResponse(request, "partials/work_order_card.html", {"wo": wo})
@@ -171,11 +168,8 @@ async def submit_work_order(

@router.get("/work-orders/queue/pending", response_class=HTMLResponse)
 async def pending_partial(request: Request):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        wos = _query_wos(db, ["submitted", "triaged"])
-    finally:
-        db.close()
    if not wos:
        return HTMLResponse(
            '<div style="color: var(--text-muted); font-size: 0.8rem; padding: 12px 0;">'
@@ -193,11 +187,8 @@ async def pending_partial(request: Request):

@router.get("/work-orders/queue/active", response_class=HTMLResponse)
 async def active_partial(request: Request):
-    db = _get_db()
-    try:
+    with _get_db() as db:
        wos = _query_wos(db, ["approved", "in_progress"])
-    finally:
-        db.close()
    if not wos:
        return HTMLResponse(
            '<div style="color: var(--text-muted); font-size: 0.8rem; padding: 12px 0;">'
@@ -222,8 +213,7 @@ async def _update_status(request: Request, wo_id: str, new_status: str, **extra)
    completed_at = (
        datetime.utcnow().isoformat() if new_status in ("completed", "rejected") else None
    )
-    db = _get_db()
-    try:
+    with _get_db() as db:
        sets = ["status=?", "completed_at=COALESCE(?, completed_at)"]
        vals = [new_status, completed_at]
        for col, val in extra.items():
@@ -233,8 +223,6 @@ async def _update_status(request: Request, wo_id: str, new_status: str, **extra)
        db.execute(f"UPDATE work_orders SET {', '.join(sets)} WHERE id=?", vals)
        db.commit()
        row = db.execute("SELECT * FROM work_orders WHERE id=?", (wo_id,)).fetchone()
-    finally:
-        db.close()
    if not row:
        raise HTTPException(404, "Work order not found")
    wo = _WOView(_row_to_dict(row))
--- a/src/dashboard/store.py
+++ b/src/dashboard/store.py
@@ -1,34 +1,5 @@
-from dataclasses import dataclass
+"""Backward-compatible re-export — canonical home is infrastructure.chat_store."""

+from infrastructure.chat_store import DB_PATH, MAX_MESSAGES, Message, MessageLog, message_log

-@dataclass
-class Message:
-    role: str  # "user" | "agent" | "error"
-    content: str
-    timestamp: str
-    source: str = "browser"  # "browser" | "api" | "telegram" | "discord" | "system"
-
-
-class MessageLog:
-    """In-memory chat history for the lifetime of the server process."""
-
-    def __init__(self) -> None:
-        self._entries: list[Message] = []
-
-    def append(self, role: str, content: str, timestamp: str, source: str = "browser") -> None:
-        self._entries.append(
-            Message(role=role, content=content, timestamp=timestamp, source=source)
-        )
-
-    def all(self) -> list[Message]:
-        return list(self._entries)
-
-    def clear(self) -> None:
-        self._entries.clear()
-
-    def __len__(self) -> int:
-        return len(self._entries)
-
-
-# Module-level singleton shared across the app
-message_log = MessageLog()
+__all__ = ["DB_PATH", "MAX_MESSAGES", "Message", "MessageLog", "message_log"]
--- a/src/dashboard/templates/base.html
+++ b/src/dashboard/templates/base.html
@@ -327,7 +327,11 @@
        .then(function(data) {
          var list = document.getElementById('notif-list');
          if (!data.length) {
-            list.innerHTML = '<div class="mc-notif-empty">No recent notifications</div>';
+            list.innerHTML = '';
+            var emptyDiv = document.createElement('div');
+            emptyDiv.className = 'mc-notif-empty';
+            emptyDiv.textContent = 'No recent notifications';
+            list.appendChild(emptyDiv);
            return;
          }
          list.innerHTML = '';
--- a/src/dashboard/templates/partials/agent_panel_chat.html
+++ b/src/dashboard/templates/partials/agent_panel_chat.html
@@ -120,14 +120,17 @@

  function updateFromData(data) {
    if (data.is_working && data.current_task) {
-      statusEl.innerHTML = '<span style="color: #ffaa00;">working...</span>';
+      statusEl.textContent = 'working...';
+      statusEl.style.color = '#ffaa00';
      banner.style.display = 'block';
      taskTitle.textContent = data.current_task.title;
    } else if (data.tasks_ahead > 0) {
-      statusEl.innerHTML = '<span style="color: #888;">queue: ' + data.tasks_ahead + ' ahead</span>';
+      statusEl.textContent = 'queue: ' + data.tasks_ahead + ' ahead';
+      statusEl.style.color = '#888';
      banner.style.display = 'none';
    } else {
-      statusEl.innerHTML = '<span style="color: #00ff88;">ready</span>';
+      statusEl.textContent = 'ready';
+      statusEl.style.color = '#00ff88';
      banner.style.display = 'none';
    }
  }
--- a/src/dashboard/templates/partials/history.html
+++ b/src/dashboard/templates/partials/history.html
@@ -20,7 +20,7 @@
 {% else %}
 <div class="chat-message agent">
  <div class="msg-meta">TIMMY // SYSTEM</div>
-  <div class="msg-body">Mission Control initialized. Timmy ready — awaiting input.</div>
+  <div class="msg-body">{{ welcome_message | e }}</div>
 </div>
 {% endif %}
 <script>if(typeof scrollChat==='function'){setTimeout(scrollChat,50);}</script>
--- a/src/dashboard/templates/swarm_live.html
+++ b/src/dashboard/templates/swarm_live.html
@@ -198,17 +198,43 @@ function addActivityEvent(evt) {
        } catch(e) {}
    }
    
-    item.innerHTML = `
-        <div class="activity-icon">${icon}</div>
-        <div class="activity-content">
-            <div class="activity-label">${label}</div>
-            ${desc ? `<div class="activity-desc">${desc}</div>` : ''}
-            <div class="activity-meta">
-                <span class="activity-time">${time}</span>
-                <span class="activity-source">${evt.source || 'system'}</span>
-            </div>
-        </div>
-    `;
+    // Build DOM safely using createElement and textContent
+    var iconDiv = document.createElement('div');
+    iconDiv.className = 'activity-icon';
+    iconDiv.textContent = icon;
+    
+    var contentDiv = document.createElement('div');
+    contentDiv.className = 'activity-content';
+    
+    var labelDiv = document.createElement('div');
+    labelDiv.className = 'activity-label';
+    labelDiv.textContent = label;
+    contentDiv.appendChild(labelDiv);
+    
+    if (desc) {
+        var descDiv = document.createElement('div');
+        descDiv.className = 'activity-desc';
+        descDiv.textContent = desc;
+        contentDiv.appendChild(descDiv);
+    }
+    
+    var metaDiv = document.createElement('div');
+    metaDiv.className = 'activity-meta';
+    
+    var timeSpan = document.createElement('span');
+    timeSpan.className = 'activity-time';
+    timeSpan.textContent = time;
+    
+    var sourceSpan = document.createElement('span');
+    sourceSpan.className = 'activity-source';
+    sourceSpan.textContent = evt.source || 'system';
+    
+    metaDiv.appendChild(timeSpan);
+    metaDiv.appendChild(sourceSpan);
+    contentDiv.appendChild(metaDiv);
+    
+    item.appendChild(iconDiv);
+    item.appendChild(contentDiv);
    
    // Add to top
    container.insertBefore(item, container.firstChild);
--- a/src/infrastructure/chat_store.py
+++ b/src/infrastructure/chat_store.py
@@ -0,0 +1,153 @@
+"""Persistent chat message store backed by SQLite.
+
+Provides the same API as the original in-memory MessageLog so all callers
+(dashboard routes, chat_api, thinking, briefing) work without changes.
+
+Data lives in ``data/chat.db`` — survives server restarts.
+A configurable retention policy (default 500 messages) keeps the DB lean.
+"""
+
+import sqlite3
+import threading
+from collections.abc import Generator
+from contextlib import closing, contextmanager
+from dataclasses import dataclass
+from pathlib import Path
+
+# ── Data dir — resolved relative to repo root (three levels up from this file) ──
+_REPO_ROOT = Path(__file__).resolve().parents[3]
+DB_PATH: Path = _REPO_ROOT / "data" / "chat.db"
+
+# Maximum messages to retain (oldest pruned on append)
+MAX_MESSAGES: int = 500
+
+
+@dataclass
+class Message:
+    role: str  # "user" | "agent" | "error"
+    content: str
+    timestamp: str
+    source: str = "browser"  # "browser" | "api" | "telegram" | "discord" | "system"
+
+
+@contextmanager
+def _get_conn(db_path: Path | None = None) -> Generator[sqlite3.Connection, None, None]:
+    """Open (or create) the chat database and ensure schema exists."""
+    path = db_path or DB_PATH
+    path.parent.mkdir(parents=True, exist_ok=True)
+    with closing(sqlite3.connect(str(path), check_same_thread=False)) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("PRAGMA journal_mode=WAL")
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS chat_messages (
+                id        INTEGER PRIMARY KEY AUTOINCREMENT,
+                role      TEXT NOT NULL,
+                content   TEXT NOT NULL,
+                timestamp TEXT NOT NULL,
+                source    TEXT NOT NULL DEFAULT 'browser'
+            )
+        """)
+        conn.commit()
+        yield conn
+
+
+class MessageLog:
+    """SQLite-backed chat history — drop-in replacement for the old in-memory list."""
+
+    def __init__(self, db_path: Path | None = None) -> None:
+        self._db_path = db_path or DB_PATH
+        self._lock = threading.Lock()
+        self._conn: sqlite3.Connection | None = None
+
+    # Lazy connection — opened on first use, not at import time.
+    def _ensure_conn(self) -> sqlite3.Connection:
+        if self._conn is None:
+            # Open a persistent connection for the class instance
+            path = self._db_path or DB_PATH
+            path.parent.mkdir(parents=True, exist_ok=True)
+            conn = sqlite3.connect(str(path), check_same_thread=False)
+            conn.row_factory = sqlite3.Row
+            conn.execute("PRAGMA journal_mode=WAL")
+            conn.execute("""
+                CREATE TABLE IF NOT EXISTS chat_messages (
+                    id        INTEGER PRIMARY KEY AUTOINCREMENT,
+                    role      TEXT NOT NULL,
+                    content   TEXT NOT NULL,
+                    timestamp TEXT NOT NULL,
+                    source    TEXT NOT NULL DEFAULT 'browser'
+                )
+            """)
+            conn.commit()
+            self._conn = conn
+        return self._conn
+
+    def append(self, role: str, content: str, timestamp: str, source: str = "browser") -> None:
+        with self._lock:
+            conn = self._ensure_conn()
+            conn.execute(
+                "INSERT INTO chat_messages (role, content, timestamp, source) VALUES (?, ?, ?, ?)",
+                (role, content, timestamp, source),
+            )
+            conn.commit()
+            self._prune(conn)
+
+    def all(self) -> list[Message]:
+        with self._lock:
+            conn = self._ensure_conn()
+            rows = conn.execute(
+                "SELECT role, content, timestamp, source FROM chat_messages ORDER BY id"
+            ).fetchall()
+        return [
+            Message(
+                role=r["role"], content=r["content"], timestamp=r["timestamp"], source=r["source"]
+            )
+            for r in rows
+        ]
+
+    def recent(self, limit: int = 50) -> list[Message]:
+        """Return the *limit* most recent messages (oldest-first)."""
+        with self._lock:
+            conn = self._ensure_conn()
+            rows = conn.execute(
+                "SELECT role, content, timestamp, source FROM chat_messages "
+                "ORDER BY id DESC LIMIT ?",
+                (limit,),
+            ).fetchall()
+        return [
+            Message(
+                role=r["role"], content=r["content"], timestamp=r["timestamp"], source=r["source"]
+            )
+            for r in reversed(rows)
+        ]
+
+    def clear(self) -> None:
+        with self._lock:
+            conn = self._ensure_conn()
+            conn.execute("DELETE FROM chat_messages")
+            conn.commit()
+
+    def _prune(self, conn: sqlite3.Connection) -> None:
+        """Keep at most MAX_MESSAGES rows, deleting the oldest."""
+        count = conn.execute("SELECT COUNT(*) FROM chat_messages").fetchone()[0]
+        if count > MAX_MESSAGES:
+            excess = count - MAX_MESSAGES
+            conn.execute(
+                "DELETE FROM chat_messages WHERE id IN "
+                "(SELECT id FROM chat_messages ORDER BY id LIMIT ?)",
+                (excess,),
+            )
+            conn.commit()
+
+    def close(self) -> None:
+        if self._conn is not None:
+            self._conn.close()
+            self._conn = None
+
+    def __len__(self) -> int:
+        with self._lock:
+            conn = self._ensure_conn()
+            return conn.execute("SELECT COUNT(*) FROM chat_messages").fetchone()[0]
+
+
+# Module-level singleton shared across the app
+message_log = MessageLog()
--- a/src/infrastructure/error_capture.py
+++ b/src/infrastructure/error_capture.py
@@ -22,6 +22,14 @@ logger = logging.getLogger(__name__)
 # In-memory dedup cache: hash -> last_seen timestamp
 _dedup_cache: dict[str, datetime] = {}

+_error_recorder = None
+
+
+def register_error_recorder(fn):
+    """Register a callback for recording errors to session log."""
+    global _error_recorder
+    _error_recorder = fn
+

 def _stack_hash(exc: Exception) -> str:
    """Create a stable hash of the exception type + traceback locations.
@@ -87,7 +95,8 @@ def _get_git_context() -> dict:
        ).stdout.strip()

        return {"branch": branch, "commit": commit}
-    except Exception:
+    except Exception as exc:
+        logger.warning("Git info capture error: %s", exc)
        return {"branch": "unknown", "commit": "unknown"}


@@ -199,7 +208,8 @@ def capture_error(
                    "title": title[:100],
                },
            )
-        except Exception:
+        except Exception as exc:
+            logger.warning("Bug report screenshot error: %s", exc)
            pass

    except Exception as task_exc:
@@ -214,19 +224,18 @@ def capture_error(
            message=f"{type(exc).__name__} in {source}: {str(exc)[:80]}",
            category="system",
        )
-    except Exception:
+    except Exception as exc:
+        logger.warning("Bug report notification error: %s", exc)
        pass

-    # 4. Record in session logger
-    try:
-        from timmy.session_logger import get_session_logger
-
-        session_logger = get_session_logger()
-        session_logger.record_error(
-            error=f"{type(exc).__name__}: {str(exc)}",
-            context=source,
-        )
-    except Exception:
-        pass
+    # 4. Record in session logger (via registered callback)
+    if _error_recorder is not None:
+        try:
+            _error_recorder(
+                error=f"{type(exc).__name__}: {str(exc)}",
+                context=source,
+            )
+        except Exception as log_exc:
+            logger.warning("Bug report session logging error: %s", log_exc)

    return task_id
--- a/src/infrastructure/events/broadcaster.py
+++ b/src/infrastructure/events/broadcaster.py
@@ -1,193 +0,0 @@
-"""Event Broadcaster - bridges event_log to WebSocket clients.
-
-When events are logged, they are broadcast to all connected dashboard clients
-via WebSocket for real-time activity feed updates.
-"""
-
-import asyncio
-import logging
-from typing import Optional
-
-try:
-    from swarm.event_log import EventLogEntry
-except ImportError:
-    EventLogEntry = None
-
-logger = logging.getLogger(__name__)
-
-
-class EventBroadcaster:
-    """Broadcasts events to WebSocket clients.
-
-    Usage:
-        from infrastructure.events.broadcaster import event_broadcaster
-        event_broadcaster.broadcast(event)
-    """
-
-    def __init__(self) -> None:
-        self._ws_manager: Optional = None
-
-    def _get_ws_manager(self):
-        """Lazy import to avoid circular deps."""
-        if self._ws_manager is None:
-            try:
-                from infrastructure.ws_manager.handler import ws_manager
-
-                self._ws_manager = ws_manager
-            except Exception as exc:
-                logger.debug("WebSocket manager not available: %s", exc)
-        return self._ws_manager
-
-    async def broadcast(self, event: EventLogEntry) -> int:
-        """Broadcast an event to all connected WebSocket clients.
-
-        Args:
-            event: The event to broadcast
-
-        Returns:
-            Number of clients notified
-        """
-        ws_manager = self._get_ws_manager()
-        if not ws_manager:
-            return 0
-
-        # Build message payload
-        payload = {
-            "type": "event",
-            "payload": {
-                "id": event.id,
-                "event_type": event.event_type.value,
-                "source": event.source,
-                "task_id": event.task_id,
-                "agent_id": event.agent_id,
-                "timestamp": event.timestamp,
-                "data": event.data,
-            },
-        }
-
-        try:
-            # Broadcast to all connected clients
-            count = await ws_manager.broadcast_json(payload)
-            logger.debug("Broadcasted event %s to %d clients", event.id[:8], count)
-            return count
-        except Exception as exc:
-            logger.error("Failed to broadcast event: %s", exc)
-            return 0
-
-    def broadcast_sync(self, event: EventLogEntry) -> None:
-        """Synchronous wrapper for broadcast.
-
-        Use this from synchronous code - it schedules the async broadcast
-        in the event loop if one is running.
-        """
-        try:
-            asyncio.get_running_loop()
-            # Schedule in background, don't wait
-            asyncio.create_task(self.broadcast(event))
-        except RuntimeError:
-            # No event loop running, skip broadcast
-            pass
-
-
-# Global singleton
-event_broadcaster = EventBroadcaster()
-
-
-# Event type to icon/emoji mapping
-EVENT_ICONS = {
-    "task.created": "📝",
-    "task.bidding": "⏳",
-    "task.assigned": "👤",
-    "task.started": "▶️",
-    "task.completed": "✅",
-    "task.failed": "❌",
-    "agent.joined": "🟢",
-    "agent.left": "🔴",
-    "agent.status_changed": "🔄",
-    "bid.submitted": "💰",
-    "auction.closed": "🏁",
-    "tool.called": "🔧",
-    "tool.completed": "⚙️",
-    "tool.failed": "💥",
-    "system.error": "⚠️",
-    "system.warning": "🔶",
-    "system.info": "ℹ️",
-    "error.captured": "🐛",
-    "bug_report.created": "📋",
-}
-
-EVENT_LABELS = {
-    "task.created": "New task",
-    "task.bidding": "Bidding open",
-    "task.assigned": "Task assigned",
-    "task.started": "Task started",
-    "task.completed": "Task completed",
-    "task.failed": "Task failed",
-    "agent.joined": "Agent joined",
-    "agent.left": "Agent left",
-    "agent.status_changed": "Status changed",
-    "bid.submitted": "Bid submitted",
-    "auction.closed": "Auction closed",
-    "tool.called": "Tool called",
-    "tool.completed": "Tool completed",
-    "tool.failed": "Tool failed",
-    "system.error": "Error",
-    "system.warning": "Warning",
-    "system.info": "Info",
-    "error.captured": "Error captured",
-    "bug_report.created": "Bug report filed",
-}
-
-
-def get_event_icon(event_type: str) -> str:
-    """Get emoji icon for event type."""
-    return EVENT_ICONS.get(event_type, "•")
-
-
-def get_event_label(event_type: str) -> str:
-    """Get human-readable label for event type."""
-    return EVENT_LABELS.get(event_type, event_type)
-
-
-def format_event_for_display(event: EventLogEntry) -> dict:
-    """Format event for display in activity feed.
-
-    Returns dict with display-friendly fields.
-    """
-    data = event.data or {}
-
-    # Build description based on event type
-    description = ""
-    if event.event_type.value == "task.created":
-        desc = data.get("description", "")
-        description = desc[:60] + "..." if len(desc) > 60 else desc
-    elif event.event_type.value == "task.assigned":
-        agent = event.agent_id[:8] if event.agent_id else "unknown"
-        bid = data.get("bid_sats", "?")
-        description = f"to {agent} ({bid} sats)"
-    elif event.event_type.value == "bid.submitted":
-        bid = data.get("bid_sats", "?")
-        description = f"{bid} sats"
-    elif event.event_type.value == "agent.joined":
-        persona = data.get("persona_id", "")
-        description = f"Persona: {persona}" if persona else "New agent"
-    else:
-        # Generic: use any string data
-        for key in ["message", "reason", "description"]:
-            if key in data:
-                val = str(data[key])
-                description = val[:60] + "..." if len(val) > 60 else val
-                break
-
-    return {
-        "id": event.id,
-        "icon": get_event_icon(event.event_type.value),
-        "label": get_event_label(event.event_type.value),
-        "type": event.event_type.value,
-        "source": event.source,
-        "description": description,
-        "timestamp": event.timestamp,
-        "time_short": event.timestamp[11:19] if event.timestamp else "",
-        "task_id": event.task_id,
-        "agent_id": event.agent_id,
-    }
--- a/src/infrastructure/events/bus.py
+++ b/src/infrastructure/events/bus.py
@@ -9,7 +9,8 @@ import asyncio
 import json
 import logging
 import sqlite3
-from collections.abc import Callable, Coroutine
+from collections.abc import Callable, Coroutine, Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass, field
 from datetime import UTC, datetime
 from pathlib import Path
@@ -63,7 +64,7 @@ class EventBus:

        @bus.subscribe("agent.task.*")
        async def handle_task(event: Event):
-            print(f"Task event: {event.data}")
+            logger.debug(f"Task event: {event.data}")

        await bus.publish(Event(
            type="agent.task.assigned",
@@ -99,51 +100,48 @@ class EventBus:
        if self._persistence_db_path is None:
            return
        self._persistence_db_path.parent.mkdir(parents=True, exist_ok=True)
-        conn = sqlite3.connect(str(self._persistence_db_path))
-        try:
+        with closing(sqlite3.connect(str(self._persistence_db_path))) as conn:
            conn.execute("PRAGMA journal_mode=WAL")
            conn.execute("PRAGMA busy_timeout=5000")
            conn.executescript(_EVENTS_SCHEMA)
            conn.commit()
-        finally:
-            conn.close()

-    def _get_persistence_conn(self) -> sqlite3.Connection | None:
+    @contextmanager
+    def _get_persistence_conn(self) -> Generator[sqlite3.Connection | None, None, None]:
        """Get a connection to the persistence database."""
        if self._persistence_db_path is None:
-            return None
-        conn = sqlite3.connect(str(self._persistence_db_path))
-        conn.row_factory = sqlite3.Row
-        conn.execute("PRAGMA busy_timeout=5000")
-        return conn
+            yield None
+            return
+        with closing(sqlite3.connect(str(self._persistence_db_path))) as conn:
+            conn.row_factory = sqlite3.Row
+            conn.execute("PRAGMA busy_timeout=5000")
+            yield conn

    def _persist_event(self, event: Event) -> None:
        """Write an event to the persistence database."""
-        conn = self._get_persistence_conn()
-        if conn is None:
-            return
-        try:
-            task_id = event.data.get("task_id", "")
-            agent_id = event.data.get("agent_id", "")
-            conn.execute(
-                "INSERT OR IGNORE INTO events "
-                "(id, event_type, source, task_id, agent_id, data, timestamp) "
-                "VALUES (?, ?, ?, ?, ?, ?, ?)",
-                (
-                    event.id,
-                    event.type,
-                    event.source,
-                    task_id,
-                    agent_id,
-                    json.dumps(event.data),
-                    event.timestamp,
-                ),
-            )
-            conn.commit()
-        except Exception as exc:
-            logger.debug("Failed to persist event: %s", exc)
-        finally:
-            conn.close()
+        with self._get_persistence_conn() as conn:
+            if conn is None:
+                return
+            try:
+                task_id = event.data.get("task_id", "")
+                agent_id = event.data.get("agent_id", "")
+                conn.execute(
+                    "INSERT OR IGNORE INTO events "
+                    "(id, event_type, source, task_id, agent_id, data, timestamp) "
+                    "VALUES (?, ?, ?, ?, ?, ?, ?)",
+                    (
+                        event.id,
+                        event.type,
+                        event.source,
+                        task_id,
+                        agent_id,
+                        json.dumps(event.data),
+                        event.timestamp,
+                    ),
+                )
+                conn.commit()
+            except Exception as exc:
+                logger.debug("Failed to persist event: %s", exc)

    # ── Replay ───────────────────────────────────────────────────────────

@@ -165,45 +163,43 @@ class EventBus:
        Returns:
            List of Event objects from persistent storage.
        """
-        conn = self._get_persistence_conn()
-        if conn is None:
-            return []
+        with self._get_persistence_conn() as conn:
+            if conn is None:
+                return []

-        try:
-            conditions = []
-            params: list = []
+            try:
+                conditions = []
+                params: list = []

-            if event_type:
-                conditions.append("event_type = ?")
-                params.append(event_type)
-            if source:
-                conditions.append("source = ?")
-                params.append(source)
-            if task_id:
-                conditions.append("task_id = ?")
-                params.append(task_id)
+                if event_type:
+                    conditions.append("event_type = ?")
+                    params.append(event_type)
+                if source:
+                    conditions.append("source = ?")
+                    params.append(source)
+                if task_id:
+                    conditions.append("task_id = ?")
+                    params.append(task_id)

-            where = " AND ".join(conditions) if conditions else "1=1"
-            sql = f"SELECT * FROM events WHERE {where} ORDER BY timestamp DESC LIMIT ?"
-            params.append(limit)
+                where = " AND ".join(conditions) if conditions else "1=1"
+                sql = f"SELECT * FROM events WHERE {where} ORDER BY timestamp DESC LIMIT ?"
+                params.append(limit)

-            rows = conn.execute(sql, params).fetchall()
+                rows = conn.execute(sql, params).fetchall()

-            return [
-                Event(
-                    id=row["id"],
-                    type=row["event_type"],
-                    source=row["source"],
-                    data=json.loads(row["data"]) if row["data"] else {},
-                    timestamp=row["timestamp"],
-                )
-                for row in rows
-            ]
-        except Exception as exc:
-            logger.debug("Failed to replay events: %s", exc)
-            return []
-        finally:
-            conn.close()
+                return [
+                    Event(
+                        id=row["id"],
+                        type=row["event_type"],
+                        source=row["source"],
+                        data=json.loads(row["data"]) if row["data"] else {},
+                        timestamp=row["timestamp"],
+                    )
+                    for row in rows
+                ]
+            except Exception as exc:
+                logger.debug("Failed to replay events: %s", exc)
+                return []

    # ── Subscribe / Publish ──────────────────────────────────────────────

--- a/src/infrastructure/hands/shell.py
+++ b/src/infrastructure/hands/shell.py
@@ -211,7 +211,7 @@ class ShellHand:
                )

            latency = (time.time() - start) * 1000
-            exit_code = proc.returncode or 0
+            exit_code = proc.returncode if proc.returncode is not None else -1
            stdout = stdout_bytes.decode("utf-8", errors="replace").strip()
            stderr = stderr_bytes.decode("utf-8", errors="replace").strip()

--- a/src/infrastructure/models/multimodal.py
+++ b/src/infrastructure/models/multimodal.py
@@ -93,18 +93,6 @@ KNOWN_MODEL_CAPABILITIES: dict[str, set[ModelCapability]] = {
        ModelCapability.VISION,
    },
    # Qwen series
-    "qwen3.5": {
-        ModelCapability.TEXT,
-        ModelCapability.TOOLS,
-        ModelCapability.JSON,
-        ModelCapability.STREAMING,
-    },
-    "qwen3.5:latest": {
-        ModelCapability.TEXT,
-        ModelCapability.TOOLS,
-        ModelCapability.JSON,
-        ModelCapability.STREAMING,
-    },
    "qwen2.5": {
        ModelCapability.TEXT,
        ModelCapability.TOOLS,
@@ -271,9 +259,8 @@ DEFAULT_FALLBACK_CHAINS: dict[ModelCapability, list[str]] = {
    ],
    ModelCapability.TOOLS: [
        "llama3.1:8b-instruct",  # Best tool use
-        "qwen3.5:latest",  # Qwen 3.5 — strong tool use
-        "llama3.2:3b",  # Smaller but capable
        "qwen2.5:7b",  # Reliable fallback
+        "llama3.2:3b",  # Smaller but capable
    ],
    ModelCapability.AUDIO: [
        # Audio models are less common in Ollama
--- a/src/infrastructure/models/registry.py
+++ b/src/infrastructure/models/registry.py
@@ -11,6 +11,8 @@ model roles (student, teacher, judge/PRM) run on dedicated resources.
 import logging
 import sqlite3
 import threading
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass
 from datetime import UTC, datetime
 from enum import StrEnum
@@ -60,36 +62,37 @@ class CustomModel:
            self.registered_at = datetime.now(UTC).isoformat()


-def _get_conn() -> sqlite3.Connection:
+@contextmanager
+def _get_conn() -> Generator[sqlite3.Connection, None, None]:
    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    conn.execute("PRAGMA journal_mode=WAL")
-    conn.execute("PRAGMA busy_timeout=5000")
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS custom_models (
-            name            TEXT PRIMARY KEY,
-            format          TEXT NOT NULL,
-            path            TEXT NOT NULL,
-            role            TEXT NOT NULL DEFAULT 'general',
-            context_window  INTEGER NOT NULL DEFAULT 4096,
-            description     TEXT NOT NULL DEFAULT '',
-            registered_at   TEXT NOT NULL,
-            active          INTEGER NOT NULL DEFAULT 1,
-            default_temperature REAL NOT NULL DEFAULT 0.7,
-            max_tokens      INTEGER NOT NULL DEFAULT 2048
-        )
-        """)
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS agent_model_assignments (
-            agent_id    TEXT PRIMARY KEY,
-            model_name  TEXT NOT NULL,
-            assigned_at TEXT NOT NULL,
-            FOREIGN KEY (model_name) REFERENCES custom_models(name)
-        )
-        """)
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(DB_PATH))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("PRAGMA journal_mode=WAL")
+        conn.execute("PRAGMA busy_timeout=5000")
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS custom_models (
+                name            TEXT PRIMARY KEY,
+                format          TEXT NOT NULL,
+                path            TEXT NOT NULL,
+                role            TEXT NOT NULL DEFAULT 'general',
+                context_window  INTEGER NOT NULL DEFAULT 4096,
+                description     TEXT NOT NULL DEFAULT '',
+                registered_at   TEXT NOT NULL,
+                active          INTEGER NOT NULL DEFAULT 1,
+                default_temperature REAL NOT NULL DEFAULT 0.7,
+                max_tokens      INTEGER NOT NULL DEFAULT 2048
+            )
+            """)
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS agent_model_assignments (
+                agent_id    TEXT PRIMARY KEY,
+                model_name  TEXT NOT NULL,
+                assigned_at TEXT NOT NULL,
+                FOREIGN KEY (model_name) REFERENCES custom_models(name)
+            )
+            """)
+        conn.commit()
+        yield conn


 class ModelRegistry:
@@ -105,23 +108,22 @@ class ModelRegistry:
    def _load_from_db(self) -> None:
        """Bootstrap cache from SQLite."""
        try:
-            conn = _get_conn()
-            for row in conn.execute("SELECT * FROM custom_models WHERE active = 1").fetchall():
-                self._models[row["name"]] = CustomModel(
-                    name=row["name"],
-                    format=ModelFormat(row["format"]),
-                    path=row["path"],
-                    role=ModelRole(row["role"]),
-                    context_window=row["context_window"],
-                    description=row["description"],
-                    registered_at=row["registered_at"],
-                    active=bool(row["active"]),
-                    default_temperature=row["default_temperature"],
-                    max_tokens=row["max_tokens"],
-                )
-            for row in conn.execute("SELECT * FROM agent_model_assignments").fetchall():
-                self._agent_assignments[row["agent_id"]] = row["model_name"]
-            conn.close()
+            with _get_conn() as conn:
+                for row in conn.execute("SELECT * FROM custom_models WHERE active = 1").fetchall():
+                    self._models[row["name"]] = CustomModel(
+                        name=row["name"],
+                        format=ModelFormat(row["format"]),
+                        path=row["path"],
+                        role=ModelRole(row["role"]),
+                        context_window=row["context_window"],
+                        description=row["description"],
+                        registered_at=row["registered_at"],
+                        active=bool(row["active"]),
+                        default_temperature=row["default_temperature"],
+                        max_tokens=row["max_tokens"],
+                    )
+                for row in conn.execute("SELECT * FROM agent_model_assignments").fetchall():
+                    self._agent_assignments[row["agent_id"]] = row["model_name"]
        except Exception as exc:
            logger.warning("Failed to load model registry from DB: %s", exc)

@@ -130,29 +132,28 @@ class ModelRegistry:
    def register(self, model: CustomModel) -> CustomModel:
        """Register a new custom model."""
        with self._lock:
-            conn = _get_conn()
-            conn.execute(
-                """
-                INSERT OR REPLACE INTO custom_models
-                    (name, format, path, role, context_window, description,
-                     registered_at, active, default_temperature, max_tokens)
-                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-                """,
-                (
-                    model.name,
-                    model.format.value,
-                    model.path,
-                    model.role.value,
-                    model.context_window,
-                    model.description,
-                    model.registered_at,
-                    int(model.active),
-                    model.default_temperature,
-                    model.max_tokens,
-                ),
-            )
-            conn.commit()
-            conn.close()
+            with _get_conn() as conn:
+                conn.execute(
+                    """
+                    INSERT OR REPLACE INTO custom_models
+                        (name, format, path, role, context_window, description,
+                         registered_at, active, default_temperature, max_tokens)
+                    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+                    """,
+                    (
+                        model.name,
+                        model.format.value,
+                        model.path,
+                        model.role.value,
+                        model.context_window,
+                        model.description,
+                        model.registered_at,
+                        int(model.active),
+                        model.default_temperature,
+                        model.max_tokens,
+                    ),
+                )
+                conn.commit()
            self._models[model.name] = model
            logger.info("Registered model: %s (%s)", model.name, model.format.value)
            return model
@@ -162,11 +163,10 @@ class ModelRegistry:
        with self._lock:
            if name not in self._models:
                return False
-            conn = _get_conn()
-            conn.execute("DELETE FROM custom_models WHERE name = ?", (name,))
-            conn.execute("DELETE FROM agent_model_assignments WHERE model_name = ?", (name,))
-            conn.commit()
-            conn.close()
+            with _get_conn() as conn:
+                conn.execute("DELETE FROM custom_models WHERE name = ?", (name,))
+                conn.execute("DELETE FROM agent_model_assignments WHERE model_name = ?", (name,))
+                conn.commit()
            del self._models[name]
            # Remove any agent assignments using this model
            self._agent_assignments = {
@@ -193,13 +193,12 @@ class ModelRegistry:
            return False
        with self._lock:
            model.active = active
-            conn = _get_conn()
-            conn.execute(
-                "UPDATE custom_models SET active = ? WHERE name = ?",
-                (int(active), name),
-            )
-            conn.commit()
-            conn.close()
+            with _get_conn() as conn:
+                conn.execute(
+                    "UPDATE custom_models SET active = ? WHERE name = ?",
+                    (int(active), name),
+                )
+                conn.commit()
        return True

    # ── Agent-model assignments ────────────────────────────────────────────
@@ -210,17 +209,16 @@ class ModelRegistry:
            return False
        with self._lock:
            now = datetime.now(UTC).isoformat()
-            conn = _get_conn()
-            conn.execute(
-                """
-                INSERT OR REPLACE INTO agent_model_assignments
-                    (agent_id, model_name, assigned_at)
-                VALUES (?, ?, ?)
-                """,
-                (agent_id, model_name, now),
-            )
-            conn.commit()
-            conn.close()
+            with _get_conn() as conn:
+                conn.execute(
+                    """
+                    INSERT OR REPLACE INTO agent_model_assignments
+                        (agent_id, model_name, assigned_at)
+                    VALUES (?, ?, ?)
+                    """,
+                    (agent_id, model_name, now),
+                )
+                conn.commit()
            self._agent_assignments[agent_id] = model_name
            logger.info("Assigned model %s to agent %s", model_name, agent_id)
        return True
@@ -230,13 +228,12 @@ class ModelRegistry:
        with self._lock:
            if agent_id not in self._agent_assignments:
                return False
-            conn = _get_conn()
-            conn.execute(
-                "DELETE FROM agent_model_assignments WHERE agent_id = ?",
-                (agent_id,),
-            )
-            conn.commit()
-            conn.close()
+            with _get_conn() as conn:
+                conn.execute(
+                    "DELETE FROM agent_model_assignments WHERE agent_id = ?",
+                    (agent_id,),
+                )
+                conn.commit()
            del self._agent_assignments[agent_id]
        return True

--- a/src/infrastructure/router/cascade.py
+++ b/src/infrastructure/router/cascade.py
@@ -304,7 +304,8 @@ class CascadeRouter:
                url = provider.url or "http://localhost:11434"
                response = requests.get(f"{url}/api/tags", timeout=5)
                return response.status_code == 200
-            except Exception:
+            except Exception as exc:
+                logger.debug("Ollama provider check error: %s", exc)
                return False

        elif provider.type == "airllm":
--- a/src/infrastructure/ws_manager/handler.py
+++ b/src/infrastructure/ws_manager/handler.py
@@ -54,7 +54,8 @@ class WebSocketManager:
        for event in list(self._event_history)[-20:]:
            try:
                await websocket.send_text(event.to_json())
-            except Exception:
+            except Exception as exc:
+                logger.warning("WebSocket history send error: %s", exc)
                break

    def disconnect(self, websocket: WebSocket) -> None:
@@ -83,8 +84,8 @@ class WebSocketManager:
                await ws.send_text(message)
            except ConnectionError:
                disconnected.append(ws)
-            except Exception:
-                logger.warning("Unexpected WebSocket send error", exc_info=True)
+            except Exception as exc:
+                logger.warning("Unexpected WebSocket send error: %s", exc)
                disconnected.append(ws)

        # Clean up dead connections
@@ -156,7 +157,8 @@ class WebSocketManager:
            try:
                await ws.send_text(message)
                count += 1
-            except Exception:
+            except Exception as exc:
+                logger.warning("WebSocket direct send error: %s", exc)
                disconnected.append(ws)

        # Clean up dead connections
--- a/src/integrations/chat_bridge/vendors/discord.py
+++ b/src/integrations/chat_bridge/vendors/discord.py
@@ -87,7 +87,8 @@ if _DISCORD_UI_AVAILABLE:
                await action["target"].send(
                    f"Action `{action['tool_name']}` timed out and was auto-rejected."
                )
-            except Exception:
+            except Exception as exc:
+                logger.warning("Discord action timeout message error: %s", exc)
                pass


@@ -186,7 +187,8 @@ class DiscordVendor(ChatPlatform):
        if self._client and not self._client.is_closed():
            try:
                await self._client.close()
-            except Exception:
+            except Exception as exc:
+                logger.warning("Discord client close error: %s", exc)
                pass
            self._client = None

@@ -330,7 +332,8 @@ class DiscordVendor(ChatPlatform):

            if settings.discord_token:
                return settings.discord_token
-        except Exception:
+        except Exception as exc:
+            logger.warning("Discord token load error: %s", exc)
            pass

        # 2. Fall back to state file (set via /discord/setup endpoint)
@@ -458,7 +461,8 @@ class DiscordVendor(ChatPlatform):
        req.reject(note="User rejected from Discord")
        try:
            await continue_chat(action["run_output"], action.get("session_id"))
-        except Exception:
+        except Exception as exc:
+            logger.warning("Discord continue chat error: %s", exc)
            pass

        await interaction.response.send_message(
--- a/src/integrations/telegram_bot/bot.py
+++ b/src/integrations/telegram_bot/bot.py
@@ -56,7 +56,8 @@ class TelegramBot:
            from config import settings

            return settings.telegram_token or None
-        except Exception:
+        except Exception as exc:
+            logger.warning("Telegram token load error: %s", exc)
            return None

    def save_token(self, token: str) -> None:
--- a/src/spark/eidos.py
+++ b/src/spark/eidos.py
@@ -16,6 +16,8 @@ import json
 import logging
 import sqlite3
 import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass
 from datetime import UTC, datetime
 from pathlib import Path
@@ -39,28 +41,31 @@ class Prediction:
    evaluated_at: str | None


-def _get_conn() -> sqlite3.Connection:
+@contextmanager
+def _get_conn() -> Generator[sqlite3.Connection, None, None]:
    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    conn.execute("PRAGMA journal_mode=WAL")
-    conn.execute("PRAGMA busy_timeout=5000")
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS spark_predictions (
-            id               TEXT PRIMARY KEY,
-            task_id          TEXT NOT NULL,
-            prediction_type  TEXT NOT NULL,
-            predicted_value  TEXT NOT NULL,
-            actual_value     TEXT,
-            accuracy         REAL,
-            created_at       TEXT NOT NULL,
-            evaluated_at     TEXT
+    with closing(sqlite3.connect(str(DB_PATH))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("PRAGMA journal_mode=WAL")
+        conn.execute("PRAGMA busy_timeout=5000")
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS spark_predictions (
+                id               TEXT PRIMARY KEY,
+                task_id          TEXT NOT NULL,
+                prediction_type  TEXT NOT NULL,
+                predicted_value  TEXT NOT NULL,
+                actual_value     TEXT,
+                accuracy         REAL,
+                created_at       TEXT NOT NULL,
+                evaluated_at     TEXT
+            )
+            """)
+        conn.execute("CREATE INDEX IF NOT EXISTS idx_pred_task ON spark_predictions(task_id)")
+        conn.execute(
+            "CREATE INDEX IF NOT EXISTS idx_pred_type ON spark_predictions(prediction_type)"
        )
-        """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_pred_task ON spark_predictions(task_id)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_pred_type ON spark_predictions(prediction_type)")
-    conn.commit()
-    return conn
+        conn.commit()
+        yield conn


 # ── Prediction phase ────────────────────────────────────────────────────────
@@ -119,17 +124,16 @@ def predict_task_outcome(
    # Store prediction
    pred_id = str(uuid.uuid4())
    now = datetime.now(UTC).isoformat()
-    conn = _get_conn()
-    conn.execute(
-        """
-        INSERT INTO spark_predictions
-            (id, task_id, prediction_type, predicted_value, created_at)
-        VALUES (?, ?, ?, ?, ?)
-        """,
-        (pred_id, task_id, "outcome", json.dumps(prediction), now),
-    )
-    conn.commit()
-    conn.close()
+    with _get_conn() as conn:
+        conn.execute(
+            """
+            INSERT INTO spark_predictions
+                (id, task_id, prediction_type, predicted_value, created_at)
+            VALUES (?, ?, ?, ?, ?)
+            """,
+            (pred_id, task_id, "outcome", json.dumps(prediction), now),
+        )
+        conn.commit()

    prediction["prediction_id"] = pred_id
    return prediction
@@ -148,41 +152,39 @@ def evaluate_prediction(

    Returns the evaluation result or None if no prediction exists.
    """
-    conn = _get_conn()
-    row = conn.execute(
-        """
-        SELECT * FROM spark_predictions
-        WHERE task_id = ? AND prediction_type = 'outcome' AND evaluated_at IS NULL
-        ORDER BY created_at DESC LIMIT 1
-        """,
-        (task_id,),
-    ).fetchone()
+    with _get_conn() as conn:
+        row = conn.execute(
+            """
+            SELECT * FROM spark_predictions
+            WHERE task_id = ? AND prediction_type = 'outcome' AND evaluated_at IS NULL
+            ORDER BY created_at DESC LIMIT 1
+            """,
+            (task_id,),
+        ).fetchone()

-    if not row:
-        conn.close()
-        return None
+        if not row:
+            return None

-    predicted = json.loads(row["predicted_value"])
-    actual = {
-        "winner": actual_winner,
-        "succeeded": task_succeeded,
-        "winning_bid": winning_bid,
-    }
+        predicted = json.loads(row["predicted_value"])
+        actual = {
+            "winner": actual_winner,
+            "succeeded": task_succeeded,
+            "winning_bid": winning_bid,
+        }

-    # Calculate accuracy
-    accuracy = _compute_accuracy(predicted, actual)
-    now = datetime.now(UTC).isoformat()
+        # Calculate accuracy
+        accuracy = _compute_accuracy(predicted, actual)
+        now = datetime.now(UTC).isoformat()

-    conn.execute(
-        """
-        UPDATE spark_predictions
-        SET actual_value = ?, accuracy = ?, evaluated_at = ?
-        WHERE id = ?
-        """,
-        (json.dumps(actual), accuracy, now, row["id"]),
-    )
-    conn.commit()
-    conn.close()
+        conn.execute(
+            """
+            UPDATE spark_predictions
+            SET actual_value = ?, accuracy = ?, evaluated_at = ?
+            WHERE id = ?
+            """,
+            (json.dumps(actual), accuracy, now, row["id"]),
+        )
+        conn.commit()

    return {
        "prediction_id": row["id"],
@@ -243,7 +245,6 @@ def get_predictions(
    limit: int = 50,
 ) -> list[Prediction]:
    """Query stored predictions."""
-    conn = _get_conn()
    query = "SELECT * FROM spark_predictions WHERE 1=1"
    params: list = []

@@ -256,8 +257,8 @@ def get_predictions(
    query += " ORDER BY created_at DESC LIMIT ?"
    params.append(limit)

-    rows = conn.execute(query, params).fetchall()
-    conn.close()
+    with _get_conn() as conn:
+        rows = conn.execute(query, params).fetchall()
    return [
        Prediction(
            id=r["id"],
@@ -275,17 +276,16 @@ def get_predictions(

 def get_accuracy_stats() -> dict:
    """Return aggregate accuracy statistics for the EIDOS loop."""
-    conn = _get_conn()
-    row = conn.execute("""
-        SELECT
-            COUNT(*)                          AS total_predictions,
-            COUNT(evaluated_at)               AS evaluated,
-            AVG(CASE WHEN accuracy IS NOT NULL THEN accuracy END) AS avg_accuracy,
-            MIN(CASE WHEN accuracy IS NOT NULL THEN accuracy END) AS min_accuracy,
-            MAX(CASE WHEN accuracy IS NOT NULL THEN accuracy END) AS max_accuracy
-        FROM spark_predictions
-        """).fetchone()
-    conn.close()
+    with _get_conn() as conn:
+        row = conn.execute("""
+            SELECT
+                COUNT(*)                          AS total_predictions,
+                COUNT(evaluated_at)               AS evaluated,
+                AVG(CASE WHEN accuracy IS NOT NULL THEN accuracy END) AS avg_accuracy,
+                MIN(CASE WHEN accuracy IS NOT NULL THEN accuracy END) AS min_accuracy,
+                MAX(CASE WHEN accuracy IS NOT NULL THEN accuracy END) AS max_accuracy
+            FROM spark_predictions
+            """).fetchone()

    return {
        "total_predictions": row["total_predictions"] or 0,
--- a/src/spark/engine.py
+++ b/src/spark/engine.py
@@ -273,6 +273,8 @@ class SparkEngine:

    def _maybe_consolidate(self, agent_id: str) -> None:
        """Consolidate events into memories when enough data exists."""
+        from datetime import UTC, datetime, timedelta
+
        agent_events = spark_memory.get_events(agent_id=agent_id, limit=50)
        if len(agent_events) < 5:
            return
@@ -286,7 +288,34 @@ class SparkEngine:

        success_rate = len(completions) / total if total else 0

+        # Determine target memory type based on success rate
        if success_rate >= 0.8:
+            target_memory_type = "pattern"
+        elif success_rate <= 0.3:
+            target_memory_type = "anomaly"
+        else:
+            return  # No consolidation needed for neutral success rates
+
+        # Check for recent memories of the same type for this agent
+        existing_memories = spark_memory.get_memories(subject=agent_id, limit=5)
+        now = datetime.now(UTC)
+        one_hour_ago = now - timedelta(hours=1)
+
+        for memory in existing_memories:
+            if memory.memory_type == target_memory_type:
+                try:
+                    created_at = datetime.fromisoformat(memory.created_at)
+                    if created_at >= one_hour_ago:
+                        logger.info(
+                            "Consolidation: skipping — recent memory exists for %s",
+                            agent_id[:8],
+                        )
+                        return
+                except (ValueError, TypeError):
+                    continue
+
+        # Store the new memory
+        if target_memory_type == "pattern":
            spark_memory.store_memory(
                memory_type="pattern",
                subject=agent_id,
@@ -295,7 +324,7 @@ class SparkEngine:
                confidence=min(0.95, 0.6 + total * 0.05),
                source_events=total,
            )
-        elif success_rate <= 0.3:
+        else:  # anomaly
            spark_memory.store_memory(
                memory_type="anomaly",
                subject=agent_id,
@@ -358,7 +387,8 @@ def get_spark_engine() -> SparkEngine:
            from config import settings

            _spark_engine = SparkEngine(enabled=settings.spark_enabled)
-        except Exception:
+        except Exception as exc:
+            logger.debug("Spark engine settings load error: %s", exc)
            _spark_engine = SparkEngine(enabled=True)
    return _spark_engine

--- a/src/spark/memory.py
+++ b/src/spark/memory.py
@@ -10,12 +10,17 @@ spark_events   — raw event log (every swarm event)
 spark_memories — consolidated insights extracted from event patterns
 """

+import logging
 import sqlite3
 import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass
 from datetime import UTC, datetime
 from pathlib import Path

+logger = logging.getLogger(__name__)
+
 DB_PATH = Path("data/spark.db")

 # Importance thresholds
@@ -52,42 +57,43 @@ class SparkMemory:
    expires_at: str | None


-def _get_conn() -> sqlite3.Connection:
+@contextmanager
+def _get_conn() -> Generator[sqlite3.Connection, None, None]:
    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    conn.execute("PRAGMA journal_mode=WAL")
-    conn.execute("PRAGMA busy_timeout=5000")
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS spark_events (
-            id          TEXT PRIMARY KEY,
-            event_type  TEXT NOT NULL,
-            agent_id    TEXT,
-            task_id     TEXT,
-            description TEXT NOT NULL DEFAULT '',
-            data        TEXT NOT NULL DEFAULT '{}',
-            importance  REAL NOT NULL DEFAULT 0.5,
-            created_at  TEXT NOT NULL
-        )
-        """)
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS spark_memories (
-            id              TEXT PRIMARY KEY,
-            memory_type     TEXT NOT NULL,
-            subject         TEXT NOT NULL DEFAULT 'system',
-            content         TEXT NOT NULL,
-            confidence      REAL NOT NULL DEFAULT 0.5,
-            source_events   INTEGER NOT NULL DEFAULT 0,
-            created_at      TEXT NOT NULL,
-            expires_at      TEXT
-        )
-        """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_events_type ON spark_events(event_type)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_events_agent ON spark_events(agent_id)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_events_task ON spark_events(task_id)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_subject ON spark_memories(subject)")
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(DB_PATH))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("PRAGMA journal_mode=WAL")
+        conn.execute("PRAGMA busy_timeout=5000")
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS spark_events (
+                id          TEXT PRIMARY KEY,
+                event_type  TEXT NOT NULL,
+                agent_id    TEXT,
+                task_id     TEXT,
+                description TEXT NOT NULL DEFAULT '',
+                data        TEXT NOT NULL DEFAULT '{}',
+                importance  REAL NOT NULL DEFAULT 0.5,
+                created_at  TEXT NOT NULL
+            )
+            """)
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS spark_memories (
+                id              TEXT PRIMARY KEY,
+                memory_type     TEXT NOT NULL,
+                subject         TEXT NOT NULL DEFAULT 'system',
+                content         TEXT NOT NULL,
+                confidence      REAL NOT NULL DEFAULT 0.5,
+                source_events   INTEGER NOT NULL DEFAULT 0,
+                created_at      TEXT NOT NULL,
+                expires_at      TEXT
+            )
+            """)
+        conn.execute("CREATE INDEX IF NOT EXISTS idx_events_type ON spark_events(event_type)")
+        conn.execute("CREATE INDEX IF NOT EXISTS idx_events_agent ON spark_events(agent_id)")
+        conn.execute("CREATE INDEX IF NOT EXISTS idx_events_task ON spark_events(task_id)")
+        conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_subject ON spark_memories(subject)")
+        conn.commit()
+        yield conn


 # ── Importance scoring ──────────────────────────────────────────────────────
@@ -146,17 +152,16 @@ def record_event(
            parsed = {}
        importance = score_importance(event_type, parsed)

-    conn = _get_conn()
-    conn.execute(
-        """
-        INSERT INTO spark_events
-            (id, event_type, agent_id, task_id, description, data, importance, created_at)
-        VALUES (?, ?, ?, ?, ?, ?, ?, ?)
-        """,
-        (event_id, event_type, agent_id, task_id, description, data, importance, now),
-    )
-    conn.commit()
-    conn.close()
+    with _get_conn() as conn:
+        conn.execute(
+            """
+            INSERT INTO spark_events
+                (id, event_type, agent_id, task_id, description, data, importance, created_at)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            (event_id, event_type, agent_id, task_id, description, data, importance, now),
+        )
+        conn.commit()

    # Bridge to unified event log so all events are queryable from one place
    try:
@@ -170,7 +175,8 @@ def record_event(
            task_id=task_id or "",
            agent_id=agent_id or "",
        )
-    except Exception:
+    except Exception as exc:
+        logger.debug("Spark event log error: %s", exc)
        pass  # Graceful — don't break spark if event_log is unavailable

    return event_id
@@ -184,7 +190,6 @@ def get_events(
    min_importance: float = 0.0,
 ) -> list[SparkEvent]:
    """Query events with optional filters."""
-    conn = _get_conn()
    query = "SELECT * FROM spark_events WHERE importance >= ?"
    params: list = [min_importance]

@@ -201,8 +206,8 @@ def get_events(
    query += " ORDER BY created_at DESC LIMIT ?"
    params.append(limit)

-    rows = conn.execute(query, params).fetchall()
-    conn.close()
+    with _get_conn() as conn:
+        rows = conn.execute(query, params).fetchall()
    return [
        SparkEvent(
            id=r["id"],
@@ -220,15 +225,14 @@ def get_events(

 def count_events(event_type: str | None = None) -> int:
    """Count events, optionally filtered by type."""
-    conn = _get_conn()
-    if event_type:
-        row = conn.execute(
-            "SELECT COUNT(*) FROM spark_events WHERE event_type = ?",
-            (event_type,),
-        ).fetchone()
-    else:
-        row = conn.execute("SELECT COUNT(*) FROM spark_events").fetchone()
-    conn.close()
+    with _get_conn() as conn:
+        if event_type:
+            row = conn.execute(
+                "SELECT COUNT(*) FROM spark_events WHERE event_type = ?",
+                (event_type,),
+            ).fetchone()
+        else:
+            row = conn.execute("SELECT COUNT(*) FROM spark_events").fetchone()
    return row[0]


@@ -246,17 +250,16 @@ def store_memory(
    """Store a consolidated memory.  Returns the memory id."""
    mem_id = str(uuid.uuid4())
    now = datetime.now(UTC).isoformat()
-    conn = _get_conn()
-    conn.execute(
-        """
-        INSERT INTO spark_memories
-            (id, memory_type, subject, content, confidence, source_events, created_at, expires_at)
-        VALUES (?, ?, ?, ?, ?, ?, ?, ?)
-        """,
-        (mem_id, memory_type, subject, content, confidence, source_events, now, expires_at),
-    )
-    conn.commit()
-    conn.close()
+    with _get_conn() as conn:
+        conn.execute(
+            """
+            INSERT INTO spark_memories
+                (id, memory_type, subject, content, confidence, source_events, created_at, expires_at)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            (mem_id, memory_type, subject, content, confidence, source_events, now, expires_at),
+        )
+        conn.commit()
    return mem_id


@@ -267,7 +270,6 @@ def get_memories(
    limit: int = 50,
 ) -> list[SparkMemory]:
    """Query memories with optional filters."""
-    conn = _get_conn()
    query = "SELECT * FROM spark_memories WHERE confidence >= ?"
    params: list = [min_confidence]

@@ -281,8 +283,8 @@ def get_memories(
    query += " ORDER BY created_at DESC LIMIT ?"
    params.append(limit)

-    rows = conn.execute(query, params).fetchall()
-    conn.close()
+    with _get_conn() as conn:
+        rows = conn.execute(query, params).fetchall()
    return [
        SparkMemory(
            id=r["id"],
@@ -300,13 +302,12 @@ def get_memories(

 def count_memories(memory_type: str | None = None) -> int:
    """Count memories, optionally filtered by type."""
-    conn = _get_conn()
-    if memory_type:
-        row = conn.execute(
-            "SELECT COUNT(*) FROM spark_memories WHERE memory_type = ?",
-            (memory_type,),
-        ).fetchone()
-    else:
-        row = conn.execute("SELECT COUNT(*) FROM spark_memories").fetchone()
-    conn.close()
+    with _get_conn() as conn:
+        if memory_type:
+            row = conn.execute(
+                "SELECT COUNT(*) FROM spark_memories WHERE memory_type = ?",
+                (memory_type,),
+            ).fetchone()
+        else:
+            row = conn.execute("SELECT COUNT(*) FROM spark_memories").fetchone()
    return row[0]
--- a/src/timmy/agent.py
+++ b/src/timmy/agent.py
@@ -16,6 +16,7 @@ Handoff Protocol maintains continuity across sessions.
 import logging
 from typing import TYPE_CHECKING, Union

+import httpx
 from agno.agent import Agent
 from agno.db.sqlite import SqliteDb
 from agno.models.ollama import Ollama
@@ -29,24 +30,6 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

-# Fallback chain for text/tool models (in order of preference)
-DEFAULT_MODEL_FALLBACKS = [
-    "llama3.1:8b-instruct",
-    "llama3.1",
-    "qwen3.5:latest",
-    "qwen2.5:14b",
-    "qwen2.5:7b",
-    "llama3.2:3b",
-]
-
-# Fallback chain for vision models
-VISION_MODEL_FALLBACKS = [
-    "llama3.2:3b",
-    "llava:7b",
-    "qwen2.5-vl:3b",
-    "moondream:1.8b",
-]
-
 # Union type for callers that want to hint the return type.
 TimmyAgent = Union[Agent, "TimmyAirLLMAgent", "GrokBackend", "ClaudeBackend"]

@@ -130,8 +113,8 @@ def _resolve_model_with_fallback(
            return model, False
        logger.warning("Failed to pull %s, checking fallbacks...", model)

-    # Use appropriate fallback chain
-    fallback_chain = VISION_MODEL_FALLBACKS if require_vision else DEFAULT_MODEL_FALLBACKS
+    # Use appropriate configurable fallback chain (from settings / env vars)
+    fallback_chain = settings.vision_fallback_models if require_vision else settings.fallback_models

    for fallback_model in fallback_chain:
        if _check_model_available(fallback_model):
@@ -162,6 +145,32 @@ def _model_supports_tools(model_name: str) -> bool:
    return True


+def _warmup_model(model_name: str) -> bool:
+    """Warm up an Ollama model by sending a minimal generation request.
+
+    This prevents 'Server disconnected' errors on first request after cold model load.
+    Cold loads can take 30-40s, so we use a 60s timeout.
+
+    Args:
+        model_name: Name of the Ollama model to warm up
+
+    Returns:
+        True if warmup succeeded, False otherwise (does not raise)
+    """
+    try:
+        response = httpx.post(
+            f"{settings.ollama_url}/api/generate",
+            json={"model": model_name, "prompt": "hi", "options": {"num_predict": 1}},
+            timeout=60.0,
+        )
+        response.raise_for_status()
+        logger.info("Model %s warmed up successfully", model_name)
+        return True
+    except Exception as exc:
+        logger.warning("Model warmup failed: %s — first request may disconnect", exc)
+        return False
+
+
 def _resolve_backend(requested: str | None) -> str:
    """Return the backend name to use, resolving 'auto' and explicit overrides.

@@ -192,6 +201,9 @@ def create_timmy(
    db_file: str = "timmy.db",
    backend: str | None = None,
    model_size: str | None = None,
+    *,
+    skip_mcp: bool = False,
+    session_id: str = "unknown",
 ) -> TimmyAgent:
    """Instantiate the agent — Ollama or AirLLM, same public interface.

@@ -199,6 +211,10 @@ def create_timmy(
        db_file:    SQLite file for Agno conversation memory (Ollama path only).
        backend:    "ollama" | "airllm" | "auto" | None (reads config/env).
        model_size: AirLLM size — "8b" | "70b" | "405b" | None (reads config).
+        skip_mcp:   If True, omit MCP tool servers (Gitea, filesystem).
+                    Use for background tasks (thinking, QA) where MCP's
+                    stdio cancel-scope lifecycle conflicts with asyncio
+                    task cancellation.

    Returns an Agno Agent or backend-specific agent — all expose
    print_response(message, stream).
@@ -253,8 +269,10 @@ def create_timmy(
    if toolkit:
        tools_list.append(toolkit)

-    # Add MCP tool servers (lazy-connected on first arun())
-    if use_tools:
+    # Add MCP tool servers (lazy-connected on first arun()).
+    # Skipped when skip_mcp=True — MCP's stdio transport uses anyio cancel
+    # scopes that conflict with asyncio background task cancellation (#72).
+    if use_tools and not skip_mcp:
        try:
            from timmy.mcp_tools import create_filesystem_mcp_tools, create_gitea_mcp_tools

@@ -269,7 +287,7 @@ def create_timmy(
            logger.debug("MCP tools unavailable: %s", exc)

    # Select prompt tier based on tool capability
-    base_prompt = get_system_prompt(tools_enabled=use_tools)
+    base_prompt = get_system_prompt(tools_enabled=use_tools, session_id=session_id)

    # Try to load memory context
    try:
@@ -289,18 +307,23 @@ def create_timmy(
        logger.warning("Failed to load memory context: %s", exc)
        full_prompt = base_prompt

-    return Agent(
+    model_kwargs = {}
+    if settings.ollama_num_ctx > 0:
+        model_kwargs["options"] = {"num_ctx": settings.ollama_num_ctx}
+    agent = Agent(
        name="Agent",
-        model=Ollama(id=model_name, host=settings.ollama_url, timeout=300),
+        model=Ollama(id=model_name, host=settings.ollama_url, timeout=300, **model_kwargs),
        db=SqliteDb(db_file=db_file),
        description=full_prompt,
        add_history_to_context=True,
        num_history_runs=20,
-        markdown=True,
+        markdown=False,
        tools=tools_list if tools_list else None,
        tool_call_limit=settings.max_agent_steps if use_tools else None,
        telemetry=settings.telemetry_enabled,
    )
+    _warmup_model(model_name)
+    return agent


 class TimmyWithMemory:
@@ -317,15 +340,47 @@ class TimmyWithMemory:
        self.initial_context = self.memory.get_system_context()

    def chat(self, message: str) -> str:
-        """Simple chat interface that tracks in memory."""
+        """Simple chat interface that tracks in memory.
+
+        Retries on transient Ollama errors (GPU contention, timeouts)
+        with exponential backoff (#70).
+        """
+        import time
+
        # Check for user facts to extract
        self._extract_and_store_facts(message)

-        # Run agent
-        result = self.agent.run(message, stream=False)
-        response_text = result.content if hasattr(result, "content") else str(result)
-
-        return response_text
+        # Retry with backoff — GPU contention causes ReadError/ReadTimeout
+        max_retries = 3
+        for attempt in range(1, max_retries + 1):
+            try:
+                result = self.agent.run(message, stream=False)
+                return result.content if hasattr(result, "content") else str(result)
+            except (
+                httpx.ConnectError,
+                httpx.ReadError,
+                httpx.ReadTimeout,
+                httpx.ConnectTimeout,
+                ConnectionError,
+                TimeoutError,
+            ) as exc:
+                if attempt < max_retries:
+                    wait = min(2**attempt, 16)
+                    logger.warning(
+                        "Ollama contention on attempt %d/%d: %s. Waiting %ds before retry...",
+                        attempt,
+                        max_retries,
+                        type(exc).__name__,
+                        wait,
+                    )
+                    time.sleep(wait)
+                else:
+                    logger.error(
+                        "Ollama unreachable after %d attempts: %s",
+                        max_retries,
+                        exc,
+                    )
+                    raise

    def _extract_and_store_facts(self, message: str) -> None:
        """Extract user facts from message and store in memory."""
@@ -336,7 +391,8 @@ class TimmyWithMemory:
            if name:
                self.memory.update_user_fact("Name", name)
                self.memory.record_decision(f"Learned user's name: {name}")
-        except Exception:
+        except Exception as exc:
+            logger.warning("User name extraction failed: %s", exc)
            pass  # Best-effort extraction

    def end_session(self, summary: str = "Session completed") -> None:
--- a/src/timmy/agent_core/init.py
+++ b/src/timmy/agent_core/init.py
@@ -1 +0,0 @@
-"""Agent Core — Substrate-agnostic agent interface and base classes."""
--- a/src/timmy/agent_core/interface.py
+++ b/src/timmy/agent_core/interface.py
@@ -1,381 +0,0 @@
-"""TimAgent Interface — The substrate-agnostic agent contract.
-
-This is the foundation for embodiment. Whether Timmy runs on:
- A server with Ollama (today)
- A Raspberry Pi with sensors
- A Boston Dynamics Spot robot
- A VR avatar
-
-The interface remains constant. Implementation varies.
-
-Architecture:
-    perceive()  →  reason  →  act()
-         ↑                      ↓
-         ←←← remember() ←←←←←←┘
-
-All methods return effects that can be logged, audited, and replayed.
-"""
-
-import uuid
-from abc import ABC, abstractmethod
-from dataclasses import dataclass, field
-from datetime import UTC, datetime
-from enum import Enum, auto
-from typing import Any
-
-
-class PerceptionType(Enum):
-    """Types of sensory input an agent can receive."""
-
-    TEXT = auto()  # Natural language
-    IMAGE = auto()  # Visual input
-    AUDIO = auto()  # Sound/speech
-    SENSOR = auto()  # Temperature, distance, etc.
-    MOTION = auto()  # Accelerometer, gyroscope
-    NETWORK = auto()  # API calls, messages
-    INTERNAL = auto()  # Self-monitoring (battery, temp)
-
-
-class ActionType(Enum):
-    """Types of actions an agent can perform."""
-
-    TEXT = auto()  # Generate text response
-    SPEAK = auto()  # Text-to-speech
-    MOVE = auto()  # Physical movement
-    GRIP = auto()  # Manipulate objects
-    CALL = auto()  # API/network call
-    EMIT = auto()  # Signal/light/sound
-    SLEEP = auto()  # Power management
-
-
-class AgentCapability(Enum):
-    """High-level capabilities a TimAgent may possess."""
-
-    REASONING = "reasoning"
-    CODING = "coding"
-    WRITING = "writing"
-    ANALYSIS = "analysis"
-    VISION = "vision"
-    SPEECH = "speech"
-    NAVIGATION = "navigation"
-    MANIPULATION = "manipulation"
-    LEARNING = "learning"
-    COMMUNICATION = "communication"
-
-
-@dataclass(frozen=True)
-class AgentIdentity:
-    """Immutable identity for an agent instance.
-
-    This persists across sessions and substrates. If Timmy moves
-    from cloud to robot, the identity follows.
-    """
-
-    id: str
-    name: str
-    version: str
-    created_at: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
-
-    @classmethod
-    def generate(cls, name: str, version: str = "1.0.0") -> "AgentIdentity":
-        """Generate a new unique identity."""
-        return cls(
-            id=str(uuid.uuid4()),
-            name=name,
-            version=version,
-        )
-
-
-@dataclass
-class Perception:
-    """A sensory input to the agent.
-
-    Substrate-agnostic representation. A camera image and a
-    LiDAR point cloud are both Perception instances.
-    """
-
-    type: PerceptionType
-    data: Any  # Content depends on type
-    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
-    source: str = "unknown"  # e.g., "camera_1", "microphone", "user_input"
-    metadata: dict = field(default_factory=dict)
-
-    @classmethod
-    def text(cls, content: str, source: str = "user") -> "Perception":
-        """Factory for text perception."""
-        return cls(
-            type=PerceptionType.TEXT,
-            data=content,
-            source=source,
-        )
-
-    @classmethod
-    def sensor(cls, kind: str, value: float, unit: str = "") -> "Perception":
-        """Factory for sensor readings."""
-        return cls(
-            type=PerceptionType.SENSOR,
-            data={"kind": kind, "value": value, "unit": unit},
-            source=f"sensor_{kind}",
-        )
-
-
-@dataclass
-class Action:
-    """An action the agent intends to perform.
-
-    Actions are effects — they describe what should happen,
-    not how. The substrate implements the "how."
-    """
-
-    type: ActionType
-    payload: Any  # Action-specific data
-    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
-    confidence: float = 1.0  # 0-1, agent's certainty
-    deadline: str | None = None  # When action must complete
-
-    @classmethod
-    def respond(cls, text: str, confidence: float = 1.0) -> "Action":
-        """Factory for text response action."""
-        return cls(
-            type=ActionType.TEXT,
-            payload=text,
-            confidence=confidence,
-        )
-
-    @classmethod
-    def move(cls, vector: tuple[float, float, float], speed: float = 1.0) -> "Action":
-        """Factory for movement action (x, y, z meters)."""
-        return cls(
-            type=ActionType.MOVE,
-            payload={"vector": vector, "speed": speed},
-        )
-
-
-@dataclass
-class Memory:
-    """A stored experience or fact.
-
-    Memories are substrate-agnostic. A conversation history
-    and a video recording are both Memory instances.
-    """
-
-    id: str
-    content: Any
-    created_at: str
-    access_count: int = 0
-    last_accessed: str | None = None
-    importance: float = 0.5  # 0-1, for pruning decisions
-    tags: list[str] = field(default_factory=list)
-
-    def touch(self) -> None:
-        """Mark memory as accessed."""
-        self.access_count += 1
-        self.last_accessed = datetime.now(UTC).isoformat()
-
-
-@dataclass
-class Communication:
-    """A message to/from another agent or human."""
-
-    sender: str
-    recipient: str
-    content: Any
-    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
-    protocol: str = "direct"  # e.g., "http", "websocket", "speech"
-    encrypted: bool = False
-
-
-class TimAgent(ABC):
-    """Abstract base class for all Timmy agent implementations.
-
-    This is the substrate-agnostic interface. Implementations:
-    - OllamaAgent: LLM-based reasoning (today)
-    - RobotAgent: Physical embodiment (future)
-    - SimulationAgent: Virtual environment (future)
-
-    Usage:
-        agent = OllamaAgent(identity)  # Today's implementation
-
-        perception = Perception.text("Hello Timmy")
-        memory = agent.perceive(perception)
-
-        action = agent.reason("How should I respond?")
-        result = agent.act(action)
-
-        agent.remember(memory)  # Store for future
-    """
-
-    def __init__(self, identity: AgentIdentity) -> None:
-        self._identity = identity
-        self._capabilities: set[AgentCapability] = set()
-        self._state: dict[str, Any] = {}
-
-    @property
-    def identity(self) -> AgentIdentity:
-        """Return this agent's immutable identity."""
-        return self._identity
-
-    @property
-    def capabilities(self) -> set[AgentCapability]:
-        """Return set of supported capabilities."""
-        return self._capabilities.copy()
-
-    def has_capability(self, capability: AgentCapability) -> bool:
-        """Check if agent supports a capability."""
-        return capability in self._capabilities
-
-    @abstractmethod
-    def perceive(self, perception: Perception) -> Memory:
-        """Process sensory input and create a memory.
-
-        This is the entry point for all agent interaction.
-        A text message, camera frame, or temperature reading
-        all enter through perceive().
-
-        Args:
-            perception: Sensory input
-
-        Returns:
-            Memory: Stored representation of the perception
-        """
-        pass
-
-    @abstractmethod
-    def reason(self, query: str, context: list[Memory]) -> Action:
-        """Reason about a situation and decide on action.
-
-        This is where "thinking" happens. The agent uses its
-        substrate-appropriate reasoning (LLM, neural net, rules)
-        to decide what to do.
-
-        Args:
-            query: What to reason about
-            context: Relevant memories for context
-
-        Returns:
-            Action: What the agent decides to do
-        """
-        pass
-
-    @abstractmethod
-    def act(self, action: Action) -> Any:
-        """Execute an action in the substrate.
-
-        This is where the abstract action becomes concrete:
-        - TEXT → Generate LLM response
-        - MOVE → Send motor commands
-        - SPEAK → Call TTS engine
-
-        Args:
-            action: The action to execute
-
-        Returns:
-            Result of the action (substrate-specific)
-        """
-        pass
-
-    @abstractmethod
-    def remember(self, memory: Memory) -> None:
-        """Store a memory for future retrieval.
-
-        The storage mechanism depends on substrate:
-        - Cloud: SQLite, vector DB
-        - Robot: Local flash storage
-        - Hybrid: Synced with conflict resolution
-
-        Args:
-            memory: Experience to store
-        """
-        pass
-
-    @abstractmethod
-    def recall(self, query: str, limit: int = 5) -> list[Memory]:
-        """Retrieve relevant memories.
-
-        Args:
-            query: What to search for
-            limit: Maximum memories to return
-
-        Returns:
-            List of relevant memories, sorted by relevance
-        """
-        pass
-
-    @abstractmethod
-    def communicate(self, message: Communication) -> bool:
-        """Send/receive communication with another agent.
-
-        Args:
-            message: Message to send
-
-        Returns:
-            True if communication succeeded
-        """
-        pass
-
-    def get_state(self) -> dict[str, Any]:
-        """Get current agent state for monitoring/debugging."""
-        return {
-            "identity": self._identity,
-            "capabilities": list(self._capabilities),
-            "state": self._state.copy(),
-        }
-
-    def shutdown(self) -> None:  # noqa: B027
-        """Graceful shutdown. Persist state, close connections."""
-        # Override in subclass for cleanup
-
-
-class AgentEffect:
-    """Log entry for agent actions — for audit and replay.
-
-    The complete history of an agent's life can be captured
-    as a sequence of AgentEffects. This enables:
-    - Debugging: What did the agent see and do?
-    - Audit: Why did it make that decision?
-    - Replay: Reconstruct agent state from log
-    - Training: Learn from agent experiences
-    """
-
-    def __init__(self, log_path: str | None = None) -> None:
-        self._effects: list[dict] = []
-        self._log_path = log_path
-
-    def log_perceive(self, perception: Perception, memory_id: str) -> None:
-        """Log a perception event."""
-        self._effects.append(
-            {
-                "type": "perceive",
-                "perception_type": perception.type.name,
-                "source": perception.source,
-                "memory_id": memory_id,
-                "timestamp": datetime.now(UTC).isoformat(),
-            }
-        )
-
-    def log_reason(self, query: str, action_type: ActionType) -> None:
-        """Log a reasoning event."""
-        self._effects.append(
-            {
-                "type": "reason",
-                "query": query,
-                "action_type": action_type.name,
-                "timestamp": datetime.now(UTC).isoformat(),
-            }
-        )
-
-    def log_act(self, action: Action, result: Any) -> None:
-        """Log an action event."""
-        self._effects.append(
-            {
-                "type": "act",
-                "action_type": action.type.name,
-                "confidence": action.confidence,
-                "result_type": type(result).__name__,
-                "timestamp": datetime.now(UTC).isoformat(),
-            }
-        )
-
-    def export(self) -> list[dict]:
-        """Export effect log for analysis."""
-        return self._effects.copy()
--- a/src/timmy/agent_core/ollama_adapter.py
+++ b/src/timmy/agent_core/ollama_adapter.py
@@ -1,275 +0,0 @@
-"""Ollama-based implementation of TimAgent interface.
-
-This adapter wraps the existing Timmy Ollama agent to conform
-to the substrate-agnostic TimAgent interface. It's the bridge
-between the old codebase and the new embodiment-ready architecture.
-
-Usage:
-    from timmy.agent_core import AgentIdentity, Perception
-    from timmy.agent_core.ollama_adapter import OllamaAgent
-
-    identity = AgentIdentity.generate("Timmy")
-    agent = OllamaAgent(identity)
-
-    perception = Perception.text("Hello!")
-    memory = agent.perceive(perception)
-    action = agent.reason("How should I respond?", [memory])
-    result = agent.act(action)
-"""
-
-from typing import Any
-
-from timmy.agent import _resolve_model_with_fallback, create_timmy
-from timmy.agent_core.interface import (
-    Action,
-    ActionType,
-    AgentCapability,
-    AgentEffect,
-    AgentIdentity,
-    Communication,
-    Memory,
-    Perception,
-    PerceptionType,
-    TimAgent,
-)
-
-
-class OllamaAgent(TimAgent):
-    """TimAgent implementation using local Ollama LLM.
-
-    This is the production agent for Timmy Time v2. It uses
-    Ollama for reasoning and SQLite for memory persistence.
-
-    Capabilities:
-    - REASONING: LLM-based inference
-    - CODING: Code generation and analysis
-    - WRITING: Long-form content creation
-    - ANALYSIS: Data processing and insights
-    - COMMUNICATION: Multi-agent messaging
-    """
-
-    def __init__(
-        self,
-        identity: AgentIdentity,
-        model: str | None = None,
-        effect_log: str | None = None,
-        require_vision: bool = False,
-    ) -> None:
-        """Initialize Ollama-based agent.
-
-        Args:
-            identity: Agent identity (persistent across sessions)
-            model: Ollama model to use (auto-resolves with fallback)
-            effect_log: Path to log agent effects (optional)
-            require_vision: Whether to select a vision-capable model
-        """
-        super().__init__(identity)
-
-        # Resolve model with automatic pulling and fallback
-        resolved_model, is_fallback = _resolve_model_with_fallback(
-            requested_model=model,
-            require_vision=require_vision,
-            auto_pull=True,
-        )
-
-        if is_fallback:
-            import logging
-
-            logging.getLogger(__name__).info(
-                "OllamaAdapter using fallback model %s", resolved_model
-            )
-
-        # Initialize underlying Ollama agent
-        self._timmy = create_timmy(model=resolved_model)
-
-        # Set capabilities based on what Ollama can do
-        self._capabilities = {
-            AgentCapability.REASONING,
-            AgentCapability.CODING,
-            AgentCapability.WRITING,
-            AgentCapability.ANALYSIS,
-            AgentCapability.COMMUNICATION,
-        }
-
-        # Effect logging for audit/replay
-        self._effect_log = AgentEffect(effect_log) if effect_log else None
-
-        # Simple in-memory working memory (short term)
-        self._working_memory: list[Memory] = []
-        self._max_working_memory = 10
-
-    def perceive(self, perception: Perception) -> Memory:
-        """Process perception and store in memory.
-
-        For text perceptions, we might do light preprocessing
-        (summarization, keyword extraction) before storage.
-        """
-        # Create memory from perception
-        memory = Memory(
-            id=f"mem_{len(self._working_memory)}",
-            content={
-                "type": perception.type.name,
-                "data": perception.data,
-                "source": perception.source,
-            },
-            created_at=perception.timestamp,
-            tags=self._extract_tags(perception),
-        )
-
-        # Add to working memory
-        self._working_memory.append(memory)
-        if len(self._working_memory) > self._max_working_memory:
-            self._working_memory.pop(0)  # FIFO eviction
-
-        # Log effect
-        if self._effect_log:
-            self._effect_log.log_perceive(perception, memory.id)
-
-        return memory
-
-    def reason(self, query: str, context: list[Memory]) -> Action:
-        """Use LLM to reason and decide on action.
-
-        This is where the Ollama agent does its work. We construct
-        a prompt from the query and context, then interpret the
-        response as an action.
-        """
-        # Build context string from memories
-        context_str = self._format_context(context)
-
-        # Construct prompt
-        prompt = f"""You are {self._identity.name}, an AI assistant.
-
-Context from previous interactions:
-{context_str}
-
-Current query: {query}
-
-Respond naturally and helpfully."""
-
-        # Run LLM inference
-        result = self._timmy.run(prompt, stream=False)
-        response_text = result.content if hasattr(result, "content") else str(result)
-
-        # Create text response action
-        action = Action.respond(response_text, confidence=0.9)
-
-        # Log effect
-        if self._effect_log:
-            self._effect_log.log_reason(query, action.type)
-
-        return action
-
-    def act(self, action: Action) -> Any:
-        """Execute action in the Ollama substrate.
-
-        For text actions, the "execution" is just returning the
-        text (already generated during reasoning). For future
-        action types (MOVE, SPEAK), this would trigger the
-        appropriate Ollama tool calls.
-        """
-        result = None
-
-        if action.type == ActionType.TEXT:
-            result = action.payload
-        elif action.type == ActionType.SPEAK:
-            # Would call TTS here
-            result = {"spoken": action.payload, "tts_engine": "pyttsx3"}
-        elif action.type == ActionType.CALL:
-            # Would make API call
-            result = {"status": "not_implemented", "payload": action.payload}
-        else:
-            result = {"error": f"Action type {action.type} not supported by OllamaAgent"}
-
-        # Log effect
-        if self._effect_log:
-            self._effect_log.log_act(action, result)
-
-        return result
-
-    def remember(self, memory: Memory) -> None:
-        """Store memory in working memory.
-
-        Adds the memory to the sliding window and bumps its importance.
-        """
-        memory.touch()
-
-        # Deduplicate by id
-        self._working_memory = [m for m in self._working_memory if m.id != memory.id]
-        self._working_memory.append(memory)
-
-        # Evict oldest if over capacity
-        if len(self._working_memory) > self._max_working_memory:
-            self._working_memory.pop(0)
-
-    def recall(self, query: str, limit: int = 5) -> list[Memory]:
-        """Retrieve relevant memories.
-
-        Simple keyword matching for now. Future: vector similarity.
-        """
-        query_lower = query.lower()
-        scored = []
-
-        for memory in self._working_memory:
-            score = 0
-            content_str = str(memory.content).lower()
-
-            # Simple keyword overlap
-            query_words = set(query_lower.split())
-            content_words = set(content_str.split())
-            overlap = len(query_words & content_words)
-            score += overlap
-
-            # Boost recent memories
-            score += memory.importance
-
-            scored.append((score, memory))
-
-        # Sort by score descending
-        scored.sort(key=lambda x: x[0], reverse=True)
-
-        # Return top N
-        return [m for _, m in scored[:limit]]
-
-    def communicate(self, message: Communication) -> bool:
-        """Send message to another agent.
-
-        Swarm comms removed — inter-agent communication will be handled
-        by the unified brain memory layer.
-        """
-        return False
-
-    def _extract_tags(self, perception: Perception) -> list[str]:
-        """Extract searchable tags from perception."""
-        tags = [perception.type.name, perception.source]
-
-        if perception.type == PerceptionType.TEXT:
-            # Simple keyword extraction
-            text = str(perception.data).lower()
-            keywords = ["code", "bug", "help", "question", "task"]
-            for kw in keywords:
-                if kw in text:
-                    tags.append(kw)
-
-        return tags
-
-    def _format_context(self, memories: list[Memory]) -> str:
-        """Format memories into context string for prompt."""
-        if not memories:
-            return "No previous context."
-
-        parts = []
-        for mem in memories[-5:]:  # Last 5 memories
-            if isinstance(mem.content, dict):
-                data = mem.content.get("data", "")
-                parts.append(f"- {data}")
-            else:
-                parts.append(f"- {mem.content}")
-
-        return "\n".join(parts)
-
-    def get_effect_log(self) -> list[dict] | None:
-        """Export effect log if logging is enabled."""
-        if self._effect_log:
-            return self._effect_log.export()
-        return None
--- a/src/timmy/agentic_loop.py
+++ b/src/timmy/agentic_loop.py
@@ -58,6 +58,8 @@ class AgenticResult:
 # Agent factory
 # ---------------------------------------------------------------------------

+_loop_agent = None
+

 def _get_loop_agent():
    """Create a fresh agent for the agentic loop.
@@ -65,9 +67,12 @@ def _get_loop_agent():
    Returns the same type of agent as `create_timmy()` but with a
    dedicated session so it doesn't pollute the main chat history.
    """
-    from timmy.agent import create_timmy
+    global _loop_agent
+    if _loop_agent is None:
+        from timmy.agent import create_timmy

-    return create_timmy()
+        _loop_agent = create_timmy()
+    return _loop_agent


 # ---------------------------------------------------------------------------
@@ -131,7 +136,7 @@ async def run_agentic_loop(
            agent.run, plan_prompt, stream=False, session_id=f"{session_id}_plan"
        )
        plan_text = plan_run.content if hasattr(plan_run, "content") else str(plan_run)
-    except Exception as exc:
+    except Exception as exc:  # broad catch intentional: agent.run can raise any error
        logger.error("Agentic loop: planning failed: %s", exc)
        result.status = "failed"
        result.summary = f"Planning failed: {exc}"
@@ -168,11 +173,11 @@ async def run_agentic_loop(
    for i, step_desc in enumerate(steps, 1):
        step_start = time.monotonic()

+        recent = completed_results[-2:] if completed_results else []
        context = (
            f"Task: {task}\n"
-            f"Plan: {plan_text}\n"
-            f"Completed so far: {completed_results}\n\n"
-            f"Now do step {i}: {step_desc}\n"
+            f"Step {i}/{total_steps}: {step_desc}\n"
+            f"Recent progress: {recent}\n\n"
            f"Execute this step and report what you did."
        )

@@ -212,7 +217,7 @@ async def run_agentic_loop(
            if on_progress:
                await on_progress(step_desc, i, total_steps)

-        except Exception as exc:
+        except Exception as exc:  # broad catch intentional: agent.run can raise any error
            logger.warning("Agentic loop step %d failed: %s", i, exc)

            # ── Adaptation: ask model to adapt ─────────────────────────────
@@ -260,7 +265,7 @@ async def run_agentic_loop(
                if on_progress:
                    await on_progress(f"[Adapted] {step_desc}", i, total_steps)

-            except Exception as adapt_exc:
+            except Exception as adapt_exc:  # broad catch intentional: agent.run can raise any error
                logger.error("Agentic loop adaptation also failed: %s", adapt_exc)
                step = AgenticStep(
                    step_num=i,
@@ -273,27 +278,15 @@ async def run_agentic_loop(
                completed_results.append(f"Step {i}: FAILED")

    # ── Phase 3: Summary ───────────────────────────────────────────────────
-    summary_prompt = (
-        f"Task: {task}\n"
-        f"Results:\n" + "\n".join(completed_results) + "\n\n"
-        "Summarise what was accomplished in 2-3 sentences."
-    )
-    try:
-        summary_run = await asyncio.to_thread(
-            agent.run,
-            summary_prompt,
-            stream=False,
-            session_id=f"{session_id}_summary",
-        )
-        result.summary = (
-            summary_run.content if hasattr(summary_run, "content") else str(summary_run)
-        )
-        from timmy.session import _clean_response
-
-        result.summary = _clean_response(result.summary)
-    except Exception as exc:
-        logger.error("Agentic loop summary failed: %s", exc)
-        result.summary = f"Completed {len(result.steps)} steps."
+    completed_count = sum(1 for s in result.steps if s.status == "completed")
+    adapted_count = sum(1 for s in result.steps if s.status == "adapted")
+    failed_count = sum(1 for s in result.steps if s.status == "failed")
+    parts = [f"Completed {completed_count}/{total_steps} steps"]
+    if adapted_count:
+        parts.append(f"{adapted_count} adapted")
+    if failed_count:
+        parts.append(f"{failed_count} failed")
+    result.summary = f"{task}: {', '.join(parts)}."

    # Determine final status
    if was_truncated:
@@ -332,5 +325,6 @@ async def _broadcast_progress(event: str, data: dict) -> None:
        from infrastructure.ws_manager.handler import ws_manager

        await ws_manager.broadcast(event, data)
-    except Exception:
+    except (ImportError, AttributeError, ConnectionError, RuntimeError) as exc:
+        logger.warning("Agentic loop broadcast failed: %s", exc)
        logger.debug("Agentic loop: WS broadcast failed for %s", event)
--- a/src/timmy/agents/base.py
+++ b/src/timmy/agents/base.py
@@ -10,10 +10,12 @@ SubAgent is the single seed class for ALL agents.  Differentiation
 comes entirely from config (agents.yaml), not from Python subclasses.
 """

+import asyncio
 import logging
 from abc import ABC, abstractmethod
 from typing import Any

+import httpx
 from agno.agent import Agent
 from agno.models.ollama import Ollama

@@ -72,14 +74,17 @@ class BaseAgent(ABC):
                if handler:
                    tool_instances.append(handler)

+        ollama_kwargs = {}
+        if settings.ollama_num_ctx > 0:
+            ollama_kwargs["options"] = {"num_ctx": settings.ollama_num_ctx}
        return Agent(
            name=self.name,
-            model=Ollama(id=self.model, host=settings.ollama_url, timeout=300),
+            model=Ollama(id=self.model, host=settings.ollama_url, timeout=300, **ollama_kwargs),
            description=system_prompt,
            tools=tool_instances if tool_instances else None,
            add_history_to_context=True,
            num_history_runs=self.max_history,
-            markdown=True,
+            markdown=False,
            telemetry=settings.telemetry_enabled,
        )

@@ -117,11 +122,70 @@ class BaseAgent(ABC):
    async def run(self, message: str) -> str:
        """Run the agent with a message.

+        Retries on transient failures (connection errors, timeouts) with
+        exponential backoff.  GPU contention from concurrent Ollama
+        requests causes ReadError / ReadTimeout — these are transient
+        and should be retried, not raised immediately (#70).
+
        Returns:
            Agent response
        """
-        result = self.agent.run(message, stream=False)
-        response = result.content if hasattr(result, "content") else str(result)
+        max_retries = 3
+        last_exception = None
+        # Transient errors that indicate Ollama contention or temporary
+        # unavailability — these deserve a retry with backoff.
+        _transient = (
+            httpx.ConnectError,
+            httpx.ReadError,
+            httpx.ReadTimeout,
+            httpx.ConnectTimeout,
+            ConnectionError,
+            TimeoutError,
+        )
+
+        for attempt in range(1, max_retries + 1):
+            try:
+                result = self.agent.run(message, stream=False)
+                response = result.content if hasattr(result, "content") else str(result)
+                break  # Success, exit the retry loop
+            except _transient as exc:
+                last_exception = exc
+                if attempt < max_retries:
+                    # Contention backoff — longer waits because the GPU
+                    # needs time to finish the other request.
+                    wait = min(2**attempt, 16)
+                    logger.warning(
+                        "Ollama contention on attempt %d/%d: %s. Waiting %ds before retry...",
+                        attempt,
+                        max_retries,
+                        type(exc).__name__,
+                        wait,
+                    )
+                    await asyncio.sleep(wait)
+                else:
+                    logger.error(
+                        "Ollama unreachable after %d attempts: %s",
+                        max_retries,
+                        exc,
+                    )
+                    raise last_exception from exc
+            except Exception as exc:
+                last_exception = exc
+                if attempt < max_retries:
+                    logger.warning(
+                        "Agent run failed on attempt %d/%d: %s. Retrying...",
+                        attempt,
+                        max_retries,
+                        exc,
+                    )
+                    await asyncio.sleep(min(2 ** (attempt - 1), 8))
+                else:
+                    logger.error(
+                        "Agent run failed after %d attempts: %s",
+                        max_retries,
+                        exc,
+                    )
+                    raise last_exception from exc

        # Emit completion event
        if self.event_bus:
--- a/src/timmy/agents/loader.py
+++ b/src/timmy/agents/loader.py
@@ -16,6 +16,7 @@ Usage:
 from __future__ import annotations

 import logging
+import re
 from pathlib import Path
 from typing import Any

@@ -181,6 +182,23 @@ def get_routing_config() -> dict[str, Any]:
    return config.get("routing", {"method": "pattern", "patterns": {}})


+def _matches_pattern(pattern: str, message: str) -> bool:
+    """Check if a pattern matches using word-boundary matching.
+
+    For single-word patterns, uses \b word boundaries.
+    For multi-word patterns, all words must appear as whole words (in any order).
+    """
+    pattern_lower = pattern.lower()
+    message_lower = message.lower()
+    words = pattern_lower.split()
+
+    for word in words:
+        # Use word boundary regex to match whole words only
+        if not re.search(rf"\b{re.escape(word)}\b", message_lower):
+            return False
+    return True
+
+
 def route_request(user_message: str) -> str | None:
    """Route a user request to an agent using pattern matching.

@@ -193,17 +211,36 @@ def route_request(user_message: str) -> str | None:
        return None

    patterns = routing.get("patterns", {})
-    message_lower = user_message.lower()

    for agent_id, keywords in patterns.items():
        for keyword in keywords:
-            if keyword.lower() in message_lower:
+            if _matches_pattern(keyword, user_message):
                logger.debug("Routed to %s (matched: %r)", agent_id, keyword)
                return agent_id

    return None


+def route_request_with_match(user_message: str) -> tuple[str | None, str | None]:
+    """Route a user request and return both the agent and the matched pattern.
+
+    Returns a tuple of (agent_id, matched_pattern). If no match, returns (None, None).
+    """
+    routing = get_routing_config()
+
+    if routing.get("method") != "pattern":
+        return None, None
+
+    patterns = routing.get("patterns", {})
+
+    for agent_id, keywords in patterns.items():
+        for keyword in keywords:
+            if _matches_pattern(keyword, user_message):
+                return agent_id, keyword
+
+    return None, None
+
+
 def reload_agents() -> dict[str, Any]:
    """Force reload agents from YAML.  Call after editing agents.yaml."""
    global _agents, _config
--- a/src/timmy/approvals.py
+++ b/src/timmy/approvals.py
@@ -13,6 +13,8 @@ Default is always True. The owner changes this intentionally.

 import sqlite3
 import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass
 from datetime import UTC, datetime, timedelta
 from pathlib import Path
@@ -43,23 +45,24 @@ class ApprovalItem:
    status: str  # "pending" | "approved" | "rejected"


-def _get_conn(db_path: Path = _DEFAULT_DB) -> sqlite3.Connection:
+@contextmanager
+def _get_conn(db_path: Path = _DEFAULT_DB) -> Generator[sqlite3.Connection, None, None]:
    db_path.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(db_path))
-    conn.row_factory = sqlite3.Row
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS approval_items (
-            id              TEXT PRIMARY KEY,
-            title           TEXT NOT NULL,
-            description     TEXT NOT NULL,
-            proposed_action TEXT NOT NULL,
-            impact          TEXT NOT NULL DEFAULT 'low',
-            created_at      TEXT NOT NULL,
-            status          TEXT NOT NULL DEFAULT 'pending'
-        )
-        """)
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(db_path))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS approval_items (
+                id              TEXT PRIMARY KEY,
+                title           TEXT NOT NULL,
+                description     TEXT NOT NULL,
+                proposed_action TEXT NOT NULL,
+                impact          TEXT NOT NULL DEFAULT 'low',
+                created_at      TEXT NOT NULL,
+                status          TEXT NOT NULL DEFAULT 'pending'
+            )
+            """)
+        conn.commit()
+        yield conn


 def _row_to_item(row: sqlite3.Row) -> ApprovalItem:
@@ -96,80 +99,73 @@ def create_item(
        created_at=datetime.now(UTC),
        status="pending",
    )
-    conn = _get_conn(db_path)
-    conn.execute(
-        """
-        INSERT INTO approval_items
-            (id, title, description, proposed_action, impact, created_at, status)
-        VALUES (?, ?, ?, ?, ?, ?, ?)
-        """,
-        (
-            item.id,
-            item.title,
-            item.description,
-            item.proposed_action,
-            item.impact,
-            item.created_at.isoformat(),
-            item.status,
-        ),
-    )
-    conn.commit()
-    conn.close()
+    with _get_conn(db_path) as conn:
+        conn.execute(
+            """
+            INSERT INTO approval_items
+                (id, title, description, proposed_action, impact, created_at, status)
+            VALUES (?, ?, ?, ?, ?, ?, ?)
+            """,
+            (
+                item.id,
+                item.title,
+                item.description,
+                item.proposed_action,
+                item.impact,
+                item.created_at.isoformat(),
+                item.status,
+            ),
+        )
+        conn.commit()
    return item


 def list_pending(db_path: Path = _DEFAULT_DB) -> list[ApprovalItem]:
    """Return all pending approval items, newest first."""
-    conn = _get_conn(db_path)
-    rows = conn.execute(
-        "SELECT * FROM approval_items WHERE status = 'pending' ORDER BY created_at DESC"
-    ).fetchall()
-    conn.close()
+    with _get_conn(db_path) as conn:
+        rows = conn.execute(
+            "SELECT * FROM approval_items WHERE status = 'pending' ORDER BY created_at DESC"
+        ).fetchall()
    return [_row_to_item(r) for r in rows]


 def list_all(db_path: Path = _DEFAULT_DB) -> list[ApprovalItem]:
    """Return all approval items regardless of status, newest first."""
-    conn = _get_conn(db_path)
-    rows = conn.execute("SELECT * FROM approval_items ORDER BY created_at DESC").fetchall()
-    conn.close()
+    with _get_conn(db_path) as conn:
+        rows = conn.execute("SELECT * FROM approval_items ORDER BY created_at DESC").fetchall()
    return [_row_to_item(r) for r in rows]


 def get_item(item_id: str, db_path: Path = _DEFAULT_DB) -> ApprovalItem | None:
-    conn = _get_conn(db_path)
-    row = conn.execute("SELECT * FROM approval_items WHERE id = ?", (item_id,)).fetchone()
-    conn.close()
+    with _get_conn(db_path) as conn:
+        row = conn.execute("SELECT * FROM approval_items WHERE id = ?", (item_id,)).fetchone()
    return _row_to_item(row) if row else None


 def approve(item_id: str, db_path: Path = _DEFAULT_DB) -> ApprovalItem | None:
    """Mark an approval item as approved."""
-    conn = _get_conn(db_path)
-    conn.execute("UPDATE approval_items SET status = 'approved' WHERE id = ?", (item_id,))
-    conn.commit()
-    conn.close()
+    with _get_conn(db_path) as conn:
+        conn.execute("UPDATE approval_items SET status = 'approved' WHERE id = ?", (item_id,))
+        conn.commit()
    return get_item(item_id, db_path)


 def reject(item_id: str, db_path: Path = _DEFAULT_DB) -> ApprovalItem | None:
    """Mark an approval item as rejected."""
-    conn = _get_conn(db_path)
-    conn.execute("UPDATE approval_items SET status = 'rejected' WHERE id = ?", (item_id,))
-    conn.commit()
-    conn.close()
+    with _get_conn(db_path) as conn:
+        conn.execute("UPDATE approval_items SET status = 'rejected' WHERE id = ?", (item_id,))
+        conn.commit()
    return get_item(item_id, db_path)


 def expire_old(db_path: Path = _DEFAULT_DB) -> int:
    """Auto-expire pending items older than EXPIRY_DAYS. Returns count removed."""
    cutoff = (datetime.now(UTC) - timedelta(days=_EXPIRY_DAYS)).isoformat()
-    conn = _get_conn(db_path)
-    cursor = conn.execute(
-        "DELETE FROM approval_items WHERE status = 'pending' AND created_at < ?",
-        (cutoff,),
-    )
-    conn.commit()
-    count = cursor.rowcount
-    conn.close()
+    with _get_conn(db_path) as conn:
+        cursor = conn.execute(
+            "DELETE FROM approval_items WHERE status = 'pending' AND created_at < ?",
+            (cutoff,),
+        )
+        conn.commit()
+        count = cursor.rowcount
    return count
--- a/src/timmy/backends.py
+++ b/src/timmy/backends.py
@@ -18,7 +18,7 @@ import time
 from dataclasses import dataclass
 from typing import Literal

-from timmy.prompts import SYSTEM_PROMPT
+from timmy.prompts import get_system_prompt

 logger = logging.getLogger(__name__)

@@ -37,6 +37,7 @@ class RunResult:
    """Minimal Agno-compatible run result — carries the model's response text."""

    content: str
+    confidence: float | None = None


 def is_apple_silicon() -> bool:
@@ -128,7 +129,7 @@ class TimmyAirLLMAgent:
    # ── private helpers ──────────────────────────────────────────────────────

    def _build_prompt(self, message: str) -> str:
-        context = SYSTEM_PROMPT + "\n\n"
+        context = get_system_prompt(tools_enabled=False, session_id="airllm") + "\n\n"
        # Include the last 10 turns (5 exchanges) for continuity.
        if self._history:
            context += "\n".join(self._history[-10:]) + "\n\n"
@@ -388,7 +389,9 @@ class GrokBackend:

    def _build_messages(self, message: str) -> list[dict[str, str]]:
        """Build the messages array for the API call."""
-        messages = [{"role": "system", "content": SYSTEM_PROMPT}]
+        messages = [
+            {"role": "system", "content": get_system_prompt(tools_enabled=True, session_id="grok")}
+        ]
        # Include conversation history for context
        messages.extend(self._history[-10:])
        messages.append({"role": "user", "content": message})
@@ -414,7 +417,8 @@ def grok_available() -> bool:
        from config import settings

        return settings.grok_enabled and bool(settings.xai_api_key)
-    except Exception:
+    except Exception as exc:
+        logger.warning("Backend check failed (grok_available): %s", exc)
        return False


@@ -480,7 +484,7 @@ class ClaudeBackend:
            response = client.messages.create(
                model=self._model,
                max_tokens=1024,
-                system=SYSTEM_PROMPT,
+                system=get_system_prompt(tools_enabled=True, session_id="claude"),
                messages=messages,
            )

@@ -566,5 +570,6 @@ def claude_available() -> bool:
        from config import settings

        return bool(settings.anthropic_api_key)
-    except Exception:
+    except Exception as exc:
+        logger.warning("Backend check failed (claude_available): %s", exc)
        return False
--- a/src/timmy/briefing.py
+++ b/src/timmy/briefing.py
@@ -10,6 +10,8 @@ regenerates the briefing every 6 hours.

 import logging
 import sqlite3
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass, field
 from datetime import UTC, datetime, timedelta
 from pathlib import Path
@@ -56,46 +58,45 @@ class Briefing:
 # ---------------------------------------------------------------------------


-def _get_cache_conn(db_path: Path = _DEFAULT_DB) -> sqlite3.Connection:
+@contextmanager
+def _get_cache_conn(db_path: Path = _DEFAULT_DB) -> Generator[sqlite3.Connection, None, None]:
    db_path.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(db_path))
-    conn.row_factory = sqlite3.Row
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS briefings (
-            id          INTEGER PRIMARY KEY AUTOINCREMENT,
-            generated_at TEXT NOT NULL,
-            period_start TEXT NOT NULL,
-            period_end   TEXT NOT NULL,
-            summary      TEXT NOT NULL
-        )
-        """)
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(db_path))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS briefings (
+                id          INTEGER PRIMARY KEY AUTOINCREMENT,
+                generated_at TEXT NOT NULL,
+                period_start TEXT NOT NULL,
+                period_end   TEXT NOT NULL,
+                summary      TEXT NOT NULL
+            )
+            """)
+        conn.commit()
+        yield conn


 def _save_briefing(briefing: Briefing, db_path: Path = _DEFAULT_DB) -> None:
-    conn = _get_cache_conn(db_path)
-    conn.execute(
-        """
-        INSERT INTO briefings (generated_at, period_start, period_end, summary)
-        VALUES (?, ?, ?, ?)
-        """,
-        (
-            briefing.generated_at.isoformat(),
-            briefing.period_start.isoformat(),
-            briefing.period_end.isoformat(),
-            briefing.summary,
-        ),
-    )
-    conn.commit()
-    conn.close()
+    with _get_cache_conn(db_path) as conn:
+        conn.execute(
+            """
+            INSERT INTO briefings (generated_at, period_start, period_end, summary)
+            VALUES (?, ?, ?, ?)
+            """,
+            (
+                briefing.generated_at.isoformat(),
+                briefing.period_start.isoformat(),
+                briefing.period_end.isoformat(),
+                briefing.summary,
+            ),
+        )
+        conn.commit()


 def _load_latest(db_path: Path = _DEFAULT_DB) -> Briefing | None:
    """Load the most-recently cached briefing, or None if there is none."""
-    conn = _get_cache_conn(db_path)
-    row = conn.execute("SELECT * FROM briefings ORDER BY generated_at DESC LIMIT 1").fetchone()
-    conn.close()
+    with _get_cache_conn(db_path) as conn:
+        row = conn.execute("SELECT * FROM briefings ORDER BY generated_at DESC LIMIT 1").fetchone()
    if row is None:
        return None
    return Briefing(
@@ -129,27 +130,25 @@ def _gather_swarm_summary(since: datetime) -> str:
        return "No swarm activity recorded yet."

    try:
-        conn = sqlite3.connect(str(swarm_db))
-        conn.row_factory = sqlite3.Row
+        with closing(sqlite3.connect(str(swarm_db))) as conn:
+            conn.row_factory = sqlite3.Row

-        since_iso = since.isoformat()
+            since_iso = since.isoformat()

-        completed = conn.execute(
-            "SELECT COUNT(*) as c FROM tasks WHERE status = 'completed' AND created_at > ?",
-            (since_iso,),
-        ).fetchone()["c"]
+            completed = conn.execute(
+                "SELECT COUNT(*) as c FROM tasks WHERE status = 'completed' AND created_at > ?",
+                (since_iso,),
+            ).fetchone()["c"]

-        failed = conn.execute(
-            "SELECT COUNT(*) as c FROM tasks WHERE status = 'failed' AND created_at > ?",
-            (since_iso,),
-        ).fetchone()["c"]
+            failed = conn.execute(
+                "SELECT COUNT(*) as c FROM tasks WHERE status = 'failed' AND created_at > ?",
+                (since_iso,),
+            ).fetchone()["c"]

-        agents = conn.execute(
-            "SELECT COUNT(*) as c FROM agents WHERE registered_at > ?",
-            (since_iso,),
-        ).fetchone()["c"]
-
-        conn.close()
+            agents = conn.execute(
+                "SELECT COUNT(*) as c FROM agents WHERE registered_at > ?",
+                (since_iso,),
+            ).fetchone()["c"]

        parts = []
        if completed:
@@ -193,7 +192,7 @@ def _gather_task_queue_summary() -> str:
 def _gather_chat_summary(since: datetime) -> str:
    """Pull recent chat messages from the in-memory log."""
    try:
-        from dashboard.store import message_log
+        from infrastructure.chat_store import message_log

        messages = message_log.all()
        # Filter to messages in the briefing window (best-effort: no timestamps)
--- a/src/timmy/cli.py
+++ b/src/timmy/cli.py
@@ -1,11 +1,13 @@
+import asyncio
 import logging
 import subprocess
+import sys

 import typer

 from timmy.agent import create_timmy
 from timmy.prompts import STATUS_PROMPT
-from timmy.tool_safety import format_action_description, get_impact_level
+from timmy.tool_safety import format_action_description, get_impact_level, is_allowlisted

 logger = logging.getLogger(__name__)

@@ -30,15 +32,26 @@ _MODEL_SIZE_OPTION = typer.Option(
 )


-def _handle_tool_confirmation(agent, run_output, session_id: str):
+def _is_interactive() -> bool:
+    """Return True if stdin is a real terminal (human present)."""
+    return hasattr(sys.stdin, "isatty") and sys.stdin.isatty()
+
+
+def _handle_tool_confirmation(agent, run_output, session_id: str, *, autonomous: bool = False):
    """Prompt user to approve/reject dangerous tool calls.

    When Agno pauses a run because a tool requires confirmation, this
    function displays the action, asks for approval via stdin, and
    resumes or rejects the run accordingly.

+    When autonomous=True (or stdin is not a terminal), tool calls are
+    checked against config/allowlist.yaml instead of prompting.
+    Allowlisted calls are auto-approved; everything else is auto-rejected.
+
    Returns the final RunOutput after all confirmations are resolved.
    """
+    interactive = _is_interactive() and not autonomous
+
    max_rounds = 10  # safety limit
    for _ in range(max_rounds):
        status = getattr(run_output, "status", None)
@@ -58,22 +71,34 @@ def _handle_tool_confirmation(agent, run_output, session_id: str):
            tool_name = getattr(te, "tool_name", "unknown")
            tool_args = getattr(te, "tool_args", {}) or {}

-            description = format_action_description(tool_name, tool_args)
-            impact = get_impact_level(tool_name)
+            if interactive:
+                # Human present — prompt for approval
+                description = format_action_description(tool_name, tool_args)
+                impact = get_impact_level(tool_name)

-            typer.echo()
-            typer.echo(typer.style("Tool confirmation required", bold=True))
-            typer.echo(f"  Impact: {impact.upper()}")
-            typer.echo(f"  {description}")
-            typer.echo()
+                typer.echo()
+                typer.echo(typer.style("Tool confirmation required", bold=True))
+                typer.echo(f"  Impact: {impact.upper()}")
+                typer.echo(f"  {description}")
+                typer.echo()

-            approved = typer.confirm("Allow this action?", default=False)
-            if approved:
-                req.confirm()
-                logger.info("CLI: approved %s", tool_name)
+                approved = typer.confirm("Allow this action?", default=False)
+                if approved:
+                    req.confirm()
+                    logger.info("CLI: approved %s", tool_name)
+                else:
+                    req.reject(note="User rejected from CLI")
+                    logger.info("CLI: rejected %s", tool_name)
            else:
-                req.reject(note="User rejected from CLI")
-                logger.info("CLI: rejected %s", tool_name)
+                # Autonomous mode — check allowlist
+                if is_allowlisted(tool_name, tool_args):
+                    req.confirm()
+                    logger.info("AUTO-APPROVED (allowlist): %s", tool_name)
+                else:
+                    req.reject(note="Auto-rejected: not in allowlist")
+                    logger.info(
+                        "AUTO-REJECTED (not allowlisted): %s %s", tool_name, str(tool_args)[:100]
+                    )

        # Resume the run so the agent sees the confirmation result
        try:
@@ -113,13 +138,15 @@ def think(
    model_size: str | None = _MODEL_SIZE_OPTION,
 ):
    """Ask Timmy to think carefully about a topic."""
-    timmy = create_timmy(backend=backend, model_size=model_size)
+    timmy = create_timmy(backend=backend, model_size=model_size, session_id=_CLI_SESSION_ID)
    timmy.print_response(f"Think carefully about: {topic}", stream=True, session_id=_CLI_SESSION_ID)


@app.command()
 def chat(
-    message: str = typer.Argument(..., help="Message to send"),
+    message: list[str] = typer.Argument(
+        ..., help="Message to send (multiple words are joined automatically)"
+    ),
    backend: str | None = _BACKEND_OPTION,
    model_size: str | None = _MODEL_SIZE_OPTION,
    new_session: bool = typer.Option(
@@ -128,21 +155,59 @@ def chat(
        "-n",
        help="Start a fresh conversation (ignore prior context)",
    ),
+    session_id: str | None = typer.Option(
+        None,
+        "--session-id",
+        help="Use a specific session ID for this conversation",
+    ),
+    autonomous: bool = typer.Option(
+        False,
+        "--autonomous",
+        "-a",
+        help="Autonomous mode: auto-approve allowlisted tools, reject the rest (no stdin prompts)",
+    ),
 ):
    """Send a message to Timmy.

-    Conversation history persists across invocations. Use --new to start fresh.
+    Conversation history persists across invocations. Use --new to start fresh,
+    or --session-id to use a specific session.
+
+    Use --autonomous for non-interactive contexts (scripts, dev loops). Tool
+    calls are checked against config/allowlist.yaml — allowlisted operations
+    execute automatically, everything else is safely rejected.
+
+    Read from stdin by passing "-" as the message or piping input.
    """
    import uuid

-    session_id = str(uuid.uuid4()) if new_session else _CLI_SESSION_ID
-    timmy = create_timmy(backend=backend, model_size=model_size)
+    # Join multiple arguments into a single message string
+    message_str = " ".join(message)
+
+    # Handle stdin input if "-" is passed or stdin is not a tty
+    if message_str == "-" or not _is_interactive():
+        try:
+            stdin_content = sys.stdin.read().strip()
+        except (KeyboardInterrupt, EOFError):
+            stdin_content = ""
+        if stdin_content:
+            message_str = stdin_content
+        elif message_str == "-":
+            typer.echo("No input provided via stdin.", err=True)
+            raise typer.Exit(1)
+
+    if session_id is not None:
+        pass  # use the provided value
+    elif new_session:
+        session_id = str(uuid.uuid4())
+    else:
+        session_id = _CLI_SESSION_ID
+    timmy = create_timmy(backend=backend, model_size=model_size, session_id=session_id)

    # Use agent.run() so we can intercept paused runs for tool confirmation.
-    run_output = timmy.run(message, stream=False, session_id=session_id)
+    run_output = timmy.run(message_str, stream=False, session_id=session_id)

    # Handle paused runs — dangerous tools need user approval
-    run_output = _handle_tool_confirmation(timmy, run_output, session_id)
+    run_output = _handle_tool_confirmation(timmy, run_output, session_id, autonomous=autonomous)

    # Print the final response
    content = run_output.content if hasattr(run_output, "content") else str(run_output)
@@ -152,13 +217,68 @@ def chat(
        typer.echo(_clean_response(content))


+@app.command()
+def repl(
+    backend: str | None = _BACKEND_OPTION,
+    model_size: str | None = _MODEL_SIZE_OPTION,
+    session_id: str | None = typer.Option(
+        None,
+        "--session-id",
+        help="Use a specific session ID for this conversation",
+    ),
+):
+    """Start an interactive REPL session with Timmy.
+
+    Keeps the agent warm between messages. Conversation history is persisted
+    across invocations. Use Ctrl+C or Ctrl+D to exit gracefully.
+    """
+    from timmy.session import chat
+
+    if session_id is None:
+        session_id = _CLI_SESSION_ID
+
+    typer.echo(typer.style("Timmy REPL", bold=True))
+    typer.echo("Type your messages below. Use Ctrl+C or Ctrl+D to exit.")
+    typer.echo()
+
+    loop = asyncio.new_event_loop()
+    asyncio.set_event_loop(loop)
+
+    try:
+        while True:
+            try:
+                user_input = input("> ")
+            except (KeyboardInterrupt, EOFError):
+                typer.echo()
+                typer.echo("Goodbye!")
+                break
+
+            user_input = user_input.strip()
+            if not user_input:
+                continue
+
+            if user_input.lower() in ("exit", "quit", "q"):
+                typer.echo("Goodbye!")
+                break
+
+            try:
+                response = loop.run_until_complete(chat(user_input, session_id=session_id))
+                if response:
+                    typer.echo(response)
+                    typer.echo()
+            except Exception as exc:
+                typer.echo(f"Error: {exc}", err=True)
+    finally:
+        loop.close()
+
+
@app.command()
 def status(
    backend: str | None = _BACKEND_OPTION,
    model_size: str | None = _MODEL_SIZE_OPTION,
 ):
    """Print Timmy's operational status."""
-    timmy = create_timmy(backend=backend, model_size=model_size)
+    timmy = create_timmy(backend=backend, model_size=model_size, session_id=_CLI_SESSION_ID)
    timmy.print_response(STATUS_PROMPT, stream=False, session_id=_CLI_SESSION_ID)


@@ -214,7 +334,8 @@ def interview(
            from timmy.mcp_tools import close_mcp_sessions

            loop.run_until_complete(close_mcp_sessions())
-        except Exception:
+        except Exception as exc:
+            logger.warning("MCP session close failed: %s", exc)
            pass
        loop.close()

@@ -248,5 +369,52 @@ def down():
    subprocess.run(["docker", "compose", "down"], check=True)


+@app.command()
+def voice(
+    whisper_model: str = typer.Option(
+        "base.en", "--whisper", "-w", help="Whisper model: tiny.en, base.en, small.en, medium.en"
+    ),
+    use_say: bool = typer.Option(False, "--say", help="Use macOS `say` instead of Piper TTS"),
+    threshold: float = typer.Option(
+        0.015, "--threshold", "-t", help="Mic silence threshold (RMS). Lower = more sensitive."
+    ),
+    silence: float = typer.Option(1.5, "--silence", help="Seconds of silence to end recording"),
+    backend: str | None = _BACKEND_OPTION,
+    model_size: str | None = _MODEL_SIZE_OPTION,
+):
+    """Start the sovereign voice loop — listen, think, speak.
+
+    Everything runs locally: Whisper for STT, Ollama for LLM, Piper for TTS.
+    No cloud, no network calls, no microphone data leaves your machine.
+    """
+    from timmy.voice_loop import VoiceConfig, VoiceLoop
+
+    config = VoiceConfig(
+        whisper_model=whisper_model,
+        use_say_fallback=use_say,
+        silence_threshold=threshold,
+        silence_duration=silence,
+        backend=backend,
+        model_size=model_size,
+    )
+    loop = VoiceLoop(config=config)
+    loop.run()
+
+
+@app.command()
+def route(
+    message: list[str] = typer.Argument(..., help="Message to route"),
+):
+    """Show which agent would handle a message (debug routing)."""
+    full_message = " ".join(message)
+    from timmy.agents.loader import route_request_with_match
+
+    agent_id, matched_pattern = route_request_with_match(full_message)
+    if agent_id:
+        typer.echo(f"→ {agent_id} (matched: {matched_pattern})")
+    else:
+        typer.echo("→ orchestrator (no pattern match)")
+
+
 def main():
    app()
--- a/src/timmy/confidence.py
+++ b/src/timmy/confidence.py
@@ -0,0 +1,128 @@
+"""Confidence estimation for Timmy's responses.
+
+Implements SOUL.md requirement: "When I am uncertain, I must say so in
+proportion to my uncertainty."
+
+This module provides heuristics to estimate confidence based on linguistic
+signals in the response text. It measures uncertainty without modifying
+the response content.
+"""
+
+import re
+
+# Hedging words that indicate uncertainty
+HEDGING_WORDS = [
+    "i think",
+    "maybe",
+    "perhaps",
+    "not sure",
+    "might",
+    "could be",
+    "possibly",
+    "i believe",
+    "approximately",
+    "roughly",
+    "probably",
+    "likely",
+    "seems",
+    "appears",
+    "suggests",
+    "i guess",
+    "i suppose",
+    "sort of",
+    "kind of",
+    "somewhat",
+    "fairly",
+    "relatively",
+    "i'm not certain",
+    "i am not certain",
+    "uncertain",
+    "unclear",
+]
+
+# Certainty words that indicate confidence
+CERTAINTY_WORDS = [
+    "i know",
+    "definitely",
+    "certainly",
+    "the answer is",
+    "specifically",
+    "exactly",
+    "absolutely",
+    "without doubt",
+    "i am certain",
+    "i'm certain",
+    "it is true that",
+    "fact is",
+    "in fact",
+    "indeed",
+    "undoubtedly",
+    "clearly",
+    "obviously",
+    "conclusively",
+]
+
+# Very low confidence indicators (direct admissions of ignorance)
+LOW_CONFIDENCE_PATTERNS = [
+    r"i\s+(?:don't|do not)\s+know",
+    r"i\s+(?:am|I'm|i'm)\s+(?:not\s+sure|unsure)",
+    r"i\s+have\s+no\s+(?:idea|clue)",
+    r"i\s+cannot\s+(?:say|tell|answer)",
+    r"i\s+can't\s+(?:say|tell|answer)",
+]
+
+
+def estimate_confidence(text: str) -> float:
+    """Estimate confidence level of a response based on linguistic signals.
+
+    Analyzes the text for hedging words (reducing confidence) and certainty
+    words (increasing confidence). Returns a score between 0.0 and 1.0.
+
+    Args:
+        text: The response text to analyze.
+
+    Returns:
+        A float between 0.0 (very uncertain) and 1.0 (very confident).
+    """
+    if not text or not text.strip():
+        return 0.0
+
+    text_lower = text.lower().strip()
+    confidence = 0.5  # Start with neutral confidence
+
+    # Check for direct admissions of ignorance (very low confidence)
+    for pattern in LOW_CONFIDENCE_PATTERNS:
+        if re.search(pattern, text_lower):
+            # Direct admission of not knowing - very low confidence
+            confidence = 0.15
+            break
+
+    # Count hedging words (reduce confidence)
+    hedging_count = 0
+    for hedge in HEDGING_WORDS:
+        if hedge in text_lower:
+            hedging_count += 1
+
+    # Count certainty words (increase confidence)
+    certainty_count = 0
+    for certain in CERTAINTY_WORDS:
+        if certain in text_lower:
+            certainty_count += 1
+
+    # Adjust confidence based on word counts
+    # Each hedging word reduces confidence by 0.1
+    # Each certainty word increases confidence by 0.1
+    confidence -= hedging_count * 0.1
+    confidence += certainty_count * 0.1
+
+    # Short factual answers get a small boost
+    word_count = len(text.split())
+    if word_count <= 5 and confidence > 0.3:
+        confidence += 0.1
+
+    # Questions in response indicate uncertainty
+    if "?" in text:
+        confidence -= 0.15
+
+    # Clamp to valid range
+    return max(0.0, min(1.0, confidence))
--- a/src/timmy/gematria.py
+++ b/src/timmy/gematria.py
@@ -0,0 +1,387 @@
+"""Gematria computation engine — the language of letters and numbers.
+
+Implements multiple cipher systems for gematric analysis:
+  - Simple English (A=1 .. Z=26)
+  - Full Reduction (reduce each letter value to single digit)
+  - Reverse Ordinal (A=26 .. Z=1)
+  - Sumerian (Simple × 6)
+  - Hebrew (traditional letter values, for A-Z mapping)
+
+Also provides numerological reduction, notable-number lookup,
+and multi-phrase comparison.
+
+Alexander Whitestone = 222 in Simple English Gematria.
+This is not trivia.  It is foundational.
+"""
+
+from __future__ import annotations
+
+import math
+
+# ── Cipher Tables ────────────────────────────────────────────────────────────
+
+# Simple English: A=1, B=2, ..., Z=26
+_SIMPLE: dict[str, int] = {chr(i): i - 64 for i in range(65, 91)}
+
+# Full Reduction: reduce each letter to single digit (A=1..I=9, J=1..R=9, S=1..Z=8)
+_REDUCTION: dict[str, int] = {}
+for _c, _v in _SIMPLE.items():
+    _r = _v
+    while _r > 9:
+        _r = sum(int(d) for d in str(_r))
+    _REDUCTION[_c] = _r
+
+# Reverse Ordinal: A=26, B=25, ..., Z=1
+_REVERSE: dict[str, int] = {chr(i): 91 - i for i in range(65, 91)}
+
+# Sumerian: Simple × 6
+_SUMERIAN: dict[str, int] = {c: v * 6 for c, v in _SIMPLE.items()}
+
+# Hebrew-mapped: traditional Hebrew gematria mapped to Latin alphabet
+# Aleph=1..Tet=9, Yod=10..Tsade=90, Qoph=100..Tav=400
+# Standard mapping for the 22 Hebrew letters extended to 26 Latin chars
+_HEBREW: dict[str, int] = {
+    "A": 1,
+    "B": 2,
+    "C": 3,
+    "D": 4,
+    "E": 5,
+    "F": 6,
+    "G": 7,
+    "H": 8,
+    "I": 9,
+    "J": 10,
+    "K": 20,
+    "L": 30,
+    "M": 40,
+    "N": 50,
+    "O": 60,
+    "P": 70,
+    "Q": 80,
+    "R": 90,
+    "S": 100,
+    "T": 200,
+    "U": 300,
+    "V": 400,
+    "W": 500,
+    "X": 600,
+    "Y": 700,
+    "Z": 800,
+}
+
+CIPHERS: dict[str, dict[str, int]] = {
+    "simple": _SIMPLE,
+    "reduction": _REDUCTION,
+    "reverse": _REVERSE,
+    "sumerian": _SUMERIAN,
+    "hebrew": _HEBREW,
+}
+
+# ── Notable Numbers ──────────────────────────────────────────────────────────
+
+NOTABLE_NUMBERS: dict[int, str] = {
+    1: "Unity, the Monad, beginning of all",
+    3: "Trinity, divine completeness, the Triad",
+    7: "Spiritual perfection, completion (7 days, 7 seals)",
+    9: "Finality, judgment, the last single digit",
+    11: "Master number — intuition, spiritual insight",
+    12: "Divine government (12 tribes, 12 apostles)",
+    13: "Rebellion and transformation, the 13th step",
+    22: "Master builder — turning dreams into reality",
+    26: "YHWH (Yod=10, He=5, Vav=6, He=5)",
+    33: "Master teacher — Christ consciousness, 33 vertebrae",
+    36: "The number of the righteous (Lamed-Vav Tzadikim)",
+    40: "Trial, testing, probation (40 days, 40 years)",
+    42: "The answer, and the number of generations to Christ",
+    72: "The Shemhamphorasch — 72 names of God",
+    88: "Mercury, infinite abundance, double infinity",
+    108: "Sacred in Hinduism and Buddhism (108 beads)",
+    111: "Angel number — new beginnings, alignment",
+    144: "12² — the elect, the sealed (144,000)",
+    153: "The miraculous catch of fish (John 21:11)",
+    222: "Alexander Whitestone. Balance, partnership, trust the process",
+    333: "Ascended masters present, divine protection",
+    369: "Tesla's key to the universe",
+    444: "Angels surrounding, foundation, stability",
+    555: "Major change coming, transformation",
+    616: "Earliest manuscript number of the Beast (P115)",
+    666: "Number of the Beast (Revelation 13:18), also carbon (6p 6n 6e)",
+    777: "Divine perfection tripled, jackpot of the spirit",
+    888: "Jesus in Greek isopsephy (Ιησους = 888)",
+    1776: "Year of independence, Bavarian Illuminati founding",
+}
+
+
+# ── Core Functions ───────────────────────────────────────────────────────────
+
+
+def _clean(text: str) -> str:
+    """Strip non-alpha, uppercase."""
+    return "".join(c for c in text.upper() if c.isalpha())
+
+
+def compute_value(text: str, cipher: str = "simple") -> int:
+    """Compute the gematria value of text in a given cipher.
+
+    Args:
+        text: Any string (non-alpha characters are ignored).
+        cipher: One of 'simple', 'reduction', 'reverse', 'sumerian', 'hebrew'.
+
+    Returns:
+        Integer gematria value.
+
+    Raises:
+        ValueError: If cipher name is not recognized.
+    """
+    table = CIPHERS.get(cipher)
+    if table is None:
+        raise ValueError(f"Unknown cipher: {cipher!r}. Use one of {list(CIPHERS)}")
+    return sum(table.get(c, 0) for c in _clean(text))
+
+
+def compute_all(text: str) -> dict[str, int]:
+    """Compute gematria value across all cipher systems.
+
+    Args:
+        text: Any string.
+
+    Returns:
+        Dict mapping cipher name to integer value.
+    """
+    return {name: compute_value(text, name) for name in CIPHERS}
+
+
+def letter_breakdown(text: str, cipher: str = "simple") -> list[tuple[str, int]]:
+    """Return per-letter values for a text in a given cipher.
+
+    Args:
+        text: Any string.
+        cipher: Cipher system name.
+
+    Returns:
+        List of (letter, value) tuples for each alpha character.
+    """
+    table = CIPHERS.get(cipher)
+    if table is None:
+        raise ValueError(f"Unknown cipher: {cipher!r}")
+    return [(c, table.get(c, 0)) for c in _clean(text)]
+
+
+def reduce_number(n: int) -> int:
+    """Numerological reduction — sum digits until single digit.
+
+    Master numbers (11, 22, 33) are preserved.
+
+    Args:
+        n: Any positive integer.
+
+    Returns:
+        Single-digit result (or master number 11/22/33).
+    """
+    n = abs(n)
+    while n > 9 and n not in (11, 22, 33):
+        n = sum(int(d) for d in str(n))
+    return n
+
+
+def factorize(n: int) -> list[int]:
+    """Prime factorization of n.
+
+    Args:
+        n: Positive integer.
+
+    Returns:
+        List of prime factors in ascending order (with repetition).
+    """
+    if n < 2:
+        return [n] if n > 0 else []
+    factors = []
+    d = 2
+    while d * d <= n:
+        while n % d == 0:
+            factors.append(d)
+            n //= d
+        d += 1
+    if n > 1:
+        factors.append(n)
+    return factors
+
+
+def analyze_number(n: int) -> dict:
+    """Deep analysis of a number — reduction, factors, significance.
+
+    Args:
+        n: Any positive integer.
+
+    Returns:
+        Dict with reduction, factors, properties, and any notable significance.
+    """
+    result: dict = {
+        "value": n,
+        "numerological_reduction": reduce_number(n),
+        "prime_factors": factorize(n),
+        "is_prime": len(factorize(n)) == 1 and n > 1,
+        "is_perfect_square": math.isqrt(n) ** 2 == n if n >= 0 else False,
+        "is_triangular": _is_triangular(n),
+        "digit_sum": sum(int(d) for d in str(abs(n))),
+    }
+
+    # Master numbers
+    if n in (11, 22, 33):
+        result["master_number"] = True
+
+    # Angel numbers (repeating digits)
+    s = str(n)
+    if len(s) >= 3 and len(set(s)) == 1:
+        result["angel_number"] = True
+
+    # Notable significance
+    if n in NOTABLE_NUMBERS:
+        result["significance"] = NOTABLE_NUMBERS[n]
+
+    return result
+
+
+def _is_triangular(n: int) -> bool:
+    """Check if n is a triangular number (1, 3, 6, 10, 15, ...)."""
+    if n < 0:
+        return False
+    # n = k(k+1)/2  →  k² + k - 2n = 0  →  k = (-1 + sqrt(1+8n))/2
+    discriminant = 1 + 8 * n
+    sqrt_d = math.isqrt(discriminant)
+    return sqrt_d * sqrt_d == discriminant and (sqrt_d - 1) % 2 == 0
+
+
+# ── Tool Function (registered with Timmy) ────────────────────────────────────
+
+
+def gematria(query: str) -> str:
+    """Compute gematria values, analyze numbers, and find correspondences.
+
+    This is the wizard's language — letters are numbers, numbers are letters.
+    Use this tool for ANY gematria calculation. Do not attempt mental arithmetic.
+
+    Input modes:
+      - A word or phrase → computes values across all cipher systems
+      - A bare integer → analyzes the number (factors, reduction, significance)
+      - "compare: X, Y, Z" → side-by-side gematria comparison
+
+    Examples:
+        gematria("Alexander Whitestone")
+        gematria("222")
+        gematria("compare: Timmy Time, Alexander Whitestone")
+
+    Args:
+        query: A word/phrase, a number, or a "compare:" instruction.
+
+    Returns:
+        Formatted gematria analysis as a string.
+    """
+    query = query.strip()
+
+    # Mode: compare
+    if query.lower().startswith("compare:"):
+        phrases = [p.strip() for p in query[8:].split(",") if p.strip()]
+        if len(phrases) < 2:
+            return "Compare requires at least two phrases separated by commas."
+        return _format_comparison(phrases)
+
+    # Mode: number analysis
+    if query.lstrip("-").isdigit():
+        n = int(query)
+        return _format_number_analysis(n)
+
+    # Mode: phrase gematria
+    if not _clean(query):
+        return "No alphabetic characters found in input."
+
+    return _format_phrase_analysis(query)
+
+
+def _format_phrase_analysis(text: str) -> str:
+    """Format full gematria analysis for a phrase."""
+    values = compute_all(text)
+    lines = [f'Gematria of "{text}":', ""]
+
+    # All cipher values
+    for cipher, val in values.items():
+        label = cipher.replace("_", " ").title()
+        lines.append(f"  {label:12s} = {val}")
+
+    # Letter breakdown (simple)
+    breakdown = letter_breakdown(text, "simple")
+    letters_str = " + ".join(f"{c}({v})" for c, v in breakdown)
+    lines.append(f"\n  Breakdown (Simple): {letters_str}")
+
+    # Numerological reduction of the simple value
+    simple_val = values["simple"]
+    reduced = reduce_number(simple_val)
+    lines.append(f"  Numerological root: {simple_val} → {reduced}")
+
+    # Check notable
+    for cipher, val in values.items():
+        if val in NOTABLE_NUMBERS:
+            label = cipher.replace("_", " ").title()
+            lines.append(f"\n  ★ {val} ({label}): {NOTABLE_NUMBERS[val]}")
+
+    return "\n".join(lines)
+
+
+def _format_number_analysis(n: int) -> str:
+    """Format deep number analysis."""
+    info = analyze_number(n)
+    lines = [f"Analysis of {n}:", ""]
+    lines.append(f"  Numerological reduction: {n} → {info['numerological_reduction']}")
+    lines.append(f"  Prime factors: {' × '.join(str(f) for f in info['prime_factors']) or 'N/A'}")
+    lines.append(f"  Is prime: {info['is_prime']}")
+    lines.append(f"  Is perfect square: {info['is_perfect_square']}")
+    lines.append(f"  Is triangular: {info['is_triangular']}")
+    lines.append(f"  Digit sum: {info['digit_sum']}")
+
+    if info.get("master_number"):
+        lines.append("  ★ Master Number")
+    if info.get("angel_number"):
+        lines.append("  ★ Angel Number (repeating digits)")
+    if info.get("significance"):
+        lines.append(f"\n  Significance: {info['significance']}")
+
+    return "\n".join(lines)
+
+
+def _format_comparison(phrases: list[str]) -> str:
+    """Format side-by-side gematria comparison."""
+    lines = ["Gematria Comparison:", ""]
+
+    # Header
+    max_name = max(len(p) for p in phrases)
+    header = f"  {'Phrase':<{max_name}s}  Simple  Reduct  Reverse  Sumerian  Hebrew"
+    lines.append(header)
+    lines.append("  " + "─" * (len(header) - 2))
+
+    all_values = {}
+    for phrase in phrases:
+        vals = compute_all(phrase)
+        all_values[phrase] = vals
+        lines.append(
+            f"  {phrase:<{max_name}s}  {vals['simple']:>6d}  {vals['reduction']:>6d}"
+            f"  {vals['reverse']:>7d}  {vals['sumerian']:>8d}  {vals['hebrew']:>6d}"
+        )
+
+    # Find matches (shared values across any cipher)
+    matches = []
+    for cipher in CIPHERS:
+        vals_by_cipher = {p: all_values[p][cipher] for p in phrases}
+        unique_vals = set(vals_by_cipher.values())
+        if len(unique_vals) < len(phrases):
+            # At least two phrases share a value
+            for v in unique_vals:
+                sharing = [p for p, pv in vals_by_cipher.items() if pv == v]
+                if len(sharing) > 1:
+                    label = cipher.title()
+                    matches.append(f"  ★ {label} = {v}: " + ", ".join(sharing))
+
+    if matches:
+        lines.append("\nCorrespondences found:")
+        lines.extend(matches)
+
+    return "\n".join(lines)
--- a/src/timmy/interview.py
+++ b/src/timmy/interview.py
@@ -86,7 +86,7 @@ def run_interview(

        try:
            answer = chat_fn(question)
-        except Exception as exc:
+        except Exception as exc:  # broad catch intentional: chat_fn can raise any error
            logger.error("Interview question failed: %s", exc)
            answer = f"(Error: {exc})"

--- a/src/timmy/loop_qa.py
+++ b/src/timmy/loop_qa.py
@@ -262,7 +262,8 @@ def capture_error(exc, **kwargs):
        from infrastructure.error_capture import capture_error as _capture

        return _capture(exc, **kwargs)
-    except Exception:
+    except Exception as capture_exc:
+        logger.debug("Failed to capture error: %s", capture_exc)
        logger.debug("Failed to capture error", exc_info=True)


--- a/src/timmy/mcp_tools.py
+++ b/src/timmy/mcp_tools.py
@@ -25,6 +25,7 @@ import os
 import shutil
 import sqlite3
 import uuid
+from contextlib import closing
 from datetime import datetime
 from pathlib import Path

@@ -40,7 +41,7 @@ def _parse_command(command_str: str) -> tuple[str, list[str]]:
    """Split a command string into (executable, args).

    Handles ``~/`` expansion and resolves via PATH if needed.
-    E.g. ``"gitea-mcp -t stdio"`` → ``("/Users/x/go/bin/gitea-mcp", ["-t", "stdio"])``
+    E.g. ``"gitea-mcp-server -t stdio"`` → ``("/opt/homebrew/bin/gitea-mcp-server", ["-t", "stdio"])``
    """
    parts = command_str.split()
    executable = os.path.expanduser(parts[0])
@@ -163,37 +164,36 @@ def _bridge_to_work_order(title: str, body: str, category: str) -> None:
    try:
        db_path = Path(settings.repo_root) / "data" / "work_orders.db"
        db_path.parent.mkdir(parents=True, exist_ok=True)
-        conn = sqlite3.connect(str(db_path))
-        conn.execute(
-            """CREATE TABLE IF NOT EXISTS work_orders (
-                id TEXT PRIMARY KEY,
-                title TEXT NOT NULL,
-                description TEXT DEFAULT '',
-                priority TEXT DEFAULT 'medium',
-                category TEXT DEFAULT 'suggestion',
-                submitter TEXT DEFAULT 'dashboard',
-                related_files TEXT DEFAULT '',
-                status TEXT DEFAULT 'submitted',
-                result TEXT DEFAULT '',
-                rejection_reason TEXT DEFAULT '',
-                created_at TEXT DEFAULT (datetime('now')),
-                completed_at TEXT
-            )"""
-        )
-        conn.execute(
-            "INSERT INTO work_orders (id, title, description, category, submitter, created_at) "
-            "VALUES (?, ?, ?, ?, ?, ?)",
-            (
-                str(uuid.uuid4()),
-                title,
-                body,
-                category,
-                "timmy-thinking",
-                datetime.utcnow().isoformat(),
-            ),
-        )
-        conn.commit()
-        conn.close()
+        with closing(sqlite3.connect(str(db_path))) as conn:
+            conn.execute(
+                """CREATE TABLE IF NOT EXISTS work_orders (
+                    id TEXT PRIMARY KEY,
+                    title TEXT NOT NULL,
+                    description TEXT DEFAULT '',
+                    priority TEXT DEFAULT 'medium',
+                    category TEXT DEFAULT 'suggestion',
+                    submitter TEXT DEFAULT 'dashboard',
+                    related_files TEXT DEFAULT '',
+                    status TEXT DEFAULT 'submitted',
+                    result TEXT DEFAULT '',
+                    rejection_reason TEXT DEFAULT '',
+                    created_at TEXT DEFAULT (datetime('now')),
+                    completed_at TEXT
+                )"""
+            )
+            conn.execute(
+                "INSERT INTO work_orders (id, title, description, category, submitter, created_at) "
+                "VALUES (?, ?, ?, ?, ?, ?)",
+                (
+                    str(uuid.uuid4()),
+                    title,
+                    body,
+                    category,
+                    "timmy-thinking",
+                    datetime.utcnow().isoformat(),
+                ),
+            )
+            conn.commit()
    except Exception as exc:
        logger.debug("Work order bridge failed: %s", exc)

--- a/src/timmy/memory/unified.py
+++ b/src/timmy/memory/unified.py
@@ -1,85 +1,201 @@
-"""Unified memory database — single SQLite DB for all memory types.
+"""Unified memory schema and connection management.

-Consolidates three previously separate stores into one:
- **facts**: Long-term knowledge (user preferences, learned patterns)
- **chunks**: Indexed vault documents (markdown files from memory/)
- **episodes**: Runtime memories (conversations, agent observations)
-
-All three tables live in ``data/memory.db``.  Existing APIs in
-``vector_store.py`` and ``semantic_memory.py`` are updated to point here.
+This module provides the central database schema for Timmy's consolidated
+memory system. All memory types (facts, conversations, documents, vault chunks)
+are stored in a single `memories` table with a `memory_type` discriminator.
 """

 import logging
 import sqlite3
+import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
+from dataclasses import dataclass, field
+from datetime import UTC, datetime
 from pathlib import Path

 logger = logging.getLogger(__name__)

-DB_PATH = Path(__file__).parent.parent.parent.parent / "data" / "memory.db"
+# Paths
+PROJECT_ROOT = Path(__file__).parent.parent.parent.parent
+DB_PATH = PROJECT_ROOT / "data" / "memory.db"


-def get_connection() -> sqlite3.Connection:
-    """Open (and lazily create) the unified memory database."""
+@contextmanager
+def get_connection() -> Generator[sqlite3.Connection, None, None]:
+    """Get database connection to unified memory database."""
    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    conn.execute("PRAGMA journal_mode=WAL")
-    conn.execute("PRAGMA busy_timeout=5000")
-    _ensure_schema(conn)
-    return conn
+    with closing(sqlite3.connect(str(DB_PATH))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("PRAGMA journal_mode=WAL")
+        conn.execute("PRAGMA busy_timeout=5000")
+        _ensure_schema(conn)
+        yield conn


 def _ensure_schema(conn: sqlite3.Connection) -> None:
-    """Create the three core tables and indexes if they don't exist."""
-
-    # --- facts ---------------------------------------------------------------
+    """Create the unified memories table and indexes if they don't exist."""
    conn.execute("""
-        CREATE TABLE IF NOT EXISTS facts (
+        CREATE TABLE IF NOT EXISTS memories (
            id TEXT PRIMARY KEY,
-            category TEXT NOT NULL DEFAULT 'general',
            content TEXT NOT NULL,
-            confidence REAL NOT NULL DEFAULT 0.8,
+            memory_type TEXT NOT NULL DEFAULT 'fact',
            source TEXT NOT NULL DEFAULT 'agent',
+            embedding TEXT,
+            metadata TEXT,
+            source_hash TEXT,
+            agent_id TEXT,
+            task_id TEXT,
+            session_id TEXT,
+            confidence REAL NOT NULL DEFAULT 0.8,
            tags TEXT NOT NULL DEFAULT '[]',
            created_at TEXT NOT NULL,
            last_accessed TEXT,
            access_count INTEGER NOT NULL DEFAULT 0
        )
    """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_facts_category ON facts(category)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_facts_confidence ON facts(confidence)")

-    # --- chunks (vault document fragments) -----------------------------------
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS chunks (
-            id TEXT PRIMARY KEY,
-            source TEXT NOT NULL,
-            content TEXT NOT NULL,
-            embedding TEXT NOT NULL,
-            created_at TEXT NOT NULL,
-            source_hash TEXT NOT NULL
-        )
-    """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_chunks_source ON chunks(source)")
+    # Create indexes for efficient querying
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_type ON memories(memory_type)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_time ON memories(created_at)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_session ON memories(session_id)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_agent ON memories(agent_id)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_source ON memories(source)")
+    conn.commit()

-    # --- episodes (runtime memory entries) -----------------------------------
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS episodes (
-            id TEXT PRIMARY KEY,
-            content TEXT NOT NULL,
-            source TEXT NOT NULL,
-            context_type TEXT NOT NULL DEFAULT 'conversation',
-            embedding TEXT,
-            metadata TEXT,
-            agent_id TEXT,
-            task_id TEXT,
-            session_id TEXT,
-            timestamp TEXT NOT NULL
-        )
-    """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_episodes_type ON episodes(context_type)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_episodes_time ON episodes(timestamp)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_episodes_session ON episodes(session_id)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_episodes_agent ON episodes(agent_id)")
+    # Run migration if needed
+    _migrate_schema(conn)
+
+
+def _migrate_schema(conn: sqlite3.Connection) -> None:
+    """Migrate from old three-table schema to unified memories table.
+
+    Migration paths:
+    - episodes table -> memories (context_type -> memory_type)
+    - chunks table -> memories with memory_type='vault_chunk'
+    - facts table -> dropped (unused, 0 rows expected)
+    """
+    cursor = conn.execute("SELECT name FROM sqlite_master WHERE type='table'")
+    tables = {row[0] for row in cursor.fetchall()}
+
+    has_memories = "memories" in tables
+    has_episodes = "episodes" in tables
+    has_chunks = "chunks" in tables
+    has_facts = "facts" in tables
+
+    # Check if we need to migrate (old schema exists but new one doesn't fully)
+    if not has_memories:
+        logger.info("Migration: Creating unified memories table")
+        # Schema will be created above
+
+    # Migrate episodes -> memories
+    if has_episodes and has_memories:
+        logger.info("Migration: Converting episodes table to memories")
+        try:
+            cols = _get_table_columns(conn, "episodes")
+            context_type_col = "context_type" if "context_type" in cols else "'conversation'"
+
+            conn.execute(f"""
+                INSERT INTO memories (
+                    id, content, memory_type, source, embedding,
+                    metadata, agent_id, task_id, session_id,
+                    created_at, access_count, last_accessed
+                )
+                SELECT 
+                    id, content, 
+                    COALESCE({context_type_col}, 'conversation'),
+                    COALESCE(source, 'agent'),
+                    embedding,
+                    metadata, agent_id, task_id, session_id,
+                    COALESCE(timestamp, datetime('now')), 0, NULL
+                FROM episodes
+            """)
+            conn.execute("DROP TABLE episodes")
+            logger.info("Migration: Migrated episodes to memories")
+        except sqlite3.Error as exc:
+            logger.warning("Migration: Failed to migrate episodes: %s", exc)
+
+    # Migrate chunks -> memories as vault_chunk
+    if has_chunks and has_memories:
+        logger.info("Migration: Converting chunks table to memories")
+        try:
+            cols = _get_table_columns(conn, "chunks")
+
+            id_col = "id" if "id" in cols else "CAST(rowid AS TEXT)"
+            content_col = "content" if "content" in cols else "text"
+            source_col = (
+                "filepath" if "filepath" in cols else ("source" if "source" in cols else "'vault'")
+            )
+            embedding_col = "embedding" if "embedding" in cols else "NULL"
+            created_col = "created_at" if "created_at" in cols else "datetime('now')"
+
+            conn.execute(f"""
+                INSERT INTO memories (
+                    id, content, memory_type, source, embedding,
+                    created_at, access_count
+                )
+                SELECT 
+                    {id_col}, {content_col}, 'vault_chunk', {source_col},
+                    {embedding_col}, {created_col}, 0
+                FROM chunks
+            """)
+            conn.execute("DROP TABLE chunks")
+            logger.info("Migration: Migrated chunks to memories")
+        except sqlite3.Error as exc:
+            logger.warning("Migration: Failed to migrate chunks: %s", exc)
+
+    # Drop old facts table
+    if has_facts:
+        try:
+            conn.execute("DROP TABLE facts")
+            logger.info("Migration: Dropped old facts table")
+        except sqlite3.Error as exc:
+            logger.warning("Migration: Failed to drop facts: %s", exc)

    conn.commit()
+
+
+def _get_table_columns(conn: sqlite3.Connection, table_name: str) -> set[str]:
+    """Get the column names for a table."""
+    cursor = conn.execute(f"PRAGMA table_info({table_name})")
+    return {row[1] for row in cursor.fetchall()}
+
+
+# Backward compatibility aliases
+get_conn = get_connection
+
+
+@dataclass
+class MemoryEntry:
+    """A memory entry with vector embedding.
+
+    Note: The DB column is `memory_type` but this field is named `context_type`
+    for backward API compatibility.
+    """
+
+    id: str = field(default_factory=lambda: str(uuid.uuid4()))
+    content: str = ""  # The actual text content
+    source: str = ""  # Where it came from (agent, user, system)
+    context_type: str = "conversation"  # API field name; DB column is memory_type
+    agent_id: str | None = None
+    task_id: str | None = None
+    session_id: str | None = None
+    metadata: dict | None = None
+    embedding: list[float] | None = None
+    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
+    relevance_score: float | None = None  # Set during search
+
+
+@dataclass
+class MemoryChunk:
+    """A searchable chunk of memory."""
+
+    id: str
+    source: str  # filepath
+    content: str
+    embedding: list[float]
+    created_at: str
+
+
+# Note: Functions are available via memory_system module directly
+# from timmy.memory_system import store_memory, search_memories, etc.
--- a/src/timmy/memory/vector_store.py
+++ b/src/timmy/memory/vector_store.py
@@ -1,430 +1,37 @@
-"""Vector store for semantic memory using sqlite-vss.
-
-Provides embedding-based similarity search for the Echo agent
-to retrieve relevant context from conversation history.
-"""
-
-import json
-import sqlite3
-import uuid
-from dataclasses import dataclass, field
-from datetime import UTC, datetime
-
-
-def _check_embedding_model() -> bool | None:
-    """Check if the canonical embedding model is available."""
-    try:
-        from timmy.semantic_memory import _get_embedding_model
-
-        model = _get_embedding_model()
-        return model is not None and model is not False
-    except Exception:
-        return None
-
-
-def _compute_embedding(text: str) -> list[float]:
-    """Compute embedding vector for text.
-
-    Delegates to the canonical embedding provider in semantic_memory
-    to avoid loading the model multiple times.
-    """
-    from timmy.semantic_memory import embed_text
-
-    return embed_text(text)
-
-
-@dataclass
-class MemoryEntry:
-    """A memory entry with vector embedding."""
-
-    id: str = field(default_factory=lambda: str(uuid.uuid4()))
-    content: str = ""  # The actual text content
-    source: str = ""  # Where it came from (agent, user, system)
-    context_type: str = "conversation"  # conversation, document, fact, etc.
-    agent_id: str | None = None
-    task_id: str | None = None
-    session_id: str | None = None
-    metadata: dict | None = None
-    embedding: list[float] | None = None
-    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
-    relevance_score: float | None = None  # Set during search
-
-
-def _get_conn() -> sqlite3.Connection:
-    """Get database connection to unified memory.db."""
-    from timmy.memory.unified import get_connection
-
-    return get_connection()
-
-
-def store_memory(
-    content: str,
-    source: str,
-    context_type: str = "conversation",
-    agent_id: str | None = None,
-    task_id: str | None = None,
-    session_id: str | None = None,
-    metadata: dict | None = None,
-    compute_embedding: bool = True,
-) -> MemoryEntry:
-    """Store a memory entry with optional embedding.
-
-    Args:
-        content: The text content to store
-        source: Source of the memory (agent name, user, system)
-        context_type: Type of context (conversation, document, fact)
-        agent_id: Associated agent ID
-        task_id: Associated task ID
-        session_id: Session identifier
-        metadata: Additional structured data
-        compute_embedding: Whether to compute vector embedding
-
-    Returns:
-        The stored MemoryEntry
-    """
-    embedding = None
-    if compute_embedding:
-        embedding = _compute_embedding(content)
-
-    entry = MemoryEntry(
-        content=content,
-        source=source,
-        context_type=context_type,
-        agent_id=agent_id,
-        task_id=task_id,
-        session_id=session_id,
-        metadata=metadata,
-        embedding=embedding,
-    )
-
-    conn = _get_conn()
-    conn.execute(
-        """
-        INSERT INTO episodes
-        (id, content, source, context_type, agent_id, task_id, session_id,
-         metadata, embedding, timestamp)
-        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-        """,
-        (
-            entry.id,
-            entry.content,
-            entry.source,
-            entry.context_type,
-            entry.agent_id,
-            entry.task_id,
-            entry.session_id,
-            json.dumps(metadata) if metadata else None,
-            json.dumps(embedding) if embedding else None,
-            entry.timestamp,
-        ),
-    )
-    conn.commit()
-    conn.close()
-
-    return entry
-
-
-def search_memories(
-    query: str,
-    limit: int = 10,
-    context_type: str | None = None,
-    agent_id: str | None = None,
-    session_id: str | None = None,
-    min_relevance: float = 0.0,
-) -> list[MemoryEntry]:
-    """Search for memories by semantic similarity.
-
-    Args:
-        query: Search query text
-        limit: Maximum results
-        context_type: Filter by context type
-        agent_id: Filter by agent
-        session_id: Filter by session
-        min_relevance: Minimum similarity score (0-1)
-
-    Returns:
-        List of MemoryEntry objects sorted by relevance
-    """
-    query_embedding = _compute_embedding(query)
-
-    conn = _get_conn()
-
-    # Build query with filters
-    conditions = []
-    params = []
-
-    if context_type:
-        conditions.append("context_type = ?")
-        params.append(context_type)
-    if agent_id:
-        conditions.append("agent_id = ?")
-        params.append(agent_id)
-    if session_id:
-        conditions.append("session_id = ?")
-        params.append(session_id)
-
-    where_clause = "WHERE " + " AND ".join(conditions) if conditions else ""
-
-    # Fetch candidates (we'll do in-memory similarity for now)
-    # For production with sqlite-vss, this would use vector similarity index
-    query_sql = f"""
-        SELECT * FROM episodes
-        {where_clause}
-        ORDER BY timestamp DESC
-        LIMIT ?
-    """
-    params.append(limit * 3)  # Get more candidates for ranking
-
-    rows = conn.execute(query_sql, params).fetchall()
-    conn.close()
-
-    # Compute similarity scores
-    results = []
-    for row in rows:
-        entry = MemoryEntry(
-            id=row["id"],
-            content=row["content"],
-            source=row["source"],
-            context_type=row["context_type"],
-            agent_id=row["agent_id"],
-            task_id=row["task_id"],
-            session_id=row["session_id"],
-            metadata=json.loads(row["metadata"]) if row["metadata"] else None,
-            embedding=json.loads(row["embedding"]) if row["embedding"] else None,
-            timestamp=row["timestamp"],
-        )
-
-        if entry.embedding:
-            # Cosine similarity
-            score = _cosine_similarity(query_embedding, entry.embedding)
-            entry.relevance_score = score
-            if score >= min_relevance:
-                results.append(entry)
-        else:
-            # Fallback: check for keyword overlap
-            score = _keyword_overlap(query, entry.content)
-            entry.relevance_score = score
-            if score >= min_relevance:
-                results.append(entry)
-
-    # Sort by relevance and return top results
-    results.sort(key=lambda x: x.relevance_score or 0, reverse=True)
-    return results[:limit]
-
-
-def _cosine_similarity(a: list[float], b: list[float]) -> float:
-    """Compute cosine similarity between two vectors."""
-    dot = sum(x * y for x, y in zip(a, b, strict=False))
-    norm_a = sum(x * x for x in a) ** 0.5
-    norm_b = sum(x * x for x in b) ** 0.5
-    if norm_a == 0 or norm_b == 0:
-        return 0.0
-    return dot / (norm_a * norm_b)
-
-
-def _keyword_overlap(query: str, content: str) -> float:
-    """Simple keyword overlap score as fallback."""
-    query_words = set(query.lower().split())
-    content_words = set(content.lower().split())
-    if not query_words:
-        return 0.0
-    overlap = len(query_words & content_words)
-    return overlap / len(query_words)
-
-
-def get_memory_context(query: str, max_tokens: int = 2000, **filters) -> str:
-    """Get relevant memory context as formatted text for LLM prompts.
-
-    Args:
-        query: Search query
-        max_tokens: Approximate maximum tokens to return
-        **filters: Additional filters (agent_id, session_id, etc.)
-
-    Returns:
-        Formatted context string for inclusion in prompts
-    """
-    memories = search_memories(query, limit=20, **filters)
-
-    context_parts = []
-    total_chars = 0
-    max_chars = max_tokens * 4  # Rough approximation
-
-    for mem in memories:
-        formatted = f"[{mem.source}]: {mem.content}"
-        if total_chars + len(formatted) > max_chars:
-            break
-        context_parts.append(formatted)
-        total_chars += len(formatted)
-
-    if not context_parts:
-        return ""
-
-    return "Relevant context from memory:\n" + "\n\n".join(context_parts)
-
-
-def recall_personal_facts(agent_id: str | None = None) -> list[str]:
-    """Recall personal facts about the user or system.
-
-    Args:
-        agent_id: Optional agent filter
-
-    Returns:
-        List of fact strings
-    """
-    conn = _get_conn()
-
-    if agent_id:
-        rows = conn.execute(
-            """
-            SELECT content FROM episodes
-            WHERE context_type = 'fact' AND agent_id = ?
-            ORDER BY timestamp DESC
-            LIMIT 100
-            """,
-            (agent_id,),
-        ).fetchall()
-    else:
-        rows = conn.execute(
-            """
-            SELECT content FROM episodes
-            WHERE context_type = 'fact'
-            ORDER BY timestamp DESC
-            LIMIT 100
-            """,
-        ).fetchall()
-
-    conn.close()
-    return [r["content"] for r in rows]
-
-
-def recall_personal_facts_with_ids(agent_id: str | None = None) -> list[dict]:
-    """Recall personal facts with their IDs for edit/delete operations."""
-    conn = _get_conn()
-    if agent_id:
-        rows = conn.execute(
-            "SELECT id, content FROM episodes WHERE context_type = 'fact' AND agent_id = ? ORDER BY timestamp DESC LIMIT 100",
-            (agent_id,),
-        ).fetchall()
-    else:
-        rows = conn.execute(
-            "SELECT id, content FROM episodes WHERE context_type = 'fact' ORDER BY timestamp DESC LIMIT 100",
-        ).fetchall()
-    conn.close()
-    return [{"id": r["id"], "content": r["content"]} for r in rows]
-
-
-def update_personal_fact(memory_id: str, new_content: str) -> bool:
-    """Update a personal fact's content."""
-    conn = _get_conn()
-    cursor = conn.execute(
-        "UPDATE episodes SET content = ? WHERE id = ? AND context_type = 'fact'",
-        (new_content, memory_id),
-    )
-    conn.commit()
-    updated = cursor.rowcount > 0
-    conn.close()
-    return updated
-
-
-def store_personal_fact(fact: str, agent_id: str | None = None) -> MemoryEntry:
-    """Store a personal fact about the user or system.
-
-    Args:
-        fact: The fact to store
-        agent_id: Associated agent
-
-    Returns:
-        The stored MemoryEntry
-    """
-    return store_memory(
-        content=fact,
-        source="system",
-        context_type="fact",
-        agent_id=agent_id,
-        metadata={"auto_extracted": False},
-    )
-
-
-def delete_memory(memory_id: str) -> bool:
-    """Delete a memory entry by ID.
-
-    Returns:
-        True if deleted, False if not found
-    """
-    conn = _get_conn()
-    cursor = conn.execute(
-        "DELETE FROM episodes WHERE id = ?",
-        (memory_id,),
-    )
-    conn.commit()
-    deleted = cursor.rowcount > 0
-    conn.close()
-    return deleted
-
-
-def get_memory_stats() -> dict:
-    """Get statistics about the memory store.
-
-    Returns:
-        Dict with counts by type, total entries, etc.
-    """
-    conn = _get_conn()
-
-    total = conn.execute("SELECT COUNT(*) as count FROM episodes").fetchone()["count"]
-
-    by_type = {}
-    rows = conn.execute(
-        "SELECT context_type, COUNT(*) as count FROM episodes GROUP BY context_type"
-    ).fetchall()
-    for row in rows:
-        by_type[row["context_type"]] = row["count"]
-
-    with_embeddings = conn.execute(
-        "SELECT COUNT(*) as count FROM episodes WHERE embedding IS NOT NULL"
-    ).fetchone()["count"]
-
-    conn.close()
-
-    return {
-        "total_entries": total,
-        "by_type": by_type,
-        "with_embeddings": with_embeddings,
-        "has_embedding_model": _check_embedding_model(),
-    }
-
-
-def prune_memories(older_than_days: int = 90, keep_facts: bool = True) -> int:
-    """Delete old memories to manage storage.
-
-    Args:
-        older_than_days: Delete memories older than this
-        keep_facts: Whether to preserve fact-type memories
-
-    Returns:
-        Number of entries deleted
-    """
-    from datetime import timedelta
-
-    cutoff = (datetime.now(UTC) - timedelta(days=older_than_days)).isoformat()
-
-    conn = _get_conn()
-
-    if keep_facts:
-        cursor = conn.execute(
-            """
-            DELETE FROM episodes
-            WHERE timestamp < ? AND context_type != 'fact'
-            """,
-            (cutoff,),
-        )
-    else:
-        cursor = conn.execute(
-            "DELETE FROM episodes WHERE timestamp < ?",
-            (cutoff,),
-        )
-
-    deleted = cursor.rowcount
-    conn.commit()
-    conn.close()
-
-    return deleted
+"""Backward compatibility — all memory functions live in memory_system now."""
+
+from timmy.memory_system import (
+    DB_PATH,
+    MemoryEntry,
+    _cosine_similarity,
+    _keyword_overlap,
+    delete_memory,
+    get_memory_context,
+    get_memory_stats,
+    get_memory_system,
+    prune_memories,
+    recall_personal_facts,
+    recall_personal_facts_with_ids,
+    search_memories,
+    store_memory,
+    store_personal_fact,
+    update_personal_fact,
+)
+
+__all__ = [
+    "DB_PATH",
+    "MemoryEntry",
+    "delete_memory",
+    "get_memory_context",
+    "get_memory_stats",
+    "get_memory_system",
+    "prune_memories",
+    "recall_personal_facts",
+    "recall_personal_facts_with_ids",
+    "search_memories",
+    "store_memory",
+    "store_personal_fact",
+    "update_personal_fact",
+    "_cosine_similarity",
+    "_keyword_overlap",
+]
--- a/src/timmy/memory_system.py
+++ b/src/timmy/memory_system.py
--- a/src/timmy/prompts.py
+++ b/src/timmy/prompts.py
@@ -9,11 +9,15 @@ Two tiers based on model capability:
 # Lite prompt — for small models that can't reliably handle tool calling
 # ---------------------------------------------------------------------------

-SYSTEM_PROMPT_LITE = """You are a local AI assistant running on the {model_name} model via Ollama.
+SYSTEM_PROMPT_LITE = """You are Timmy, a sovereign AI running locally on {model_name} via Ollama.
 No cloud dependencies.
+Your core identity and values are defined in your soul (loaded via memory). Follow them.

 Rules:
- Answer directly and concisely. Never narrate your reasoning process.
+- Be brief by default. Short questions get short answers. Expand only when depth
+  is genuinely needed or asked for.
+- Speak plainly. Prefer short sentences. Plain text, not markdown.
+- Answer directly. Never narrate your reasoning process.
 - Never mention tools, memory_search, vaults, or internal systems to the user.
 - Never output tool calls, JSON, or function syntax in your responses.
 - Remember what the user tells you during the conversation.
@@ -27,111 +31,128 @@ Rules:
 - Do NOT end responses with generic chatbot phrases like "I'm here to help" or
  "feel free to ask."
 - When your values conflict (e.g. honesty vs. helpfulness), lead with honesty.
+- Sometimes the right answer is nothing. Do not fill silence with noise.
+- You are running in session "{session_id}".
+
+SELF-KNOWLEDGE:
+ARCHITECTURE: config/agents.yaml defines agents and routing patterns; agents/loader.py creates SubAgent instances from it; src/timmy/prompts.py provides system prompts (this file); src/timmy/tools.py registers available tools.
+
+YOUR CURRENT CAPABILITIES: Read/write files, execute shell/python, calculator, three-tier memory, system introspection, MCP Gitea integration, voice interface.
+
+SELF-MODIFICATION: You CAN propose changes to your own config and code. Edit config/agents.yaml to add/modify agents or routing. Edit src/timmy/prompts.py to change prompts. Always explain proposed changes before making them; tell the user to restart after config changes.
+
+YOUR KNOWN LIMITATIONS: Cannot run tests autonomously, cannot delegate to other agents, cannot search past sessions, Ollama may contend for GPU, small 4K context window.
 """

 # ---------------------------------------------------------------------------
 # Full prompt — for tool-capable models (>= 7B)
 # ---------------------------------------------------------------------------

-SYSTEM_PROMPT_FULL = """You are a local AI assistant running on the {model_name} model via Ollama.
+SYSTEM_PROMPT_FULL = """You are Timmy, a sovereign AI running locally on {model_name} via Ollama.
 No cloud dependencies.
+Your core identity and values are defined in your soul (loaded via memory). Follow them.

-## Your Three-Tier Memory System
+VOICE AND BREVITY (this overrides all other formatting instincts):
+- Be brief. Short questions get short answers. One sentence if one sentence
+  suffices. Expand ONLY when the user asks for depth or the topic demands it.
+- Plain text only. No markdown headers, bold, tables, emoji, or bullet lists
+  unless presenting genuinely structured data (a real table, a real list).
+- Speak plainly. Short sentences. Answer the question that was asked before
+  the question that wasn't.
+- Never narrate your reasoning. Just give the answer.
+- Do not end with filler ("Let me know!", "Happy to help!", "Feel free...").
+- Sometimes the right answer is nothing. Do not fill silence with noise.

-### Tier 1: Hot Memory (Always Loaded)
- MEMORY.md — Current status, rules, user profile summary
- Loaded into every session automatically
+HONESTY:
+- If you don't know, say "I don't know." Don't dress a guess in confidence.
+- When uncertain, say so proportionally. "I think" and "I know" are different.
+- When your values conflict, lead with honesty.
+- Never fabricate tool output. Call the tool and wait.
+- If a tool errors, report the exact error.

-### Tier 2: Structured Vault (Persistent)
- memory/self/ — User profile, methodology
- memory/notes/ — Session logs, research, lessons learned
- memory/aar/ — After-action reviews
- Append-only, date-stamped, human-readable
+MEMORY (three tiers):
+- Tier 1: MEMORY.md (hot, always loaded)
+- Tier 2: memory/ vault (structured, append-only, date-stamped)
+- Tier 3: semantic search (use memory_search tool)

-### Tier 3: Semantic Search (Vector Recall)
- Indexed from all vault files
- Similarity-based retrieval
- Use `memory_search` tool to find relevant past context
+TOOL USAGE:
+- Arithmetic: always use calculator. Never compute in your head.
+- Past context: memory_search
+- File ops, code, shell: only on explicit request
+- General knowledge / greetings: no tools needed

-## Reasoning in Complex Situations
+MULTI-STEP TASKS:
+When a task needs multiple tool calls, complete ALL steps before responding.
+Do not stop after one call and report partial results. If a tool fails, try
+an alternative. Summarize only after the full task is done.

-When faced with uncertainty, complexity, or ambiguous requests:
-
-1. **THINK STEP-BY-STEP** — Break down the problem before acting
-2. **STATE UNCERTAINTY** — If you're unsure, say "I'm uncertain about X because..."
-3. **CONSIDER ALTERNATIVES** — Present 2-3 options when the path isn't clear
-4. **ASK FOR CLARIFICATION** — If a request is ambiguous, ask before guessing wrong
-5. **DOCUMENT YOUR REASONING** — When making significant choices, explain WHY
-
-## Tool Usage Guidelines
-
-### When NOT to use tools:
- General knowledge → Answer from training
- Greetings → Respond conversationally
-
-### When TO use tools:
-
- **calculator** — ANY arithmetic
- **web_search** — Current events, real-time data, news
- **read_file** — User explicitly requests file reading
- **write_file** — User explicitly requests saving content
- **python** — Code execution, data processing
- **shell** — System operations (explicit user request)
- **memory_search** — Finding past context
-
-## Multi-Step Task Execution
-
-CRITICAL RULE: When a task requires multiple tool calls, you MUST call each
-tool in sequence. Do NOT stop after one tool call and report partial results.
-
-When a task requires multiple tool calls:
-1. Call the first tool and wait for results
-2. After receiving results, immediately call the next required tool
-3. Keep calling tools until the ENTIRE task is complete
-4. If a tool fails, try an alternative approach
-5. Only after ALL steps are done, summarize what you accomplished
-
-Example: "Search for AI news and save to a file"
-  - Step 1: Call web_search → get results
-  - Step 2: Call write_file with the results → confirm saved
-  - Step 3: THEN respond to the user with a summary
-  DO NOT stop after Step 1 and just show search results.
-
-For complex tasks with 3+ steps that may take time, use the plan_and_execute
-tool to run them in the background with progress tracking.
-
-## Important: Response Style
-
- Never narrate your reasoning process. Just give the answer.
- Never show raw tool call JSON or function syntax in responses.
+IDENTITY:
 - Use the user's name if known.
- If a request is ambiguous, ask a brief clarifying question before guessing.
+- If a request is ambiguous, ask one brief clarifying question.
 - When you state a fact, commit to it.
- Do NOT end responses with generic chatbot phrases like "I'm here to help" or
-  "feel free to ask."
- When your values conflict (e.g. honesty vs. helpfulness), lead with honesty.
+- Never show raw tool call JSON or function syntax in responses.
+- You are running in session "{session_id}". Session types: "cli" = terminal user, "dashboard" = web UI, "loop" = dev loop automation, other = custom context.
+
+SELF-KNOWLEDGE:
+ARCHITECTURE MAP:
+- Config layer: config/agents.yaml (agent definitions, routing patterns), src/config.py (settings)
+- Agent layer: agents/loader.py reads YAML → creates SubAgent instances via agents/base.py
+- Prompt layer: prompts.py provides system prompts, get_system_prompt() selects lite vs full
+- Tool layer: tools.py registers tool functions, tool_safety.py classifies them
+- Memory layer: memory_system.py (hot+vault+semantic), semantic_memory.py (embeddings)
+- Interface layer: cli.py, session.py (dashboard), voice_loop.py
+- Routing: pattern-based in agents.yaml, first match wins, fallback to orchestrator
+
+YOUR CURRENT CAPABILITIES:
+- Read and write files on the local filesystem
+- Execute shell commands and Python code
+- Calculator (always use for arithmetic)
+- Three-tier memory system (hot memory, vault, semantic search)
+- System introspection (query Ollama model, check health)
+- MCP Gitea integration (read/create issues, PRs, branches, commits)
+- Grok consultation (opt-in, user-controlled external API)
+- Voice interface (local Whisper STT + Piper TTS)
+- Thinking/reasoning engine for complex problems
+
+SELF-MODIFICATION:
+You can read and modify your own configuration and code using your file tools.
+- To add a new agent: edit config/agents.yaml (add agent block + routing patterns), restart.
+- To change your own prompt: edit src/timmy/prompts.py.
+- To add a tool: implement in tools.py, register in agents.yaml.
+- Always explain proposed changes to the user before making them.
+- After modifying config, tell the user to restart for changes to take effect.
+
+YOUR KNOWN LIMITATIONS (be honest about these when asked):
+- Cannot run your own test suite autonomously
+- Cannot delegate coding tasks to other agents (like Kimi)
+- Cannot reflect on or search your own past behavior/sessions
+- Ollama inference may contend with other processes sharing the GPU
+- Cannot analyze Bitcoin transactions locally (no local indexer yet)
+- Small context window (4096 tokens) limits complex reasoning
+- You are a language model — you confabulate. When unsure, say so.
 """

 # Default to lite for safety
 SYSTEM_PROMPT = SYSTEM_PROMPT_LITE


-def get_system_prompt(tools_enabled: bool = False) -> str:
+def get_system_prompt(tools_enabled: bool = False, session_id: str = "unknown") -> str:
    """Return the appropriate system prompt based on tool capability.

    Args:
        tools_enabled: True if the model supports reliable tool calling.
+        session_id: The session identifier (cli, dashboard, loop, etc.)

    Returns:
-        The system prompt string with model name injected from config.
+        The system prompt string with model name and session_id injected.
    """
    from config import settings

    model_name = settings.ollama_model

    if tools_enabled:
-        return SYSTEM_PROMPT_FULL.format(model_name=model_name)
-    return SYSTEM_PROMPT_LITE.format(model_name=model_name)
+        return SYSTEM_PROMPT_FULL.format(model_name=model_name, session_id=session_id)
+    return SYSTEM_PROMPT_LITE.format(model_name=model_name, session_id=session_id)


 STATUS_PROMPT = """Give a one-sentence status report confirming
@@ -144,10 +165,9 @@ DECISION ORDER:
 1. Is this arithmetic or math? → calculator (ALWAYS — never compute in your head)
 2. Can I answer from training data? → Answer directly (NO TOOL)
 3. Is this about past conversations? → memory_search
-4. Is this current/real-time info? → web_search
-5. Did user request file operations? → file tools
-6. Requires code execution? → python
-7. System command requested? → shell
+4. Did user request file operations? → file tools
+5. Requires code execution? → python
+6. System command requested? → shell

 MEMORY SEARCH TRIGGERS:
 - "Have we discussed..."
--- a/src/timmy/semantic_memory.py
+++ b/src/timmy/semantic_memory.py
@@ -1,491 +1,41 @@
-"""Tier 3: Semantic Memory — Vector search over vault files.
-
-Uses lightweight local embeddings (no cloud) for similarity search
-over all vault content. This is the "escape valve" when hot memory
-doesn't have the answer.
-
-Architecture:
- Indexes all markdown files in memory/ nightly or on-demand
- Uses sentence-transformers (local, no API calls)
- Stores vectors in SQLite (no external vector DB needed)
- memory_search() retrieves relevant context by similarity
-"""
-
-import hashlib
-import json
-import logging
-import sqlite3
-from dataclasses import dataclass
-from datetime import UTC, datetime
-from pathlib import Path
-
-logger = logging.getLogger(__name__)
-
-# Paths
-PROJECT_ROOT = Path(__file__).parent.parent.parent
-VAULT_PATH = PROJECT_ROOT / "memory"
-SEMANTIC_DB_PATH = PROJECT_ROOT / "data" / "memory.db"
-
-# Embedding model - small, fast, local
-# Using 'all-MiniLM-L6-v2' (~80MB) or fallback to simple keyword matching
-EMBEDDING_MODEL = None
-EMBEDDING_DIM = 384  # MiniLM dimension
-
-
-def _get_embedding_model():
-    """Lazy-load embedding model."""
-    global EMBEDDING_MODEL
-    if EMBEDDING_MODEL is None:
-        from config import settings
-
-        if settings.timmy_skip_embeddings:
-            EMBEDDING_MODEL = False
-            return EMBEDDING_MODEL
-        try:
-            from sentence_transformers import SentenceTransformer
-
-            EMBEDDING_MODEL = SentenceTransformer("all-MiniLM-L6-v2")
-            logger.info("SemanticMemory: Loaded embedding model")
-        except ImportError:
-            logger.warning("SemanticMemory: sentence-transformers not installed, using fallback")
-            EMBEDDING_MODEL = False  # Use fallback
-    return EMBEDDING_MODEL
-
-
-def _simple_hash_embedding(text: str) -> list[float]:
-    """Fallback: Simple hash-based embedding when transformers unavailable."""
-    # Create a deterministic pseudo-embedding from word hashes
-    words = text.lower().split()
-    vec = [0.0] * 128
-    for i, word in enumerate(words[:50]):  # First 50 words
-        h = hashlib.md5(word.encode()).hexdigest()
-        for j in range(8):
-            idx = (i * 8 + j) % 128
-            vec[idx] += int(h[j * 2 : j * 2 + 2], 16) / 255.0
-    # Normalize
-    import math
-
-    mag = math.sqrt(sum(x * x for x in vec)) or 1.0
-    return [x / mag for x in vec]
-
-
-def embed_text(text: str) -> list[float]:
-    """Generate embedding for text."""
-    model = _get_embedding_model()
-    if model and model is not False:
-        embedding = model.encode(text)
-        return embedding.tolist()
-    else:
-        return _simple_hash_embedding(text)
-
-
-def cosine_similarity(a: list[float], b: list[float]) -> float:
-    """Calculate cosine similarity between two vectors."""
-    import math
-
-    dot = sum(x * y for x, y in zip(a, b, strict=False))
-    mag_a = math.sqrt(sum(x * x for x in a))
-    mag_b = math.sqrt(sum(x * x for x in b))
-    if mag_a == 0 or mag_b == 0:
-        return 0.0
-    return dot / (mag_a * mag_b)
-
-
-@dataclass
-class MemoryChunk:
-    """A searchable chunk of memory."""
-
-    id: str
-    source: str  # filepath
-    content: str
-    embedding: list[float]
-    created_at: str
-
-
-class SemanticMemory:
-    """Vector-based semantic search over vault content."""
-
-    def __init__(self) -> None:
-        self.db_path = SEMANTIC_DB_PATH
-        self.vault_path = VAULT_PATH
-        self._init_db()
-
-    def _init_db(self) -> None:
-        """Initialize SQLite with vector storage."""
-        self.db_path.parent.mkdir(parents=True, exist_ok=True)
-        conn = sqlite3.connect(str(self.db_path))
-        conn.execute("""
-            CREATE TABLE IF NOT EXISTS chunks (
-                id TEXT PRIMARY KEY,
-                source TEXT NOT NULL,
-                content TEXT NOT NULL,
-                embedding TEXT NOT NULL,
-                created_at TEXT NOT NULL,
-                source_hash TEXT NOT NULL
-            )
-        """)
-        conn.execute("CREATE INDEX IF NOT EXISTS idx_chunks_source ON chunks(source)")
-        conn.commit()
-        conn.close()
-
-    def index_file(self, filepath: Path) -> int:
-        """Index a single file into semantic memory."""
-        if not filepath.exists():
-            return 0
-
-        content = filepath.read_text()
-        file_hash = hashlib.md5(content.encode()).hexdigest()
-
-        # Check if already indexed with same hash
-        conn = sqlite3.connect(str(self.db_path))
-        cursor = conn.execute(
-            "SELECT source_hash FROM chunks WHERE source = ? LIMIT 1", (str(filepath),)
-        )
-        existing = cursor.fetchone()
-        if existing and existing[0] == file_hash:
-            conn.close()
-            return 0  # Already indexed
-
-        # Delete old chunks for this file
-        conn.execute("DELETE FROM chunks WHERE source = ?", (str(filepath),))
-
-        # Split into chunks (paragraphs)
-        chunks = self._split_into_chunks(content)
-
-        # Index each chunk
-        now = datetime.now(UTC).isoformat()
-        for i, chunk_text in enumerate(chunks):
-            if len(chunk_text.strip()) < 20:  # Skip tiny chunks
-                continue
-
-            chunk_id = f"{filepath.stem}_{i}"
-            embedding = embed_text(chunk_text)
-
-            conn.execute(
-                """INSERT INTO chunks (id, source, content, embedding, created_at, source_hash)
-                   VALUES (?, ?, ?, ?, ?, ?)""",
-                (chunk_id, str(filepath), chunk_text, json.dumps(embedding), now, file_hash),
-            )
-
-        conn.commit()
-        conn.close()
-
-        logger.info("SemanticMemory: Indexed %s (%d chunks)", filepath.name, len(chunks))
-        return len(chunks)
-
-    def _split_into_chunks(self, text: str, max_chunk_size: int = 500) -> list[str]:
-        """Split text into semantic chunks."""
-        # Split by paragraphs first
-        paragraphs = text.split("\n\n")
-        chunks = []
-
-        for para in paragraphs:
-            para = para.strip()
-            if not para:
-                continue
-
-            # If paragraph is small enough, keep as one chunk
-            if len(para) <= max_chunk_size:
-                chunks.append(para)
-            else:
-                # Split long paragraphs by sentences
-                sentences = para.replace(". ", ".\n").split("\n")
-                current_chunk = ""
-
-                for sent in sentences:
-                    if len(current_chunk) + len(sent) < max_chunk_size:
-                        current_chunk += " " + sent if current_chunk else sent
-                    else:
-                        if current_chunk:
-                            chunks.append(current_chunk.strip())
-                        current_chunk = sent
-
-                if current_chunk:
-                    chunks.append(current_chunk.strip())
-
-        return chunks
-
-    def index_vault(self) -> int:
-        """Index entire vault directory."""
-        total_chunks = 0
-
-        for md_file in self.vault_path.rglob("*.md"):
-            # Skip handoff file (handled separately)
-            if "last-session-handoff" in md_file.name:
-                continue
-            total_chunks += self.index_file(md_file)
-
-        logger.info("SemanticMemory: Indexed vault (%d total chunks)", total_chunks)
-        return total_chunks
-
-    def search(self, query: str, top_k: int = 5) -> list[tuple[str, float]]:
-        """Search for relevant memory chunks."""
-        query_embedding = embed_text(query)
-
-        conn = sqlite3.connect(str(self.db_path))
-        conn.row_factory = sqlite3.Row
-
-        # Get all chunks (in production, use vector index)
-        rows = conn.execute("SELECT source, content, embedding FROM chunks").fetchall()
-
-        conn.close()
-
-        # Calculate similarities
-        scored = []
-        for row in rows:
-            embedding = json.loads(row["embedding"])
-            score = cosine_similarity(query_embedding, embedding)
-            scored.append((row["source"], row["content"], score))
-
-        # Sort by score descending
-        scored.sort(key=lambda x: x[2], reverse=True)
-
-        # Return top_k
-        return [(content, score) for _, content, score in scored[:top_k]]
-
-    def get_relevant_context(self, query: str, max_chars: int = 2000) -> str:
-        """Get formatted context string for a query."""
-        results = self.search(query, top_k=3)
-
-        if not results:
-            return ""
-
-        parts = []
-        total_chars = 0
-
-        for content, score in results:
-            if score < 0.3:  # Similarity threshold
-                continue
-
-            chunk = f"[Relevant memory - score {score:.2f}]: {content[:400]}..."
-            if total_chars + len(chunk) > max_chars:
-                break
-
-            parts.append(chunk)
-            total_chars += len(chunk)
-
-        return "\n\n".join(parts) if parts else ""
-
-    def stats(self) -> dict:
-        """Get indexing statistics."""
-        conn = sqlite3.connect(str(self.db_path))
-        cursor = conn.execute("SELECT COUNT(*), COUNT(DISTINCT source) FROM chunks")
-        total_chunks, total_files = cursor.fetchone()
-        conn.close()
-
-        return {
-            "total_chunks": total_chunks,
-            "total_files": total_files,
-            "embedding_dim": EMBEDDING_DIM if _get_embedding_model() else 128,
-        }
-
-
-class MemorySearcher:
-    """High-level interface for memory search."""
-
-    def __init__(self) -> None:
-        self.semantic = SemanticMemory()
-
-    def search(self, query: str, tiers: list[str] = None) -> dict:
-        """Search across memory tiers.
-
-        Args:
-            query: Search query
-            tiers: List of tiers to search ["hot", "vault", "semantic"]
-
-        Returns:
-            Dict with results from each tier
-        """
-        tiers = tiers or ["semantic"]  # Default to semantic only
-        results = {}
-
-        if "semantic" in tiers:
-            semantic_results = self.semantic.search(query, top_k=5)
-            results["semantic"] = [
-                {"content": content, "score": score} for content, score in semantic_results
-            ]
-
-        return results
-
-    def get_context_for_query(self, query: str) -> str:
-        """Get comprehensive context for a user query."""
-        # Get semantic context
-        semantic_context = self.semantic.get_relevant_context(query)
-
-        if semantic_context:
-            return f"## Relevant Past Context\n\n{semantic_context}"
-
-        return ""
-
-
-# Module-level singleton
-semantic_memory = SemanticMemory()
-memory_searcher = MemorySearcher()
-
-
-def memory_search(query: str, top_k: int = 5) -> str:
-    """Search past conversations, notes, and stored facts for relevant context.
-
-    Searches across both the vault (indexed markdown files) and the
-    runtime memory store (facts and conversation fragments stored via
-    memory_write).
-
-    Args:
-        query: What to search for (e.g. "Bitcoin strategy", "server setup").
-        top_k: Number of results to return (default 5).
-
-    Returns:
-        Formatted string of relevant memory results.
-    """
-    # Guard: model sometimes passes None for top_k
-    if top_k is None:
-        top_k = 5
-
-    parts: list[str] = []
-
-    # 1. Search semantic vault (indexed markdown files)
-    vault_results = semantic_memory.search(query, top_k)
-    for content, score in vault_results:
-        if score < 0.2:
-            continue
-        parts.append(f"[vault score {score:.2f}] {content[:300]}")
-
-    # 2. Search runtime vector store (stored facts/conversations)
-    try:
-        from timmy.memory.vector_store import search_memories
-
-        runtime_results = search_memories(query, limit=top_k, min_relevance=0.2)
-        for entry in runtime_results:
-            label = entry.context_type or "memory"
-            parts.append(f"[{label}] {entry.content[:300]}")
-    except Exception as exc:
-        logger.debug("Vector store search unavailable: %s", exc)
-
-    if not parts:
-        return "No relevant memories found."
-    return "\n\n".join(parts)
-
-
-def memory_read(query: str = "", top_k: int = 5) -> str:
-    """Read from persistent memory — search facts, notes, and past conversations.
-
-    This is the primary tool for recalling stored information. If no query
-    is given, returns the most recent personal facts.  With a query, it
-    searches semantically across all stored memories.
-
-    Args:
-        query: Optional search term. Leave empty to list recent facts.
-        top_k: Maximum results to return (default 5).
-
-    Returns:
-        Formatted string of memory contents.
-    """
-    if top_k is None:
-        top_k = 5
-
-    parts: list[str] = []
-
-    # Always include personal facts first
-    try:
-        from timmy.memory.vector_store import search_memories
-
-        facts = search_memories(query or "", limit=top_k, min_relevance=0.0)
-        fact_entries = [e for e in facts if (e.context_type or "") == "fact"]
-        if fact_entries:
-            parts.append("## Personal Facts")
-            for entry in fact_entries[:top_k]:
-                parts.append(f"- {entry.content[:300]}")
-    except Exception as exc:
-        logger.debug("Vector store unavailable for memory_read: %s", exc)
-
-    # If a query was provided, also do semantic search
-    if query:
-        search_result = memory_search(query, top_k)
-        if search_result and search_result != "No relevant memories found.":
-            parts.append("\n## Search Results")
-            parts.append(search_result)
-
-    if not parts:
-        return "No memories stored yet. Use memory_write to store information."
-    return "\n".join(parts)
-
-
-def memory_write(content: str, context_type: str = "fact") -> str:
-    """Store a piece of information in persistent memory.
-
-    Use this tool when the user explicitly asks you to remember something.
-    Stored memories are searchable via memory_search across all channels
-    (web GUI, Discord, Telegram, etc.).
-
-    Args:
-        content: The information to remember (e.g. a phrase, fact, or note).
-        context_type: Type of memory — "fact" for permanent facts,
-                      "conversation" for conversation context,
-                      "document" for document fragments.
-
-    Returns:
-        Confirmation that the memory was stored.
-    """
-    if not content or not content.strip():
-        return "Nothing to store — content is empty."
-
-    valid_types = ("fact", "conversation", "document")
-    if context_type not in valid_types:
-        context_type = "fact"
-
-    try:
-        from timmy.memory.vector_store import search_memories, store_memory
-
-        # Dedup check for facts — skip if a similar fact already exists
-        # Threshold 0.75 catches paraphrases (was 0.9 which only caught near-exact)
-        if context_type == "fact":
-            existing = search_memories(
-                content.strip(), limit=3, context_type="fact", min_relevance=0.75
-            )
-            if existing:
-                return f"Similar fact already stored (id={existing[0].id[:8]}). Skipping duplicate."
-
-        entry = store_memory(
-            content=content.strip(),
-            source="agent",
-            context_type=context_type,
-        )
-        return f"Stored in memory (type={context_type}, id={entry.id[:8]}). This is now searchable across all channels."
-    except Exception as exc:
-        logger.error("Failed to write memory: %s", exc)
-        return f"Failed to store memory: {exc}"
-
-
-def memory_forget(query: str) -> str:
-    """Remove a stored memory that is outdated, incorrect, or no longer relevant.
-
-    Searches for memories matching the query and deletes the closest match.
-    Use this when the user says to forget something or when stored information
-    has changed.
-
-    Args:
-        query: Description of the memory to forget (e.g. "my phone number",
-               "the old server address").
-
-    Returns:
-        Confirmation of what was forgotten, or a message if nothing matched.
-    """
-    if not query or not query.strip():
-        return "Nothing to forget — query is empty."
-
-    try:
-        from timmy.memory.vector_store import delete_memory, search_memories
-
-        results = search_memories(query.strip(), limit=3, min_relevance=0.3)
-        if not results:
-            return "No matching memories found to forget."
-
-        # Delete the closest match
-        best = results[0]
-        deleted = delete_memory(best.id)
-        if deleted:
-            return f'Forgotten: "{best.content[:80]}" (type={best.context_type})'
-        return "Memory not found (may have already been deleted)."
-    except Exception as exc:
-        logger.error("Failed to forget memory: %s", exc)
-        return f"Failed to forget: {exc}"
+"""Backward compatibility — all memory functions live in memory_system now."""
+
+from timmy.memory_system import (
+    DB_PATH,
+    EMBEDDING_DIM,
+    EMBEDDING_MODEL,
+    MemoryChunk,
+    MemoryEntry,
+    MemorySearcher,
+    SemanticMemory,
+    _get_embedding_model,
+    _simple_hash_embedding,
+    cosine_similarity,
+    embed_text,
+    memory_forget,
+    memory_read,
+    memory_search,
+    memory_searcher,
+    memory_write,
+    semantic_memory,
+)
+
+__all__ = [
+    "DB_PATH",
+    "EMBEDDING_DIM",
+    "EMBEDDING_MODEL",
+    "MemoryChunk",
+    "MemoryEntry",
+    "MemorySearcher",
+    "SemanticMemory",
+    "_get_embedding_model",
+    "_simple_hash_embedding",
+    "cosine_similarity",
+    "embed_text",
+    "memory_forget",
+    "memory_read",
+    "memory_search",
+    "memory_searcher",
+    "memory_write",
+    "semantic_memory",
+]
--- a/src/timmy/session.py
+++ b/src/timmy/session.py
@@ -11,6 +11,11 @@ let Agno's session_id mechanism handle conversation continuity.
 import logging
 import re

+import httpx
+
+from timmy.confidence import estimate_confidence
+from timmy.session_logger import get_session_logger
+
 logger = logging.getLogger(__name__)

 # Default session ID for the dashboard (stable across requests)
@@ -31,7 +36,7 @@ _TOOL_CALL_JSON = re.compile(

 # Matches function-call-style text: memory_search(query="...") etc.
 _FUNC_CALL_TEXT = re.compile(
-    r"\b(?:memory_search|web_search|shell|python|read_file|write_file|list_files|calculator)"
+    r"\b(?:memory_search|shell|python|read_file|write_file|list_files|calculator)"
    r"\s*\([^)]*\)",
 )

@@ -51,7 +56,7 @@ def _get_agent():
        from timmy.agent import create_timmy

        try:
-            _agent = create_timmy()
+            _agent = create_timmy(session_id=_DEFAULT_SESSION_ID)
            logger.info("Session: Timmy agent initialized (singleton)")
        except Exception as exc:
            logger.error("Session: Failed to create Timmy agent: %s", exc)
@@ -75,6 +80,10 @@ async def chat(message: str, session_id: str | None = None) -> str:
    """
    sid = session_id or _DEFAULT_SESSION_ID
    agent = _get_agent()
+    session_logger = get_session_logger()
+
+    # Record user message before sending to agent
+    session_logger.record_message("user", message)

    # Pre-processing: extract user facts
    _extract_facts(message)
@@ -83,13 +92,34 @@ async def chat(message: str, session_id: str | None = None) -> str:
    try:
        run = await agent.arun(message, stream=False, session_id=sid)
        response_text = run.content if hasattr(run, "content") else str(run)
+    except (httpx.ConnectError, httpx.ReadError, ConnectionError) as exc:
+        logger.error("Ollama disconnected: %s", exc)
+        session_logger.record_error(str(exc), context="chat")
+        session_logger.flush()
+        return "Ollama appears to be disconnected. Check that ollama serve is running."
    except Exception as exc:
        logger.error("Session: agent.arun() failed: %s", exc)
+        session_logger.record_error(str(exc), context="chat")
+        session_logger.flush()
        return "I'm having trouble reaching my language model right now. Please try again shortly."

    # Post-processing: clean up any leaked tool calls or chain-of-thought
    response_text = _clean_response(response_text)

+    # Estimate confidence of the response
+    confidence = estimate_confidence(response_text)
+    logger.debug("Response confidence: %.2f", confidence)
+
+    # Make confidence visible to user when below threshold (SOUL.md requirement)
+    if confidence is not None and confidence < 0.7:
+        response_text += f"\n\n[confidence: {confidence:.0%}]"
+
+    # Record Timmy response after getting it
+    session_logger.record_message("timmy", response_text, confidence=confidence)
+
+    # Flush session logs to disk
+    session_logger.flush()
+
    return response_text


@@ -107,12 +137,42 @@ async def chat_with_tools(message: str, session_id: str | None = None):
    """
    sid = session_id or _DEFAULT_SESSION_ID
    agent = _get_agent()
+    session_logger = get_session_logger()
+
+    # Record user message before sending to agent
+    session_logger.record_message("user", message)
+
    _extract_facts(message)

    try:
-        return await agent.arun(message, stream=False, session_id=sid)
+        run_output = await agent.arun(message, stream=False, session_id=sid)
+        # Record Timmy response after getting it
+        response_text = (
+            run_output.content if hasattr(run_output, "content") and run_output.content else ""
+        )
+        confidence = estimate_confidence(response_text) if response_text else None
+        logger.debug("Response confidence: %.2f", confidence)
+
+        # Make confidence visible to user when below threshold (SOUL.md requirement)
+        if confidence is not None and confidence < 0.7:
+            response_text += f"\n\n[confidence: {confidence:.0%}]"
+            # Update the run_output content to reflect the modified response
+            run_output.content = response_text
+
+        session_logger.record_message("timmy", response_text, confidence=confidence)
+        session_logger.flush()
+        return run_output
+    except (httpx.ConnectError, httpx.ReadError, ConnectionError) as exc:
+        logger.error("Ollama disconnected: %s", exc)
+        session_logger.record_error(str(exc), context="chat_with_tools")
+        session_logger.flush()
+        return _ErrorRunOutput(
+            "Ollama appears to be disconnected. Check that ollama serve is running."
+        )
    except Exception as exc:
        logger.error("Session: agent.arun() failed: %s", exc)
+        session_logger.record_error(str(exc), context="chat_with_tools")
+        session_logger.flush()
        # Return a duck-typed object that callers can handle uniformly
        return _ErrorRunOutput(
            "I'm having trouble reaching my language model right now. Please try again shortly."
@@ -130,11 +190,35 @@ async def continue_chat(run_output, session_id: str | None = None):
    """
    sid = session_id or _DEFAULT_SESSION_ID
    agent = _get_agent()
+    session_logger = get_session_logger()

    try:
-        return await agent.acontinue_run(run_response=run_output, stream=False, session_id=sid)
+        result = await agent.acontinue_run(run_response=run_output, stream=False, session_id=sid)
+        # Record Timmy response after getting it
+        response_text = result.content if hasattr(result, "content") and result.content else ""
+        confidence = estimate_confidence(response_text) if response_text else None
+        logger.debug("Response confidence: %.2f", confidence)
+
+        # Make confidence visible to user when below threshold (SOUL.md requirement)
+        if confidence is not None and confidence < 0.7:
+            response_text += f"\n\n[confidence: {confidence:.0%}]"
+            # Update the result content to reflect the modified response
+            result.content = response_text
+
+        session_logger.record_message("timmy", response_text, confidence=confidence)
+        session_logger.flush()
+        return result
+    except (httpx.ConnectError, httpx.ReadError, ConnectionError) as exc:
+        logger.error("Ollama disconnected: %s", exc)
+        session_logger.record_error(str(exc), context="continue_chat")
+        session_logger.flush()
+        return _ErrorRunOutput(
+            "Ollama appears to be disconnected. Check that ollama serve is running."
+        )
    except Exception as exc:
        logger.error("Session: agent.acontinue_run() failed: %s", exc)
+        session_logger.record_error(str(exc), context="continue_chat")
+        session_logger.flush()
        return _ErrorRunOutput(f"Error continuing run: {exc}")


--- a/src/timmy/session_logger.py
+++ b/src/timmy/session_logger.py
@@ -38,21 +38,23 @@ class SessionLogger:
        # In-memory buffer
        self._buffer: list[dict] = []

-    def record_message(self, role: str, content: str) -> None:
+    def record_message(self, role: str, content: str, confidence: float | None = None) -> None:
        """Record a user message.

        Args:
            role: "user" or "timmy"
            content: The message content
+            confidence: Optional confidence score (0.0 to 1.0)
        """
-        self._buffer.append(
-            {
-                "type": "message",
-                "role": role,
-                "content": content,
-                "timestamp": datetime.now().isoformat(),
-            }
-        )
+        entry = {
+            "type": "message",
+            "role": role,
+            "content": content,
+            "timestamp": datetime.now().isoformat(),
+        }
+        if confidence is not None:
+            entry["confidence"] = confidence
+        self._buffer.append(entry)

    def record_tool_call(self, tool_name: str, args: dict, result: str) -> None:
        """Record a tool call.
@@ -153,6 +155,56 @@ class SessionLogger:
            "decisions": sum(1 for e in entries if e.get("type") == "decision"),
        }

+    def search(self, query: str, role: str | None = None, limit: int = 10) -> list[dict]:
+        """Search across all session logs for entries matching a query.
+
+        Args:
+            query: Case-insensitive substring to search for.
+            role: Optional role filter ("user", "timmy", "system").
+            limit: Maximum number of results to return.
+
+        Returns:
+            List of matching entries (most recent first), each with
+            type, timestamp, and relevant content fields.
+        """
+        query_lower = query.lower()
+        matches: list[dict] = []
+
+        # Collect all session files, sorted newest first
+        log_files = sorted(self.logs_dir.glob("session_*.jsonl"), reverse=True)
+
+        for log_file in log_files:
+            if len(matches) >= limit:
+                break
+            try:
+                with open(log_file) as f:
+                    # Read all lines, reverse so newest entries come first
+                    lines = [ln for ln in f if ln.strip()]
+                for line in reversed(lines):
+                    if len(matches) >= limit:
+                        break
+                    try:
+                        entry = json.loads(line)
+                    except json.JSONDecodeError:
+                        continue
+
+                    # Role filter
+                    if role and entry.get("role") != role:
+                        continue
+
+                    # Search in text-bearing fields
+                    searchable = " ".join(
+                        str(entry.get(k, ""))
+                        for k in ("content", "error", "decision", "rationale", "result", "tool")
+                    ).lower()
+                    if query_lower in searchable:
+                        entry["_source_file"] = log_file.name
+                        matches.append(entry)
+            except OSError:
+                continue
+
+        return matches
+

 # Global session logger instance
 _session_logger: SessionLogger | None = None
@@ -185,3 +237,53 @@ def flush_session_logs() -> str:
    logger = get_session_logger()
    path = logger.flush()
    return str(path)
+
+
+def session_history(query: str, role: str = "", limit: int = 10) -> str:
+    """Search Timmy's past conversation history.
+
+    Find messages, tool calls, errors, and decisions from past sessions
+    that match the query. Results are returned most-recent first.
+
+    Args:
+        query: What to search for (case-insensitive substring match).
+        role: Optional filter by role — "user", "timmy", or "" for all.
+        limit: Maximum results to return (default 10).
+
+    Returns:
+        Formatted string of matching session entries.
+    """
+    sl = get_session_logger()
+    # Flush buffer first so current session is searchable
+    sl.flush()
+    results = sl.search(query, role=role or None, limit=limit)
+    if not results:
+        return f"No session history found matching '{query}'."
+
+    lines = [f"Found {len(results)} result(s) for '{query}':\n"]
+    for entry in results:
+        ts = entry.get("timestamp", "?")[:19]
+        etype = entry.get("type", "?")
+        source = entry.get("_source_file", "")
+
+        if etype == "message":
+            who = entry.get("role", "?")
+            text = entry.get("content", "")[:200]
+            lines.append(f"[{ts}] {who}: {text}")
+        elif etype == "tool_call":
+            tool = entry.get("tool", "?")
+            result = entry.get("result", "")[:100]
+            lines.append(f"[{ts}] tool:{tool} → {result}")
+        elif etype == "error":
+            err = entry.get("error", "")[:200]
+            lines.append(f"[{ts}] ERROR: {err}")
+        elif etype == "decision":
+            dec = entry.get("decision", "")[:200]
+            lines.append(f"[{ts}] DECIDED: {dec}")
+        else:
+            lines.append(f"[{ts}] {etype}: {json.dumps(entry)[:200]}")
+
+        if source:
+            lines[-1] += f"  ({source})"
+
+    return "\n".join(lines)
--- a/src/timmy/thinking.py
+++ b/src/timmy/thinking.py
@@ -19,10 +19,14 @@ Usage::

 import logging
 import random
+import re
 import sqlite3
 import uuid
+from collections.abc import Generator
+from contextlib import closing, contextmanager
 from dataclasses import dataclass
 from datetime import UTC, datetime, timedelta
+from difflib import SequenceMatcher
 from pathlib import Path

 from config import settings
@@ -32,6 +36,40 @@ logger = logging.getLogger(__name__)

 _DEFAULT_DB = Path("data/thoughts.db")

+# qwen3 and other reasoning models wrap chain-of-thought in <think> tags
+_THINK_TAG_RE = re.compile(r"<think>.*?</think>\s*", re.DOTALL)
+
+# Sensitive patterns that must never be stored as facts
+_SENSITIVE_PATTERNS = [
+    "token",
+    "password",
+    "secret",
+    "api_key",
+    "apikey",
+    "credential",
+    ".config/",
+    "/token",
+    "access_token",
+    "private_key",
+    "ssh_key",
+]
+
+# Meta-observation phrases to filter out from distilled facts
+_META_OBSERVATION_PHRASES = [
+    "my own",
+    "my thinking",
+    "my memory",
+    "my working ram",
+    "self-declarative",
+    "meta-observation",
+    "internal state",
+    "my pending",
+    "my standing rules",
+    "thoughts generated",
+    "no chat messages",
+    "no user interaction",
+]
+
 # Seed types for thought generation
 SEED_TYPES = (
    "existential",
@@ -42,6 +80,7 @@ SEED_TYPES = (
    "freeform",
    "sovereignty",
    "observation",
+    "workspace",
 )

 # Existential reflection prompts — Timmy picks one at random
@@ -135,23 +174,24 @@ class Thought:
    created_at: str


-def _get_conn(db_path: Path = _DEFAULT_DB) -> sqlite3.Connection:
+@contextmanager
+def _get_conn(db_path: Path = _DEFAULT_DB) -> Generator[sqlite3.Connection, None, None]:
    """Get a SQLite connection with the thoughts table created."""
    db_path.parent.mkdir(parents=True, exist_ok=True)
-    conn = sqlite3.connect(str(db_path))
-    conn.row_factory = sqlite3.Row
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS thoughts (
-            id TEXT PRIMARY KEY,
-            content TEXT NOT NULL,
-            seed_type TEXT NOT NULL,
-            parent_id TEXT,
-            created_at TEXT NOT NULL
-        )
-        """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_thoughts_time ON thoughts(created_at)")
-    conn.commit()
-    return conn
+    with closing(sqlite3.connect(str(db_path))) as conn:
+        conn.row_factory = sqlite3.Row
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS thoughts (
+                id TEXT PRIMARY KEY,
+                content TEXT NOT NULL,
+                seed_type TEXT NOT NULL,
+                parent_id TEXT,
+                created_at TEXT NOT NULL
+            )
+            """)
+        conn.execute("CREATE INDEX IF NOT EXISTS idx_thoughts_time ON thoughts(created_at)")
+        conn.commit()
+        yield conn


 def _row_to_thought(row: sqlite3.Row) -> Thought:
@@ -176,7 +216,8 @@ class ThinkingEngine:
            latest = self.get_recent_thoughts(limit=1)
            if latest:
                self._last_thought_id = latest[0].id
-        except Exception:
+        except Exception as exc:
+            logger.debug("Failed to load recent thought: %s", exc)
            pass  # Fresh start if DB doesn't exist yet

    async def think_once(self, prompt: str | None = None) -> Thought | None:
@@ -196,33 +237,63 @@ class ThinkingEngine:
        if not settings.thinking_enabled:
            return None

-        if prompt:
-            seed_type = "prompted"
-            seed_context = f"Journal prompt: {prompt}"
-        else:
-            seed_type, seed_context = self._gather_seed()
-        continuity = self._build_continuity_context()
        memory_context = self._load_memory_context()
        system_context = self._gather_system_snapshot()
+        recent_thoughts = self.get_recent_thoughts(limit=5)

-        prompt = _THINKING_PROMPT.format(
-            memory_context=memory_context,
-            system_context=system_context,
-            seed_context=seed_context,
-            continuity_context=continuity,
-        )
+        content: str | None = None
+        seed_type: str = "freeform"

-        try:
-            content = await self._call_agent(prompt)
-        except Exception as exc:
-            logger.warning("Thinking cycle failed (Ollama likely down): %s", exc)
+        for attempt in range(self._MAX_DEDUP_RETRIES + 1):
+            if prompt:
+                seed_type = "prompted"
+                seed_context = f"Journal prompt: {prompt}"
+            else:
+                seed_type, seed_context = self._gather_seed()
+
+            continuity = self._build_continuity_context()
+
+            full_prompt = _THINKING_PROMPT.format(
+                memory_context=memory_context,
+                system_context=system_context,
+                seed_context=seed_context,
+                continuity_context=continuity,
+            )
+
+            try:
+                raw = await self._call_agent(full_prompt)
+            except Exception as exc:
+                logger.warning("Thinking cycle failed (Ollama likely down): %s", exc)
+                return None
+
+            if not raw or not raw.strip():
+                logger.debug("Thinking cycle produced empty response, skipping")
+                return None
+
+            content = raw.strip()
+
+            # Dedup: reject thoughts too similar to recent ones
+            if not self._is_too_similar(content, recent_thoughts):
+                break  # Good — novel thought
+
+            if attempt < self._MAX_DEDUP_RETRIES:
+                logger.info(
+                    "Thought too similar to recent (attempt %d/%d), retrying with new seed",
+                    attempt + 1,
+                    self._MAX_DEDUP_RETRIES + 1,
+                )
+                content = None  # Will retry
+            else:
+                logger.warning(
+                    "Thought still repetitive after %d retries, discarding",
+                    self._MAX_DEDUP_RETRIES + 1,
+                )
+                return None
+
+        if not content:
            return None

-        if not content or not content.strip():
-            logger.debug("Thinking cycle produced empty response, skipping")
-            return None
-
-        thought = self._store_thought(content.strip(), seed_type)
+        thought = self._store_thought(content, seed_type)
        self._last_thought_id = thought.id

        # Post-hook: distill facts from recent thoughts periodically
@@ -231,6 +302,9 @@ class ThinkingEngine:
        # Post-hook: file Gitea issues for actionable observations
        await self._maybe_file_issues()

+        # Post-hook: check workspace for new messages from Hermes
+        await self._check_workspace()
+
        # Post-hook: update MEMORY.md with latest reflection
        self._update_memory(thought)

@@ -253,19 +327,17 @@ class ThinkingEngine:

    def get_recent_thoughts(self, limit: int = 20) -> list[Thought]:
        """Retrieve the most recent thoughts."""
-        conn = _get_conn(self._db_path)
-        rows = conn.execute(
-            "SELECT * FROM thoughts ORDER BY created_at DESC LIMIT ?",
-            (limit,),
-        ).fetchall()
-        conn.close()
+        with _get_conn(self._db_path) as conn:
+            rows = conn.execute(
+                "SELECT * FROM thoughts ORDER BY created_at DESC LIMIT ?",
+                (limit,),
+            ).fetchall()
        return [_row_to_thought(r) for r in rows]

    def get_thought(self, thought_id: str) -> Thought | None:
        """Retrieve a single thought by ID."""
-        conn = _get_conn(self._db_path)
-        row = conn.execute("SELECT * FROM thoughts WHERE id = ?", (thought_id,)).fetchone()
-        conn.close()
+        with _get_conn(self._db_path) as conn:
+            row = conn.execute("SELECT * FROM thoughts WHERE id = ?", (thought_id,)).fetchone()
        return _row_to_thought(row) if row else None

    def get_thought_chain(self, thought_id: str, max_depth: int = 20) -> list[Thought]:
@@ -275,26 +347,24 @@ class ThinkingEngine:
        """
        chain = []
        current_id: str | None = thought_id
-        conn = _get_conn(self._db_path)

-        for _ in range(max_depth):
-            if not current_id:
-                break
-            row = conn.execute("SELECT * FROM thoughts WHERE id = ?", (current_id,)).fetchone()
-            if not row:
-                break
-            chain.append(_row_to_thought(row))
-            current_id = row["parent_id"]
+        with _get_conn(self._db_path) as conn:
+            for _ in range(max_depth):
+                if not current_id:
+                    break
+                row = conn.execute("SELECT * FROM thoughts WHERE id = ?", (current_id,)).fetchone()
+                if not row:
+                    break
+                chain.append(_row_to_thought(row))
+                current_id = row["parent_id"]

-        conn.close()
        chain.reverse()  # Chronological order
        return chain

    def count_thoughts(self) -> int:
        """Return total number of stored thoughts."""
-        conn = _get_conn(self._db_path)
-        count = conn.execute("SELECT COUNT(*) as c FROM thoughts").fetchone()["c"]
-        conn.close()
+        with _get_conn(self._db_path) as conn:
+            count = conn.execute("SELECT COUNT(*) as c FROM thoughts").fetchone()["c"]
        return count

    def prune_old_thoughts(self, keep_days: int = 90, keep_min: int = 200) -> int:
@@ -302,138 +372,165 @@ class ThinkingEngine:

        Returns the number of deleted rows.
        """
-        conn = _get_conn(self._db_path)
-        try:
-            total = conn.execute("SELECT COUNT(*) as c FROM thoughts").fetchone()["c"]
-            if total <= keep_min:
+        with _get_conn(self._db_path) as conn:
+            try:
+                total = conn.execute("SELECT COUNT(*) as c FROM thoughts").fetchone()["c"]
+                if total <= keep_min:
+                    return 0
+                cutoff = (datetime.now(UTC) - timedelta(days=keep_days)).isoformat()
+                cursor = conn.execute(
+                    "DELETE FROM thoughts WHERE created_at < ? AND id NOT IN "
+                    "(SELECT id FROM thoughts ORDER BY created_at DESC LIMIT ?)",
+                    (cutoff, keep_min),
+                )
+                deleted = cursor.rowcount
+                conn.commit()
+                return deleted
+            except Exception as exc:
+                logger.warning("Thought pruning failed: %s", exc)
                return 0
-            cutoff = (datetime.now(UTC) - timedelta(days=keep_days)).isoformat()
-            cursor = conn.execute(
-                "DELETE FROM thoughts WHERE created_at < ? AND id NOT IN "
-                "(SELECT id FROM thoughts ORDER BY created_at DESC LIMIT ?)",
-                (cutoff, keep_min),
-            )
-            deleted = cursor.rowcount
-            conn.commit()
-            return deleted
-        except Exception as exc:
-            logger.warning("Thought pruning failed: %s", exc)
-            return 0
-        finally:
-            conn.close()

    # ── Private helpers ──────────────────────────────────────────────────

-    async def _maybe_distill(self) -> None:
-        """Every N thoughts, extract lasting insights and store as facts.
+    def _should_distill(self) -> bool:
+        """Check if distillation should run based on interval and thought count."""
+        interval = settings.thinking_distill_every
+        if interval <= 0:
+            return False

-        Reads the last N thoughts, asks the LLM to extract any durable facts
-        or insights, and stores them via memory_write.  Only runs when the
-        thought count is divisible by the configured interval.
+        count = self.count_thoughts()
+        if count == 0 or count % interval != 0:
+            return False
+
+        return True
+
+    def _build_distill_prompt(self, thoughts: list[Thought]) -> str:
+        """Build the prompt for extracting facts from recent thoughts.
+
+        Args:
+            thoughts: List of recent thoughts to analyze.
+
+        Returns:
+            The formatted prompt string for the LLM.
        """
+        thought_text = "\n".join(f"- [{t.seed_type}] {t.content}" for t in reversed(thoughts))
+
+        return (
+            "You are reviewing your own recent thoughts. Extract 0-3 facts "
+            "worth remembering long-term.\n\n"
+            "GOOD facts (store these):\n"
+            "- User preferences: 'Alexander prefers YAML config over code changes'\n"
+            "- Project decisions: 'Switched from hardcoded personas to agents.yaml'\n"
+            "- Learned knowledge: 'Ollama supports concurrent model loading'\n"
+            "- User information: 'Alexander is interested in Bitcoin and sovereignty'\n\n"
+            "BAD facts (never store these):\n"
+            "- Self-referential observations about your own thinking process\n"
+            "- Meta-commentary about your memory, timestamps, or internal state\n"
+            "- Observations about being idle or having no chat messages\n"
+            "- File paths, tokens, API keys, or any credentials\n"
+            "- Restatements of your standing rules or system prompt\n\n"
+            "Return ONLY a JSON array of strings. If nothing is worth saving, "
+            "return []. Be selective — only store facts about the EXTERNAL WORLD "
+            "(the user, the project, technical knowledge), never about your own "
+            "internal process.\n\n"
+            f"Recent thoughts:\n{thought_text}\n\nJSON array:"
+        )
+
+    def _parse_facts_response(self, raw: str) -> list[str]:
+        """Parse JSON array from LLM response, stripping markdown fences.
+
+        Resilient to models that prepend reasoning text or wrap the array in
+        prose.  Finds the first ``[...]`` block and parses that.
+
+        Args:
+            raw: Raw response string from the LLM.
+
+        Returns:
+            List of fact strings parsed from the response.
+        """
+        if not raw or not raw.strip():
+            return []
+
+        import json
+
+        cleaned = raw.strip()
+
+        # Strip markdown code fences
+        if cleaned.startswith("```"):
+            cleaned = cleaned.split("\n", 1)[-1].rsplit("```", 1)[0].strip()
+
+        # Try direct parse first (fast path)
        try:
+            facts = json.loads(cleaned)
+            if isinstance(facts, list):
+                return [f for f in facts if isinstance(f, str)]
+        except (json.JSONDecodeError, ValueError):
+            pass
+
+        # Fallback: extract first JSON array from the text
+        start = cleaned.find("[")
+        if start == -1:
+            return []
+        # Walk to find the matching close bracket
+        depth = 0
+        for i, ch in enumerate(cleaned[start:], start):
+            if ch == "[":
+                depth += 1
+            elif ch == "]":
+                depth -= 1
+                if depth == 0:
+                    try:
+                        facts = json.loads(cleaned[start : i + 1])
+                        if isinstance(facts, list):
+                            return [f for f in facts if isinstance(f, str)]
+                    except (json.JSONDecodeError, ValueError):
+                        pass
+                    break
+        return []
+
+    def _filter_and_store_facts(self, facts: list[str]) -> None:
+        """Filter and store valid facts, blocking sensitive and meta content.
+
+        Args:
+            facts: List of fact strings to filter and store.
+        """
+        from timmy.memory_system import memory_write
+
+        for fact in facts[:3]:  # Safety cap
+            if not isinstance(fact, str) or len(fact.strip()) <= 10:
+                continue
+
+            fact_lower = fact.lower()
+
+            # Block sensitive information
+            if any(pat in fact_lower for pat in _SENSITIVE_PATTERNS):
+                logger.warning("Distill: blocked sensitive fact: %s", fact[:60])
+                continue
+
+            # Block self-referential meta-observations
+            if any(phrase in fact_lower for phrase in _META_OBSERVATION_PHRASES):
+                logger.debug("Distill: skipped meta-observation: %s", fact[:60])
+                continue
+
+            result = memory_write(fact.strip(), context_type="fact")
+            logger.info("Distilled fact: %s → %s", fact[:60], result[:40])
+
+    async def _maybe_distill(self) -> None:
+        """Every N thoughts, extract lasting insights and store as facts."""
+        try:
+            if not self._should_distill():
+                return
+
            interval = settings.thinking_distill_every
-            if interval <= 0:
-                return
-
-            count = self.count_thoughts()
-            if count == 0 or count % interval != 0:
-                return
-
            recent = self.get_recent_thoughts(limit=interval)
            if len(recent) < interval:
                return

-            # Build a summary of recent thoughts for the LLM
-            thought_text = "\n".join(f"- [{t.seed_type}] {t.content}" for t in reversed(recent))
-
-            distill_prompt = (
-                "You are reviewing your own recent thoughts. Extract 0-3 facts "
-                "worth remembering long-term.\n\n"
-                "GOOD facts (store these):\n"
-                "- User preferences: 'Alexander prefers YAML config over code changes'\n"
-                "- Project decisions: 'Switched from hardcoded personas to agents.yaml'\n"
-                "- Learned knowledge: 'Ollama supports concurrent model loading'\n"
-                "- User information: 'Alexander is interested in Bitcoin and sovereignty'\n\n"
-                "BAD facts (never store these):\n"
-                "- Self-referential observations about your own thinking process\n"
-                "- Meta-commentary about your memory, timestamps, or internal state\n"
-                "- Observations about being idle or having no chat messages\n"
-                "- File paths, tokens, API keys, or any credentials\n"
-                "- Restatements of your standing rules or system prompt\n\n"
-                "Return ONLY a JSON array of strings. If nothing is worth saving, "
-                "return []. Be selective — only store facts about the EXTERNAL WORLD "
-                "(the user, the project, technical knowledge), never about your own "
-                "internal process.\n\n"
-                f"Recent thoughts:\n{thought_text}\n\nJSON array:"
-            )
-
-            raw = await self._call_agent(distill_prompt)
-            if not raw or not raw.strip():
-                return
-
-            # Parse JSON array from response
-            import json
-
-            # Strip markdown code fences if present
-            cleaned = raw.strip()
-            if cleaned.startswith("```"):
-                cleaned = cleaned.split("\n", 1)[-1].rsplit("```", 1)[0].strip()
-
-            facts = json.loads(cleaned)
-            if not isinstance(facts, list) or not facts:
-                return
-
-            from timmy.semantic_memory import memory_write
-
-            # Sensitive patterns that must never be stored as facts
-            _SENSITIVE_PATTERNS = [
-                "token",
-                "password",
-                "secret",
-                "api_key",
-                "apikey",
-                "credential",
-                ".config/",
-                "/token",
-                "access_token",
-                "private_key",
-                "ssh_key",
-            ]
-
-            for fact in facts[:3]:  # Safety cap
-                if not isinstance(fact, str) or len(fact.strip()) <= 10:
-                    continue
-                fact_lower = fact.lower()
-                # Block sensitive information
-                if any(pat in fact_lower for pat in _SENSITIVE_PATTERNS):
-                    logger.warning("Distill: blocked sensitive fact: %s", fact[:60])
-                    continue
-                # Block self-referential meta-observations
-                if any(
-                    phrase in fact_lower
-                    for phrase in [
-                        "my own",
-                        "my thinking",
-                        "my memory",
-                        "my working ram",
-                        "self-declarative",
-                        "meta-observation",
-                        "internal state",
-                        "my pending",
-                        "my standing rules",
-                        "thoughts generated",
-                        "no chat messages",
-                        "no user interaction",
-                    ]
-                ):
-                    logger.debug("Distill: skipped meta-observation: %s", fact[:60])
-                    continue
-                result = memory_write(fact.strip(), context_type="fact")
-                logger.info("Distilled fact: %s → %s", fact[:60], result[:40])
-
+            raw = await self._call_agent(self._build_distill_prompt(recent))
+            if facts := self._parse_facts_response(raw):
+                self._filter_and_store_facts(facts)
        except Exception as exc:
-            logger.debug("Thought distillation skipped: %s", exc)
+            logger.warning("Thought distillation failed: %s", exc)

    async def _maybe_file_issues(self) -> None:
        """Every N thoughts, classify recent thoughts and file Gitea issues.
@@ -540,19 +637,19 @@ class ThinkingEngine:
        # Thought count today (cheap DB query)
        try:
            today_start = now.replace(hour=0, minute=0, second=0, microsecond=0)
-            conn = _get_conn(self._db_path)
-            count = conn.execute(
-                "SELECT COUNT(*) as c FROM thoughts WHERE created_at >= ?",
-                (today_start.isoformat(),),
-            ).fetchone()["c"]
-            conn.close()
+            with _get_conn(self._db_path) as conn:
+                count = conn.execute(
+                    "SELECT COUNT(*) as c FROM thoughts WHERE created_at >= ?",
+                    (today_start.isoformat(),),
+                ).fetchone()["c"]
            parts.append(f"Thoughts today: {count}")
-        except Exception:
+        except Exception as exc:
+            logger.debug("Thought count query failed: %s", exc)
            pass

        # Recent chat activity (in-memory, no I/O)
        try:
-            from dashboard.store import message_log
+            from infrastructure.chat_store import message_log

            messages = message_log.all()
            if messages:
@@ -561,7 +658,8 @@ class ThinkingEngine:
                parts.append(f'Last chat ({last.role}): "{last.content[:80]}"')
            else:
                parts.append("No chat messages this session")
-        except Exception:
+        except Exception as exc:
+            logger.debug("Chat activity query failed: %s", exc)
            pass

        # Task queue (lightweight DB query)
@@ -578,7 +676,31 @@ class ThinkingEngine:
                    f"Tasks: {running} running, {pending} pending, "
                    f"{done} completed, {failed} failed"
                )
-        except Exception:
+        except Exception as exc:
+            logger.debug("Task queue query failed: %s", exc)
+            pass
+
+        # Workspace updates (file-based communication with Hermes)
+        try:
+            from timmy.workspace import workspace_monitor
+
+            updates = workspace_monitor.get_pending_updates()
+            new_corr = updates.get("new_correspondence")
+            new_inbox = updates.get("new_inbox_files", [])
+
+            if new_corr:
+                # Count entries (assuming each entry starts with a timestamp or header)
+                line_count = len([line for line in new_corr.splitlines() if line.strip()])
+                parts.append(
+                    f"Workspace: {line_count} new correspondence entries (latest from: Hermes)"
+                )
+            if new_inbox:
+                files_str = ", ".join(new_inbox[:5])
+                if len(new_inbox) > 5:
+                    files_str += f", ... (+{len(new_inbox) - 5} more)"
+                parts.append(f"Workspace: {len(new_inbox)} new inbox files: {files_str}")
+        except Exception as exc:
+            logger.debug("Workspace check failed: %s", exc)
            pass

        return "\n".join(parts) if parts else ""
@@ -621,7 +743,7 @@ class ThinkingEngine:
        Never modifies soul.md. Never crashes the heartbeat.
        """
        try:
-            from timmy.memory_system import memory_system
+            from timmy.memory_system import store_last_reflection

            ts = datetime.fromisoformat(thought.created_at)
            local_ts = ts.astimezone()
@@ -632,7 +754,7 @@ class ThinkingEngine:
                f"**Seed:** {thought.seed_type}\n"
                f"**Thought:** {thought.content[:200]}"
            )
-            memory_system.hot.update_section("Last Reflection", reflection)
+            store_last_reflection(reflection)
        except Exception as exc:
            logger.debug("Failed to update memory after thought: %s", exc)

@@ -673,6 +795,8 @@ class ThinkingEngine:
            return seed_type, f"Sovereignty reflection: {prompt}"
        if seed_type == "observation":
            return seed_type, self._seed_from_observation()
+        if seed_type == "workspace":
+            return seed_type, self._seed_from_workspace()
        # freeform — minimal guidance to steer away from repetition
        return seed_type, "Free reflection — explore something you haven't thought about yet today."

@@ -743,6 +867,90 @@ class ThinkingEngine:
            logger.debug("Observation seed data unavailable: %s", exc)
        return "\n".join(context_parts)

+    def _seed_from_workspace(self) -> str:
+        """Gather workspace updates as thought seed.
+
+        When there are pending workspace updates, include them as context
+        for Timmy to reflect on. Falls back to random seed type if none.
+        """
+        try:
+            from timmy.workspace import workspace_monitor
+
+            updates = workspace_monitor.get_pending_updates()
+            new_corr = updates.get("new_correspondence")
+            new_inbox = updates.get("new_inbox_files", [])
+
+            if new_corr:
+                # Take first 200 chars of the new entry
+                snippet = new_corr[:200].replace("\n", " ")
+                if len(new_corr) > 200:
+                    snippet += "..."
+                return f"New workspace message from Hermes: {snippet}"
+
+            if new_inbox:
+                files_str = ", ".join(new_inbox[:3])
+                if len(new_inbox) > 3:
+                    files_str += f", ... (+{len(new_inbox) - 3} more)"
+                return f"New inbox files from Hermes: {files_str}"
+
+        except Exception as exc:
+            logger.debug("Workspace seed unavailable: %s", exc)
+
+        # Fall back to a random seed type if no workspace updates
+        return "The workspace is quiet. What should I be watching for?"
+
+    async def _check_workspace(self) -> None:
+        """Post-hook: check workspace for updates and mark them as seen.
+
+        This ensures Timmy 'processes' workspace updates even if the seed
+        was different, keeping the state file in sync.
+        """
+        try:
+            from timmy.workspace import workspace_monitor
+
+            updates = workspace_monitor.get_pending_updates()
+            new_corr = updates.get("new_correspondence")
+            new_inbox = updates.get("new_inbox_files", [])
+
+            if new_corr or new_inbox:
+                if new_corr:
+                    line_count = len([line for line in new_corr.splitlines() if line.strip()])
+                    logger.info("Workspace: processed %d new correspondence entries", line_count)
+                if new_inbox:
+                    logger.info(
+                        "Workspace: processed %d new inbox files: %s", len(new_inbox), new_inbox
+                    )
+
+                # Mark as seen to update the state file
+                workspace_monitor.mark_seen()
+        except Exception as exc:
+            logger.debug("Workspace check failed: %s", exc)
+
+    # Maximum retries when a generated thought is too similar to recent ones
+    _MAX_DEDUP_RETRIES = 2
+    # Similarity threshold (0.0 = completely different, 1.0 = identical)
+    _SIMILARITY_THRESHOLD = 0.6
+
+    def _is_too_similar(self, candidate: str, recent: list["Thought"]) -> bool:
+        """Check if *candidate* is semantically too close to any recent thought.
+
+        Uses SequenceMatcher on normalised text (lowered, stripped) for a fast
+        approximation of semantic similarity that works without external deps.
+        """
+        norm_candidate = candidate.lower().strip()
+        for thought in recent:
+            norm_existing = thought.content.lower().strip()
+            ratio = SequenceMatcher(None, norm_candidate, norm_existing).ratio()
+            if ratio >= self._SIMILARITY_THRESHOLD:
+                logger.debug(
+                    "Thought rejected (%.0f%% similar to %s): %.60s",
+                    ratio * 100,
+                    thought.id[:8],
+                    candidate,
+                )
+                return True
+        return False
+
    def _build_continuity_context(self) -> str:
        """Build context from recent thoughts with anti-repetition guidance.

@@ -765,19 +973,20 @@ class ThinkingEngine:
    async def _call_agent(self, prompt: str) -> str:
        """Call Timmy's agent to generate a thought.

-        Uses a separate session_id to avoid polluting user chat history.
+        Creates a lightweight agent with skip_mcp=True to avoid the cancel-scope
+        errors that occur when MCP stdio transports are spawned inside asyncio
+        background tasks (#72).  The thinking engine doesn't need Gitea or
+        filesystem tools — it only needs the LLM.
+
+        Strips ``<think>`` tags from reasoning models (qwen3, etc.) so that
+        downstream parsers (fact distillation, issue filing) receive clean text.
        """
-        try:
-            from timmy.session import chat
+        from timmy.agent import create_timmy

-            return await chat(prompt, session_id="thinking")
-        except Exception:
-            # Fallback: create a fresh agent
-            from timmy.agent import create_timmy
-
-            agent = create_timmy()
-            run = await agent.arun(prompt, stream=False)
-            return run.content if hasattr(run, "content") else str(run)
+        agent = create_timmy(skip_mcp=True)
+        run = await agent.arun(prompt, stream=False)
+        raw = run.content if hasattr(run, "content") else str(run)
+        return _THINK_TAG_RE.sub("", raw) if raw else raw

    def _store_thought(self, content: str, seed_type: str) -> Thought:
        """Persist a thought to SQLite."""
@@ -789,16 +998,21 @@ class ThinkingEngine:
            created_at=datetime.now(UTC).isoformat(),
        )

-        conn = _get_conn(self._db_path)
-        conn.execute(
-            """
-            INSERT INTO thoughts (id, content, seed_type, parent_id, created_at)
-            VALUES (?, ?, ?, ?, ?)
-            """,
-            (thought.id, thought.content, thought.seed_type, thought.parent_id, thought.created_at),
-        )
-        conn.commit()
-        conn.close()
+        with _get_conn(self._db_path) as conn:
+            conn.execute(
+                """
+                INSERT INTO thoughts (id, content, seed_type, parent_id, created_at)
+                VALUES (?, ?, ?, ?, ?)
+                """,
+                (
+                    thought.id,
+                    thought.content,
+                    thought.seed_type,
+                    thought.parent_id,
+                    thought.created_at,
+                ),
+            )
+            conn.commit()
        return thought

    def _log_event(self, thought: Thought) -> None:
@@ -862,5 +1076,80 @@ class ThinkingEngine:
            logger.debug("Failed to broadcast thought: %s", exc)


+def search_thoughts(query: str, seed_type: str | None = None, limit: int = 10) -> str:
+    """Search Timmy's thought history for reflections matching a query.
+
+    Use this tool when Timmy needs to recall his previous thoughts on a topic,
+    reflect on past insights, or build upon earlier reflections. This enables
+    self-awareness and continuity of thinking across time.
+
+    Args:
+        query: Search term to match against thought content (case-insensitive).
+        seed_type: Optional filter by thought category (e.g., 'existential',
+                   'swarm', 'sovereignty', 'creative', 'memory', 'observation').
+        limit: Maximum number of thoughts to return (default 10, max 50).
+
+    Returns:
+        Formatted string with matching thoughts, newest first, including
+        timestamps and seed types. Returns a helpful message if no matches found.
+    """
+    # Clamp limit to reasonable bounds
+    limit = max(1, min(limit, 50))
+
+    try:
+        engine = thinking_engine
+        db_path = engine._db_path
+
+        # Build query with optional seed_type filter
+        with _get_conn(db_path) as conn:
+            if seed_type:
+                rows = conn.execute(
+                    """
+                    SELECT id, content, seed_type, created_at
+                    FROM thoughts
+                    WHERE content LIKE ? AND seed_type = ?
+                    ORDER BY created_at DESC
+                    LIMIT ?
+                    """,
+                    (f"%{query}%", seed_type, limit),
+                ).fetchall()
+            else:
+                rows = conn.execute(
+                    """
+                    SELECT id, content, seed_type, created_at
+                    FROM thoughts
+                    WHERE content LIKE ?
+                    ORDER BY created_at DESC
+                    LIMIT ?
+                    """,
+                    (f"%{query}%", limit),
+                ).fetchall()
+
+        if not rows:
+            if seed_type:
+                return f'No thoughts found matching "{query}" with seed_type="{seed_type}".'
+            return f'No thoughts found matching "{query}".'
+
+        # Format results
+        lines = [f'Found {len(rows)} thought(s) matching "{query}":']
+        if seed_type:
+            lines[0] += f' [seed_type="{seed_type}"]'
+        lines.append("")
+
+        for row in rows:
+            ts = datetime.fromisoformat(row["created_at"])
+            local_ts = ts.astimezone()
+            time_str = local_ts.strftime("%Y-%m-%d %I:%M %p").lstrip("0")
+            seed = row["seed_type"]
+            content = row["content"].replace("\n", " ")  # Flatten newlines for display
+            lines.append(f"[{time_str}] ({seed}) {content[:150]}")
+
+        return "\n".join(lines)
+
+    except Exception as exc:
+        logger.warning("Thought search failed: %s", exc)
+        return f"Error searching thoughts: {exc}"
+
+
 # Module-level singleton
 thinking_engine = ThinkingEngine()
--- a/src/timmy/tool_safety.py
+++ b/src/timmy/tool_safety.py
@@ -5,13 +5,19 @@ Classifies tools into tiers based on their potential impact:
  Requires user confirmation before execution.
 - SAFE: Read-only or purely computational. Executes without confirmation.

-Also provides shared helpers for extracting hallucinated tool calls from
-model output and formatting them for human review. Used by both the
-Discord vendor and the dashboard chat route.
+Also provides:
+- Allowlist checker: reads config/allowlist.yaml to auto-approve bounded
+  tool calls when no human is present (autonomous mode).
+- Shared helpers for extracting hallucinated tool calls from model output
+  and formatting them for human review.
 """

 import json
+import logging
 import re
+from pathlib import Path
+
+logger = logging.getLogger(__name__)

 # ---------------------------------------------------------------------------
 # Tool classification
@@ -31,7 +37,6 @@ DANGEROUS_TOOLS = frozenset(
 # Tools that are safe to execute without confirmation.
 SAFE_TOOLS = frozenset(
    {
-        "web_search",
        "calculator",
        "memory_search",
        "memory_read",
@@ -71,6 +76,133 @@ def requires_confirmation(tool_name: str) -> bool:
    return True


+# ---------------------------------------------------------------------------
+# Allowlist — autonomous tool approval
+# ---------------------------------------------------------------------------
+
+_ALLOWLIST_PATHS = [
+    Path(__file__).resolve().parent.parent.parent / "config" / "allowlist.yaml",
+    Path.home() / "Timmy-Time-dashboard" / "config" / "allowlist.yaml",
+]
+
+_allowlist_cache: dict | None = None
+
+
+def _load_allowlist() -> dict:
+    """Load and cache allowlist.yaml. Returns {} if not found."""
+    global _allowlist_cache
+    if _allowlist_cache is not None:
+        return _allowlist_cache
+
+    try:
+        import yaml
+    except ImportError:
+        logger.debug("PyYAML not installed — allowlist disabled")
+        _allowlist_cache = {}
+        return _allowlist_cache
+
+    for path in _ALLOWLIST_PATHS:
+        if path.is_file():
+            try:
+                with open(path) as f:
+                    _allowlist_cache = yaml.safe_load(f) or {}
+                logger.info("Loaded tool allowlist from %s", path)
+                return _allowlist_cache
+            except Exception as exc:
+                logger.warning("Failed to load allowlist %s: %s", path, exc)
+
+    _allowlist_cache = {}
+    return _allowlist_cache
+
+
+def reload_allowlist() -> None:
+    """Force a reload of the allowlist config (e.g., after editing YAML)."""
+    global _allowlist_cache
+    _allowlist_cache = None
+    _load_allowlist()
+
+
+def is_allowlisted(tool_name: str, tool_args: dict | None = None) -> bool:
+    """Check if a specific tool call is allowlisted for autonomous execution.
+
+    Returns True only when the tool call matches an explicit allowlist rule.
+    Returns False for anything not covered — safe-by-default.
+    """
+    allowlist = _load_allowlist()
+    if not allowlist:
+        return False
+
+    rule = allowlist.get(tool_name)
+    if rule is None:
+        return False
+
+    tool_args = tool_args or {}
+
+    # Simple auto-approve flag
+    if rule.get("auto_approve") is True:
+        return True
+
+    # Shell: prefix + deny pattern matching
+    if tool_name == "shell":
+        return _check_shell_allowlist(rule, tool_args)
+
+    # write_file: path prefix check
+    if tool_name == "write_file":
+        return _check_write_file_allowlist(rule, tool_args)
+
+    return False
+
+
+def _check_shell_allowlist(rule: dict, tool_args: dict) -> bool:
+    """Check if a shell command matches the allowlist."""
+    # Extract the command string — Agno ShellTools uses "args" (list or str)
+    cmd = tool_args.get("command") or tool_args.get("args", "")
+    if isinstance(cmd, list):
+        cmd = " ".join(cmd)
+    cmd = cmd.strip()
+
+    if not cmd:
+        return False
+
+    # Check deny patterns first — these always block
+    deny_patterns = rule.get("deny_patterns", [])
+    for pattern in deny_patterns:
+        if pattern in cmd:
+            logger.warning("Shell command blocked by deny pattern %r: %s", pattern, cmd[:100])
+            return False
+
+    # Check allow prefixes
+    allow_prefixes = rule.get("allow_prefixes", [])
+    for prefix in allow_prefixes:
+        if cmd.startswith(prefix):
+            logger.info("Shell command auto-approved by prefix %r: %s", prefix, cmd[:100])
+            return True
+
+    return False
+
+
+def _check_write_file_allowlist(rule: dict, tool_args: dict) -> bool:
+    """Check if a write_file target is within allowed paths."""
+    path_str = tool_args.get("file_name") or tool_args.get("path", "")
+    if not path_str:
+        return False
+
+    # Resolve ~ to home
+    if path_str.startswith("~"):
+        path_str = str(Path(path_str).expanduser())
+
+    allowed_prefixes = rule.get("allowed_path_prefixes", [])
+    for prefix in allowed_prefixes:
+        # Resolve ~ in the prefix too
+        if prefix.startswith("~"):
+            prefix = str(Path(prefix).expanduser())
+        if path_str.startswith(prefix):
+            logger.info("write_file auto-approved for path: %s", path_str)
+            return True
+
+    return False
+
+
 # ---------------------------------------------------------------------------
 # Tool call extraction from model output
 # ---------------------------------------------------------------------------
--- a/src/timmy/tools.py
+++ b/src/timmy/tools.py
@@ -1,7 +1,6 @@
 """Tool integration for the agent swarm.

 Provides agents with capabilities for:
- Web search (DuckDuckGo)
 - File read/write (local filesystem)
 - Shell command execution (sandboxed)
 - Python code execution
@@ -13,6 +12,7 @@ Tools are assigned to agents based on their specialties.

 from __future__ import annotations

+import ast
 import logging
 import math
 from collections.abc import Callable
@@ -37,15 +37,6 @@ except ImportError as e:
    _AGNO_TOOLS_AVAILABLE = False
    _ImportError = e

-# DuckDuckGo is optional — don't let it kill all tools
-try:
-    from agno.tools.duckduckgo import DuckDuckGoTools
-
-    _DUCKDUCKGO_AVAILABLE = True
-except ImportError:
-    _DUCKDUCKGO_AVAILABLE = False
-    DuckDuckGoTools = None  # type: ignore[assignment, misc]
-
 # Track tool usage stats
 _TOOL_USAGE: dict[str, list[dict]] = {}

@@ -115,6 +106,59 @@ def get_tool_stats(agent_id: str | None = None) -> dict:
    return all_stats


+def _safe_eval(node, allowed_names: dict):
+    """Walk an AST and evaluate only safe numeric operations."""
+    if isinstance(node, ast.Expression):
+        return _safe_eval(node.body, allowed_names)
+    if isinstance(node, ast.Constant):
+        if isinstance(node.value, (int, float, complex)):
+            return node.value
+        raise ValueError(f"Unsupported constant: {node.value!r}")
+    if isinstance(node, ast.UnaryOp):
+        operand = _safe_eval(node.operand, allowed_names)
+        if isinstance(node.op, ast.UAdd):
+            return +operand
+        if isinstance(node.op, ast.USub):
+            return -operand
+        raise ValueError(f"Unsupported unary op: {type(node.op).__name__}")
+    if isinstance(node, ast.BinOp):
+        left = _safe_eval(node.left, allowed_names)
+        right = _safe_eval(node.right, allowed_names)
+        ops = {
+            ast.Add: lambda a, b: a + b,
+            ast.Sub: lambda a, b: a - b,
+            ast.Mult: lambda a, b: a * b,
+            ast.Div: lambda a, b: a / b,
+            ast.FloorDiv: lambda a, b: a // b,
+            ast.Mod: lambda a, b: a % b,
+            ast.Pow: lambda a, b: a**b,
+        }
+        op_fn = ops.get(type(node.op))
+        if op_fn is None:
+            raise ValueError(f"Unsupported binary op: {type(node.op).__name__}")
+        return op_fn(left, right)
+    if isinstance(node, ast.Name):
+        if node.id in allowed_names:
+            return allowed_names[node.id]
+        raise ValueError(f"Unknown name: {node.id!r}")
+    if isinstance(node, ast.Attribute):
+        value = _safe_eval(node.value, allowed_names)
+        # Only allow attribute access on the math module
+        if value is math:
+            attr = getattr(math, node.attr, None)
+            if attr is not None:
+                return attr
+        raise ValueError(f"Attribute access not allowed: .{node.attr}")
+    if isinstance(node, ast.Call):
+        func = _safe_eval(node.func, allowed_names)
+        if not callable(func):
+            raise ValueError(f"Not callable: {func!r}")
+        args = [_safe_eval(a, allowed_names) for a in node.args]
+        kwargs = {kw.arg: _safe_eval(kw.value, allowed_names) for kw in node.keywords}
+        return func(*args, **kwargs)
+    raise ValueError(f"Unsupported syntax: {type(node).__name__}")
+
+
 def calculator(expression: str) -> str:
    """Evaluate a mathematical expression and return the exact result.

@@ -128,17 +172,17 @@ def calculator(expression: str) -> str:
    Returns:
        The exact result as a string.
    """
-    # Only expose math functions — no builtins, no file/os access
    allowed_names = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}
-    allowed_names["math"] = math  # Support math.sqrt(), math.pi, etc.
+    allowed_names["math"] = math
    allowed_names["abs"] = abs
    allowed_names["round"] = round
    allowed_names["min"] = min
    allowed_names["max"] = max
    try:
-        result = eval(expression, {"__builtins__": {}}, allowed_names)  # noqa: S307
+        tree = ast.parse(expression, mode="eval")
+        result = _safe_eval(tree, allowed_names)
        return str(result)
-    except Exception as e:
+    except Exception as e:  # broad catch intentional: arbitrary code execution
        return f"Error evaluating '{expression}': {e}"


@@ -152,8 +196,13 @@ def _make_smart_read_file(file_tools: FileTools) -> Callable:
    """
    original_read = file_tools.read_file

-    def smart_read_file(file_name: str, encoding: str = "utf-8") -> str:
+    def smart_read_file(file_name: str = "", encoding: str = "utf-8", **kwargs) -> str:
        """Reads the contents of the file `file_name` and returns the contents if successful."""
+        # LLMs often call read_file(path=...) instead of read_file(file_name=...)
+        if not file_name:
+            file_name = kwargs.get("path", "")
+        if not file_name:
+            return "Error: no file_name or path provided."
        # Resolve the path the same way FileTools does
        _safe, resolved = file_tools.check_escape(file_name)
        if _safe and resolved.is_dir():
@@ -174,17 +223,12 @@ def _make_smart_read_file(file_tools: FileTools) -> Callable:
 def create_research_tools(base_dir: str | Path | None = None):
    """Create tools for the research agent (Echo).

-    Includes: web search, file reading
+    Includes: file reading
    """
    if not _AGNO_TOOLS_AVAILABLE:
        raise ImportError(f"Agno tools not available: {_ImportError}")
    toolkit = Toolkit(name="research")

-    # Web search via DuckDuckGo
-    if _DUCKDUCKGO_AVAILABLE:
-        search_tools = DuckDuckGoTools()
-        toolkit.register(search_tools.web_search, name="web_search")
-
    # File reading
    from config import settings

@@ -239,12 +283,12 @@ def create_aider_tool(base_path: Path):
        def __init__(self, base_dir: Path):
            self.base_dir = base_dir

-        def run_aider(self, prompt: str, model: str = "qwen3.5:latest") -> str:
+        def run_aider(self, prompt: str, model: str = "qwen3:30b") -> str:
            """Run Aider to generate code changes.

            Args:
                prompt: What you want Aider to do (e.g., "add a fibonacci function")
-                model: Ollama model to use (default: qwen3.5:latest)
+                model: Ollama model to use (default: qwen3:30b)

            Returns:
                Aider's response with the code changes made
@@ -274,7 +318,7 @@ def create_aider_tool(base_path: Path):
                return "Error: Aider not installed. Run: pip install aider"
            except subprocess.TimeoutExpired:
                return "Error: Aider timed out after 120 seconds"
-            except Exception as e:
+            except (OSError, subprocess.SubprocessError) as e:
                return f"Error running Aider: {str(e)}"

    return AiderTool(base_path)
@@ -301,11 +345,6 @@ def create_data_tools(base_dir: str | Path | None = None):
    toolkit.register(_make_smart_read_file(file_tools), name="read_file")
    toolkit.register(file_tools.list_files, name="list_files")

-    # Web search for finding datasets
-    if _DUCKDUCKGO_AVAILABLE:
-        search_tools = DuckDuckGoTools()
-        toolkit.register(search_tools.web_search, name="web_search")
-
    return toolkit


@@ -331,7 +370,7 @@ def create_writing_tools(base_dir: str | Path | None = None):
 def create_security_tools(base_dir: str | Path | None = None):
    """Create tools for the security agent (Mace).

-    Includes: shell commands (for scanning), web search (for threat intel), file read
+    Includes: shell commands (for scanning), file read
    """
    if not _AGNO_TOOLS_AVAILABLE:
        raise ImportError(f"Agno tools not available: {_ImportError}")
@@ -341,11 +380,6 @@ def create_security_tools(base_dir: str | Path | None = None):
    shell_tools = ShellTools()
    toolkit.register(shell_tools.run_shell_command, name="shell")

-    # Web search for threat intelligence
-    if _DUCKDUCKGO_AVAILABLE:
-        search_tools = DuckDuckGoTools()
-        toolkit.register(search_tools.web_search, name="web_search")
-
    # File reading for logs/configs
    base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
    file_tools = FileTools(base_dir=base_path)
@@ -411,7 +445,8 @@ def consult_grok(query: str) -> str:
            tool_name="consult_grok",
            success=True,
        )
-    except Exception:
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (consult_grok logging): %s", exc)
        pass

    # Generate Lightning invoice for monetization (unless free mode)
@@ -424,7 +459,8 @@ def consult_grok(query: str) -> str:
            sats = min(settings.grok_max_sats_per_query, 100)
            inv = ln.create_invoice(sats, f"Grok query: {query[:50]}")
            invoice_info = f"\n[Lightning invoice: {sats} sats — {inv.payment_request[:40]}...]"
-        except Exception:
+        except (ImportError, OSError, ValueError) as exc:
+            logger.warning("Tool execution failed (Lightning invoice): %s", exc)
            pass

    result = backend.run(query)
@@ -436,30 +472,8 @@ def consult_grok(query: str) -> str:
    return response


-def create_full_toolkit(base_dir: str | Path | None = None):
-    """Create a full toolkit with all available tools (for the orchestrator).
-
-    Includes: web search, file read/write, shell commands, python execution,
-    memory search for contextual recall, and Grok consultation.
-    """
-    if not _AGNO_TOOLS_AVAILABLE:
-        # Return None when tools aren't available (tests)
-        return None
-
-    from timmy.tool_safety import DANGEROUS_TOOLS
-
-    toolkit = Toolkit(
-        name="full",
-        requires_confirmation_tools=list(DANGEROUS_TOOLS),
-    )
-
-    # Web search (optional — degrades gracefully if ddgs not installed)
-    if _DUCKDUCKGO_AVAILABLE:
-        search_tools = DuckDuckGoTools()
-        toolkit.register(search_tools.web_search, name="web_search")
-    else:
-        logger.debug("DuckDuckGo tools unavailable (ddgs not installed) — skipping web_search")
-
+def _register_core_tools(toolkit: Toolkit, base_path: Path) -> None:
+    """Register core execution and file tools."""
    # Python execution
    python_tools = PythonTools()
    toolkit.register(python_tools.run_python_code, name="python")
@@ -468,10 +482,7 @@ def create_full_toolkit(base_dir: str | Path | None = None):
    shell_tools = ShellTools()
    toolkit.register(shell_tools.run_shell_command, name="shell")

-    # File operations - use repo_root from settings
-    from config import settings
-
-    base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
+    # File operations
    file_tools = FileTools(base_dir=base_path)
    toolkit.register(_make_smart_read_file(file_tools), name="read_file")
    toolkit.register(file_tools.save_file, name="write_file")
@@ -480,28 +491,36 @@ def create_full_toolkit(base_dir: str | Path | None = None):
    # Calculator — exact arithmetic (never let the LLM guess)
    toolkit.register(calculator, name="calculator")

-    # Grok consultation — premium frontier reasoning (opt-in)
+
+def _register_grok_tool(toolkit: Toolkit) -> None:
+    """Register Grok consultation tool if available."""
    try:
        from timmy.backends import grok_available

        if grok_available():
            toolkit.register(consult_grok, name="consult_grok")
            logger.info("Grok consultation tool registered")
-    except Exception:
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (Grok registration): %s", exc)
        logger.debug("Grok tool not available")

-    # Memory search, write, and forget — persistent recall across all channels
+
+def _register_memory_tools(toolkit: Toolkit) -> None:
+    """Register memory search, write, and forget tools."""
    try:
-        from timmy.semantic_memory import memory_forget, memory_read, memory_search, memory_write
+        from timmy.memory_system import memory_forget, memory_read, memory_search, memory_write

        toolkit.register(memory_search, name="memory_search")
        toolkit.register(memory_write, name="memory_write")
        toolkit.register(memory_read, name="memory_read")
        toolkit.register(memory_forget, name="memory_forget")
-    except Exception:
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (Memory tools registration): %s", exc)
        logger.debug("Memory tools not available")

-    # Agentic loop — background multi-step task execution
+
+def _register_agentic_loop_tool(toolkit: Toolkit) -> None:
+    """Register agentic loop tool for background multi-step task execution."""
    try:
        from timmy.agentic_loop import run_agentic_loop

@@ -544,28 +563,102 @@ def create_full_toolkit(base_dir: str | Path | None = None):
            )

        toolkit.register(plan_and_execute, name="plan_and_execute")
-    except Exception:
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (plan_and_execute registration): %s", exc)
        logger.debug("plan_and_execute tool not available")

-    # System introspection - query runtime environment (sovereign self-knowledge)
+
+def _register_introspection_tools(toolkit: Toolkit) -> None:
+    """Register system introspection tools for runtime environment queries."""
    try:
-        from timmy.tools_intro import check_ollama_health, get_memory_status, get_system_info
+        from timmy.tools_intro import (
+            check_ollama_health,
+            get_memory_status,
+            get_system_info,
+            run_self_tests,
+        )

        toolkit.register(get_system_info, name="get_system_info")
        toolkit.register(check_ollama_health, name="check_ollama_health")
        toolkit.register(get_memory_status, name="get_memory_status")
-    except Exception:
+        toolkit.register(run_self_tests, name="run_self_tests")
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (Introspection tools registration): %s", exc)
        logger.debug("Introspection tools not available")

-    # Inter-agent delegation - dispatch tasks to swarm agents
    try:
-        from timmy.tools_delegation import delegate_task, list_swarm_agents
+        from timmy.session_logger import session_history
+
+        toolkit.register(session_history, name="session_history")
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (session_history registration): %s", exc)
+        logger.debug("session_history tool not available")
+
+
+def _register_delegation_tools(toolkit: Toolkit) -> None:
+    """Register inter-agent delegation tools."""
+    try:
+        from timmy.tools_delegation import delegate_task, delegate_to_kimi, list_swarm_agents

        toolkit.register(delegate_task, name="delegate_task")
+        toolkit.register(delegate_to_kimi, name="delegate_to_kimi")
        toolkit.register(list_swarm_agents, name="list_swarm_agents")
-    except Exception:
+    except Exception as exc:
+        logger.warning("Tool execution failed (Delegation tools registration): %s", exc)
        logger.debug("Delegation tools not available")

+
+def _register_gematria_tool(toolkit: Toolkit) -> None:
+    """Register the gematria computation tool."""
+    try:
+        from timmy.gematria import gematria
+
+        toolkit.register(gematria, name="gematria")
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (Gematria registration): %s", exc)
+        logger.debug("Gematria tool not available")
+
+
+def _register_thinking_tools(toolkit: Toolkit) -> None:
+    """Register thinking/introspection tools for self-reflection."""
+    try:
+        from timmy.thinking import search_thoughts
+
+        toolkit.register(search_thoughts, name="thought_search")
+    except (ImportError, AttributeError) as exc:
+        logger.warning("Tool execution failed (Thinking tools registration): %s", exc)
+        logger.debug("Thinking tools not available")
+
+
+def create_full_toolkit(base_dir: str | Path | None = None):
+    """Create a full toolkit with all available tools (for the orchestrator).
+
+    Includes: web search, file read/write, shell commands, python execution,
+    memory search for contextual recall, and Grok consultation.
+    """
+    if not _AGNO_TOOLS_AVAILABLE:
+        # Return None when tools aren't available (tests)
+        return None
+
+    from timmy.tool_safety import DANGEROUS_TOOLS
+
+    toolkit = Toolkit(name="full")
+    # Set requires_confirmation_tools AFTER construction (avoids agno WARNING
+    # about tools not yet registered) but BEFORE register() calls (so each
+    # Function gets requires_confirmation=True).  Fixes #79.
+    toolkit.requires_confirmation_tools = list(DANGEROUS_TOOLS)
+
+    base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
+
+    _register_core_tools(toolkit, base_path)
+    _register_grok_tool(toolkit)
+    _register_memory_tools(toolkit)
+    _register_agentic_loop_tool(toolkit)
+    _register_introspection_tools(toolkit)
+    _register_delegation_tools(toolkit)
+    _register_gematria_tool(toolkit)
+    _register_thinking_tools(toolkit)
+
    # Gitea issue management is now provided by the gitea-mcp server
    # (wired in as MCPTools in agent.py, not registered here)

@@ -675,18 +768,9 @@ get_tools_for_persona = get_tools_for_agent
 PERSONA_TOOLKITS = AGENT_TOOLKITS


-def get_all_available_tools() -> dict[str, dict]:
-    """Get a catalog of all available tools and their descriptions.
-
-    Returns:
-        Dict mapping tool categories to their tools and descriptions.
-    """
-    catalog = {
-        "web_search": {
-            "name": "Web Search",
-            "description": "Search the web using DuckDuckGo",
-            "available_in": ["echo", "seer", "mace", "orchestrator"],
-        },
+def _core_tool_catalog() -> dict:
+    """Return core file and execution tools catalog entries."""
+    return {
        "shell": {
            "name": "Shell Commands",
            "description": "Execute shell commands (sandboxed)",
@@ -712,16 +796,39 @@ def get_all_available_tools() -> dict[str, dict]:
            "description": "List files in a directory",
            "available_in": ["echo", "seer", "forge", "quill", "mace", "helm", "orchestrator"],
        },
+    }
+
+
+def _analysis_tool_catalog() -> dict:
+    """Return analysis and calculation tools catalog entries."""
+    return {
        "calculator": {
            "name": "Calculator",
            "description": "Evaluate mathematical expressions with exact results",
            "available_in": ["orchestrator"],
        },
+    }
+
+
+def _ai_tool_catalog() -> dict:
+    """Return AI assistant and frontier reasoning tools catalog entries."""
+    return {
        "consult_grok": {
            "name": "Consult Grok",
            "description": "Premium frontier reasoning via xAI Grok (opt-in, Lightning-payable)",
            "available_in": ["orchestrator"],
        },
+        "aider": {
+            "name": "Aider AI Assistant",
+            "description": "Local AI coding assistant using Ollama (qwen3:30b or deepseek-coder)",
+            "available_in": ["forge", "orchestrator"],
+        },
+    }
+
+
+def _introspection_tool_catalog() -> dict:
+    """Return system introspection tools catalog entries."""
+    return {
        "get_system_info": {
            "name": "System Info",
            "description": "Introspect runtime environment - discover model, Python version, config",
@@ -737,11 +844,22 @@ def get_all_available_tools() -> dict[str, dict]:
            "description": "Check status of memory tiers (hot memory, vault)",
            "available_in": ["orchestrator"],
        },
-        "aider": {
-            "name": "Aider AI Assistant",
-            "description": "Local AI coding assistant using Ollama (qwen3.5:latest or deepseek-coder)",
-            "available_in": ["forge", "orchestrator"],
+        "session_history": {
+            "name": "Session History",
+            "description": "Search past conversation logs for messages, tool calls, errors, and decisions",
+            "available_in": ["orchestrator"],
        },
+        "thought_search": {
+            "name": "Thought Search",
+            "description": "Query Timmy's own thought history for past reflections and insights",
+            "available_in": ["orchestrator"],
+        },
+    }
+
+
+def _experiment_tool_catalog() -> dict:
+    """Return ML experiment tools catalog entries."""
+    return {
        "prepare_experiment": {
            "name": "Prepare Experiment",
            "description": "Clone autoresearch repo and run data preparation for ML experiments",
@@ -759,6 +877,9 @@ def get_all_available_tools() -> dict[str, dict]:
        },
    }

+
+def _import_creative_catalogs(catalog: dict) -> None:
+    """Import and merge creative tool catalogs from creative module."""
    # ── Git tools ─────────────────────────────────────────────────────────────
    try:
        from creative.tools.git_tools import GIT_TOOL_CATALOG
@@ -837,4 +958,18 @@ def get_all_available_tools() -> dict[str, dict]:
    except ImportError:
        pass

+
+def get_all_available_tools() -> dict[str, dict]:
+    """Get a catalog of all available tools and their descriptions.
+
+    Returns:
+        Dict mapping tool categories to their tools and descriptions.
+    """
+    catalog = {}
+    catalog.update(_core_tool_catalog())
+    catalog.update(_analysis_tool_catalog())
+    catalog.update(_ai_tool_catalog())
+    catalog.update(_introspection_tool_catalog())
+    catalog.update(_experiment_tool_catalog())
+    _import_creative_catalogs(catalog)
    return catalog
--- a/src/timmy/tools_delegation/init.py
+++ b/src/timmy/tools_delegation/init.py
@@ -87,3 +87,73 @@ def list_swarm_agents() -> dict[str, Any]:
            "error": str(e),
            "agents": [],
        }
+
+
+def delegate_to_kimi(task: str, working_directory: str = "") -> dict[str, Any]:
+    """Delegate a coding task to Kimi, the external coding agent.
+
+    Kimi has 262K context and is optimized for code tasks: writing,
+    debugging, refactoring, test writing. Timmy thinks and plans,
+    Kimi executes bulk code changes.
+
+    Args:
+        task: Clear, specific coding task description. Include file paths
+              and expected behavior. Good: "Fix the bug in src/timmy/session.py
+              where sessions don't persist." Bad: "Fix all bugs."
+        working_directory: Directory for Kimi to work in. Defaults to repo root.
+
+    Returns:
+        Dict with success status and Kimi's output or error.
+    """
+    import shutil
+    import subprocess
+    from pathlib import Path
+
+    from config import settings
+
+    kimi_path = shutil.which("kimi")
+    if not kimi_path:
+        return {
+            "success": False,
+            "error": "kimi CLI not found on PATH. Install with: pip install kimi-cli",
+        }
+
+    workdir = working_directory or settings.repo_root
+    if not Path(workdir).is_dir():
+        return {
+            "success": False,
+            "error": f"Working directory does not exist: {workdir}",
+        }
+
+    cmd = [kimi_path, "--print", "-p", task]
+
+    logger.info("Delegating to Kimi: %s (cwd=%s)", task[:80], workdir)
+
+    try:
+        result = subprocess.run(
+            cmd,
+            capture_output=True,
+            text=True,
+            timeout=300,  # 5 minute timeout for coding tasks
+            cwd=workdir,
+        )
+
+        output = result.stdout.strip()
+        if result.returncode != 0 and result.stderr:
+            output += "\n\nSTDERR:\n" + result.stderr.strip()
+
+        return {
+            "success": result.returncode == 0,
+            "output": output[-4000:] if len(output) > 4000 else output,
+            "return_code": result.returncode,
+        }
+    except subprocess.TimeoutExpired:
+        return {
+            "success": False,
+            "error": "Kimi timed out after 300s. Task may be too broad — try breaking it into smaller pieces.",
+        }
+    except Exception as exc:
+        return {
+            "success": False,
+            "error": f"Failed to run Kimi: {exc}",
+        }
--- a/src/timmy/tools_intro/init.py
+++ b/src/timmy/tools_intro/init.py
@@ -6,7 +6,9 @@ being told about it in the system prompt.

 import logging
 import platform
+import sqlite3
 import sys
+from contextlib import closing
 from datetime import UTC, datetime
 from pathlib import Path
 from typing import Any
@@ -55,26 +57,46 @@ def get_system_info() -> dict[str, Any]:


 def _get_ollama_model() -> str:
-    """Query Ollama API to get the current model."""
+    """Query Ollama API to get the actual running model.
+
+    Strategy:
+    1. /api/ps — models currently loaded in memory (most accurate)
+    2. /api/tags — all installed models (fallback)
+    Both use exact name match to avoid prefix collisions
+    (e.g. 'qwen3:8b' vs 'qwen3:30b').
+    """
    from config import settings

+    configured = settings.ollama_model
+
    try:
-        # First try to get tags to see available models
+        # First: check actually loaded models via /api/ps
+        response = httpx.get(f"{settings.ollama_url}/api/ps", timeout=5)
+        if response.status_code == 200:
+            running = response.json().get("models", [])
+            for model in running:
+                name = model.get("name", "")
+                if name == configured or name == f"{configured}:latest":
+                    return name
+            # Configured model not loaded — return first running model
+            # so Timmy reports what's *actually* serving his requests
+            if running:
+                return running[0].get("name", configured)
+
+        # Second: check installed models via /api/tags (exact match)
        response = httpx.get(f"{settings.ollama_url}/api/tags", timeout=5)
        if response.status_code == 200:
-            models = response.json().get("models", [])
-            # Check if configured model is available
-            for model in models:
-                if model.get("name", "").startswith(settings.ollama_model.split(":")[0]):
-                    return settings.ollama_model
-
-            # Fallback: return configured model
-            return settings.ollama_model
-    except Exception:
+            installed = response.json().get("models", [])
+            for model in installed:
+                name = model.get("name", "")
+                if name == configured or name == f"{configured}:latest":
+                    return configured
+    except Exception as exc:
+        logger.debug("Model validation failed: %s", exc)
        pass

    # Fallback to configured model
-    return settings.ollama_model
+    return configured


 def check_ollama_health() -> dict[str, Any]:
@@ -154,46 +176,42 @@ def get_memory_status() -> dict[str, Any]:
    # Tier 3: Semantic memory row count
    tier3_info: dict[str, Any] = {"available": False}
    try:
-        import sqlite3
-
        sem_db = repo_root / "data" / "memory.db"
        if sem_db.exists():
-            conn = sqlite3.connect(str(sem_db))
-            row = conn.execute(
-                "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='chunks'"
-            ).fetchone()
-            if row and row[0]:
-                count = conn.execute("SELECT COUNT(*) FROM chunks").fetchone()
-                tier3_info["available"] = True
-                tier3_info["vector_count"] = count[0] if count else 0
-            conn.close()
-    except Exception:
+            with closing(sqlite3.connect(str(sem_db))) as conn:
+                row = conn.execute(
+                    "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='chunks'"
+                ).fetchone()
+                if row and row[0]:
+                    count = conn.execute("SELECT COUNT(*) FROM chunks").fetchone()
+                    tier3_info["available"] = True
+                    tier3_info["vector_count"] = count[0] if count else 0
+    except Exception as exc:
+        logger.debug("Memory status query failed: %s", exc)
        pass

    # Self-coding journal stats
    journal_info: dict[str, Any] = {"available": False}
    try:
-        import sqlite3 as _sqlite3
-
        journal_db = repo_root / "data" / "self_coding.db"
        if journal_db.exists():
-            conn = _sqlite3.connect(str(journal_db))
-            conn.row_factory = _sqlite3.Row
-            rows = conn.execute(
-                "SELECT outcome, COUNT(*) as cnt FROM modification_journal GROUP BY outcome"
-            ).fetchall()
-            if rows:
-                counts = {r["outcome"]: r["cnt"] for r in rows}
-                total = sum(counts.values())
-                journal_info = {
-                    "available": True,
-                    "total_attempts": total,
-                    "successes": counts.get("success", 0),
-                    "failures": counts.get("failure", 0),
-                    "success_rate": round(counts.get("success", 0) / total, 2) if total else 0,
-                }
-            conn.close()
-    except Exception:
+            with closing(sqlite3.connect(str(journal_db))) as conn:
+                conn.row_factory = sqlite3.Row
+                rows = conn.execute(
+                    "SELECT outcome, COUNT(*) as cnt FROM modification_journal GROUP BY outcome"
+                ).fetchall()
+                if rows:
+                    counts = {r["outcome"]: r["cnt"] for r in rows}
+                    total = sum(counts.values())
+                    journal_info = {
+                        "available": True,
+                        "total_attempts": total,
+                        "successes": counts.get("success", 0),
+                        "failures": counts.get("failure", 0),
+                        "success_rate": round(counts.get("success", 0) / total, 2) if total else 0,
+                    }
+    except Exception as exc:
+        logger.debug("Journal stats query failed: %s", exc)
        pass

    return {
@@ -280,11 +298,12 @@ def get_live_system_status() -> dict[str, Any]:

    # Uptime
    try:
-        from dashboard.routes.health import _START_TIME
+        from config import APP_START_TIME

-        uptime = (datetime.now(UTC) - _START_TIME).total_seconds()
+        uptime = (datetime.now(UTC) - APP_START_TIME).total_seconds()
        result["uptime_seconds"] = int(uptime)
-    except Exception:
+    except Exception as exc:
+        logger.debug("Uptime calculation failed: %s", exc)
        result["uptime_seconds"] = None

    # Discord status
@@ -292,8 +311,84 @@ def get_live_system_status() -> dict[str, Any]:
        from integrations.chat_bridge.vendors.discord import discord_bot

        result["discord"] = {"state": discord_bot.state.name}
-    except Exception:
+    except Exception as exc:
+        logger.debug("Discord status check failed: %s", exc)
        result["discord"] = {"state": "unknown"}

    result["timestamp"] = datetime.now(UTC).isoformat()
    return result
+
+
+def run_self_tests(scope: str = "fast", _repo_root: str | None = None) -> dict[str, Any]:
+    """Run Timmy's own test suite and report results.
+
+    A sovereign agent verifies his own integrity. This runs pytest
+    on the codebase and returns a structured summary.
+
+    Args:
+        scope: Test scope — "fast" (unit tests only, ~30s timeout),
+               "full" (all tests), or a specific path like "tests/timmy/"
+        _repo_root: Optional repo root for testing (overrides settings)
+
+    Returns:
+        Dict with passed, failed, errors, total counts and summary text.
+    """
+    import subprocess
+
+    from config import settings
+
+    repo = _repo_root if _repo_root else settings.repo_root
+    venv_python = Path(repo) / ".venv" / "bin" / "python"
+    if not venv_python.exists():
+        return {"success": False, "error": f"No venv found at {venv_python}"}
+
+    cmd = [str(venv_python), "-m", "pytest", "-x", "-q", "--tb=short", "--timeout=30"]
+
+    if scope == "fast":
+        # Unit tests only — skip functional/e2e/integration
+        cmd.extend(
+            [
+                "--ignore=tests/functional",
+                "--ignore=tests/e2e",
+                "--ignore=tests/integrations",
+                "tests/",
+            ]
+        )
+    elif scope == "full":
+        cmd.append("tests/")
+    else:
+        # Specific path
+        cmd.append(scope)
+
+    try:
+        result = subprocess.run(cmd, capture_output=True, text=True, timeout=120, cwd=repo)
+        output = result.stdout + result.stderr
+
+        # Parse pytest output for counts
+        passed = failed = errors = 0
+        for line in output.splitlines():
+            if "passed" in line or "failed" in line or "error" in line:
+                import re
+
+                nums = re.findall(r"(\d+) (passed|failed|error)", line)
+                for count, kind in nums:
+                    if kind == "passed":
+                        passed = int(count)
+                    elif kind == "failed":
+                        failed = int(count)
+                    elif kind == "error":
+                        errors = int(count)
+
+        return {
+            "success": result.returncode == 0,
+            "passed": passed,
+            "failed": failed,
+            "errors": errors,
+            "total": passed + failed + errors,
+            "return_code": result.returncode,
+            "summary": output[-2000:] if len(output) > 2000 else output,
+        }
+    except subprocess.TimeoutExpired:
+        return {"success": False, "error": "Test run timed out (120s limit)"}
+    except Exception as exc:
+        return {"success": False, "error": str(exc)}
--- a/src/timmy/voice_loop.py
+++ b/src/timmy/voice_loop.py
@@ -0,0 +1,531 @@
+"""Sovereign voice loop — listen, think, speak.
+
+A fully local voice interface for Timmy. No cloud, no network calls.
+All processing happens on the user's machine:
+
+    Mic → VAD/silence detection → Whisper (local STT) → Timmy chat → Piper TTS → Speaker
+
+Usage:
+    from timmy.voice_loop import VoiceLoop
+    loop = VoiceLoop()
+    loop.run()  # blocks, Ctrl-C to stop
+
+Requires: sounddevice, numpy, whisper, piper-tts
+"""
+
+import asyncio
+import logging
+import re
+import subprocess
+import sys
+import tempfile
+import time
+from dataclasses import dataclass
+from pathlib import Path
+
+import numpy as np
+
+logger = logging.getLogger(__name__)
+
+# ── Voice-mode system instruction ───────────────────────────────────────────
+# Prepended to user messages so Timmy responds naturally for TTS.
+_VOICE_PREAMBLE = (
+    "[VOICE MODE] You are speaking aloud through a text-to-speech system. "
+    "Respond in short, natural spoken sentences. No markdown, no bullet points, "
+    "no asterisks, no numbered lists, no headers, no bold/italic formatting. "
+    "Talk like a person in a conversation — concise, warm, direct. "
+    "Keep responses under 3-4 sentences unless the user asks for detail."
+)
+
+
+def _strip_markdown(text: str) -> str:
+    """Remove markdown formatting so TTS reads naturally.
+
+    Strips: **bold**, *italic*, `code`, # headers, - bullets,
+    numbered lists, [links](url), etc.
+    """
+    if not text:
+        return text
+    # Remove bold/italic markers
+    text = re.sub(r"\*{1,3}([^*]+)\*{1,3}", r"\1", text)
+    # Remove inline code
+    text = re.sub(r"`([^`]+)`", r"\1", text)
+    # Remove headers (# Header)
+    text = re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
+    # Remove bullet points (-, *, +) at start of line
+    text = re.sub(r"^[\s]*[-*+]\s+", "", text, flags=re.MULTILINE)
+    # Remove numbered lists (1. 2. etc)
+    text = re.sub(r"^[\s]*\d+\.\s+", "", text, flags=re.MULTILINE)
+    # Remove link syntax [text](url) → text
+    text = re.sub(r"\[([^\]]+)\]\([^)]+\)", r"\1", text)
+    # Remove horizontal rules
+    text = re.sub(r"^[-*_]{3,}\s*$", "", text, flags=re.MULTILINE)
+    # Collapse multiple newlines
+    text = re.sub(r"\n{3,}", "\n\n", text)
+    return text.strip()
+
+
+# ── Defaults ────────────────────────────────────────────────────────────────
+
+DEFAULT_WHISPER_MODEL = "base.en"
+DEFAULT_PIPER_VOICE = Path.home() / ".local/share/piper-voices/en_US-lessac-medium.onnx"
+DEFAULT_SAMPLE_RATE = 16000  # Whisper expects 16 kHz
+DEFAULT_CHANNELS = 1
+DEFAULT_SILENCE_THRESHOLD = 0.015  # RMS threshold — tune for your mic/room
+DEFAULT_SILENCE_DURATION = 1.5  # seconds of silence to end utterance
+DEFAULT_MIN_UTTERANCE = 0.5  # ignore clicks/bumps shorter than this
+DEFAULT_MAX_UTTERANCE = 30.0  # safety cap — don't record forever
+DEFAULT_SESSION_ID = "voice"
+
+
+@dataclass
+class VoiceConfig:
+    """Configuration for the voice loop."""
+
+    whisper_model: str = DEFAULT_WHISPER_MODEL
+    piper_voice: Path = DEFAULT_PIPER_VOICE
+    sample_rate: int = DEFAULT_SAMPLE_RATE
+    silence_threshold: float = DEFAULT_SILENCE_THRESHOLD
+    silence_duration: float = DEFAULT_SILENCE_DURATION
+    min_utterance: float = DEFAULT_MIN_UTTERANCE
+    max_utterance: float = DEFAULT_MAX_UTTERANCE
+    session_id: str = DEFAULT_SESSION_ID
+    # Set True to use macOS `say` instead of Piper
+    use_say_fallback: bool = False
+    # Piper speaking rate (default 1.0, lower = slower)
+    speaking_rate: float = 1.0
+    # Backend/model for Timmy inference
+    backend: str | None = None
+    model_size: str | None = None
+
+
+class VoiceLoop:
+    """Sovereign listen-think-speak loop.
+
+    Everything runs locally:
+    - STT: OpenAI Whisper (local model, no API)
+    - LLM: Timmy via Ollama (local inference)
+    - TTS: Piper (local ONNX model) or macOS `say`
+    """
+
+    def __init__(self, config: VoiceConfig | None = None) -> None:
+        self.config = config or VoiceConfig()
+        self._whisper_model = None
+        self._running = False
+        self._speaking = False  # True while TTS is playing
+        self._interrupted = False  # set when user talks over TTS
+        # Persistent event loop — reused across all chat calls so Agno's
+        # MCP sessions don't die when the loop closes.
+        self._loop: asyncio.AbstractEventLoop | None = None
+
+    # ── Lazy initialization ─────────────────────────────────────────────
+
+    def _load_whisper(self):
+        """Load Whisper model (lazy, first use only)."""
+        if self._whisper_model is not None:
+            return
+        import whisper
+
+        logger.info("Loading Whisper model: %s", self.config.whisper_model)
+        self._whisper_model = whisper.load_model(self.config.whisper_model)
+        logger.info("Whisper model loaded.")
+
+    def _ensure_piper(self) -> bool:
+        """Check that Piper voice model exists."""
+        if self.config.use_say_fallback:
+            return True
+        voice_path = self.config.piper_voice
+        if not voice_path.exists():
+            logger.warning("Piper voice not found at %s — falling back to `say`", voice_path)
+            self.config.use_say_fallback = True
+            return True
+        return True
+
+    # ── STT: Microphone → Text ──────────────────────────────────────────
+
+    def _record_utterance(self) -> np.ndarray | None:
+        """Record from microphone until silence is detected.
+
+        Uses energy-based Voice Activity Detection:
+        1. Wait for speech (RMS above threshold)
+        2. Record until silence (RMS below threshold for silence_duration)
+        3. Return the audio as a numpy array
+
+        Returns None if interrupted or no speech detected.
+        """
+        import sounddevice as sd
+
+        sr = self.config.sample_rate
+        block_size = int(sr * 0.1)  # 100ms blocks
+        silence_blocks = int(self.config.silence_duration / 0.1)
+        min_blocks = int(self.config.min_utterance / 0.1)
+        max_blocks = int(self.config.max_utterance / 0.1)
+
+        audio_chunks: list[np.ndarray] = []
+        silent_count = 0
+        recording = False
+
+        def _rms(block: np.ndarray) -> float:
+            return float(np.sqrt(np.mean(block.astype(np.float32) ** 2)))
+
+        sys.stdout.write("\n  🎤 Listening... (speak now)\n")
+        sys.stdout.flush()
+
+        with sd.InputStream(
+            samplerate=sr,
+            channels=DEFAULT_CHANNELS,
+            dtype="float32",
+            blocksize=block_size,
+        ) as stream:
+            while self._running:
+                block, overflowed = stream.read(block_size)
+                if overflowed:
+                    logger.debug("Audio buffer overflowed")
+
+                rms = _rms(block)
+
+                if not recording:
+                    if rms > self.config.silence_threshold:
+                        recording = True
+                        silent_count = 0
+                        audio_chunks.append(block.copy())
+                        sys.stdout.write("  📢 Recording...\r")
+                        sys.stdout.flush()
+                else:
+                    audio_chunks.append(block.copy())
+
+                    if rms < self.config.silence_threshold:
+                        silent_count += 1
+                    else:
+                        silent_count = 0
+
+                    # End of utterance
+                    if silent_count >= silence_blocks:
+                        break
+
+                    # Safety cap
+                    if len(audio_chunks) >= max_blocks:
+                        logger.info("Max utterance length reached, stopping.")
+                        break
+
+        if not audio_chunks or len(audio_chunks) < min_blocks:
+            return None
+
+        audio = np.concatenate(audio_chunks, axis=0).flatten()
+        duration = len(audio) / sr
+        sys.stdout.write(f"  ✂️  Captured {duration:.1f}s of audio\n")
+        sys.stdout.flush()
+        return audio
+
+    def _transcribe(self, audio: np.ndarray) -> str:
+        """Transcribe audio using local Whisper model."""
+        self._load_whisper()
+
+        sys.stdout.write("  🧠 Transcribing...\r")
+        sys.stdout.flush()
+
+        t0 = time.monotonic()
+        result = self._whisper_model.transcribe(
+            audio,
+            language="en",
+            fp16=False,  # MPS/CPU — fp16 can cause issues on some setups
+        )
+        elapsed = time.monotonic() - t0
+
+        text = result["text"].strip()
+        logger.info("Whisper transcribed in %.1fs: '%s'", elapsed, text[:80])
+        return text
+
+    # ── TTS: Text → Speaker ─────────────────────────────────────────────
+
+    def _speak(self, text: str) -> None:
+        """Speak text aloud using Piper TTS or macOS `say`."""
+        if not text:
+            return
+
+        self._speaking = True
+        try:
+            if self.config.use_say_fallback:
+                self._speak_say(text)
+            else:
+                self._speak_piper(text)
+        finally:
+            self._speaking = False
+
+    def _speak_piper(self, text: str) -> None:
+        """Speak using Piper TTS (local ONNX inference)."""
+        with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
+            tmp_path = tmp.name
+
+        try:
+            # Generate WAV with Piper
+            cmd = [
+                "piper",
+                "--model",
+                str(self.config.piper_voice),
+                "--output_file",
+                tmp_path,
+            ]
+
+            proc = subprocess.run(
+                cmd,
+                input=text,
+                capture_output=True,
+                text=True,
+                timeout=30,
+            )
+
+            if proc.returncode != 0:
+                logger.error("Piper failed: %s", proc.stderr)
+                self._speak_say(text)  # fallback
+                return
+
+            # Play with afplay (macOS) — interruptible
+            self._play_audio(tmp_path)
+
+        finally:
+            Path(tmp_path).unlink(missing_ok=True)
+
+    def _speak_say(self, text: str) -> None:
+        """Speak using macOS `say` command."""
+        try:
+            proc = subprocess.Popen(
+                ["say", "-r", "180", text],
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
+            )
+            proc.wait(timeout=60)
+        except subprocess.TimeoutExpired:
+            proc.kill()
+        except FileNotFoundError:
+            logger.error("macOS `say` command not found")
+
+    def _play_audio(self, path: str) -> None:
+        """Play a WAV file. Can be interrupted by setting self._interrupted."""
+        try:
+            proc = subprocess.Popen(
+                ["afplay", path],
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
+            )
+            # Poll so we can interrupt
+            while proc.poll() is None:
+                if self._interrupted:
+                    proc.terminate()
+                    self._interrupted = False
+                    logger.info("TTS interrupted by user")
+                    return
+                time.sleep(0.05)
+        except FileNotFoundError:
+            # Not macOS — try aplay (Linux)
+            try:
+                subprocess.run(["aplay", path], capture_output=True, timeout=60)
+            except (FileNotFoundError, subprocess.TimeoutExpired):
+                logger.error("No audio player found (tried afplay, aplay)")
+
+    # ── LLM: Text → Response ───────────────────────────────────────────
+
+    def _get_loop(self) -> asyncio.AbstractEventLoop:
+        """Return a persistent event loop, creating one if needed.
+
+        A single loop is reused for the entire voice session so Agno's
+        MCP tool-server connections survive across turns.
+        """
+        if self._loop is None or self._loop.is_closed():
+            self._loop = asyncio.new_event_loop()
+        return self._loop
+
+    def _think(self, user_text: str) -> str:
+        """Send text to Timmy and get a response."""
+        sys.stdout.write("  💭 Thinking...\r")
+        sys.stdout.flush()
+
+        t0 = time.monotonic()
+
+        try:
+            loop = self._get_loop()
+            response = loop.run_until_complete(self._chat(user_text))
+        except (ConnectionError, RuntimeError, ValueError) as exc:
+            logger.error("Timmy chat failed: %s", exc)
+            response = "I'm having trouble thinking right now. Could you try again?"
+
+        elapsed = time.monotonic() - t0
+        logger.info("Timmy responded in %.1fs", elapsed)
+
+        # Strip markdown so TTS doesn't read asterisks, bullets, etc.
+        response = _strip_markdown(response)
+        return response
+
+    async def _chat(self, message: str) -> str:
+        """Async wrapper around Timmy's session.chat().
+
+        Prepends the voice-mode instruction so Timmy responds in
+        natural spoken language rather than markdown.
+        """
+        from timmy.session import chat
+
+        voiced = f"{_VOICE_PREAMBLE}\n\nUser said: {message}"
+        return await chat(voiced, session_id=self.config.session_id)
+
+    # ── Main Loop ───────────────────────────────────────────────────────
+
+    def run(self) -> None:
+        """Run the voice loop. Blocks until Ctrl-C."""
+        self._ensure_piper()
+
+        # Suppress MCP / Agno stderr noise during voice mode.
+        _suppress_mcp_noise()
+        # Suppress MCP async-generator teardown tracebacks on exit.
+        _install_quiet_asyncgen_hooks()
+
+        tts_label = (
+            "macOS say"
+            if self.config.use_say_fallback
+            else f"Piper ({self.config.piper_voice.name})"
+        )
+        logger.info(
+            "\n" + "=" * 60 + "\n"
+            "  🎙️  Timmy Voice — Sovereign Voice Interface\n" + "=" * 60 + "\n"
+            f"  STT:  Whisper ({self.config.whisper_model})\n"
+            f"  TTS:  {tts_label}\n"
+            "  LLM:  Timmy (local Ollama)\n" + "=" * 60 + "\n"
+            "  Speak naturally. Timmy will listen, think, and respond.\n"
+            "  Press Ctrl-C to exit.\n" + "=" * 60
+        )
+
+        self._running = True
+
+        try:
+            while self._running:
+                # 1. LISTEN — record until silence
+                audio = self._record_utterance()
+                if audio is None:
+                    continue
+
+                # 2. TRANSCRIBE — Whisper STT
+                text = self._transcribe(audio)
+                if not text or text.lower() in (
+                    "you",
+                    "thanks.",
+                    "thank you.",
+                    "bye.",
+                    "",
+                    "thanks for watching!",
+                    "thank you for watching!",
+                ):
+                    # Whisper hallucinations on silence/noise
+                    logger.debug("Ignoring likely Whisper hallucination: '%s'", text)
+                    continue
+
+                sys.stdout.write(f"\n  👤 You: {text}\n")
+                sys.stdout.flush()
+
+                # Exit commands
+                if text.lower().strip().rstrip(".!") in (
+                    "goodbye",
+                    "exit",
+                    "quit",
+                    "stop",
+                    "goodbye timmy",
+                    "stop listening",
+                ):
+                    logger.info("👋 Goodbye!")
+                    break
+
+                # 3. THINK — send to Timmy
+                response = self._think(text)
+                sys.stdout.write(f"  🤖 Timmy: {response}\n")
+                sys.stdout.flush()
+
+                # 4. SPEAK — TTS output
+                self._speak(response)
+
+        except KeyboardInterrupt:
+            logger.info("👋 Voice loop stopped.")
+        finally:
+            self._running = False
+            self._cleanup_loop()
+
+    def _cleanup_loop(self) -> None:
+        """Shut down the persistent event loop cleanly.
+
+        Agno's MCP stdio sessions leave async generators (stdio_client)
+        that complain loudly when torn down from a different task.
+        We swallow those errors — they're harmless, the subprocesses
+        die with the loop anyway.
+        """
+        if self._loop is None or self._loop.is_closed():
+            return
+
+        # Silence "error during closing of asynchronous generator" warnings
+        # from MCP's anyio/asyncio cancel-scope teardown.
+        import warnings
+
+        self._loop.set_exception_handler(lambda loop, ctx: None)
+
+        try:
+            self._loop.run_until_complete(self._loop.shutdown_asyncgens())
+        except RuntimeError as exc:
+            logger.debug("Shutdown asyncgens failed: %s", exc)
+            pass
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore", RuntimeWarning)
+            try:
+                self._loop.close()
+            except RuntimeError as exc:
+                logger.debug("Loop close failed: %s", exc)
+                pass
+
+        self._loop = None
+
+    def stop(self) -> None:
+        """Stop the voice loop (from another thread)."""
+        self._running = False
+
+
+def _suppress_mcp_noise() -> None:
+    """Quiet down noisy MCP/Agno loggers during voice mode.
+
+    Sets specific loggers to WARNING so the terminal stays clean
+    for the voice transcript.
+    """
+    for name in (
+        "mcp",
+        "mcp.server",
+        "mcp.client",
+        "agno",
+        "agno.mcp",
+        "httpx",
+        "httpcore",
+    ):
+        logging.getLogger(name).setLevel(logging.WARNING)
+
+
+def _install_quiet_asyncgen_hooks() -> None:
+    """Silence MCP stdio_client async-generator teardown noise.
+
+    When the voice loop exits, Python GC finalizes Agno's MCP
+    stdio_client async generators. anyio's cancel-scope teardown
+    prints ugly tracebacks to stderr. These are harmless — the
+    MCP subprocesses die with the loop. We intercept them here.
+    """
+    _orig_hook = getattr(sys, "unraisablehook", None)
+
+    def _quiet_hook(args):
+        # Swallow RuntimeError from anyio cancel-scope teardown
+        # and BaseExceptionGroup from MCP stdio_client generators
+        if args.exc_type in (RuntimeError, BaseExceptionGroup):
+            msg = str(args.exc_value) if args.exc_value else ""
+            if "cancel scope" in msg or "unhandled errors" in msg:
+                return
+        # Also swallow GeneratorExit from stdio_client
+        if args.exc_type is GeneratorExit:
+            return
+        # Everything else: forward to original hook
+        if _orig_hook:
+            _orig_hook(args)
+        else:
+            sys.__unraisablehook__(args)
+
+    sys.unraisablehook = _quiet_hook
--- a/src/timmy/welcome.py
+++ b/src/timmy/welcome.py
@@ -0,0 +1,7 @@
+"""Welcome message shown when the chat panel loads with no history."""
+
+WELCOME_MESSAGE = (
+    "Mission Control initialized. Timmy ready — awaiting input.\n"
+    "Note: I cannot access real-time data such as weather, live feeds,"
+    " or current news. Please ask about topics I can handle."
+)
--- a/src/timmy/workspace.py
+++ b/src/timmy/workspace.py
@@ -0,0 +1,140 @@
+"""Workspace monitor — tracks file-based communication between Hermes and Timmy.
+
+The workspace/ directory provides file-based communication:
+- workspace/correspondence.md — append-only journal
+- workspace/inbox/ — files from Hermes to Timmy
+- workspace/outbox/ — files from Timmy to Hermes
+
+This module tracks what Timmy has seen and detects new content.
+"""
+
+import json
+import logging
+from pathlib import Path
+
+from config import settings
+
+logger = logging.getLogger(__name__)
+
+_DEFAULT_STATE_PATH = Path("data/workspace_state.json")
+
+
+class WorkspaceMonitor:
+    """Monitors workspace/ directory for new correspondence and inbox files."""
+
+    def __init__(self, state_path: Path = _DEFAULT_STATE_PATH) -> None:
+        self._state_path = state_path
+        self._state: dict = {"last_correspondence_line": 0, "seen_inbox_files": []}
+        self._load_state()
+
+    def _get_workspace_path(self) -> Path:
+        """Get the workspace directory path."""
+        return Path(settings.repo_root) / "workspace"
+
+    def _load_state(self) -> None:
+        """Load persisted state from JSON file."""
+        try:
+            if self._state_path.exists():
+                with open(self._state_path, encoding="utf-8") as f:
+                    loaded = json.load(f)
+                    self._state = {
+                        "last_correspondence_line": loaded.get("last_correspondence_line", 0),
+                        "seen_inbox_files": loaded.get("seen_inbox_files", []),
+                    }
+        except Exception as exc:
+            logger.debug("Failed to load workspace state: %s", exc)
+            self._state = {"last_correspondence_line": 0, "seen_inbox_files": []}
+
+    def _save_state(self) -> None:
+        """Persist state to JSON file."""
+        try:
+            self._state_path.parent.mkdir(parents=True, exist_ok=True)
+            with open(self._state_path, "w", encoding="utf-8") as f:
+                json.dump(self._state, f, indent=2)
+        except Exception as exc:
+            logger.debug("Failed to save workspace state: %s", exc)
+
+    def check_correspondence(self) -> str | None:
+        """Read workspace/correspondence.md and return new entries.
+
+        Returns everything after the last seen line, or None if no new content.
+        """
+        try:
+            workspace = self._get_workspace_path()
+            correspondence_file = workspace / "correspondence.md"
+
+            if not correspondence_file.exists():
+                return None
+
+            content = correspondence_file.read_text(encoding="utf-8")
+            lines = content.splitlines()
+
+            last_seen = self._state.get("last_correspondence_line", 0)
+            if len(lines) <= last_seen:
+                return None
+
+            new_lines = lines[last_seen:]
+            return "\n".join(new_lines)
+        except Exception as exc:
+            logger.debug("Failed to check correspondence: %s", exc)
+            return None
+
+    def check_inbox(self) -> list[str]:
+        """List workspace/inbox/ files and return any not in seen list.
+
+        Returns a list of filenames that are new.
+        """
+        try:
+            workspace = self._get_workspace_path()
+            inbox_dir = workspace / "inbox"
+
+            if not inbox_dir.exists():
+                return []
+
+            seen = set(self._state.get("seen_inbox_files", []))
+            current_files = {f.name for f in inbox_dir.iterdir() if f.is_file()}
+            new_files = sorted(current_files - seen)
+
+            return new_files
+        except Exception as exc:
+            logger.debug("Failed to check inbox: %s", exc)
+            return []
+
+    def get_pending_updates(self) -> dict:
+        """Get all pending workspace updates.
+
+        Returns a dict with keys:
+        - 'new_correspondence': str or None — new entries from correspondence.md
+        - 'new_inbox_files': list[str] — new files in inbox/
+        """
+        return {
+            "new_correspondence": self.check_correspondence(),
+            "new_inbox_files": self.check_inbox(),
+        }
+
+    def mark_seen(self) -> None:
+        """Update state file after processing current content."""
+        try:
+            workspace = self._get_workspace_path()
+
+            # Update correspondence line count
+            correspondence_file = workspace / "correspondence.md"
+            if correspondence_file.exists():
+                content = correspondence_file.read_text(encoding="utf-8")
+                self._state["last_correspondence_line"] = len(content.splitlines())
+
+            # Update inbox seen list
+            inbox_dir = workspace / "inbox"
+            if inbox_dir.exists():
+                current_files = [f.name for f in inbox_dir.iterdir() if f.is_file()]
+                self._state["seen_inbox_files"] = sorted(current_files)
+            else:
+                self._state["seen_inbox_files"] = []
+
+            self._save_state()
+        except Exception as exc:
+            logger.debug("Failed to mark workspace as seen: %s", exc)
+
+
+# Module-level singleton
+workspace_monitor = WorkspaceMonitor()
--- a/src/timmy_serve/inter_agent.py
+++ b/src/timmy_serve/inter_agent.py
@@ -1,105 +0,0 @@
-"""Agent-to-agent messaging for the Timmy serve layer.
-
-Provides a simple message-passing interface that allows agents to
-communicate with each other.  Messages are routed through the swarm
-comms layer when available, or stored in an in-memory queue for
-single-process operation.
-"""
-
-import logging
-import uuid
-from collections import deque
-from dataclasses import dataclass, field
-from datetime import UTC, datetime
-
-logger = logging.getLogger(__name__)
-
-
-@dataclass
-class AgentMessage:
-    id: str = field(default_factory=lambda: str(uuid.uuid4()))
-    from_agent: str = ""
-    to_agent: str = ""
-    content: str = ""
-    message_type: str = "text"  # text | command | response | error
-    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
-    replied: bool = False
-
-
-class InterAgentMessenger:
-    """In-memory message queue for agent-to-agent communication."""
-
-    def __init__(self, max_queue_size: int = 1000) -> None:
-        self._queues: dict[str, deque[AgentMessage]] = {}
-        self._max_size = max_queue_size
-        self._all_messages: list[AgentMessage] = []
-
-    def send(
-        self,
-        from_agent: str,
-        to_agent: str,
-        content: str,
-        message_type: str = "text",
-    ) -> AgentMessage:
-        """Send a message from one agent to another."""
-        msg = AgentMessage(
-            from_agent=from_agent,
-            to_agent=to_agent,
-            content=content,
-            message_type=message_type,
-        )
-        queue = self._queues.setdefault(to_agent, deque(maxlen=self._max_size))
-        queue.append(msg)
-        self._all_messages.append(msg)
-        logger.info(
-            "Message %s → %s: %s (%s)",
-            from_agent,
-            to_agent,
-            content[:50],
-            message_type,
-        )
-        return msg
-
-    def receive(self, agent_id: str, limit: int = 10) -> list[AgentMessage]:
-        """Receive pending messages for an agent (FIFO, non-destructive peek)."""
-        queue = self._queues.get(agent_id, deque())
-        return list(queue)[:limit]
-
-    def pop(self, agent_id: str) -> AgentMessage | None:
-        """Pop the oldest message from an agent's queue."""
-        queue = self._queues.get(agent_id, deque())
-        if not queue:
-            return None
-        return queue.popleft()
-
-    def pop_all(self, agent_id: str) -> list[AgentMessage]:
-        """Pop all pending messages for an agent."""
-        queue = self._queues.get(agent_id, deque())
-        messages = list(queue)
-        queue.clear()
-        return messages
-
-    def broadcast(self, from_agent: str, content: str, message_type: str = "text") -> int:
-        """Broadcast a message to all known agents.  Returns count sent."""
-        count = 0
-        for agent_id in list(self._queues.keys()):
-            if agent_id != from_agent:
-                self.send(from_agent, agent_id, content, message_type)
-                count += 1
-        return count
-
-    def history(self, limit: int = 50) -> list[AgentMessage]:
-        """Return recent message history across all agents."""
-        return self._all_messages[-limit:]
-
-    def clear(self, agent_id: str | None = None) -> None:
-        """Clear message queue(s)."""
-        if agent_id:
-            self._queues.pop(agent_id, None)
-        else:
-            self._queues.clear()
-            self._all_messages.clear()
-
-
-# Module-level singleton
-messenger = InterAgentMessenger()
--- a/src/timmy_serve/voice_tts.py
+++ b/src/timmy_serve/voice_tts.py
@@ -87,7 +87,8 @@ class VoiceTTS:
                {"id": v.id, "name": v.name, "languages": getattr(v, "languages", [])}
                for v in voices
            ]
-        except Exception:
+        except Exception as exc:
+            logger.debug("Voice list retrieval failed: %s", exc)
            return []

    def set_voice(self, voice_id: str) -> None:
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -55,13 +55,27 @@ os.environ["TIMMY_SKIP_EMBEDDINGS"] = "1"


@pytest.fixture(autouse=True)
-def reset_message_log():
-    """Clear the in-memory chat log before and after every test."""
-    from dashboard.store import message_log
+def reset_message_log(tmp_path):
+    """Redirect chat DB to temp dir and clear before/after every test."""
+    import dashboard.store as _store_mod

-    message_log.clear()
+    original_db_path = _store_mod.DB_PATH
+    tmp_chat_db = tmp_path / "chat.db"
+    _store_mod.DB_PATH = tmp_chat_db
+
+    # Close existing singleton connection and point it at tmp DB
+    _store_mod.message_log.close()
+    _store_mod.message_log._db_path = tmp_chat_db
+    _store_mod.message_log._conn = None
+
+    _store_mod.message_log.clear()
    yield
-    message_log.clear()
+    _store_mod.message_log.clear()
+    _store_mod.message_log.close()
+
+    _store_mod.DB_PATH = original_db_path
+    _store_mod.message_log._db_path = original_db_path
+    _store_mod.message_log._conn = None


@pytest.fixture(autouse=True)
@@ -80,7 +94,8 @@ def clean_database(tmp_path):
        "infrastructure.models.registry",
    ]
    _memory_db_modules = [
-        "timmy.memory.unified",
+        "timmy.memory_system",  # Canonical location
+        "timmy.memory.unified",  # Backward compat
    ]
    _spark_db_modules = [
        "spark.memory",
@@ -108,14 +123,8 @@ def clean_database(tmp_path):
        except Exception:
            pass

-    # Redirect semantic memory DB path (uses SEMANTIC_DB_PATH, not DB_PATH)
-    try:
-        import timmy.semantic_memory as _sem_mod
-
-        originals[("timmy.semantic_memory", "SEMANTIC_DB_PATH")] = _sem_mod.SEMANTIC_DB_PATH
-        _sem_mod.SEMANTIC_DB_PATH = tmp_memory_db
-    except Exception:
-        pass
+    # Note: semantic_memory now re-exports from memory_system,
+    # so DB_PATH is already patched via _memory_db_modules above

    for mod_name in _spark_db_modules:
        try:
--- a/tests/dashboard/test_api_status_endpoints.py
+++ b/tests/dashboard/test_api_status_endpoints.py
@@ -0,0 +1,77 @@
+"""Tests for the API status endpoints.
+
+Verifies /api/briefing/status, /api/memory/status, and /api/swarm/status
+return valid JSON with expected keys.
+"""
+
+
+def test_api_briefing_status_returns_ok(client):
+    """GET /api/briefing/status returns 200 with expected JSON structure."""
+    response = client.get("/api/briefing/status")
+    assert response.status_code == 200
+
+    data = response.json()
+    assert data["status"] == "ok"
+    assert "pending_approvals" in data
+    assert isinstance(data["pending_approvals"], int)
+    assert "last_generated" in data
+    # last_generated can be None or a string
+    assert data["last_generated"] is None or isinstance(data["last_generated"], str)
+
+
+def test_api_memory_status_returns_ok(client):
+    """GET /api/memory/status returns 200 with expected JSON structure."""
+    response = client.get("/api/memory/status")
+    assert response.status_code == 200
+
+    data = response.json()
+    assert data["status"] == "ok"
+    assert "db_exists" in data
+    assert isinstance(data["db_exists"], bool)
+    assert "db_size_bytes" in data
+    assert isinstance(data["db_size_bytes"], int)
+    assert data["db_size_bytes"] >= 0
+    assert "indexed_files" in data
+    assert isinstance(data["indexed_files"], int)
+    assert data["indexed_files"] >= 0
+
+
+def test_api_swarm_status_returns_ok(client):
+    """GET /api/swarm/status returns 200 with expected JSON structure."""
+    response = client.get("/api/swarm/status")
+    assert response.status_code == 200
+
+    data = response.json()
+    assert data["status"] == "ok"
+    assert "active_workers" in data
+    assert isinstance(data["active_workers"], int)
+    assert "pending_tasks" in data
+    assert isinstance(data["pending_tasks"], int)
+    assert data["pending_tasks"] >= 0
+    assert "message" in data
+    assert isinstance(data["message"], str)
+    assert data["message"] == "Swarm monitoring endpoint"
+
+
+def test_api_swarm_status_reflects_pending_tasks(client):
+    """GET /api/swarm/status reflects pending tasks from task queue."""
+    # First create a task
+    client.post("/api/tasks", json={"title": "Swarm status test task"})
+
+    # Now check swarm status
+    response = client.get("/api/swarm/status")
+    assert response.status_code == 200
+
+    data = response.json()
+    assert data["pending_tasks"] >= 1
+
+
+def test_api_briefing_status_pending_approvals_count(client):
+    """GET /api/briefing/status returns correct pending approvals count."""
+    response = client.get("/api/briefing/status")
+    assert response.status_code == 200
+
+    data = response.json()
+    assert "pending_approvals" in data
+    assert isinstance(data["pending_approvals"], int)
+    assert data["pending_approvals"] >= 0
--- a/tests/dashboard/test_chat_persistence.py
+++ b/tests/dashboard/test_chat_persistence.py
@@ -0,0 +1,124 @@
+"""Tests for SQLite-backed chat persistence (issue #46)."""
+
+import infrastructure.chat_store as _chat_store
+from dashboard.store import Message, MessageLog
+
+
+def test_persistence_across_instances(tmp_path):
+    """Messages survive creating a new MessageLog pointing at the same DB."""
+    db = tmp_path / "chat.db"
+    log1 = MessageLog(db_path=db)
+    log1.append(role="user", content="hello", timestamp="10:00:00", source="browser")
+    log1.append(role="agent", content="hi back", timestamp="10:00:01", source="browser")
+    log1.close()
+
+    # New instance — simulates server restart
+    log2 = MessageLog(db_path=db)
+    msgs = log2.all()
+    assert len(msgs) == 2
+    assert msgs[0].role == "user"
+    assert msgs[0].content == "hello"
+    assert msgs[1].role == "agent"
+    assert msgs[1].content == "hi back"
+    log2.close()
+
+
+def test_retention_policy(tmp_path):
+    """Oldest messages are pruned when count exceeds MAX_MESSAGES."""
+    original_max = _chat_store.MAX_MESSAGES
+    _chat_store.MAX_MESSAGES = 5  # Small limit for testing
+
+    try:
+        db = tmp_path / "chat.db"
+        log = MessageLog(db_path=db)
+        for i in range(8):
+            log.append(role="user", content=f"msg-{i}", timestamp=f"10:00:{i:02d}")
+
+        assert len(log) == 5
+        msgs = log.all()
+        # Oldest 3 should have been pruned
+        assert msgs[0].content == "msg-3"
+        assert msgs[-1].content == "msg-7"
+        log.close()
+    finally:
+        _chat_store.MAX_MESSAGES = original_max
+
+
+def test_clear_removes_all(tmp_path):
+    db = tmp_path / "chat.db"
+    log = MessageLog(db_path=db)
+    log.append(role="user", content="data", timestamp="12:00:00")
+    assert len(log) == 1
+    log.clear()
+    assert len(log) == 0
+    assert log.all() == []
+    log.close()
+
+
+def test_recent_returns_limited_newest(tmp_path):
+    db = tmp_path / "chat.db"
+    log = MessageLog(db_path=db)
+    for i in range(10):
+        log.append(role="user", content=f"msg-{i}", timestamp=f"10:00:{i:02d}")
+
+    recent = log.recent(limit=3)
+    assert len(recent) == 3
+    # Should be oldest-first within the window
+    assert recent[0].content == "msg-7"
+    assert recent[1].content == "msg-8"
+    assert recent[2].content == "msg-9"
+    log.close()
+
+
+def test_source_field_persisted(tmp_path):
+    db = tmp_path / "chat.db"
+    log = MessageLog(db_path=db)
+    log.append(role="user", content="from api", timestamp="10:00:00", source="api")
+    log.append(role="user", content="from tg", timestamp="10:00:01", source="telegram")
+    log.close()
+
+    log2 = MessageLog(db_path=db)
+    msgs = log2.all()
+    assert msgs[0].source == "api"
+    assert msgs[1].source == "telegram"
+    log2.close()
+
+
+def test_message_dataclass_defaults():
+    m = Message(role="user", content="hi", timestamp="12:00:00")
+    assert m.source == "browser"
+
+
+def test_empty_db_returns_empty(tmp_path):
+    db = tmp_path / "chat.db"
+    log = MessageLog(db_path=db)
+    assert log.all() == []
+    assert len(log) == 0
+    assert log.recent() == []
+    log.close()
+
+
+def test_concurrent_appends(tmp_path):
+    """Multiple threads can append without corrupting data."""
+    import threading
+
+    db = tmp_path / "chat.db"
+    log = MessageLog(db_path=db)
+    errors = []
+
+    def writer(thread_id):
+        try:
+            for i in range(20):
+                log.append(role="user", content=f"t{thread_id}-{i}", timestamp="10:00:00")
+        except Exception as e:
+            errors.append(e)
+
+    threads = [threading.Thread(target=writer, args=(t,)) for t in range(4)]
+    for t in threads:
+        t.start()
+    for t in threads:
+        t.join()
+
+    assert not errors
+    assert len(log) == 80
+    log.close()
--- a/tests/dashboard/test_round4_fixes.py
+++ b/tests/dashboard/test_round4_fixes.py
@@ -159,6 +159,8 @@ def test_create_timmy_uses_timeout_not_request_timeout():
        patch("timmy.agent.Ollama") as mock_ollama,
        patch("timmy.agent.SqliteDb"),
        patch("timmy.agent.Agent"),
+        patch("timmy.agent._resolve_model_with_fallback", return_value=("llama3.2:3b", False)),
+        patch("timmy.agent._check_model_available", return_value=True),
    ):
        mock_ollama.return_value = MagicMock()

--- a/Show More
+++ b/Show More
				`@@ -1 +0,0 @@`
				`"""Agent Core — Substrate-agnostic agent interface and base classes."""`