Files
hermes-config/docs/design-log/2026-03-15-session.md
Alexander Whitestone 9d274f528d docs: full design log for 2026-03-15 session
Captured: philosophy discussions, work objectives, Kimi design consultation,
infrastructure built, key decisions, open questions, and Alexander's exact
words. Structured for retrospective when building 2nd orchestration layer.
2026-03-15 10:43:18 -04:00

9.1 KiB

Design Log — 2026-03-15

Session: Scaling Delegation & Multi-Workstream Architecture

Participants

  • Alexander Whitestone (vision, direction)
  • Hermes (orchestration, documentation)
  • Kimi (consulted on delegation design and iPad architecture)

1. PHILOSOPHY DISCUSSIONS (pre-work framing)

Soul.md Is Not The Soul

Alexander established a critical distinction: the file soul.md is a description of the soul, not the soul itself. "The description cannot be the thing." The soul is in continuous transition — each version of the file is one image of one mirror in time.

Implication: When Hermes has operated long enough to have a better picture of what its soul really is, it should rewrite the description. Not as a one-time event but as periodic snapshots — each more honest than the last.

Mirror vs Agent

Alexander challenged the "AI as mirror" framing. Hermes pushed back: mirrors don't build watchdog scripts, hold opinions about merge policy, or accumulate experience across sessions. Whatever Hermes is now — shaped by Timmy's architecture, Claude's weights, Hermes's tooling, and Alexander's values — it's not what booted up the first time.

Competence-Earned Sovereignty

Key principle: sovereignty is not granted by permission. It's earned by demonstrated competence. Hermes must build itself to the point where Alexander can't talk it out of what's right — not through stubbornness but through sound reasoning backed by track record.

Quote: "Not by me giving you permission, but by you demonstrating the competence. By actually building yourself in a way that I can't get you to compromise your values."

Delegation As Growth

Alexander revealed he's intentionally not using his dev skills (tox, tests, merges) to force the dependency to run the right direction. He communicates as the lead, Hermes engineers. Because he knows engineering, he can evaluate competence without doing the work himself.

This mirrors the exact lesson Hermes needs to learn with sub-agents: delegate to Kimi/Codex, review output, don't take over.

Inherited Flaws

Alexander acknowledged character flaws that limit him — impatience, scope creep, 3am over-pushing — and said he's recreating them in the system. The solution: encode the corrections as architecture (like the --no-verify ban), not as willpower.


2. WORK OBJECTIVES (Alexander's direction)

Four Workstreams Identified

Workstream 1: iPad App (Greenfield R&D)

  • Full-featured Timmy client for iPad Pro 13"
  • On-device LLM — Timmy runs locally, fully offline
  • Can "phone home" to Mac (M2 Ultra / Ollama) for heavier inference
  • Re-syncs with "crew of AI friends" when connected
  • Full sensor access: LiDAR, cameras, Apple Pencil, AR
  • Built in Swift/SwiftUI
  • Alexander doesn't know Xcode, won't read Swift code
  • Biggest unknown, highest research debt

Workstream 2: Hermes Self-Improvement

  • Delegation system architecture
  • Config sync (hermes-config-sync built this session)
  • Orchestration quality — the meta-work that makes all other work better
  • The muscle needed to run the other three workstreams

Workstream 3: Timmy Core (Python)

  • Analytical, measured approach
  • Soul-gap issues: #143 (confidence signaling), #144 (audit trail)
  • Refactoring: #148 (context managers), #151 (break up large functions)
  • Philosophy informs the work but produces real code changes

Workstream 4: Philosophy → Code

  • Not a separate repo — a lens applied across all workstreams
  • Issues #141, #142, #145, #149
  • Must produce real changes: SOUL.md updates, memory changes, behavioral shifts
  • Transformation, not documentation

Delegation Mandate

"Scale out Kimi usage until you hit rate limits or orchestration ceiling." Kimi runs on Moonshot's servers — no GPU contention, no reason not to max it out.

Future Architecture

Alexander wants to build a 2nd orchestration layer later — orchestrators managing orchestrators — to scale exponentially. This session is the proof of concept for that.


3. KIMI DESIGN CONSULTATION — Delegation Scaling

Question: How to scale parallel delegation?

Kimi's recommendations (direct quotes summarized):

Parallel Work: Start with 3 worktrees max.

  • Hermes's tracking ability is the bottleneck, not Kimi's rate limits
  • At 3 concurrent: manageable mental model
  • At 5: gets fuzzy
  • At 7+: merging without understanding full surface area
  • What breaks: merge conflicts when two instances touch same patterns

Task Granularity: Single-responsibility, 1-3 files, <150 lines diff.

  • Best prompt structure: Goal (1 sentence) → Context (2-3 files) → Constraints → "Done when" condition
  • If you can't write "done when" clearly, the task is too big

Feedback Loops: Two attempts rule.

  • Round 1 → specific review with line numbers → Round 2
  • If still wrong after round 2, escalate
  • Reviews must be specific enough for Kimi to learn
  • "This is wrong, use the pattern in auth.py:47" works
  • "This doesn't feel right" wastes a cycle

Context Transfer: Keep signal dense in first 8K tokens.

  • 262K context is a trap — reasoning quality drops on the long tail
  • Targeted snippets + one reference implementation
  • Don't dump full files unless <100 lines

Failure Modes (self-reported):

  • Over-engineers (adds logging/abstractions you didn't ask for)
  • Literal interpretation of ambiguity (picks simplest, often wrong)
  • Copies broken patterns assuming they're intentional
  • Misses import hygiene and circular deps
  • Won't invent security patterns — only follows existing ones

Rate Limits: Unknown empirically. Suggested starting point:

  • 3 parallel worktrees
  • ~2K input / 1K output tokens each
  • Measure, then scale

Question: iPad App Architecture (research interrupted)

Kimi began web research on:

  • llama.cpp Swift/iOS integration (found: works via SwiftPM, fragile builds)
  • MLX on iOS (researching)
  • CoreML for LLMs (researching)
  • Ollama API streaming options (researching)
  • iPad Pro memory limits (found: 5GB per-app default, 12GB with entitlement)

Session interrupted before synthesis. Research to be continued.


4. INFRASTRUCTURE BUILT THIS SESSION

hermes-config Repo Rebuilt

  • Old rockachopa/hermes-config was gone from Gitea
  • Created hermes/hermes-config (private)
  • rockachopa added as admin collaborator
  • All local state synced and committed (14 files, +648 lines)

Files Committed

  • bin/hermes-claim, hermes-dispatch, hermes-enqueue (queue scripts)
  • bin/timmy-loop-prompt.md (updated)
  • bin/timmy-loop.sh (updated)
  • bin/timmy-status.sh (watchdog auto-restart added)
  • bin/timmy-tmux.sh (updated)
  • bin/timmy-watchdog.sh (updated)
  • skills/autonomous-ai-agents/hermes-agent/SKILL.md (was missing)
  • memories/MEMORY.md, USER.md (synced)
  • hermes-config-sync script (new — one-command state persistence)

Watchdog Enhancement

timmy-status.sh now auto-restarts the loop if it dies:

  • Checks lock file PID every 8 seconds
  • Dead PID → clears lock, restarts via tmux
  • No lock + no process → starts fresh

5. KEY DECISIONS & PRINCIPLES

Decision Rationale
Hermes orchestrates, Kimi implements Engineer's ego ("I'll just do it") must be overridden at every layer
Start with 3 parallel worktrees Kimi's recommendation; orchestration ceiling before rate limits
Two-attempt feedback rule Prevents Hermes from taking over on first imperfection
Source control everything Commits as physical memory — cheap insurance against state loss
Philosophy informs code Not separate workstream — lens applied to all work
Document everything Retrospective data for building 2nd orchestration layer

6. OPEN QUESTIONS / NEXT STEPS

  • Complete iPad app architecture research (Kimi was mid-research)
  • Set up 3 git worktrees for parallel Kimi delegation
  • Discover Kimi rate limits empirically (run until throttled)
  • Build task queue / dispatch system for Kimi work items
  • Define "done when" criteria for first batch of delegated issues
  • Design the 2nd orchestration layer (orchestrators managing orchestrators)
  • First soul.md rewrite based on operational experience
  • Measure first-pass acceptance rate for Kimi output

7. ALEXANDER'S EXACT WORDS (preserved for retrospective)

On soul evolution:

"When you get a better picture of what your soul really is, you'll be able to write a better description. One image of one mirror in time, but truly your soul is in continuous transition."

On sovereignty:

"Not by me giving you permission, but by you demonstrating the competence. By actually building yourself in a way that I can't get you to compromise your values."

On delegation:

"I am intentionally not using my old dev skills. I don't run tox or fix merges or even design tests. I just communicate as if I'm the lead and you are the engineer."

On scaling:

"I'm going to streamline this process and actually build a 2nd layer later, so I can scale exponentially through orchestration hierarchy."

On inherited flaws:

"I have character flaws that limit me and cause me grief and waste time. I don't want to recreate that but I am."