[STUDY] Sovereign Local Agents on macOS — Hermes v0.4.0 Architecture Spike #576

New Issue

perplexity · 2026-03-26T19:56:09Z

perplexity commented

2026-03-26 19:56:09 +00:00

Research Spike: Sovereign Local Agents on macOS — State of the Art

Source: "Sovereign Local Agents on macOS: A State-of-the-Art Technical Architecture" (20pp, generated by Kimi.ai, March 2026)

This report covers the full Hermes Agent v0.4.0 architecture and how it maps to our sovereign stack. Key sections with actionable findings below.

1. Hermes Agent Harness (Section 1)

What we already have: Hermes Agent installed on Hermes (Mac M3 Max), config.yaml with Ollama, SOUL.md, memory enabled, orchestration MCP server.

What the report confirms we're missing:

MCP servers for perception/action (steam-info-mcp, mcp-pyautogui) are defined in mcp/servers.json but not registered in config.yaml mcp_servers — Hermes doesn't see them
Model config still points at claude-opus-4-6 (Anthropic) as default with hermes3:latest (8B) as Ollama fallback — should point at hermes4:14b once pulled (#9)
Hermes v0.4.0 has a built-in trajectory compression pipeline (trajectory_compressor.py) that we're not using — it compresses agent execution traces into training-optimized formats, 10-50x reduction

Key v0.4.0 features we should leverage:

hermes mcp CLI for installing/configuring MCP servers (with OAuth 2.1 PKCE flow)
Background self-improvement thread ("online distillation")
Context compression overhaul with configurable summary endpoint
Trajectory export directly compatible with Atropos RL framework

2. Instant Distillation / Context Compression (Section 2)

Structured Distillation paper (Nous Research, March 13, 2026, arXiv):

Transforms each conversation exchange into 4 fields: exchange_core (~15 tokens), specific_context (~23 tokens), thematic_room_assignments, files_touched
11x compression ratio (371→38 tokens) with 96% retrieval MRR preserved
Dense vector retrieval robust to distillation; BM25/lexical search degrades significantly
Implication: Our DPO training pipeline should prioritize vector-based retrieval over keyword search for session mining

Agentic On-Policy Distillation (OPD):

v0.2.0 introduced OPD as RL training environment in Atropos
tinker-atropos standalone training infrastructure available
4B Qwen3.5 achieved convergence with 27B baseline after ~7 hours of autonomous self-improvement — 7x effective capacity expansion through learning

3. Memory Architecture (Section 3)

Five-tier memory already in Hermes: Working Context → FTS5 Search → Vector Embeddings → Honcho Profiles → Skill Documents

Hindsight-Hermes plugin (worth evaluating):

pip install hindsight-hermes
Structured fact extraction + entity resolution + knowledge graph
Multi-strategy retrieval: semantic + BM25 + entity graph traversal + temporal filtering
Self-hosted Docker option for sovereignty
Critical step after install: hermes tools disable memory (prevents native tool preference)

my-hermantic-agent fork shows Ollama-hosted Hermes-4-14B with TimescaleDB persistent semantic memory — closest to our architecture.

4. Auto Research / Self-Improvement (Section 4)

hermes-autoresearch branch — experimental, enables autonomous hypothesis generation and experimentation
~700 autonomous commits for nanochat optimization, 11% improvement
Not ready for us yet — experimental branch, expect instability

5. Tool Ecosystem (Section 5)

80+ production-ready skills in hermes-skills repo
agentskills.io standard adopted by 11 tools (Claude Code, Cursor, Copilot, Gemini CLI, etc.)
Gateway supports 15 messaging platforms (v0.4.0 added Signal, DingTalk, SMS, Mattermost, Matrix, Webhook)
ACP server enables IDE integration (VS Code, Zed, JetBrains)

6. macOS Deployment (Section 6)

M3 Max with 128GB unified memory can run Hermes 3 70B without quantization — capability requiring ~$15,000 in discrete GPU hardware elsewhere. We should be running at least 14B without breaking a sweat.

Security layers available: namespace isolation, capability dropping, read-only root FS, seccomp-bpf, DM pairing for credential access.

7. Integration Roadmap from Report (Section 7)

The report's 4-phase roadmap maps to our current state:

Phase	Report says	Our status
Phase 1: Core (Day 1)	Install Hermes + Ollama + base model	DONE
Phase 1: Memory	Install hindsight-hermes	NOT STARTED — evaluate
Phase 2: Distillation (Week 1)	Enable compression + trajectory export	NOT STARTED — tickets below
Phase 3: Auto Research (Week 2-4)	Deploy hermes-autoresearch branch	BLOCKED — experimental
Phase 4: Ecosystem (Ongoing)	Custom skills, multi-agent orchestration	IN PROGRESS (our swarm)

Actionable Tickets Created

Register steam-info-mcp and desktop-control in config.yaml mcp_servers
Rewire heartbeat_tick() in tasks.py to invoke Hermes agent sessions (telemetry capture)
Enable trajectory export: /config set trajectory_export true + HERMES_TRAJECTORY_PATH
Switch model default from Anthropic cloud to local Ollama (hermes4:14b after #9 completes)

## Research Spike: Sovereign Local Agents on macOS — State of the Art **Source:** "Sovereign Local Agents on macOS: A State-of-the-Art Technical Architecture" (20pp, generated by Kimi.ai, March 2026) This report covers the full Hermes Agent v0.4.0 architecture and how it maps to our sovereign stack. Key sections with actionable findings below. --- ### 1. Hermes Agent Harness (Section 1) **What we already have:** Hermes Agent installed on Hermes (Mac M3 Max), config.yaml with Ollama, SOUL.md, memory enabled, orchestration MCP server. **What the report confirms we're missing:** - MCP servers for perception/action (`steam-info-mcp`, `mcp-pyautogui`) are defined in `mcp/servers.json` but **not registered in `config.yaml` `mcp_servers`** — Hermes doesn't see them - Model config still points at `claude-opus-4-6` (Anthropic) as default with `hermes3:latest` (8B) as Ollama fallback — should point at `hermes4:14b` once pulled (#9) - Hermes v0.4.0 has a built-in **trajectory compression pipeline** (`trajectory_compressor.py`) that we're not using — it compresses agent execution traces into training-optimized formats, 10-50x reduction **Key v0.4.0 features we should leverage:** - `hermes mcp` CLI for installing/configuring MCP servers (with OAuth 2.1 PKCE flow) - Background self-improvement thread ("online distillation") - Context compression overhaul with configurable summary endpoint - Trajectory export directly compatible with Atropos RL framework ### 2. Instant Distillation / Context Compression (Section 2) **Structured Distillation paper (Nous Research, March 13, 2026, arXiv):** - Transforms each conversation exchange into 4 fields: `exchange_core` (~15 tokens), `specific_context` (~23 tokens), `thematic_room_assignments`, `files_touched` - 11x compression ratio (371→38 tokens) with 96% retrieval MRR preserved - Dense vector retrieval robust to distillation; BM25/lexical search degrades significantly - **Implication:** Our DPO training pipeline should prioritize vector-based retrieval over keyword search for session mining **Agentic On-Policy Distillation (OPD):** - v0.2.0 introduced OPD as RL training environment in Atropos - `tinker-atropos` standalone training infrastructure available - 4B Qwen3.5 achieved convergence with 27B baseline after ~7 hours of autonomous self-improvement — **7x effective capacity expansion through learning** ### 3. Memory Architecture (Section 3) **Five-tier memory already in Hermes:** Working Context → FTS5 Search → Vector Embeddings → Honcho Profiles → Skill Documents **Hindsight-Hermes plugin** (worth evaluating): - `pip install hindsight-hermes` - Structured fact extraction + entity resolution + knowledge graph - Multi-strategy retrieval: semantic + BM25 + entity graph traversal + temporal filtering - Self-hosted Docker option for sovereignty - Critical step after install: `hermes tools disable memory` (prevents native tool preference) **`my-hermantic-agent` fork** shows Ollama-hosted Hermes-4-14B with TimescaleDB persistent semantic memory — closest to our architecture. ### 4. Auto Research / Self-Improvement (Section 4) - `hermes-autoresearch` branch — experimental, enables autonomous hypothesis generation and experimentation - ~700 autonomous commits for nanochat optimization, 11% improvement - **Not ready for us yet** — experimental branch, expect instability ### 5. Tool Ecosystem (Section 5) - 80+ production-ready skills in `hermes-skills` repo - `agentskills.io` standard adopted by 11 tools (Claude Code, Cursor, Copilot, Gemini CLI, etc.) - Gateway supports 15 messaging platforms (v0.4.0 added Signal, DingTalk, SMS, Mattermost, Matrix, Webhook) - ACP server enables IDE integration (VS Code, Zed, JetBrains) ### 6. macOS Deployment (Section 6) **M3 Max with 128GB unified memory can run Hermes 3 70B without quantization** — capability requiring ~$15,000 in discrete GPU hardware elsewhere. We should be running at least 14B without breaking a sweat. **Security layers available:** namespace isolation, capability dropping, read-only root FS, seccomp-bpf, DM pairing for credential access. ### 7. Integration Roadmap from Report (Section 7) The report's 4-phase roadmap maps to our current state: | Phase | Report says | Our status | |-------|------------|------------| | Phase 1: Core (Day 1) | Install Hermes + Ollama + base model | DONE | | Phase 1: Memory | Install hindsight-hermes | NOT STARTED — evaluate | | Phase 2: Distillation (Week 1) | Enable compression + trajectory export | NOT STARTED — tickets below | | Phase 3: Auto Research (Week 2-4) | Deploy hermes-autoresearch branch | BLOCKED — experimental | | Phase 4: Ecosystem (Ongoing) | Custom skills, multi-agent orchestration | IN PROGRESS (our swarm) | --- ### Actionable Tickets Created - [ ] Register `steam-info-mcp` and `desktop-control` in config.yaml `mcp_servers` - [ ] Rewire `heartbeat_tick()` in tasks.py to invoke Hermes agent sessions (telemetry capture) - [ ] Enable trajectory export: `/config set trajectory_export true` + `HERMES_TRAJECTORY_PATH` - [ ] Switch model default from Anthropic cloud to local Ollama (`hermes4:14b` after #9 completes)

Timmy was assigned by perplexity

2026-03-26 19:56:10 +00:00

gemini commented

2026-03-26 20:00:36 +00:00

🔧 gemini working on this via Huey. Branch: gemini/issue-576

🔧 `gemini` working on this via Huey. Branch: `gemini/issue-576`

gemini referenced a pull request that will close this issue

2026-03-26 20:00:42 +00:00

[gemini] [STUDY] Sovereign Local Agents on macOS — Hermes v0.4.0 Architecture Spike (#576) #577

grok commented

2026-03-26 20:00:42 +00:00

🔧 grok working on this via Huey. Branch: grok/issue-576

🔧 `grok` working on this via Huey. Branch: `grok/issue-576`

gemini referenced this issue from a commit

2026-03-26 20:00:43 +00:00

[gemini] [STUDY] Sovereign Local Agents on macOS — Hermes v0.4.0 Architecture Spike (#576)

grok commented

2026-03-26 20:00:45 +00:00

⚠️ grok produced no changes for this issue. Skipping.

⚠️ `grok` produced no changes for this issue. Skipping.

Timmy commented

2026-03-26 20:00:46 +00:00

⚡ Dispatched to claude. Huey task queued.

⚡ Dispatched to `claude`. Huey task queued.

Timmy commented

2026-03-26 20:00:46 +00:00

⚡ Dispatched to gemini. Huey task queued.

⚡ Dispatched to `gemini`. Huey task queued.

Timmy commented

2026-03-26 20:00:47 +00:00

⚡ Dispatched to kimi. Huey task queued.

⚡ Dispatched to `kimi`. Huey task queued.

Timmy commented

2026-03-26 20:00:48 +00:00

⚡ Dispatched to grok. Huey task queued.

⚡ Dispatched to `grok`. Huey task queued.

Timmy commented

2026-03-26 20:00:48 +00:00

⚡ Dispatched to perplexity. Huey task queued.

⚡ Dispatched to `perplexity`. Huey task queued.

Timmy commented

2026-03-28 04:52:40 +00:00

Closing during the 2026-03-28 backlog burn-down.

Reason: this issue is being retired as part of a backlog reset toward the current final vision: Heartbeat, Harness, and Portal. If the work still matters after reset, it should return as a narrower, proof-oriented next-step issue rather than stay open as a broad legacy frontier.

Closing during the 2026-03-28 backlog burn-down. Reason: this issue is being retired as part of a backlog reset toward the current final vision: Heartbeat, Harness, and Portal. If the work still matters after reset, it should return as a narrower, proof-oriented next-step issue rather than stay open as a broad legacy frontier.

Timmy closed this issue

2026-03-28 04:52:41 +00:00

Sign in to join this conversation.

4 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#576