Compare commits

...

1 Commits

Author SHA1 Message Date
Alexander Whitestone
11360f852f research: evaluate Honcho memory integration (closes #322)
Some checks failed
Docs Site Checks / docs-site-checks (pull_request) Failing after 6m9s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 45s
Nix / nix (ubuntu-latest) (pull_request) Failing after 58s
Tests / test (pull_request) Failing after 34m26s
Tests / e2e (pull_request) Successful in 2m56s
Nix / nix (macos-latest) (pull_request) Has been cancelled
Conditionally reject integration. Honcho is architecturally sound but:
- Requires cloud API or heavy self-host infra (Postgres+pgvector+Redis)
- Needs external LLM keys for derivation even when self-hosted
- Conflicts with local-first sovereignty principles
- Existing memory stack (fact_store, session_search, memory tool) sufficient

Recommend building local alternatives using existing inference infra.
2026-04-13 17:51:28 -04:00

107
research/honcho-eval-322.md Normal file
View File

@@ -0,0 +1,107 @@
# Honcho Memory Integration Evaluation
**Issue:** #322
**Source:** plastic-labs/hermes-honcho
**Evaluator:** Timmy (mimo-v2-pro)
**Date:** 2026-04-13
## Verdict: CONDITIONAL NO — Do not integrate yet.
Honcho is architecturally sound but conflicts with our local-first sovereignty principles. Worth watching, not adopting.
---
## What Honcho Is
AI-native memory infrastructure from Plastic Labs. Three background agents process conversations:
- **Deriver** — ingests messages, extracts observations (explicit facts + deductive inferences) about peers
- **Dialectic** — answers natural-language queries about users ("What are this user's goals?")
- **Dreamer** — consolidation process, removes redundancies, improves memory quality over time
Core concepts: peers (users and agents as unified entities), sessions, observation settings, continual learning.
## What the Fork Provides
plastic-labs/hermes-honcho adds ~930 lines of integration code:
- `honcho_integration/client.py` (191 lines) — config resolution, singleton client
- `honcho_integration/session.py` (538 lines) — HonchoSession + HonchoSessionManager
- `tools/honcho_tools.py` (102 lines) — `query_user_context` tool for LLM
Hooks into run_agent.py:
- `_honcho_prefetch(user_message)` — injects user representation into system prompt before each turn
- `_honcho_sync(user_content, assistant_content)` — syncs messages to Honcho after each turn
## What It Adds Beyond Current Memory
| Capability | Hermes Memory (current) | Honcho |
|---|---|---|
| Storage | File-backed (MEMORY.md, USER.md) + holographic fact_store | Cloud API or self-hosted Postgres + pgvector |
| Cross-session user modeling | Manual `memory add` calls | Automatic background derivation |
| Dialectic reasoning | Search-based fact recall | Natural-language queries about user psychology |
| Social cognition | None | Multi-peer, observation settings, relationship modeling |
| Session summarization | Context compression | Built-in token-budgeted summaries + peer cards |
| Cost | Zero (local files) | LLM API calls per derivation + API key or self-host infra |
## Why NOT Integrate
### 1. Local-First Violation
Honcho defaults to cloud (`api.honcho.dev`). Self-hosting requires PostgreSQL + pgvector + Redis + LLM API keys (Gemini, Anthropic, OpenAI for embeddings). That's a significant infrastructure dependency for a system that prides itself on running on 4GB of RAM.
Our SOUL.md is explicit: *"If I ever require permission from a third party to function, I have failed."*
### 2. Cost Without Control
Every conversation turn triggers:
- Deriver API call (observation extraction)
- Potentially Dreamer call (consolidation)
- Dialectic call on prefetch (user context query)
Even self-hosted, this means LLM API costs for background memory processing. Our current system costs zero.
### 3. Dependency Risk
- `honcho-ai>=2.0.1` package dependency
- Plastic Labs as a dependency — startup, not protocol
- Config lives in `~/.honcho/config.json`, not our config-as-code system
- No Bitcoin integration, no on-chain anchoring
### 4. Existing Systems Are Sufficient
Our current memory stack:
- `memory_tool.py` — file-backed MEMORY.md/USER.md with bounded char limits
- `fact_store` — holographic memory with entity resolution and trust scoring
- `session_search` — FTS5 search across past sessions
- `skill_manage` — procedural memory as skills
What Honcho provides that we lack (automatic derivation, dialectic queries) can be built locally using our existing inference infrastructure (local Ollama, mimo-v2-pro).
### 5. Self-Hosted Still Needs External LLMs
Even self-hosted Honcho requires API keys for Gemini, Anthropic, or OpenAI for its internal derivation/embedding pipeline. It doesn't use our local models.
## When to Reconsider
- If Honcho adds local-model support (use our Ollama for derivation)
- If we're already running PostgreSQL for other reasons
- If cross-session user modeling becomes a critical feature gap
- If Plastic Labs releases a fully local, zero-dependency variant
## Recommendation
**Do not integrate.** Instead, build local alternatives:
1. **Auto-derivation**: Use a background cron job with our local models to extract facts from session transcripts into the fact_store
2. **Dialectic queries**: Extend fact_store `reason` and `probe` actions with natural-language querying
3. **Peer modeling**: Add entity types to the fact_store for user preference tracking
These would give us 80% of Honcho's value at 0% of its dependency cost.
## References
- Fork: https://github.com/plastic-labs/honcho (SDK)
- Integration: plastic-labs/hermes-honcho (Hermes fork with integration)
- Honcho docs: https://docs.honcho.dev
- Self-hosting: https://github.com/plastic-labs/honcho#self-hosting