Co-authored-by: Allegro <allegro@hermes.local> Co-committed-by: Allegro <allegro@hermes.local>
6.5 KiB
Successor Fork Specification
Parent: Hermes v2.0 Architecture — docs/hermes-v2.0-architecture.md
Epic: #421 — The Autogenesis Protocol
Author: Allegro
1. Purpose
The Successor Fork is the mechanism by which a Hermes v2.0 instance evaluates changes to its own architecture without risking the live runtime. It is not a subagent solving a user task. It is a sandboxed clone of the runtime that exists solely to answer the question:
"If I applied this architecture patch, would the result be better?"
2. Definitions
| Term | Definition |
|---|---|
| Parent | The live Hermes v2.0 runtime currently serving users. |
| Successor | A temporary, isolated fork of the Parent created for architectural evaluation. |
| Architecture Patch | A proposed change to one or more runtime components (loop, router, memory layer, etc.). |
| Evaluation Task | A bounded test or benchmark the Successor must run to validate the patch. |
| Verdict | The Successor's final judgment: APPROVE, REJECT, or NEEDS_REVISION. |
3. Successor Lifecycle
┌─────────┐ fork() ┌───────────┐ apply patch ┌─────────────┐
│ Parent │ ──────────────> │ Successor │ ─────────────────> │ Patched │
│ Runtime │ │ (clone) │ │ Successor │
└─────────┘ └───────────┘ └──────┬──────┘
▲ │
│ verdict │
│ + artifact bundle │
│<─────────────────────────────────────────────────────────────┘
3.1 Fork Phase
The Parent creates a Successor by:
- Copying the current state store to a temporary SQLite file.
- Copying the current tool registry into a temporary namespace.
- Cloning the conversation loop into an isolated
asyncioevent loop. - Generating a temporary identity keypair (never persisted to Bitcoin).
3.2 Patch Phase
The Successor applies the Architecture Patch. This may involve:
- Replacing a Python module
- Adding a new schema migration
- Changing a configuration constant
- Swapping a tool implementation
Rule: The Successor has no network access except to localhost services required for the Evaluation Task.
3.3 Evaluation Phase
The Successor runs the Evaluation Task. Examples:
- "Run the full test suite. Report pass rate and execution time."
- "Compress 100 sessions and measure memory usage vs. baseline."
- "Handle 1,000 concurrent tool calls and report latency p99."
Budget constraints:
- Max wall time: 1 hour
- Max API cost: $5.00 equivalent
- Max disk usage: 10 GB
- Max CPU percentage: 200% (2 cores)
If any budget is exceeded, the Successor auto-terminates and returns REJECT.
3.4 Verdict Phase
The Successor produces a Verdict document:
{
"verdict": "APPROVE",
"confidence": 0.94,
"patch_hash": "sha256:abc123...",
"evaluation_summary": "Test suite passed (312/312). p99 latency improved 18%. Memory usage reduced 12%.",
"caveats": ["One flaky test (test_gateway_reconnect) required 3 retries."],
"artifacts": [
{"type": "test_report", "path": "/tmp/successor/reports/tests.html"},
{"type": "benchmark", "path": "/tmp/successor/reports/latency.json"}
]
}
3.5 Termination Phase
The Successor is destroyed. Its temporary state store is deleted. Its identity keypair is zeroed. Only the Verdict document and artifact bundle survive.
4. Security Boundaries
| Resource | Parent Access | Successor Access |
|---|---|---|
| Live state store | Read/Write | Read-only copy |
| Live tool registry | Modify | Temporary copy |
| Network (proprietary APIs) | Allowed | Blocked |
| Network (localhost tests) | Allowed | Allowed |
| Bitcoin identity key | Full | None |
| User messages | Receives | None |
| File system (outside /tmp) | Full | Read-only |
Enforcement: The Successor runs inside a lightweight container (Docker or systemd-nspawn) with seccomp-bpf rules.
5. Human and Mesh Approval Gates
A positive Verdict does not auto-deploy the patch. The patch must pass one additional gate:
Phase I–III: Human Gate
Alexander (or designated human steward) must approve the patch.
Phase IV+: Mesh Consensus Gate
The patch is gossiped to a quorum of mesh nodes. Each node may spawn its own Successor to independently verify the patch. If ≥2/3 of responding nodes return APPROVE, the patch is applied automatically.
6. The Autogenesis Loop
In Phase V, the Successor Fork becomes the engine of continuous self-improvement:
- Bottleneck detector monitors the Parent runtime.
- Patch generator drafts an architecture patch.
- Successor fork evaluates the patch.
- Verifier agent (a second, independent Successor) audits the first Successor's methodology.
- If both approve, the patch is gossiped for mesh consensus.
- If consensus passes, the Parent applies the patch during a scheduled maintenance window.
- The new Parent now has a new SOUL.md hash, which is signed and attested.
7. Interface Definition
class SuccessorFork:
def __init__(self, parent_runtime: HermesRuntime, patch: ArchitecturePatch):
...
async def evaluate(self, task: EvaluationTask, budget: Budget) -> Verdict:
"""
Spawn the successor, apply the patch, run the evaluation,
and return a Verdict. Never modifies the parent.
"""
...
def destroy(self):
"""Clean up all temporary state. Idempotent."""
...
8. Acceptance Criteria
- Successor can be spawned from a running Hermes v2.0 instance in <30 seconds.
- Successor cannot modify Parent state, filesystem, or identity.
- Successor returns a structured Verdict with confidence score and artifacts.
- Budget enforcement auto-terminates runaway Successors.
- At least one demo patch (e.g., "swap context compressor algorithm") is evaluated end-to-end.
The Successor Fork is the recursive engine. It is how Hermes learns to outgrow itself.