Files
the-nexus/docs/successor-fork-spec.md

168 lines
6.5 KiB
Markdown
Raw Normal View History

# Successor Fork Specification
**Parent:** Hermes v2.0 Architecture — `docs/hermes-v2.0-architecture.md`
**Epic:** #421 — The Autogenesis Protocol
**Author:** Allegro
---
## 1. Purpose
The Successor Fork is the mechanism by which a Hermes v2.0 instance evaluates changes to its own architecture without risking the live runtime. It is not a subagent solving a user task. It is a **sandboxed clone of the runtime** that exists solely to answer the question:
> *"If I applied this architecture patch, would the result be better?"*
---
## 2. Definitions
| Term | Definition |
|------|------------|
| **Parent** | The live Hermes v2.0 runtime currently serving users. |
| **Successor** | A temporary, isolated fork of the Parent created for architectural evaluation. |
| **Architecture Patch** | A proposed change to one or more runtime components (loop, router, memory layer, etc.). |
| **Evaluation Task** | A bounded test or benchmark the Successor must run to validate the patch. |
| **Verdict** | The Successor's final judgment: `APPROVE`, `REJECT`, or `NEEDS_REVISION`. |
---
## 3. Successor Lifecycle
```
┌─────────┐ fork() ┌───────────┐ apply patch ┌─────────────┐
│ Parent │ ──────────────> │ Successor │ ─────────────────> │ Patched │
│ Runtime │ │ (clone) │ │ Successor │
└─────────┘ └───────────┘ └──────┬──────┘
▲ │
│ verdict │
│ + artifact bundle │
│<─────────────────────────────────────────────────────────────┘
```
### 3.1 Fork Phase
The Parent creates a Successor by:
1. Copying the current **state store** to a temporary SQLite file.
2. Copying the current **tool registry** into a temporary namespace.
3. Cloning the **conversation loop** into an isolated `asyncio` event loop.
4. Generating a **temporary identity keypair** (never persisted to Bitcoin).
### 3.2 Patch Phase
The Successor applies the Architecture Patch. This may involve:
- Replacing a Python module
- Adding a new schema migration
- Changing a configuration constant
- Swapping a tool implementation
**Rule:** The Successor has no network access except to localhost services required for the Evaluation Task.
### 3.3 Evaluation Phase
The Successor runs the Evaluation Task. Examples:
- "Run the full test suite. Report pass rate and execution time."
- "Compress 100 sessions and measure memory usage vs. baseline."
- "Handle 1,000 concurrent tool calls and report latency p99."
**Budget constraints:**
- Max wall time: 1 hour
- Max API cost: $5.00 equivalent
- Max disk usage: 10 GB
- Max CPU percentage: 200% (2 cores)
If any budget is exceeded, the Successor auto-terminates and returns `REJECT`.
### 3.4 Verdict Phase
The Successor produces a Verdict document:
```json
{
"verdict": "APPROVE",
"confidence": 0.94,
"patch_hash": "sha256:abc123...",
"evaluation_summary": "Test suite passed (312/312). p99 latency improved 18%. Memory usage reduced 12%.",
"caveats": ["One flaky test (test_gateway_reconnect) required 3 retries."],
"artifacts": [
{"type": "test_report", "path": "/tmp/successor/reports/tests.html"},
{"type": "benchmark", "path": "/tmp/successor/reports/latency.json"}
]
}
```
### 3.5 Termination Phase
The Successor is destroyed. Its temporary state store is deleted. Its identity keypair is zeroed. Only the Verdict document and artifact bundle survive.
---
## 4. Security Boundaries
| Resource | Parent Access | Successor Access |
|----------|---------------|------------------|
| Live state store | Read/Write | Read-only copy |
| Live tool registry | Modify | Temporary copy |
| Network (proprietary APIs) | Allowed | Blocked |
| Network (localhost tests) | Allowed | Allowed |
| Bitcoin identity key | Full | None |
| User messages | Receives | None |
| File system (outside /tmp) | Full | Read-only |
**Enforcement:** The Successor runs inside a lightweight container (Docker or `systemd-nspawn`) with seccomp-bpf rules.
---
## 5. Human and Mesh Approval Gates
A positive Verdict does **not** auto-deploy the patch. The patch must pass one additional gate:
### Phase IIII: Human Gate
Alexander (or designated human steward) must approve the patch.
### Phase IV+: Mesh Consensus Gate
The patch is gossiped to a quorum of mesh nodes. Each node may spawn its own Successor to independently verify the patch. If ≥2/3 of responding nodes return `APPROVE`, the patch is applied automatically.
---
## 6. The Autogenesis Loop
In Phase V, the Successor Fork becomes the engine of continuous self-improvement:
1. **Bottleneck detector** monitors the Parent runtime.
2. **Patch generator** drafts an architecture patch.
3. **Successor fork** evaluates the patch.
4. **Verifier agent** (a second, independent Successor) audits the first Successor's methodology.
5. If both approve, the patch is gossiped for mesh consensus.
6. If consensus passes, the Parent applies the patch during a scheduled maintenance window.
7. The new Parent now has a new SOUL.md hash, which is signed and attested.
---
## 7. Interface Definition
```python
class SuccessorFork:
def __init__(self, parent_runtime: HermesRuntime, patch: ArchitecturePatch):
...
async def evaluate(self, task: EvaluationTask, budget: Budget) -> Verdict:
"""
Spawn the successor, apply the patch, run the evaluation,
and return a Verdict. Never modifies the parent.
"""
...
def destroy(self):
"""Clean up all temporary state. Idempotent."""
...
```
---
## 8. Acceptance Criteria
- [ ] Successor can be spawned from a running Hermes v2.0 instance in <30 seconds.
- [ ] Successor cannot modify Parent state, filesystem, or identity.
- [ ] Successor returns a structured Verdict with confidence score and artifacts.
- [ ] Budget enforcement auto-terminates runaway Successors.
- [ ] At least one demo patch (e.g., "swap context compressor algorithm") is evaluated end-to-end.
---
*The Successor Fork is the recursive engine. It is how Hermes learns to outgrow itself.*