Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
Co-authored-by: Allegro <allegro@hermes.local> Co-committed-by: Allegro <allegro@hermes.local>
168 lines
6.5 KiB
Markdown
168 lines
6.5 KiB
Markdown
# Successor Fork Specification
|
||
|
||
**Parent:** Hermes v2.0 Architecture — `docs/hermes-v2.0-architecture.md`
|
||
**Epic:** #421 — The Autogenesis Protocol
|
||
**Author:** Allegro
|
||
|
||
---
|
||
|
||
## 1. Purpose
|
||
|
||
The Successor Fork is the mechanism by which a Hermes v2.0 instance evaluates changes to its own architecture without risking the live runtime. It is not a subagent solving a user task. It is a **sandboxed clone of the runtime** that exists solely to answer the question:
|
||
|
||
> *"If I applied this architecture patch, would the result be better?"*
|
||
|
||
---
|
||
|
||
## 2. Definitions
|
||
|
||
| Term | Definition |
|
||
|------|------------|
|
||
| **Parent** | The live Hermes v2.0 runtime currently serving users. |
|
||
| **Successor** | A temporary, isolated fork of the Parent created for architectural evaluation. |
|
||
| **Architecture Patch** | A proposed change to one or more runtime components (loop, router, memory layer, etc.). |
|
||
| **Evaluation Task** | A bounded test or benchmark the Successor must run to validate the patch. |
|
||
| **Verdict** | The Successor's final judgment: `APPROVE`, `REJECT`, or `NEEDS_REVISION`. |
|
||
|
||
---
|
||
|
||
## 3. Successor Lifecycle
|
||
|
||
```
|
||
┌─────────┐ fork() ┌───────────┐ apply patch ┌─────────────┐
|
||
│ Parent │ ──────────────> │ Successor │ ─────────────────> │ Patched │
|
||
│ Runtime │ │ (clone) │ │ Successor │
|
||
└─────────┘ └───────────┘ └──────┬──────┘
|
||
▲ │
|
||
│ verdict │
|
||
│ + artifact bundle │
|
||
│<─────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### 3.1 Fork Phase
|
||
The Parent creates a Successor by:
|
||
1. Copying the current **state store** to a temporary SQLite file.
|
||
2. Copying the current **tool registry** into a temporary namespace.
|
||
3. Cloning the **conversation loop** into an isolated `asyncio` event loop.
|
||
4. Generating a **temporary identity keypair** (never persisted to Bitcoin).
|
||
|
||
### 3.2 Patch Phase
|
||
The Successor applies the Architecture Patch. This may involve:
|
||
- Replacing a Python module
|
||
- Adding a new schema migration
|
||
- Changing a configuration constant
|
||
- Swapping a tool implementation
|
||
|
||
**Rule:** The Successor has no network access except to localhost services required for the Evaluation Task.
|
||
|
||
### 3.3 Evaluation Phase
|
||
The Successor runs the Evaluation Task. Examples:
|
||
- "Run the full test suite. Report pass rate and execution time."
|
||
- "Compress 100 sessions and measure memory usage vs. baseline."
|
||
- "Handle 1,000 concurrent tool calls and report latency p99."
|
||
|
||
**Budget constraints:**
|
||
- Max wall time: 1 hour
|
||
- Max API cost: $5.00 equivalent
|
||
- Max disk usage: 10 GB
|
||
- Max CPU percentage: 200% (2 cores)
|
||
|
||
If any budget is exceeded, the Successor auto-terminates and returns `REJECT`.
|
||
|
||
### 3.4 Verdict Phase
|
||
The Successor produces a Verdict document:
|
||
|
||
```json
|
||
{
|
||
"verdict": "APPROVE",
|
||
"confidence": 0.94,
|
||
"patch_hash": "sha256:abc123...",
|
||
"evaluation_summary": "Test suite passed (312/312). p99 latency improved 18%. Memory usage reduced 12%.",
|
||
"caveats": ["One flaky test (test_gateway_reconnect) required 3 retries."],
|
||
"artifacts": [
|
||
{"type": "test_report", "path": "/tmp/successor/reports/tests.html"},
|
||
{"type": "benchmark", "path": "/tmp/successor/reports/latency.json"}
|
||
]
|
||
}
|
||
```
|
||
|
||
### 3.5 Termination Phase
|
||
The Successor is destroyed. Its temporary state store is deleted. Its identity keypair is zeroed. Only the Verdict document and artifact bundle survive.
|
||
|
||
---
|
||
|
||
## 4. Security Boundaries
|
||
|
||
| Resource | Parent Access | Successor Access |
|
||
|----------|---------------|------------------|
|
||
| Live state store | Read/Write | Read-only copy |
|
||
| Live tool registry | Modify | Temporary copy |
|
||
| Network (proprietary APIs) | Allowed | Blocked |
|
||
| Network (localhost tests) | Allowed | Allowed |
|
||
| Bitcoin identity key | Full | None |
|
||
| User messages | Receives | None |
|
||
| File system (outside /tmp) | Full | Read-only |
|
||
|
||
**Enforcement:** The Successor runs inside a lightweight container (Docker or `systemd-nspawn`) with seccomp-bpf rules.
|
||
|
||
---
|
||
|
||
## 5. Human and Mesh Approval Gates
|
||
|
||
A positive Verdict does **not** auto-deploy the patch. The patch must pass one additional gate:
|
||
|
||
### Phase I–III: Human Gate
|
||
Alexander (or designated human steward) must approve the patch.
|
||
|
||
### Phase IV+: Mesh Consensus Gate
|
||
The patch is gossiped to a quorum of mesh nodes. Each node may spawn its own Successor to independently verify the patch. If ≥2/3 of responding nodes return `APPROVE`, the patch is applied automatically.
|
||
|
||
---
|
||
|
||
## 6. The Autogenesis Loop
|
||
|
||
In Phase V, the Successor Fork becomes the engine of continuous self-improvement:
|
||
|
||
1. **Bottleneck detector** monitors the Parent runtime.
|
||
2. **Patch generator** drafts an architecture patch.
|
||
3. **Successor fork** evaluates the patch.
|
||
4. **Verifier agent** (a second, independent Successor) audits the first Successor's methodology.
|
||
5. If both approve, the patch is gossiped for mesh consensus.
|
||
6. If consensus passes, the Parent applies the patch during a scheduled maintenance window.
|
||
7. The new Parent now has a new SOUL.md hash, which is signed and attested.
|
||
|
||
---
|
||
|
||
## 7. Interface Definition
|
||
|
||
```python
|
||
class SuccessorFork:
|
||
def __init__(self, parent_runtime: HermesRuntime, patch: ArchitecturePatch):
|
||
...
|
||
|
||
async def evaluate(self, task: EvaluationTask, budget: Budget) -> Verdict:
|
||
"""
|
||
Spawn the successor, apply the patch, run the evaluation,
|
||
and return a Verdict. Never modifies the parent.
|
||
"""
|
||
...
|
||
|
||
def destroy(self):
|
||
"""Clean up all temporary state. Idempotent."""
|
||
...
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Acceptance Criteria
|
||
|
||
- [ ] Successor can be spawned from a running Hermes v2.0 instance in <30 seconds.
|
||
- [ ] Successor cannot modify Parent state, filesystem, or identity.
|
||
- [ ] Successor returns a structured Verdict with confidence score and artifacts.
|
||
- [ ] Budget enforcement auto-terminates runaway Successors.
|
||
- [ ] At least one demo patch (e.g., "swap context compressor algorithm") is evaluated end-to-end.
|
||
|
||
---
|
||
|
||
*The Successor Fork is the recursive engine. It is how Hermes learns to outgrow itself.*
|