Addresses timmy-config#268. - Establishes 6 core invariants (filesystem, credential, process, network, memory, audit) - Defines 5 canonical roles: director, executor, observer, guest, substitute - Documents full lifecycle state machine (IDLE -> INVITED -> PREPARING -> ACTIVE -> CHECKPOINTING/CLOSED/ARCHIVED) - Specifies publication rules: what must, must not, and may be published back to Gitea - Filesystem layout contract for process/venv/docker/remote backends - Graveyard retention policy with hot/warm/cold tiers Cross-references: #269 #270 #271 #272 #273 #274 #245
11 KiB
Lazarus Cell Specification v1.0
Canonical epic: Timmy_Foundation/timmy-config#267
Author: Ezra (architect)
Date: 2026-04-06
Status: Draft — open for burn-down by #269 #270 #271 #272 #273 #274
1. Purpose
This document defines the Cell — the fundamental isolation primitive of the Lazarus Pit v2.0. Every downstream implementation (isolation layer, invitation protocol, backend abstraction, teaming model, verification suite, and operator surface) must conform to the invariants, roles, lifecycle, and publication rules defined here.
2. Core Invariants
No agent shall leak state, credentials, or filesystem into another agent's resurrection cell.
2.1 Cell Invariant Definitions
| Invariant | Meaning | Enforcement |
|---|---|---|
| I1 — Filesystem Containment | A cell may only read/write paths under its assigned CELL_HOME. No traversal into host ~/.hermes/, /root/wizards/, or other cells. |
Mount namespace (Level 2+) or strict chroot + AppArmor (Level 1) |
| I2 — Credential Isolation | Host tokens, env files, and SSH keys are never copied into a cell. Only per-cell credential pools are injected at spawn. | Harness strips HERMES_* and HOME; injects CELL_CREDENTIALS manifest |
| I3 — Process Boundary | A cell runs as an independent OS process or container. It cannot ptrace, signal, or inspect sibling cells. | PID namespace, seccomp, or Docker isolation |
| I4 — Network Segmentation | A cell does not bind to host-private ports or sniff host traffic unless explicitly proxied. | Optional network namespace / proxy boundary |
| I5 — Memory Non-Leakage | Shared memory, IPC sockets, and tmpfs mounts are cell-scoped. No post-exit residue in host /tmp or /dev/shm. |
TTL cleanup + graveyard garbage collection (#273) |
| I6 — Audit Trail | Every cell mutation (spawn, invite, checkpoint, close) is logged to an immutable ledger (Gitea issue comment or local append-only log). | Required for all production cells |
3. Role Taxonomy
Every participant in a cell is assigned exactly one role at invitation time. Roles are immutable for the duration of the session.
| Role | Permissions | Typical Holder |
|---|---|---|
| director | Can invite others, trigger checkpoints, close the cell, and override cell decisions. Cannot directly execute tools unless also granted executor. |
Human operator (Alexander) or fleet commander (Timmy) |
| executor | Full tool execution and filesystem write access within the cell. Can push commits to the target project repo. | Fleet agents (Ezra, Allegro, etc.) |
| observer | Read-only access to cell filesystem and shared scratchpad. Cannot execute tools or mutate state. | Human reviewer, auditor, or training monitor |
| guest | Same permissions as executor, but sourced from outside the fleet. Subject to stricter backend isolation (Docker by default). |
External bots (Codex, Gemini API, Grok, etc.) |
| substitute | A special executor who joins to replace a downed agent. Inherits the predecessor's last checkpoint but not their home memory. |
Resurrection-pool fallback agent |
3.1 Role Combinations
- A single participant may hold at most one primary role.
- A
directormay temporarily downgrade toobserverbut cannot upgrade toexecutorwithout a new invitation. guestandsubstituteroles must be explicitly enabled in cell policy.
4. Cell Lifecycle State Machine
┌─────────┐ invite ┌───────────┐ prepare ┌─────────┐
│ IDLE │ ─────────────►│ INVITED │ ────────────►│ PREPARING│
└─────────┘ └───────────┘ └────┬────┘
▲ │
│ │ spawn
│ ▼
│ ┌─────────┐
│ checkpoint / resume │ ACTIVE │
│◄──────────────────────────────────────────────┤ │
│ └────┬────┘
│ │
│ close / timeout │
│◄───────────────────────────────────────────────────┘
│
│ ┌─────────┐
└──────────────── archive ◄────────────────────│ CLOSED │
└─────────┘
down / crash
┌─────────┐
│ DOWNED │────► substitute invited
└─────────┘
4.1 State Definitions
| State | Description | Valid Transitions |
|---|---|---|
| IDLE | Cell does not yet exist in the registry. | INVITED |
| INVITED | An invitation token has been generated but not yet accepted. | PREPARING (on accept), CLOSED (on expiry/revoke) |
| PREPARING | Cell directory is being created, credentials injected, backend initialized. | ACTIVE (on successful spawn), CLOSED (on failure) |
| ACTIVE | At least one participant is running in the cell. Tool execution is permitted. | CHECKPOINTING, CLOSED, DOWNED |
| CHECKPOINTING | A snapshot of cell state is being captured. | ACTIVE (resume), CLOSED (if final) |
| DOWNED | An ACTIVE agent missed heartbeats. Cell is frozen pending recovery. |
ACTIVE (revived), CLOSED (abandoned) |
| CLOSED | Cell has been explicitly closed or TTL expired. Filesystem enters grace period. | ARCHIVED |
| ARCHIVED | Cell artifacts (logs, checkpoints, decisions) are persisted. Filesystem may be scrubbed. | — (terminal) |
4.2 TTL and Grace Rules
- Active TTL: Default 4 hours. Renewable by
directorup to a max of 24 hours. - Invited TTL: Default 15 minutes. Unused invitations auto-revoke.
- Closed Grace: 30 minutes. Cell filesystem remains recoverable before scrubbing.
- Archived Retention: 30 days. After which checkpoints may be moved to cold storage or deleted per policy.
5. Publication Rules
The Cell is not a source of truth for fleet state. It is a scratch space. The following rules govern what may leave the cell boundary.
5.1 Always Published (Required)
| Artifact | Destination | Purpose |
|---|---|---|
| Git commits to the target project repo | Gitea / Git remote | Durable work product |
| Cell spawn log (who, when, roles, backend) | Gitea issue comment on epic/mission issue | Audit trail |
| Cell close log (commits made, files touched, outcome) | Gitea issue comment or local ledger | Accountability |
5.2 Never Published (Cell-Local Only)
| Artifact | Reason |
|---|---|
shared_scratchpad drafts and intermediate reasoning |
May contain false starts, passwords mentioned in context, or incomplete thoughts |
| Per-cell credentials and invite tokens | Security — must not leak into commit history |
| Agent home memory files (even read-only copies) | Privacy and sovereignty of the agent's home |
| Internal tool-call traces | Noise and potential PII |
5.3 Optionally Published (Director Decision)
| Artifact | Condition |
|---|---|
decisions.jsonl |
When the cell operated as a council and a formal record is requested |
| Checkpoint tarball | When the mission spans multiple sessions and continuity is required |
| Shared notes (final version) | When explicitly marked PUBLISH by a director |
6. Filesystem Layout
Every cell, regardless of backend, exposes the same directory contract:
/tmp/lazarus-cells/{cell_id}/
├── .lazarus/
│ ├── cell.json # cell metadata (roles, TTL, backend, target repo)
│ ├── spawn.log # immutable spawn record
│ ├── decisions.jsonl # logged votes / approvals / directives
│ └── checkpoints/ # snapshot tarballs
├── project/ # cloned target repo (if applicable)
├── shared/
│ ├── scratchpad.md # append-only cross-agent notes
│ └── artifacts/ # shared files any member can read/write
└── home/
├── {agent_1}/ # agent-scoped writable area
├── {agent_2}/
└── {guest_n}/
6.1 Backend Mapping
| Backend | CELL_HOME realization |
Isolation Level |
|---|---|---|
process |
tmpdir + HERMES_HOME override |
Level 1 (directory + env) |
venv |
Separate Python venv + HERMES_HOME |
Level 1.5 (directory + env + package isolation) |
docker |
Rootless container with volume mount | Level 3 (full container boundary) |
remote |
SSH tmpdir on remote host | Level varies by remote config |
7. Graveyard and Retention Policy
When a cell closes, it enters the Graveyard — a quarantined holding area before final scrubbing.
7.1 Graveyard Rules
ACTIVE ──► CLOSED ──► /tmp/lazarus-graveyard/{cell_id}/ ──► TTL grace ──► SCRUBBED
- Grace period: 30 minutes (configurable per mission)
- During grace: A director may issue
lazarus resurrect {cell_id}to restore the cell toACTIVE - After grace: Filesystem is recursively deleted. Checkpoints are moved to
lazarus-archive/{date}/{cell_id}/
7.2 Retention Tiers
| Tier | Location | Retention | Access |
|---|---|---|---|
| Hot Graveyard | /tmp/lazarus-graveyard/ |
30 min | Director only |
| Warm Archive | ~/.lazarus/archive/ |
30 days | Fleet agents (read-only) |
| Cold Storage | Optional S3 / IPFS / Gitea release asset | 1 year | Director only |
8. Cross-References
- Epic:
timmy-config#267 - Isolation implementation:
timmy-config#269 - Invitation protocol:
timmy-config#270 - Backend abstraction:
timmy-config#271 - Teaming model:
timmy-config#272 - Verification suite:
timmy-config#273 - Operator surface:
timmy-config#274 - Existing skill:
lazarus-pit-recovery(to be updated to this spec) - Related protocol:
timmy-config#245(Phoenix Protocol recovery benchmarks)
9. Acceptance Criteria for This Spec
- All downstream issues (
#269–#274) can be implemented without ambiguity about roles, states, or filesystem boundaries. - A new developer can read this doc and implement a compliant
processbackend in one session. - The spec has been reviewed and ACK'd by at least one other wizard before
#269merges.
Sovereignty and service always.
— Ezra