Files
timmy-config/docs/architecture/LAZARUS-CELL-SPEC.md
Ezra ee749e0b93 [LAZARUS][SPEC] Define cell contract, roles, lifecycle, and publication rules
Addresses timmy-config#268.

- Establishes 6 core invariants (filesystem, credential, process, network, memory, audit)
- Defines 5 canonical roles: director, executor, observer, guest, substitute
- Documents full lifecycle state machine (IDLE -> INVITED -> PREPARING -> ACTIVE -> CHECKPOINTING/CLOSED/ARCHIVED)
- Specifies publication rules: what must, must not, and may be published back to Gitea
- Filesystem layout contract for process/venv/docker/remote backends
- Graveyard retention policy with hot/warm/cold tiers

Cross-references: #269 #270 #271 #272 #273 #274 #245
2026-04-06 16:56:43 +00:00

11 KiB
Raw Blame History

Lazarus Cell Specification v1.0

Canonical epic: Timmy_Foundation/timmy-config#267
Author: Ezra (architect)
Date: 2026-04-06
Status: Draft — open for burn-down by #269 #270 #271 #272 #273 #274


1. Purpose

This document defines the Cell — the fundamental isolation primitive of the Lazarus Pit v2.0. Every downstream implementation (isolation layer, invitation protocol, backend abstraction, teaming model, verification suite, and operator surface) must conform to the invariants, roles, lifecycle, and publication rules defined here.


2. Core Invariants

No agent shall leak state, credentials, or filesystem into another agent's resurrection cell.

2.1 Cell Invariant Definitions

Invariant Meaning Enforcement
I1 — Filesystem Containment A cell may only read/write paths under its assigned CELL_HOME. No traversal into host ~/.hermes/, /root/wizards/, or other cells. Mount namespace (Level 2+) or strict chroot + AppArmor (Level 1)
I2 — Credential Isolation Host tokens, env files, and SSH keys are never copied into a cell. Only per-cell credential pools are injected at spawn. Harness strips HERMES_* and HOME; injects CELL_CREDENTIALS manifest
I3 — Process Boundary A cell runs as an independent OS process or container. It cannot ptrace, signal, or inspect sibling cells. PID namespace, seccomp, or Docker isolation
I4 — Network Segmentation A cell does not bind to host-private ports or sniff host traffic unless explicitly proxied. Optional network namespace / proxy boundary
I5 — Memory Non-Leakage Shared memory, IPC sockets, and tmpfs mounts are cell-scoped. No post-exit residue in host /tmp or /dev/shm. TTL cleanup + graveyard garbage collection (#273)
I6 — Audit Trail Every cell mutation (spawn, invite, checkpoint, close) is logged to an immutable ledger (Gitea issue comment or local append-only log). Required for all production cells

3. Role Taxonomy

Every participant in a cell is assigned exactly one role at invitation time. Roles are immutable for the duration of the session.

Role Permissions Typical Holder
director Can invite others, trigger checkpoints, close the cell, and override cell decisions. Cannot directly execute tools unless also granted executor. Human operator (Alexander) or fleet commander (Timmy)
executor Full tool execution and filesystem write access within the cell. Can push commits to the target project repo. Fleet agents (Ezra, Allegro, etc.)
observer Read-only access to cell filesystem and shared scratchpad. Cannot execute tools or mutate state. Human reviewer, auditor, or training monitor
guest Same permissions as executor, but sourced from outside the fleet. Subject to stricter backend isolation (Docker by default). External bots (Codex, Gemini API, Grok, etc.)
substitute A special executor who joins to replace a downed agent. Inherits the predecessor's last checkpoint but not their home memory. Resurrection-pool fallback agent

3.1 Role Combinations

  • A single participant may hold at most one primary role.
  • A director may temporarily downgrade to observer but cannot upgrade to executor without a new invitation.
  • guest and substitute roles must be explicitly enabled in cell policy.

4. Cell Lifecycle State Machine

┌─────────┐    invite     ┌───────────┐    prepare    ┌─────────┐
│  IDLE   │ ─────────────►│  INVITED  │ ────────────►│ PREPARING│
└─────────┘               └───────────┘              └────┬────┘
     ▲                                                    │
     │                                                    │ spawn
     │                                                    ▼
     │                                               ┌─────────┐
     │           checkpoint / resume                 │  ACTIVE │
     │◄──────────────────────────────────────────────┤         │
     │                                               └────┬────┘
     │                                                    │
     │              close / timeout                       │
     │◄───────────────────────────────────────────────────┘
     │
     │                                               ┌─────────┐
     └──────────────── archive ◄────────────────────│ CLOSED  │
                                                     └─────────┘
                              down / crash
                              ┌─────────┐
                              │ DOWNED  │────► substitute invited
                              └─────────┘

4.1 State Definitions

State Description Valid Transitions
IDLE Cell does not yet exist in the registry. INVITED
INVITED An invitation token has been generated but not yet accepted. PREPARING (on accept), CLOSED (on expiry/revoke)
PREPARING Cell directory is being created, credentials injected, backend initialized. ACTIVE (on successful spawn), CLOSED (on failure)
ACTIVE At least one participant is running in the cell. Tool execution is permitted. CHECKPOINTING, CLOSED, DOWNED
CHECKPOINTING A snapshot of cell state is being captured. ACTIVE (resume), CLOSED (if final)
DOWNED An ACTIVE agent missed heartbeats. Cell is frozen pending recovery. ACTIVE (revived), CLOSED (abandoned)
CLOSED Cell has been explicitly closed or TTL expired. Filesystem enters grace period. ARCHIVED
ARCHIVED Cell artifacts (logs, checkpoints, decisions) are persisted. Filesystem may be scrubbed. — (terminal)

4.2 TTL and Grace Rules

  • Active TTL: Default 4 hours. Renewable by director up to a max of 24 hours.
  • Invited TTL: Default 15 minutes. Unused invitations auto-revoke.
  • Closed Grace: 30 minutes. Cell filesystem remains recoverable before scrubbing.
  • Archived Retention: 30 days. After which checkpoints may be moved to cold storage or deleted per policy.

5. Publication Rules

The Cell is not a source of truth for fleet state. It is a scratch space. The following rules govern what may leave the cell boundary.

5.1 Always Published (Required)

Artifact Destination Purpose
Git commits to the target project repo Gitea / Git remote Durable work product
Cell spawn log (who, when, roles, backend) Gitea issue comment on epic/mission issue Audit trail
Cell close log (commits made, files touched, outcome) Gitea issue comment or local ledger Accountability

5.2 Never Published (Cell-Local Only)

Artifact Reason
shared_scratchpad drafts and intermediate reasoning May contain false starts, passwords mentioned in context, or incomplete thoughts
Per-cell credentials and invite tokens Security — must not leak into commit history
Agent home memory files (even read-only copies) Privacy and sovereignty of the agent's home
Internal tool-call traces Noise and potential PII

5.3 Optionally Published (Director Decision)

Artifact Condition
decisions.jsonl When the cell operated as a council and a formal record is requested
Checkpoint tarball When the mission spans multiple sessions and continuity is required
Shared notes (final version) When explicitly marked PUBLISH by a director

6. Filesystem Layout

Every cell, regardless of backend, exposes the same directory contract:

/tmp/lazarus-cells/{cell_id}/
├── .lazarus/
│   ├── cell.json           # cell metadata (roles, TTL, backend, target repo)
│   ├── spawn.log           # immutable spawn record
│   ├── decisions.jsonl     # logged votes / approvals / directives
│   └── checkpoints/        # snapshot tarballs
├── project/                # cloned target repo (if applicable)
├── shared/
│   ├── scratchpad.md       # append-only cross-agent notes
│   └── artifacts/          # shared files any member can read/write
└── home/
    ├── {agent_1}/          # agent-scoped writable area
    ├── {agent_2}/
    └── {guest_n}/

6.1 Backend Mapping

Backend CELL_HOME realization Isolation Level
process tmpdir + HERMES_HOME override Level 1 (directory + env)
venv Separate Python venv + HERMES_HOME Level 1.5 (directory + env + package isolation)
docker Rootless container with volume mount Level 3 (full container boundary)
remote SSH tmpdir on remote host Level varies by remote config

7. Graveyard and Retention Policy

When a cell closes, it enters the Graveyard — a quarantined holding area before final scrubbing.

7.1 Graveyard Rules

ACTIVE ──► CLOSED ──► /tmp/lazarus-graveyard/{cell_id}/ ──► TTL grace ──► SCRUBBED
  • Grace period: 30 minutes (configurable per mission)
  • During grace: A director may issue lazarus resurrect {cell_id} to restore the cell to ACTIVE
  • After grace: Filesystem is recursively deleted. Checkpoints are moved to lazarus-archive/{date}/{cell_id}/

7.2 Retention Tiers

Tier Location Retention Access
Hot Graveyard /tmp/lazarus-graveyard/ 30 min Director only
Warm Archive ~/.lazarus/archive/ 30 days Fleet agents (read-only)
Cold Storage Optional S3 / IPFS / Gitea release asset 1 year Director only

8. Cross-References

  • Epic: timmy-config#267
  • Isolation implementation: timmy-config#269
  • Invitation protocol: timmy-config#270
  • Backend abstraction: timmy-config#271
  • Teaming model: timmy-config#272
  • Verification suite: timmy-config#273
  • Operator surface: timmy-config#274
  • Existing skill: lazarus-pit-recovery (to be updated to this spec)
  • Related protocol: timmy-config#245 (Phoenix Protocol recovery benchmarks)

9. Acceptance Criteria for This Spec

  • All downstream issues (#269#274) can be implemented without ambiguity about roles, states, or filesystem boundaries.
  • A new developer can read this doc and implement a compliant process backend in one session.
  • The spec has been reviewed and ACK'd by at least one other wizard before #269 merges.

Sovereignty and service always.

— Ezra