Files
timmy-home/reports/production/2026-03-28-evennia-training-baseline.md
2026-03-28 15:33:43 -04:00

36 lines
1.3 KiB
Markdown

# Evennia Training Proof — 2026-03-28
Issue:
- #37 Hermes/Evennia telemetry, replay, and DPO/eval alignment
What this slice adds:
- canonical telemetry contract for the Evennia lane
- session-id sidecar mapping path
- sample trace generator
- deterministic replay/eval harness for world basics
- committed example trace/eval artifacts
Committed example artifacts:
- `training-data/evennia/examples/world-basics-trace.example.jsonl`
- `training-data/evennia/examples/world-basics-eval.example.json`
Final result:
- replay/eval now starts from a deterministic Gate anchor using a dedicated eval account (`TimmyEval`)
- sample trace generation succeeds
- world-basics eval passes cleanly
- orientation: pass
- navigation: pass
- object inspection: pass
Canonical mapping:
- Hermes session id is the join key
- world events write to `~/.timmy/training-data/evennia/YYYYMMDD/<session_id>.jsonl`
- sidecar mapping file writes to `~/.timmy/training-data/evennia/YYYYMMDD/<session_id>.meta.json`
- the bridge binds the session id through `mcp_evennia_bind_session`
Why this matters:
- world interaction no longer disappears into an opaque side channel
- we now have a path from Hermes transcript -> Evennia event log -> replay/eval
- this complements rather than replaces NLE/MiniHack
- the persistent-world lane now has a real green baseline, not just an aspiration