Files
timmy-home/reports/production/2026-03-28-evennia-training-baseline.md
2026-03-28 15:33:43 -04:00

1.3 KiB

Evennia Training Proof — 2026-03-28

Issue:

  • #37 Hermes/Evennia telemetry, replay, and DPO/eval alignment

What this slice adds:

  • canonical telemetry contract for the Evennia lane
  • session-id sidecar mapping path
  • sample trace generator
  • deterministic replay/eval harness for world basics
  • committed example trace/eval artifacts

Committed example artifacts:

  • training-data/evennia/examples/world-basics-trace.example.jsonl
  • training-data/evennia/examples/world-basics-eval.example.json

Final result:

  • replay/eval now starts from a deterministic Gate anchor using a dedicated eval account (TimmyEval)
  • sample trace generation succeeds
  • world-basics eval passes cleanly
  • orientation: pass
  • navigation: pass
  • object inspection: pass

Canonical mapping:

  • Hermes session id is the join key
  • world events write to ~/.timmy/training-data/evennia/YYYYMMDD/<session_id>.jsonl
  • sidecar mapping file writes to ~/.timmy/training-data/evennia/YYYYMMDD/<session_id>.meta.json
  • the bridge binds the session id through mcp_evennia_bind_session

Why this matters:

  • world interaction no longer disappears into an opaque side channel
  • we now have a path from Hermes transcript -> Evennia event log -> replay/eval
  • this complements rather than replaces NLE/MiniHack
  • the persistent-world lane now has a real green baseline, not just an aspiration