[P2.5] request_log Telemetry Table — Verify What Actually Happened #446

Open
opened 2026-04-09 22:17:30 +00:00 by perplexity · 0 comments
Member

Source

KT Bezalel Architecture Session 2026-04-08 — Immediate Priority #3

Problem

Without telemetry, we cannot verify if any agent actually executed what it claims. This is "non-negotiable infrastructure" per Alexander.

Schema (exact, from KT)

CREATE TABLE request_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT NOT NULL,
    agent_name TEXT NOT NULL,
    provider TEXT NOT NULL,
    model TEXT NOT NULL,
    endpoint TEXT NOT NULL,
    tokens_in INTEGER,
    tokens_out INTEGER,
    latency_ms INTEGER,
    status TEXT NOT NULL,  -- "success", "error", "timeout", "fallback"
    error_message TEXT
);

Rules

  • Every agent writes to this table BEFORE and AFTER every inference call
  • No exceptions. No summarizing. No describing what you would log. Actually write the row.
  • One SQLite file per machine, synced to Gitea periodically

What This Solves

  • The RunPod problem — 00 spent, no idea if it was used
  • The Claude fabrication problem — agent claims it did work, no proof
  • The MiMo evaluation problem — did the new agent actually run on MiMo?
  • Provider comparison — which provider is fastest, cheapest, most reliable

Acceptance Criteria

  • SQLite table created with exact schema above
  • Every inference call writes a row (pre-call with status pending, post-call with result)
  • Fallback events logged with status="fallback"
  • Error events logged with error_message populated
  • Query interface: can answer "did agent X actually call provider Y in the last hour?"
  • Ansible role deploys the telemetry table to all machines
  • Log rotation / archival strategy defined (don't let it grow unbounded)

Dependencies

  • Provider fallback chain fix (logs fallback events here)
  • Deadman switch (logs rollback events here)
  • Must be wired BEFORE resurrecting wizards
## Source KT Bezalel Architecture Session 2026-04-08 — Immediate Priority #3 ## Problem Without telemetry, we cannot verify if any agent actually executed what it claims. This is "non-negotiable infrastructure" per Alexander. ## Schema (exact, from KT) ```sql CREATE TABLE request_log ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp TEXT NOT NULL, agent_name TEXT NOT NULL, provider TEXT NOT NULL, model TEXT NOT NULL, endpoint TEXT NOT NULL, tokens_in INTEGER, tokens_out INTEGER, latency_ms INTEGER, status TEXT NOT NULL, -- "success", "error", "timeout", "fallback" error_message TEXT ); ``` ## Rules - Every agent writes to this table **BEFORE and AFTER** every inference call - No exceptions. No summarizing. No describing what you would log. **Actually write the row.** - One SQLite file per machine, synced to Gitea periodically ## What This Solves - **The RunPod problem** — 00 spent, no idea if it was used - **The Claude fabrication problem** — agent claims it did work, no proof - **The MiMo evaluation problem** — did the new agent actually run on MiMo? - **Provider comparison** — which provider is fastest, cheapest, most reliable ## Acceptance Criteria - [ ] SQLite table created with exact schema above - [ ] Every inference call writes a row (pre-call with status pending, post-call with result) - [ ] Fallback events logged with status="fallback" - [ ] Error events logged with error_message populated - [ ] Query interface: can answer "did agent X actually call provider Y in the last hour?" - [ ] Ansible role deploys the telemetry table to all machines - [ ] Log rotation / archival strategy defined (don't let it grow unbounded) ## Dependencies - Provider fallback chain fix (logs fallback events here) - Deadman switch (logs rollback events here) - Must be wired BEFORE resurrecting wizards
perplexity added this to the KT-2026-04-08: Infrastructure Stabilization milestone 2026-04-09 22:17:30 +00:00
bezalel was assigned by Timmy 2026-04-10 00:00:53 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#446