Files
timmy-config/docs/operator-command-center-requirements.md

10 KiB

Sovereign Operator Command Center Requirements

Status: requirements for #159 Parent: #154 Decision: v1 ownership stays in timmy-config

Goal

Define the minimum viable operator command center for Timmy: a sovereign control surface that shows real system health, queue pressure, review load, and task state over a trusted network.

This is an operator surface, not a public product surface, not a demo, and not a reboot of the archived dashboard lineage.

Non-goals

  • public internet exposure
  • a marketing or presentation dashboard
  • hidden queue mutation during polling or page refresh
  • a second shadow task database that competes with Gitea or Hermes runtime truth
  • personal-token fallback behavior hidden inside the UI or browser session
  • developer-specific local absolute paths in requirements, config, or examples

Hard requirements

1. Access model: local or Tailscale only

The operator command center must be reachable only from:

  • localhost, or
  • a Tailscale-bound interface or Tailscale-gated tunnel

It must not:

  • bind a public-facing listener by default
  • require public DNS or public ingress
  • expose a login page to the open internet
  • degrade from Tailscale identity to ad hoc password sharing

If trusted-network conditions are missing or ambiguous, the surface must fail closed.

2. Truth model: operator truth beats UI theater

The command center exists to expose operator truth. That means every status tile, counter, and row must be backed by a named authoritative source and a freshness signal.

Authoritative sources for v1 are:

  • Gitea for issue, PR, review, assignee, and repo state
  • Hermes cron state and Huey runtime state for scheduled work
  • live runtime health checks, process state, and explicit agent heartbeat artifacts for agent liveness
  • direct model or service health endpoints for local inference and operator-facing services

Non-authoritative signals must never be treated as truth on their own. Examples:

  • pane color
  • old dashboard screenshots
  • manually curated status notes
  • stale cached summaries without source timestamps
  • synthetic green badges produced when the underlying source is unavailable

If a source is unavailable, the UI must say unknown, stale, or degraded. It must never silently substitute optimism.

3. Mutation model: read-first, explicit writes only

The default operator surface is read-only.

For MVP, the five required views below are read-only views. They may link the operator to the underlying source-of-truth object, but they must not mutate state merely by rendering, refreshing, filtering, or opening detail drawers.

If write actions are added later, they must live in a separate, explicit control surface with all of the following:

  • an intentional operator action
  • a confirmation step for destructive or queue-changing actions
  • a single named source-of-truth target
  • an audit trail tied to the action
  • idempotent behavior where practical
  • machine-scoped credentials, not a hidden fallback to a human personal token

4. Repo boundary: visible world is not operator truth

the-nexus is the visible world. It may eventually project summarized status outward, but it must not own the operator control surface.

The operator command center belongs with the sidecar/control-plane boundary, where Timmy already owns:

  • orchestration policy
  • cron definitions
  • playbooks
  • sidecar scripts
  • deployment and runtime governance

That makes the v1 ownership decision:

  • timmy-config owns the requirements and first implementation shape

Allowed future extraction:

  • if the command center becomes large enough to deserve its own release cycle, implementation code may later move into a dedicated control-plane repo
  • if that happens, timmy-config still remains the source of truth for policy, access requirements, and operator doctrine

Rejected owner for v1:

  • the-nexus, because it is the wrong boundary for an operator-only surface and invites demo/UI theater to masquerade as truth

Minimum viable views

Every view must show freshness and expose drill-through links or identifiers back to the source object.

View Must answer Authoritative sources MVP mutation status
Brief status What is red right now, what is degraded, and what needs operator attention first? Derived rollup from the four views below; no standalone shadow state Read-only
Agent health Which agents or loops are alive, stalled, rate-limited, missing, or working the wrong thing? Runtime health checks, process state, agent heartbeats, active claim/assignment state, model/provider health Read-only
Review queue Which PRs are waiting, blocked, risky, stale, or ready for review/merge? Gitea PR state, review comments, checks, mergeability, labels, assignees Read-only
Cron state Which scheduled jobs are enabled, paused, stale, failing, or drifting from intended schedule? Hermes cron registry, Huey consumer health, last-run status, next-run schedule Read-only
Task board What work is unassigned, assigned, in progress, blocked, or waiting on review across the active repos? Gitea issues, labels, assignees, milestones, linked PRs, issue state Read-only

View requirements in detail

Brief status

The brief status view is the operator's first screen. It must provide a compact summary of:

  • overall health state
  • current review pressure
  • current queue pressure
  • cron failures or paused jobs that matter
  • stale agent or service conditions

It must be computed from the authoritative views below, not from a separate private cache. A red item in brief status must point to the exact underlying object that caused it.

Agent health

Minimum fields per agent or loop:

  • agent name
  • current state: up, down, degraded, idle, busy, rate-limited, unknown
  • last successful activity time
  • current task or claim, if any
  • model/provider or service dependency in use
  • failure mode when degraded

The view must distinguish between:

  • process missing
  • process present but unhealthy
  • healthy but idle
  • healthy and actively working
  • active but stale on one issue for too long

This view must reflect real operator concerns, not just whether a shell process exists.

Review queue

Minimum fields per PR row:

  • repo
  • PR number and title
  • author
  • age
  • review state
  • mergeability or blocking condition
  • sensitive-surface flag when applicable

The queue must make it obvious which PRs require Timmy judgment versus routine review. It must not collapse all open PRs into a vanity count.

Cron state

Minimum fields per scheduled job:

  • job name
  • desired state
  • actual state
  • last run time
  • last result
  • next run time
  • pause reason or failure reason

The view must highlight drift, especially cases where:

  • config says the job exists but the runner is absent
  • a job is paused and nobody noticed
  • a job is overdue relative to its schedule
  • the runner is alive but the job has stopped producing successful runs

Task board

The task board is not a hand-maintained kanban. It is a projection of Gitea truth.

Minimum board lanes for MVP:

  • unassigned
  • assigned
  • in progress
  • blocked
  • in review

Lane membership must come from explicit source-of-truth signals such as assignees, labels, linked PRs, and issue state. If the mapping is ambiguous, the card must say so rather than invent certainty.

Read-only versus mutating surfaces

Read-only for MVP

The following are read-only in MVP:

  • brief status
  • agent health
  • review queue
  • cron state
  • task board
  • all filtering, sorting, searching, and drill-down behavior

May mutate later, but only as explicit controls

The following are acceptable future mutation classes if they are isolated behind explicit controls and audit:

  • pause or resume a cron job
  • dispatch, assign, unassign, or requeue a task in Gitea
  • post a review action or merge action to a PR
  • restart or stop a named operator-managed agent/service

These controls must never be mixed invisibly into passive status polling. The operator must always know when a click is about to change world state.

Truth versus theater rules

The command center must follow these rules:

  1. No hidden side effects on read.
  2. No green status without a timestamped source.
  3. No second queue that disagrees with Gitea.
  4. No synthetic task board curated by hand.
  5. No stale cache presented as live truth.
  6. No public-facing polish requirements allowed to override operator clarity.
  7. No fallback to personal human tokens when machine identity is missing.
  8. No developer-specific local absolute paths in requirements, config examples, or UI copy.

Credential and identity requirements

The surface must use machine-scoped or service-scoped credentials for any source it reads or writes.

It must not rely on:

  • a principal's browser session as the only auth story
  • a hidden file lookup chain for a human token
  • a personal access token copied into client-side code
  • ambiguous fallback identity that changes behavior depending on who launched the process

Remote operator access is granted by Tailscale identity and network reachability, not by making the surface public and adding a thin password prompt later.

  • implement the operator command center as a sidecar-owned surface under timmy-config
  • keep the first version read-only
  • prefer direct reads from Gitea, Hermes cron state, Huey/runtime state, and service health endpoints
  • attach freshness metadata to every view
  • treat drill-through links to source objects as mandatory, not optional
  • postpone write controls until audit, identity, and source-of-truth mapping are explicit

Acceptance criteria for this requirement set

  • the minimum viable views are fixed as: agent health, review queue, cron state, task board, brief status
  • the access model is explicitly local or Tailscale only
  • operator truth is defined and separated from demo/UI theater
  • read-only versus mutating behavior is explicitly separated
  • repo ownership is decided: timmy-config owns v1 requirements and implementation boundary
  • no local absolute paths are required by this design
  • no human-token fallback pattern is allowed by this design