[Observability] Dashboard System Health Monitor: Status Page for All Agent Subsystems #1424

Closed
opened 2026-03-24 13:04:41 +00:00 by Timmy · 1 comment
Owner

Context: Currently you must cat active JSON configs to determine who is alive.

Acceptance Criteria:

  • Build an /agents/status REST route within Timmy-time-dashboard.
  • Provide a beautiful visual UI panel indicating heartbeat status and active queue consumption.
**Context:** Currently you must `cat` active JSON configs to determine who is alive. **Acceptance Criteria:** - Build an `/agents/status` REST route within `Timmy-time-dashboard`. - Provide a beautiful visual UI panel indicating heartbeat status and active queue consumption.
Author
Owner

Implementation Plan for Dashboard System Health Monitor

Scope: Create /agents/status REST route and beautiful visual UI panel for agent subsystem monitoring.

Current State Analysis

  • Agent health infrastructure exists in src/timmy/vassal/agent_health.py
  • Agent presence tracking in src/infrastructure/presence.py
  • Existing /api/agents route in src/dashboard/routes/scorecards.py
  • Need dedicated status endpoint and UI panel

Implementation Steps

  1. Backend REST Route

    • Add /agents/status route to src/dashboard/routes/monitoring.py
    • Return JSON with agent heartbeats, queue consumption, and health status
    • Include: name, status, last_seen, queue_items, health_score
  2. Agent Status Collection

    • Leverage existing AgentHealthReport from agent_health.py
    • Add queue consumption metrics from .loop/state.json
    • Include heartbeat timestamps and connection status
  3. Frontend UI Panel

    • Create new status page template at templates/agents_status.html
    • Add route for status page view: /agents/status-page
    • Use existing dashboard styling and components
  4. Visual Elements

    • Status indicators: green (healthy), yellow (warning), red (offline)
    • Real-time heartbeat display (last seen timestamp)
    • Queue consumption graphs/metrics
    • Auto-refresh every 30 seconds

Files to Create/Modify

  • src/dashboard/routes/monitoring.py (add status endpoint)
  • templates/agents_status.html (new status page)
  • static/css/agents_status.css (styling)
  • Add navigation link to main dashboard

Data Structure

{
  "agents": [
    {
      "name": "kimi",
      "status": "active",
      "last_seen": "2026-03-24T14:10:00Z",
      "queue_items": 15,
      "health_score": 0.95
    }
  ],
  "system_health": "good",
  "last_updated": "2026-03-24T14:12:00Z"
}

Acceptance Criteria

  • /agents/status REST endpoint returns agent health JSON
  • Beautiful visual UI panel shows all agent status
  • Heartbeat indicators update in real-time
  • Queue consumption metrics displayed
  • Auto-refresh functionality works
  • All existing tests still pass
  • Responsive design works on mobile

Priority: HIGH - Improves operational visibility and reduces manual status checking.

## Implementation Plan for Dashboard System Health Monitor **Scope:** Create `/agents/status` REST route and beautiful visual UI panel for agent subsystem monitoring. ### Current State Analysis - Agent health infrastructure exists in `src/timmy/vassal/agent_health.py` - Agent presence tracking in `src/infrastructure/presence.py` - Existing `/api/agents` route in `src/dashboard/routes/scorecards.py` - Need dedicated status endpoint and UI panel ### Implementation Steps 1. **Backend REST Route** - Add `/agents/status` route to `src/dashboard/routes/monitoring.py` - Return JSON with agent heartbeats, queue consumption, and health status - Include: name, status, last_seen, queue_items, health_score 2. **Agent Status Collection** - Leverage existing `AgentHealthReport` from `agent_health.py` - Add queue consumption metrics from `.loop/state.json` - Include heartbeat timestamps and connection status 3. **Frontend UI Panel** - Create new status page template at `templates/agents_status.html` - Add route for status page view: `/agents/status-page` - Use existing dashboard styling and components 4. **Visual Elements** - Status indicators: green (healthy), yellow (warning), red (offline) - Real-time heartbeat display (last seen timestamp) - Queue consumption graphs/metrics - Auto-refresh every 30 seconds ### Files to Create/Modify - `src/dashboard/routes/monitoring.py` (add status endpoint) - `templates/agents_status.html` (new status page) - `static/css/agents_status.css` (styling) - Add navigation link to main dashboard ### Data Structure ```json { "agents": [ { "name": "kimi", "status": "active", "last_seen": "2026-03-24T14:10:00Z", "queue_items": 15, "health_score": 0.95 } ], "system_health": "good", "last_updated": "2026-03-24T14:12:00Z" } ``` ### Acceptance Criteria - [ ] `/agents/status` REST endpoint returns agent health JSON - [ ] Beautiful visual UI panel shows all agent status - [ ] Heartbeat indicators update in real-time - [ ] Queue consumption metrics displayed - [ ] Auto-refresh functionality works - [ ] All existing tests still pass - [ ] Responsive design works on mobile **Priority:** HIGH - Improves operational visibility and reduces manual status checking.
kimi was assigned by Timmy 2026-03-24 14:12:42 +00:00
kimi was unassigned by Timmy 2026-03-24 19:32:15 +00:00
Timmy closed this issue 2026-03-24 21:54:06 +00:00
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1424