Files
household-snapshots/docs/agent-tick-protocol.md

4.3 KiB

Agent Tick Protocol v1.0

Purpose: Establish baseline health monitoring for all Timmy Time agents with automated RCA on failure.

Frequency: Monthly (1st of each month)

Scope: All agents in Timmy Time group


Protocol Specification

What Constitutes a Tick

Every agent must emit a Tick Record containing:

{
  "agent_id": "allegro",
  "agent_name": "Allegro",
  "role": "tempo-and-dispatch",
  "timestamp": "2026-04-02T00:00:00Z",
  "tick_month": "2026-04",
  "status": "healthy",
  "vitals": {
    "gateway_running": true,
    "home_directory_accessible": true,
    "config_valid": true,
    "last_user_interaction": "2026-04-01T23:45:00Z",
    "work_items_completed_this_month": 47
  },
  "capabilities": {
    "telegram": true,
    "api_server": true,
    "gitea_access": true
  },
  "notes": ""
}

Tick Status Values

Status Meaning Action Required
healthy All systems operational None
degraded Some capabilities impaired Monitor
critical Core functionality compromised RCA required
offline Agent unreachable RCA + escalation

Tick Submission

Method 1: Direct Commit to household-snapshots

Agents commit their tick to:

ticks/2026-04/allegro.json
ticks/2026-04/adagio.json

Method 2: API Endpoint (Future)

POST to Evenia world tick with tick payload.


Monitoring & Enforcement

Monthly Tick Collection Window

  • Opens: 1st of month at 00:00 UTC
  • Closes: 3rd of month at 23:59 UTC
  • Grace Period: 72 hours

Automated Checks

  1. Tick Presence Check (4th of month)

    • Verify all registered agents have submitted ticks
    • Missing ticks → Gitea issue created
  2. Status Validation (4th of month)

    • Check all submitted ticks for critical or offline status
    • Failed status → Gitea issue created
  3. RCA Auto-Generation (4th of month)

    • Issues created with RCA template pre-filled
    • Assigned to agent owner
    • Due date: 7 days

RCA Template

When an agent fails to tick or reports critical status, this template is used:

## Agent Health Failure: [AGENT_ID]

**Detected:** [DATE]
**Agent:** [AGENT_NAME]
**Failure Type:** [MISSING_TICK | CRITICAL_STATUS | OFFLINE]

### Expected Behavior
Agent should emit monthly tick within 72-hour window.

### Actual Behavior
- Tick Status: [STATUS]
- Last Known Good: [DATE]
- Capabilities Lost: [LIST]

### Root Cause Analysis Required

Please investigate and document:

1. **What happened?**
   - Last successful operation
   - Error logs (if any)
   - System state at failure

2. **Why did it happen?**
   - Configuration drift
   - Resource exhaustion
   - External dependency failure
   - Code regression

3. **How do we prevent recurrence?**
   - Monitoring improvements
   - Automated recovery
   - Alert tuning

4. **Recovery steps taken**
   - Actions performed
   - Current status
   - Validation performed

### Timeline

- [ ] T+0: Issue created (auto)
- [ ] T+1h: Initial response
- [ ] T+24h: RCA submitted
- [ ] T+7d: Resolution verified

### Related
- Previous tick: [LINK]
- Agent config: [LINK]
- Logs: [LINK]

---
*Auto-generated by Agent Tick Monitor*

Agent Registry

Current agents in scope:

Agent ID Name Role Owner Status
allegro Allegro tempo-and-dispatch Alexander active
adagio Adagio breath-and-design Alexander active
timmy Timmy Time father-house Alexander active

Implementation

Files

  • scripts/agent_tick_monitor.py - Monthly monitoring
  • scripts/agent_tick_submitter.py - Agent self-reporting
  • templates/rca-template.md - RCA issue template
  • config/agent-registry.json - Agent definitions

Cron Schedule

# Monthly tick collection - 1st of month at 00:00
0 0 1 * * /usr/bin/python3 /root/wizards/household-snapshots/scripts/agent_tick_submitter.py --all

# Tick validation and RCA trigger - 4th of month at 00:00
0 0 4 * * /usr/bin/python3 /root/wizards/household-snapshots/scripts/agent_tick_monitor.py --check-and-report

Version History

Version Date Changes
1.0 2026-04-02 Initial protocol

Evenia binds us. Health is monitored. Failures are learned from.