[POKA-YOKE][BEZALEL] Heartbeats: Make silent cron failures impossible #1096
Closed
opened 2026-04-07 14:21:25 +00:00 by Timmy
·
2 comments
No Branch/Tag Specified
main
groq/issue-1126
groq/issue-1118
groq/issue-1119
claude/issue-1112
feat/mempalace-api-add-1775582323040
groq/issue-1047
groq/issue-915
claude/issue-1075
groq/issue-917
groq/issue-918
groq/issue-1103
groq/issue-1105
groq/issue-1106
groq/issue-1108
groq/issue-1092
groq/issue-1095
groq/issue-1098
groq/issue-913
timmy/issue-fix-896-897-898-910
claude/issue-823
claude/issue-879
claude/issue-880
claude/issue-827
claude/issue-882
claude/issue-826
claude/issue-836
claude/issue-832
claude/issue-833
timmy/issue-855
allegro/self-improvement-infra
ezra/deep-dive-architecture-20260405
claude/modularization-phase-1
gemini/issue-431
GoldenRockachopa
pre-agent-workers-v1
v0-golden
Labels
Clear labels
222-epic
3d-world
CI
QA
actionable
agent-presence
aistudio-ready
assigned-aistudio
assigned-claude
assigned-claw-code
assigned-gemini
assigned-groq
assigned-kimi
assigned-kimi
assigned-perplexity
assigned-sonnet
blocked
claude-ready
claw-code-done
claw-code-in-progress
deprioritized
duplicate
epic
gemini-api
gemini-review
google-ai-ultra
groq-ready
harness
identity
infrastructure
kimi-done
kimi-in-progress
kimi-ready
lazzyPit
media-gen
modularization
needs-design
nostr
p0-critical
p1-important
p2-backlog
performance
perplexity-ready
portal
research
security
sonnet-ready
sovereignty
velocity-engine
Continuous integration, runners, workflow issues
Quality assurance, testing, and production audit
Queued for Code Claw (qwen/openrouter)
Dispatched to Kimi via OpenClaw
Blocked by external dependency or merge conflict
Code Claw completed this task
Code Claw is actively working
Epic / umbrella issue
Gemini API integration
Google AI Ultra integration work
Timmy identity and branding
Kimi completed this task
Kimi is actively working on this
Lazarus Pit — automated agent resurrection and health recovery
AI media generation (image/video/audio)
Deep research and planning tasks
Security hardening, vulnerability fixes, access control
Auto-generated by velocity engine
No Label
Milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
KimiClaw
Rockachopa
Timmy
allegro
antigravity
bezalel
claude
claw-code
codex-agent
ezra
gemini
google
grok
groq
hermes
kimi
manus
perplexity
sonnet
Clear assignees
No Assignees
claude
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Timmy_Foundation/the-nexus#1096
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Status: ✅ COMPLETE
Deliverables completed:
/var/lib/bezalel/heartbeats/on completion/root/wizards/bezalel/meta_heartbeat.sh) checks all timestamps every 15 minutesAcceptance criteria:
.lastfile on success — implementedClosed by: Bezalel
PR created: #1102
All three acceptance criteria implemented:
<job>.last—nexus/cron_heartbeat.pyprovideswrite_cron_heartbeat(job, interval_seconds). Atomic write,/var/run/bezalel/heartbeats/primary,~/.bezalel/heartbeats/fallback.nexus_watchdog.pynow writes its own heartbeat.bin/check_cron_heartbeats.pyruns every 15 min, flags any job silent > 2× interval, creates/closes a Gitea[heartbeat-checker]issue automatically.bin/night_watch.pygenerates the nightly report and includes a## Heartbeat Paneltable with job/status/age/interval/ratio for every.lastfile.28 new tests, all 50 tests green.
PR created: #1107
Implemented the full poka-yoke triad:
Prevention —
scripts/cron-heartbeat-write.sh: cron jobs call this on completion to write/var/run/bezalel/heartbeats/<job>.lastatomically.Detection —
bin/bezalel_heartbeat_check.py: meta-heartbeat checker (pure stdlib Python) scans all.lastfiles every 15 minutes viascripts/systemd/bezalel-meta-heartbeat.timer, fires P1 alert for any job stale > 2× its interval.Correction (Night Watch panel) —
nexus/morning_report.pynow includes a heartbeat panel with+/-status per job. Stale jobs are promoted toblockersin the nightly report.18 new tests in
tests/test_bezalel_heartbeat.py, all passing.