BUG: Cron ticker thread not starting in gateway — 30 jobs stuck #342

Closed
opened 2026-04-13 06:49:09 +00:00 by Timmy · 0 comments
Owner

Root Cause

The gateway process (PID 83608, started 2026-04-13 02:02AM) is alive but the cron ticker thread never started.

Evidence:

  • Gateway log shows "Cron ticker started (interval=60s)" only on April 10 (two old entries)
  • No "Cron ticker started" message for the current gateway session (April 13)
  • Gateway log has 0 tick mentions in the last 200 lines
  • _start_cron_ticker is defined at line 8718 of gateway/run.py and called at line 8924
  • The thread is created but never starts executing cron_tick()

Impact:

  • 30+ cron jobs are scheduled but never execute
  • mimo-swarm-dispatcher, mimo-swarm-worker-2, mimo-swarm-worker-3 stuck with stale errors
  • All burn loops, nightwatch jobs, and swarm jobs affected
  • Only jobs that work are those triggered manually via hermes cron run <id>

Diagnosis Steps

  1. hermes cron list shows jobs with Next run in the past — scheduler not ticking
  2. grep "Cron ticker" ~/.hermes/logs/gateway.log shows only April 10 entries
  3. hermes cron tick works manually — the scheduler code is fine, the thread just isn't running
  4. Gateway PID is alive (kill -0 83608 succeeds) but no cron activity

Hypothesis

The gateway's _start_cron_ticker thread is created at line 8924 via threading.Thread(target=_start_cron_ticker, ...). The thread may be:

  • Blocked waiting for adapters/loop that aren't provided
  • Crashing silently on startup (exception caught by outer try/except)
  • Not started because the gateway was launched with --replace which may skip the ticker

Acceptance Criteria

  1. Gateway log shows "Cron ticker started" on every gateway restart
  2. Cron jobs fire within 60 seconds of gateway start
  3. Manual hermes cron tick is not required for jobs to execute
  4. All 4 mimo-swarm jobs (dispatcher + 3 workers) run successfully on schedule
  5. No stale "interpreter shutdown" errors after gateway replacement

Files

  • gateway/run.py_start_cron_ticker (line 8718), thread creation (line 8924)
  • cron/scheduler.pytick() function
  • Gateway log: ~/.hermes/logs/gateway.log
## Root Cause The gateway process (PID 83608, started 2026-04-13 02:02AM) is alive but the cron ticker thread never started. Evidence: - Gateway log shows "Cron ticker started (interval=60s)" only on April 10 (two old entries) - No "Cron ticker started" message for the current gateway session (April 13) - Gateway log has 0 tick mentions in the last 200 lines - `_start_cron_ticker` is defined at line 8718 of gateway/run.py and called at line 8924 - The thread is created but never starts executing `cron_tick()` Impact: - 30+ cron jobs are scheduled but never execute - mimo-swarm-dispatcher, mimo-swarm-worker-2, mimo-swarm-worker-3 stuck with stale errors - All burn loops, nightwatch jobs, and swarm jobs affected - Only jobs that work are those triggered manually via `hermes cron run <id>` ## Diagnosis Steps 1. `hermes cron list` shows jobs with `Next run` in the past — scheduler not ticking 2. `grep "Cron ticker" ~/.hermes/logs/gateway.log` shows only April 10 entries 3. `hermes cron tick` works manually — the scheduler code is fine, the thread just isn't running 4. Gateway PID is alive (`kill -0 83608` succeeds) but no cron activity ## Hypothesis The gateway's `_start_cron_ticker` thread is created at line 8924 via `threading.Thread(target=_start_cron_ticker, ...)`. The thread may be: - Blocked waiting for adapters/loop that aren't provided - Crashing silently on startup (exception caught by outer try/except) - Not started because the gateway was launched with `--replace` which may skip the ticker ## Acceptance Criteria 1. Gateway log shows "Cron ticker started" on every gateway restart 2. Cron jobs fire within 60 seconds of gateway start 3. Manual `hermes cron tick` is not required for jobs to execute 4. All 4 mimo-swarm jobs (dispatcher + 3 workers) run successfully on schedule 5. No stale "interpreter shutdown" errors after gateway replacement ## Files - `gateway/run.py` — `_start_cron_ticker` (line 8718), thread creation (line 8924) - `cron/scheduler.py` — `tick()` function - Gateway log: `~/.hermes/logs/gateway.log`
claude self-assigned this 2026-04-13 06:53:37 +00:00
Timmy closed this issue 2026-04-13 07:15:31 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/hermes-agent#342