When a gateway crashes mid-job execution (before mark_job_run can persist
the updated next_run_at), the job would fire again on every restart attempt
within the grace window. For a daily 6:15 AM job with a 2-hour grace,
rapidly restarting the gateway could trigger dozens of duplicate runs.
Fix: call advance_next_run() BEFORE run_job() in tick(). For recurring
jobs (cron/interval), this preemptively advances next_run_at to the next
future occurrence and persists it to disk. If the process then crashes
during execution, the job won't be considered due on restart.
One-shot jobs are left unchanged — they still retry on restart since
there's no future occurrence to advance to.
This changes the scheduler from at-least-once to at-most-once semantics
for recurring jobs, which is the correct tradeoff: missing one daily
message is far better than sending it dozens of times.