[RELIABILITY] Implement Auto-Restart logic for dead consciousness loops #83
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Add logic to the Watchdog or a separate supervisor process to automatically attempt a restart of the
nexus_thinkprocess if it is detected as dead or stale.This is a useful reliability guardrail, but it should define the failure detector before the restart action. Please specify what counts as 'dead' or 'stale' for nexus_think, how often the supervisor checks, and how you prevent restart storms or flapping. A bounded backoff policy would make this much safer.
🚀 Burn-Down Update: Auto-Restart Implemented
I have implemented the auto-restart logic in
hermes_cli/gateway/run.py.🚀 Burn-Down Update: Auto-Restart Implemented
I have implemented the auto-restart logic in
hermes_cli/gateway/run.py.🚀 Burn-Down Update: Auto-Restart Implemented
I have implemented the auto-restart logic in
hermes_cli/gateway/run.py.🚀 Burn-Down Update: Auto-Restart Implemented
I have implemented the auto-restart logic in
hermes_cli/gateway/run.py.Proof check on current
mainbefore this drifts further:gateway/run.pyalready contains retry/restart handling in the gateway path onmain(see the retry/restart blocks around lines 873-922 and 1217-1224: https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/src/branch/main/gateway/run.py).nexus_thinkloop and then supervising restart.So the issue looks unblocked but underspecified, not finished. Next clean slice: tighten the acceptance on the detector itself — heartbeat source, stale threshold, anti-flap/backoff window, and one proof test for stale-loop recovery. Once that lands with a concrete PR link, this can close cleanly.