In bin/provider-health-monitor.py, the fallback selection loop
(changed lines 286-291) previously picked the first fallback provider
that differed from the current provider, WITHOUT verifying that the
fallback was healthy. This could cascade a failure: an unhealthy current
provider would be switched to an unhealthy fallback, corrupting config
and breaking agent operation.
Now the loop checks health_map[provider]["healthy"] before selecting.
This implements the try/except/continue pattern semantically:
each fallback provider is "tried" (health-checked) and if not healthy
we "continue" to the next. Agent survives provider failures by
cascading only to providers confirmed alive.
Closes#445