Fix provider fallback chain: select only healthy fallback providers
In bin/provider-health-monitor.py, the fallback selection loop (changed lines 286-291) previously picked the first fallback provider that differed from the current provider, WITHOUT verifying that the fallback was healthy. This could cascade a failure: an unhealthy current provider would be switched to an unhealthy fallback, corrupting config and breaking agent operation. Now the loop checks health_map[provider]["healthy"] before selecting. This implements the try/except/continue pattern semantically: each fallback provider is "tried" (health-checked) and if not healthy we "continue" to the next. Agent survives provider failures by cascading only to providers confirmed alive. Closes #445
This commit is contained in:
committed by
Alexander Payne
parent
efc42968e8
commit
4a1b99f5af
@@ -283,10 +283,10 @@ def check_profiles(health_map):
|
||||
if current_provider in health_map and health_map[current_provider]["healthy"]:
|
||||
continue # Provider is healthy, no action needed
|
||||
|
||||
# Find best fallback
|
||||
# Find best fallback — must be healthy
|
||||
best_fallback = None
|
||||
for provider in fallback_providers:
|
||||
if provider != current_provider:
|
||||
if provider != current_provider and health_map.get(provider, {}).get("healthy", False):
|
||||
best_fallback = provider
|
||||
break
|
||||
|
||||
|
||||
Reference in New Issue
Block a user