- Add MultiModalManager with capability detection for vision/audio/tools
- Define fallback chains: vision (llama3.2:3b -> llava:7b -> moondream)
tools (llama3.1:8b-instruct -> qwen2.5:7b)
- Update CascadeRouter to detect content type and select appropriate models
- Add model pulling with automatic fallback in agent creation
- Update providers.yaml with multi-modal model configurations
- Update OllamaAdapter to use model resolution with vision support
Tests: All 96 infrastructure tests pass
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
- YAML-based provider configuration (config/providers.yaml)
- Priority-ordered provider routing
- Circuit breaker pattern for failing providers
- Health check and availability monitoring
- Metrics tracking (latency, errors, success rates)
- Support for Ollama, OpenAI, Anthropic, AirLLM providers
- Automatic failover on rate limits or errors
- REST API endpoints for monitoring and control
- 41 comprehensive tests
API Endpoints:
- POST /api/v1/router/complete - Chat completion with failover
- GET /api/v1/router/status - Provider health status
- GET /api/v1/router/metrics - Detailed metrics
- GET /api/v1/router/providers - List all providers
- POST /api/v1/router/providers/{name}/control - Enable/disable/reset
- POST /api/v1/router/health-check - Run health checks
- GET /api/v1/router/config - View configuration