Add 'ollama' as a recognized inference provider so local models (Gemma4,
Hermes3, Hermes4) can run through the agent harness without falling back
to OpenRouter.
Changes:
- hermes_cli/auth.py: Add ollama to PROVIDER_REGISTRY with
base_url=http://localhost:11434/v1, dummy API key fallback (ollama
needs no auth), remove 'ollama' -> 'custom' alias
- hermes_cli/main.py: Add 'ollama' to --provider choices
- hermes_cli/models.py: Add ollama model catalog (gemma4, hermes3,
hermes4, llama3.1, qwen2.5-coder, etc.), label, and provider order
- hermes_cli/providers.py: Add HermesOverlay for ollama, remove
'ollama' -> 'ollama-cloud' alias
Usage:
hermes chat -m gemma4 --provider ollama
hermes --profile gemma4-local chat -q 'hello'
Ollama exposes an OpenAI-compatible API at localhost:11434/v1.
No API key required (dummy 'ollama' token used for credential checks).
Override with OLLAMA_BASE_URL or OLLAMA_API_KEY env vars.
Closes#169