Alexander Whitestone
28d1905df4
feat: add vLLM as alternative inference backend ( #1281 )
...
Adds vLLM (high-throughput OpenAI-compatible inference server) as a
selectable backend alongside the existing Ollama and vllm-mlx backends.
vLLM's continuous batching gives 3-10x throughput for agentic workloads.
Changes:
- config.py: add `vllm` to timmy_model_backend Literal; add vllm_url /
vllm_model settings (VLLM_URL / VLLM_MODEL env vars)
- cascade.py: add vllm provider type with _check_provider_available
(hits /health) and _call_vllm (OpenAI-compatible completions)
- providers.yaml: add disabled-by-default vllm-local provider (priority 3,
port 8001); bump OpenAI/Anthropic backup priorities to 4/5
- health.py: add _check_vllm/_check_vllm_sync with 30-second TTL cache;
/health and /health/sovereignty reflect vLLM status when it is the
active backend
- docker-compose.yml: add vllm service behind 'vllm' profile (GPU
passthrough commented-out template included); add vllm-cache volume
- CLAUDE.md: add vLLM row to Service Fallback Matrix
- tests: 26 new unit tests covering availability checks, _call_vllm,
providers.yaml validation, config options, and health helpers
Graceful fallback: if vLLM is unavailable the cascade router automatically
falls back to Ollama. The app never crashes.
Fixes #1281
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-03-23 21:52:52 -04:00
0436dfd4c4
[claude] Dashboard: Agent Scorecards panel in Mission Control ( #929 ) ( #1276 )
2026-03-24 01:43:21 +00:00
9eeb49a6f1
[claude] Autonomous research pipeline — orchestrator + SOVEREIGNTY.md ( #972 ) ( #1274 )
2026-03-24 01:40:53 +00:00
2d6bfe6ba1
[claude] Agent Self-Correction Dashboard ( #1007 ) ( #1269 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-24 01:40:40 +00:00
ebb2cad552
[claude] feat: Session Sovereignty Report Generator ( #957 ) v3 ( #1263 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-24 01:40:24 +00:00
003e3883fb
[claude] Restore self-modification loop ( #983 ) ( #1270 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-24 01:40:16 +00:00
6b2e6d9e8c
[claude] feat: Agent Energy Budget Monitoring ( #1009 ) ( #1267 )
2026-03-24 01:35:49 +00:00
f62220eb61
[claude] Autoresearch H1: Apple Silicon support + M3 Max baseline doc ( #905 ) ( #1252 )
2026-03-23 23:38:38 +00:00
72992b7cc5
[claude] Fix ImportError: memory_write missing from memory_system ( #1249 ) ( #1251 )
2026-03-23 23:37:21 +00:00
b5fb6a85cf
[claude] Fix pre-existing ruff lint errors blocking git hooks ( #1247 ) ( #1248 )
2026-03-23 23:33:37 +00:00
261b7be468
[kimi] Refactor autoresearch.py -> SystemExperiment class ( #906 ) ( #1244 )
...
Co-authored-by: Kimi Agent <kimi@timmy.local >
Co-committed-by: Kimi Agent <kimi@timmy.local >
2026-03-23 23:28:54 +00:00
6691f4d1f3
[claude] Add timmy learn autoresearch entry point ( #907 ) ( #1240 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 23:14:09 +00:00
1e1689f931
[claude] Qwen3 two-model routing via task complexity classifier ( #1065 ) v2 ( #1233 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 22:58:21 +00:00
acc0df00cf
[claude] Three-Strike Detector ( #962 ) v2 ( #1232 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 22:50:59 +00:00
697575e561
[gemini] Implement semantic index for research outputs ( #976 ) ( #1227 )
2026-03-23 22:45:29 +00:00
d697c3d93e
[claude] refactor: break up monolithic tools.py into a tools/ package ( #1215 ) ( #1221 )
2026-03-23 22:43:09 +00:00
3217c32356
[claude] feat: Nexus — persistent conversational awareness space with live memory ( #1208 ) ( #1211 )
2026-03-23 22:34:48 +00:00
25157a71a8
[loop-cycle] fix: remove unused imports and fix formatting (lint) ( #1209 )
2026-03-23 22:30:03 +00:00
3ed2bbab02
[loop-cycle] refactor: break up git.py::run() into helpers ( #538 ) ( #1204 )
2026-03-23 22:07:28 +00:00
9121689a41
[claude] refactor: break up produce_system_status() ( #1194 ) ( #1196 )
2026-03-23 21:55:50 +00:00
8f8061e224
[claude] refactor: break up cascade.py complete() ( #1185 ) ( #1190 )
2026-03-23 21:52:27 +00:00
c78922ccbc
[kimi] Refactor cli.py::daily_run() — 105 lines → 33 lines ( #1168 ) ( #1189 )
2026-03-23 21:51:47 +00:00
f3093e9dea
[claude] refactor: break up dispatch_issue() into helpers ( #1187 ) ( #1188 )
2026-03-23 21:49:45 +00:00
b735b553e6
[kimi] Break up dispatch_task() into helper functions ( #1137 ) ( #1184 )
2026-03-23 21:46:02 +00:00
7aa48b4e22
[kimi] Break up _dispatch_via_gitea() into helper functions ( #1136 ) ( #1183 )
2026-03-23 21:40:17 +00:00
d796fe7c53
[claude] Refactor thinking.py::_maybe_file_issues() into focused helpers ( #1170 ) ( #1173 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 20:47:06 +00:00
ff921da547
[claude] Refactor timmyctl inbox() into helper functions ( #1169 ) ( #1174 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 20:47:00 +00:00
61377e3a1e
[gemini] Docs: Acknowledge The Sovereignty Loop governing architecture ( #953 ) ( #1167 )
...
Co-authored-by: Google Gemini <gemini@hermes.local >
Co-committed-by: Google Gemini <gemini@hermes.local >
2026-03-23 20:14:27 +00:00
de289878d6
[loop-cycle] refactor: add docstrings to 20 undocumented classes ( #1130 ) ( #1166 )
2026-03-23 20:08:06 +00:00
0d73a4ff7a
[claude] Fix ruff S105/S106/B017/E402 errors in bannerlord ( #1161 ) ( #1165 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 19:56:07 +00:00
dec9736679
[claude] Sovereignty metrics emitter + SQLite store ( #954 ) ( #1164 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 19:52:20 +00:00
08d337e03d
[claude] Implement three-tier metabolic LLM router ( #966 ) ( #1160 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 19:45:56 +00:00
6e65b53f3a
[loop-cycle-5] feat: implement 4 TODO stubs in timmyctl/cli.py ( #1128 ) ( #1158 )
2026-03-23 19:34:46 +00:00
2b9a55fa6d
[claude] Bannerlord M5: sovereign victory stack (src/bannerlord/) ( #1097 ) ( #1155 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 19:26:05 +00:00
495c1ac2bd
[claude] Fix 27 ruff lint errors blocking all pushes ( #1149 ) ( #1153 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 19:06:11 +00:00
382dd041d9
[kimi] Refactor scorecards.py — break up oversized functions ( #1127 ) ( #1152 )
...
Co-authored-by: Kimi Agent <kimi@timmy.local >
Co-committed-by: Kimi Agent <kimi@timmy.local >
2026-03-23 18:59:05 +00:00
3a8d9ee380
[claude] Break up _build_gitea_tools() into per-operation helpers ( #1134 ) ( #1147 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 18:42:47 +00:00
fd9fbe8a18
[claude] Break up MCPBridge.run() into helper methods ( #1135 ) ( #1148 )
2026-03-23 18:41:34 +00:00
7e03985368
[claude] feat: Agent Voice Customization UI ( #1017 ) ( #1146 )
2026-03-23 18:39:47 +00:00
cd1bc2bf6b
[claude] Add agent emotional state simulation ( #1013 ) ( #1144 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 18:36:52 +00:00
1c1bfb6407
[claude] Hermes health monitor — system resources + model management ( #1073 ) ( #1133 )
...
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local >
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local >
2026-03-23 18:36:06 +00:00
05e1196ea4
[gemini] feat: add coverage and duration strictness to pytest ( #934 ) ( #1140 )
...
Co-authored-by: Google Gemini <gemini@hermes.local >
Co-committed-by: Google Gemini <gemini@hermes.local >
2026-03-23 18:36:01 +00:00
ed63877f75
[claude] Qwen3 two-model strategy: 14B primary + 8B fast router ( #1063 ) ( #1143 )
2026-03-23 18:35:57 +00:00
128aa4427f
[claude] Vassal Protocol — Timmy as autonomous orchestrator ( #1070 ) ( #1142 )
2026-03-23 18:33:15 +00:00
4f8e86348c
[claude] Build Timmy autonomous backlog triage loop ( #1071 ) ( #1141 )
2026-03-23 18:32:27 +00:00
0c627f175b
[gemini] refactor: Gracefully handle tool registration errors ( #938 ) ( #1132 )
2026-03-23 18:26:40 +00:00
cf82bb0be4
[claude] Build agent dispatcher — route tasks to Claude Code, Kimi, APIs ( #1072 ) ( #1123 )
2026-03-23 18:25:38 +00:00
276bbcd112
[claude] Bannerlord M1 — GABS Observer Mode (Passive Lord) ( #1093 ) ( #1124 )
2026-03-23 18:23:52 +00:00
e8b3d59041
[gemini] feat: Add Claude API fallback tier to cascade.py ( #980 ) ( #1119 )
...
Co-authored-by: Google Gemini <gemini@hermes.local >
Co-committed-by: Google Gemini <gemini@hermes.local >
2026-03-23 18:21:18 +00:00
300d9575f1
[claude] Fix Starlette 1.0.0 TemplateResponse API in calm and tools routes ( #1112 ) ( #1115 )
2026-03-23 18:14:36 +00:00