feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Custom model registry — register, load, and manage model weights.
|
|
|
|
|
|
|
|
|
|
Tracks custom models (GGUF files, HF checkpoints, Ollama modelfiles)
|
|
|
|
|
and their assignment to swarm agents. Models can be registered at
|
|
|
|
|
runtime via the API or pre-configured via providers.yaml.
|
|
|
|
|
|
|
|
|
|
Inspired by OpenClaw-RL's multi-model orchestration where distinct
|
|
|
|
|
model roles (student, teacher, judge/PRM) run on dedicated resources.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
import logging
|
|
|
|
|
import sqlite3
|
|
|
|
|
import threading
|
2026-03-15 11:58:43 -04:00
|
|
|
from collections.abc import Generator
|
|
|
|
|
from contextlib import closing, contextmanager
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
from dataclasses import dataclass
|
|
|
|
|
from datetime import UTC, datetime
|
|
|
|
|
from enum import StrEnum
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
from pathlib import Path
|
|
|
|
|
|
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
|
|
|
|
DB_PATH = Path("data/swarm.db")
|
|
|
|
|
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
class ModelFormat(StrEnum):
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Supported model weight formats."""
|
2026-03-08 12:50:44 -04:00
|
|
|
|
|
|
|
|
GGUF = "gguf" # Ollama-compatible quantised weights
|
|
|
|
|
SAFETENSORS = "safetensors" # HuggingFace safetensors
|
|
|
|
|
HF_CHECKPOINT = "hf" # Full HuggingFace checkpoint directory
|
|
|
|
|
OLLAMA = "ollama" # Already loaded in Ollama by name
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
class ModelRole(StrEnum):
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Role a model can play in the system (OpenClaw-RL style)."""
|
2026-03-08 12:50:44 -04:00
|
|
|
|
|
|
|
|
GENERAL = "general" # Default agent inference
|
|
|
|
|
REWARD = "reward" # Process Reward Model (PRM) scoring
|
|
|
|
|
TEACHER = "teacher" # On-policy distillation teacher
|
|
|
|
|
JUDGE = "judge" # Output quality evaluation
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
@dataclass
|
|
|
|
|
class CustomModel:
|
|
|
|
|
"""A registered custom model."""
|
2026-03-08 12:50:44 -04:00
|
|
|
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
name: str
|
|
|
|
|
format: ModelFormat
|
2026-03-08 12:50:44 -04:00
|
|
|
path: str # Absolute path or Ollama model name
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
role: ModelRole = ModelRole.GENERAL
|
|
|
|
|
context_window: int = 4096
|
|
|
|
|
description: str = ""
|
|
|
|
|
registered_at: str = ""
|
|
|
|
|
active: bool = True
|
|
|
|
|
# Per-model generation settings
|
|
|
|
|
default_temperature: float = 0.7
|
|
|
|
|
max_tokens: int = 2048
|
|
|
|
|
|
|
|
|
|
def __post_init__(self):
|
|
|
|
|
if not self.registered_at:
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
self.registered_at = datetime.now(UTC).isoformat()
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
|
|
|
|
|
|
2026-03-15 11:58:43 -04:00
|
|
|
@contextmanager
|
|
|
|
|
def _get_conn() -> Generator[sqlite3.Connection, None, None]:
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
DB_PATH.parent.mkdir(parents=True, exist_ok=True)
|
2026-03-15 11:58:43 -04:00
|
|
|
with closing(sqlite3.connect(str(DB_PATH))) as conn:
|
|
|
|
|
conn.row_factory = sqlite3.Row
|
|
|
|
|
conn.execute("PRAGMA journal_mode=WAL")
|
|
|
|
|
conn.execute("PRAGMA busy_timeout=5000")
|
|
|
|
|
conn.execute("""
|
|
|
|
|
CREATE TABLE IF NOT EXISTS custom_models (
|
|
|
|
|
name TEXT PRIMARY KEY,
|
|
|
|
|
format TEXT NOT NULL,
|
|
|
|
|
path TEXT NOT NULL,
|
|
|
|
|
role TEXT NOT NULL DEFAULT 'general',
|
|
|
|
|
context_window INTEGER NOT NULL DEFAULT 4096,
|
|
|
|
|
description TEXT NOT NULL DEFAULT '',
|
|
|
|
|
registered_at TEXT NOT NULL,
|
|
|
|
|
active INTEGER NOT NULL DEFAULT 1,
|
|
|
|
|
default_temperature REAL NOT NULL DEFAULT 0.7,
|
|
|
|
|
max_tokens INTEGER NOT NULL DEFAULT 2048
|
|
|
|
|
)
|
|
|
|
|
""")
|
|
|
|
|
conn.execute("""
|
|
|
|
|
CREATE TABLE IF NOT EXISTS agent_model_assignments (
|
|
|
|
|
agent_id TEXT PRIMARY KEY,
|
|
|
|
|
model_name TEXT NOT NULL,
|
|
|
|
|
assigned_at TEXT NOT NULL,
|
|
|
|
|
FOREIGN KEY (model_name) REFERENCES custom_models(name)
|
|
|
|
|
)
|
|
|
|
|
""")
|
|
|
|
|
conn.commit()
|
|
|
|
|
yield conn
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
class ModelRegistry:
|
|
|
|
|
"""Singleton registry for custom models and agent-model assignments."""
|
|
|
|
|
|
|
|
|
|
def __init__(self) -> None:
|
|
|
|
|
self._lock = threading.Lock()
|
|
|
|
|
# In-memory cache for fast lookups
|
|
|
|
|
self._models: dict[str, CustomModel] = {}
|
|
|
|
|
self._agent_assignments: dict[str, str] = {}
|
|
|
|
|
self._load_from_db()
|
|
|
|
|
|
|
|
|
|
def _load_from_db(self) -> None:
|
|
|
|
|
"""Bootstrap cache from SQLite."""
|
|
|
|
|
try:
|
2026-03-15 11:58:43 -04:00
|
|
|
with _get_conn() as conn:
|
2026-03-15 11:05:39 -04:00
|
|
|
for row in conn.execute("SELECT * FROM custom_models WHERE active = 1").fetchall():
|
|
|
|
|
self._models[row["name"]] = CustomModel(
|
|
|
|
|
name=row["name"],
|
|
|
|
|
format=ModelFormat(row["format"]),
|
|
|
|
|
path=row["path"],
|
|
|
|
|
role=ModelRole(row["role"]),
|
|
|
|
|
context_window=row["context_window"],
|
|
|
|
|
description=row["description"],
|
|
|
|
|
registered_at=row["registered_at"],
|
|
|
|
|
active=bool(row["active"]),
|
|
|
|
|
default_temperature=row["default_temperature"],
|
|
|
|
|
max_tokens=row["max_tokens"],
|
|
|
|
|
)
|
|
|
|
|
for row in conn.execute("SELECT * FROM agent_model_assignments").fetchall():
|
|
|
|
|
self._agent_assignments[row["agent_id"]] = row["model_name"]
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
except Exception as exc:
|
|
|
|
|
logger.warning("Failed to load model registry from DB: %s", exc)
|
|
|
|
|
|
|
|
|
|
# ── Model CRUD ─────────────────────────────────────────────────────────
|
|
|
|
|
|
|
|
|
|
def register(self, model: CustomModel) -> CustomModel:
|
|
|
|
|
"""Register a new custom model."""
|
|
|
|
|
with self._lock:
|
2026-03-15 11:58:43 -04:00
|
|
|
with _get_conn() as conn:
|
2026-03-15 11:05:39 -04:00
|
|
|
conn.execute(
|
|
|
|
|
"""
|
|
|
|
|
INSERT OR REPLACE INTO custom_models
|
|
|
|
|
(name, format, path, role, context_window, description,
|
|
|
|
|
registered_at, active, default_temperature, max_tokens)
|
|
|
|
|
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
|
|
|
|
""",
|
|
|
|
|
(
|
|
|
|
|
model.name,
|
|
|
|
|
model.format.value,
|
|
|
|
|
model.path,
|
|
|
|
|
model.role.value,
|
|
|
|
|
model.context_window,
|
|
|
|
|
model.description,
|
|
|
|
|
model.registered_at,
|
|
|
|
|
int(model.active),
|
|
|
|
|
model.default_temperature,
|
|
|
|
|
model.max_tokens,
|
|
|
|
|
),
|
|
|
|
|
)
|
|
|
|
|
conn.commit()
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
self._models[model.name] = model
|
|
|
|
|
logger.info("Registered model: %s (%s)", model.name, model.format.value)
|
|
|
|
|
return model
|
|
|
|
|
|
|
|
|
|
def unregister(self, name: str) -> bool:
|
|
|
|
|
"""Remove a model from the registry."""
|
|
|
|
|
with self._lock:
|
|
|
|
|
if name not in self._models:
|
|
|
|
|
return False
|
2026-03-15 11:58:43 -04:00
|
|
|
with _get_conn() as conn:
|
2026-03-15 11:05:39 -04:00
|
|
|
conn.execute("DELETE FROM custom_models WHERE name = ?", (name,))
|
|
|
|
|
conn.execute("DELETE FROM agent_model_assignments WHERE model_name = ?", (name,))
|
|
|
|
|
conn.commit()
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
del self._models[name]
|
|
|
|
|
# Remove any agent assignments using this model
|
|
|
|
|
self._agent_assignments = {
|
|
|
|
|
k: v for k, v in self._agent_assignments.items() if v != name
|
|
|
|
|
}
|
|
|
|
|
logger.info("Unregistered model: %s", name)
|
|
|
|
|
return True
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
def get(self, name: str) -> CustomModel | None:
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Look up a model by name."""
|
|
|
|
|
return self._models.get(name)
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
def list_models(self, role: ModelRole | None = None) -> list[CustomModel]:
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""List all registered models, optionally filtered by role."""
|
|
|
|
|
models = list(self._models.values())
|
|
|
|
|
if role is not None:
|
|
|
|
|
models = [m for m in models if m.role == role]
|
|
|
|
|
return models
|
|
|
|
|
|
|
|
|
|
def set_active(self, name: str, active: bool) -> bool:
|
|
|
|
|
"""Enable or disable a model without removing it."""
|
|
|
|
|
model = self._models.get(name)
|
|
|
|
|
if not model:
|
|
|
|
|
return False
|
|
|
|
|
with self._lock:
|
|
|
|
|
model.active = active
|
2026-03-15 11:58:43 -04:00
|
|
|
with _get_conn() as conn:
|
2026-03-15 11:05:39 -04:00
|
|
|
conn.execute(
|
|
|
|
|
"UPDATE custom_models SET active = ? WHERE name = ?",
|
|
|
|
|
(int(active), name),
|
|
|
|
|
)
|
|
|
|
|
conn.commit()
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
return True
|
|
|
|
|
|
|
|
|
|
# ── Agent-model assignments ────────────────────────────────────────────
|
|
|
|
|
|
|
|
|
|
def assign_model(self, agent_id: str, model_name: str) -> bool:
|
|
|
|
|
"""Assign a specific model to an agent."""
|
|
|
|
|
if model_name not in self._models:
|
|
|
|
|
return False
|
|
|
|
|
with self._lock:
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
now = datetime.now(UTC).isoformat()
|
2026-03-15 11:58:43 -04:00
|
|
|
with _get_conn() as conn:
|
2026-03-15 11:05:39 -04:00
|
|
|
conn.execute(
|
|
|
|
|
"""
|
|
|
|
|
INSERT OR REPLACE INTO agent_model_assignments
|
|
|
|
|
(agent_id, model_name, assigned_at)
|
|
|
|
|
VALUES (?, ?, ?)
|
|
|
|
|
""",
|
|
|
|
|
(agent_id, model_name, now),
|
|
|
|
|
)
|
|
|
|
|
conn.commit()
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
self._agent_assignments[agent_id] = model_name
|
|
|
|
|
logger.info("Assigned model %s to agent %s", model_name, agent_id)
|
|
|
|
|
return True
|
|
|
|
|
|
|
|
|
|
def unassign_model(self, agent_id: str) -> bool:
|
|
|
|
|
"""Remove model assignment from an agent (falls back to default)."""
|
|
|
|
|
with self._lock:
|
|
|
|
|
if agent_id not in self._agent_assignments:
|
|
|
|
|
return False
|
2026-03-15 11:58:43 -04:00
|
|
|
with _get_conn() as conn:
|
2026-03-15 11:05:39 -04:00
|
|
|
conn.execute(
|
|
|
|
|
"DELETE FROM agent_model_assignments WHERE agent_id = ?",
|
|
|
|
|
(agent_id,),
|
|
|
|
|
)
|
|
|
|
|
conn.commit()
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
del self._agent_assignments[agent_id]
|
|
|
|
|
return True
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
def get_agent_model(self, agent_id: str) -> CustomModel | None:
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Get the model assigned to an agent, or None for default."""
|
|
|
|
|
model_name = self._agent_assignments.get(agent_id)
|
|
|
|
|
if model_name:
|
|
|
|
|
return self._models.get(model_name)
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
def get_agent_assignments(self) -> dict[str, str]:
|
|
|
|
|
"""Return all agent-to-model assignments."""
|
|
|
|
|
return dict(self._agent_assignments)
|
|
|
|
|
|
|
|
|
|
# ── Role-based lookups ─────────────────────────────────────────────────
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
def get_reward_model(self) -> CustomModel | None:
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Get the active reward/PRM model, if any."""
|
|
|
|
|
reward_models = self.list_models(role=ModelRole.REWARD)
|
|
|
|
|
active = [m for m in reward_models if m.active]
|
|
|
|
|
return active[0] if active else None
|
|
|
|
|
|
ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
|
|
|
def get_teacher_model(self) -> CustomModel | None:
|
feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:
1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
registry for GGUF, safetensors, HF checkpoint, and Ollama models with
role-based lookups (general, reward, teacher, judge).
2. Per-agent model assignment — each swarm persona can use a different model
instead of sharing the global default. Resolved via registry assignment >
persona default > global default.
3. Runtime model management API (/api/v1/models) — REST endpoints to register,
list, assign, enable/disable, and remove custom models without restart.
Includes a dashboard page at /models.
4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
agent outputs using a configurable reward model. Scores persist in SQLite
and feed into the swarm learner.
New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.
54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.
https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:08:03 +00:00
|
|
|
"""Get the active teacher model for distillation."""
|
|
|
|
|
teacher_models = self.list_models(role=ModelRole.TEACHER)
|
|
|
|
|
active = [m for m in teacher_models if m.active]
|
|
|
|
|
return active[0] if active else None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Module-level singleton
|
|
|
|
|
model_registry = ModelRegistry()
|