Compare commits
11 Commits
feat/bezal
...
e369727235
| Author | SHA1 | Date | |
|---|---|---|---|
| e369727235 | |||
| 1705a7b802 | |||
| e0bef949dd | |||
| dafe8667c5 | |||
| 4844ce6238 | |||
| d07305b89c | |||
| 5c15704c3a | |||
| ff7ce9a022 | |||
| d54a218a27 | |||
| 2e2a646ba8 | |||
|
|
0a13347e39 |
141
docs/MEMORY_ARCHITECTURE.md
Normal file
141
docs/MEMORY_ARCHITECTURE.md
Normal file
@@ -0,0 +1,141 @@
|
|||||||
|
# Memory Architecture
|
||||||
|
|
||||||
|
> How Timmy remembers, recalls, and learns — without hallucinating.
|
||||||
|
|
||||||
|
Refs: Epic #367 | Sub-issues #368, #369, #370, #371, #372
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Timmy's memory system uses a **Memory Palace** architecture — a structured, file-backed knowledge store organized into rooms and drawers. When faced with a recall question, the agent checks its palace *before* generating from scratch.
|
||||||
|
|
||||||
|
This document defines the retrieval order, storage layers, and data flow that make this work.
|
||||||
|
|
||||||
|
## Retrieval Order (L0–L5)
|
||||||
|
|
||||||
|
When the agent receives a prompt that looks like a recall question ("what did we do?", "what's the status of X?"), the retrieval enforcer intercepts it and walks through layers in order:
|
||||||
|
|
||||||
|
| Layer | Source | Question Answered | Short-circuits? |
|
||||||
|
|-------|--------|-------------------|------------------|
|
||||||
|
| L0 | `identity.txt` | Who am I? What are my mandates? | No (always loaded) |
|
||||||
|
| L1 | Palace rooms/drawers | What do I know about this topic? | Yes, if hit |
|
||||||
|
| L2 | Session scratchpad | What have I learned this session? | Yes, if hit |
|
||||||
|
| L3 | Artifact retrieval (Gitea API) | Can I fetch the actual issue/file/log? | Yes, if hit |
|
||||||
|
| L4 | Procedures/playbooks | Is there a documented way to do this? | Yes, if hit |
|
||||||
|
| L5 | Free generation | (Only when L0–L4 are exhausted) | N/A |
|
||||||
|
|
||||||
|
**Key principle:** The agent never reaches L5 (free generation) if any prior layer has relevant data. This eliminates hallucination for recall-style queries.
|
||||||
|
|
||||||
|
## Storage Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
~/.mempalace/
|
||||||
|
identity.txt # L0: Who I am, mandates, personality
|
||||||
|
rooms/
|
||||||
|
projects/
|
||||||
|
timmy-config.md # What I know about timmy-config
|
||||||
|
hermes-agent.md # What I know about hermes-agent
|
||||||
|
people/
|
||||||
|
alexander.md # Working relationship context
|
||||||
|
architecture/
|
||||||
|
fleet.md # Fleet system knowledge
|
||||||
|
mempalace.md # Self-knowledge about this system
|
||||||
|
config/
|
||||||
|
mempalace.yaml # Palace configuration
|
||||||
|
|
||||||
|
~/.hermes/
|
||||||
|
scratchpad/
|
||||||
|
{session_id}.json # L2: Ephemeral session context
|
||||||
|
```
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
### 1. Memory Palace Skill (`mempalace.py`) — #368
|
||||||
|
|
||||||
|
Core data structures:
|
||||||
|
- `PalaceRoom`: A named collection of drawers (topics)
|
||||||
|
- `Mempalace`: The top-level palace with room management
|
||||||
|
- Factory constructors: `for_issue_analysis()`, `for_health_check()`, `for_code_review()`
|
||||||
|
|
||||||
|
### 2. Retrieval Enforcer (`retrieval_enforcer.py`) — #369
|
||||||
|
|
||||||
|
Middleware that intercepts recall-style prompts:
|
||||||
|
1. Detects recall patterns ("what did", "status of", "last time we")
|
||||||
|
2. Walks L0→L4 in order, short-circuiting on first hit
|
||||||
|
3. Only allows free generation (L5) when all layers return empty
|
||||||
|
4. Produces an honest fallback: "I don't have this in my memory palace."
|
||||||
|
|
||||||
|
### 3. Session Scratchpad (`scratchpad.py`) — #370
|
||||||
|
|
||||||
|
Ephemeral, session-scoped working memory:
|
||||||
|
- Write-append only during a session
|
||||||
|
- Entries have TTL (default: 1 hour)
|
||||||
|
- Queried at L2 in retrieval chain
|
||||||
|
- Never auto-promoted to palace
|
||||||
|
|
||||||
|
### 4. Memory Promotion — #371
|
||||||
|
|
||||||
|
Explicit promotion from scratchpad to palace:
|
||||||
|
- Agent must call `promote_to_palace()` with a reason
|
||||||
|
- Dedup check against target drawer
|
||||||
|
- Summary required (raw tool output never stored)
|
||||||
|
- Conflict detection when new memory contradicts existing
|
||||||
|
|
||||||
|
### 5. Wake-Up Protocol (`wakeup.py`) — #372
|
||||||
|
|
||||||
|
Boot sequence for new sessions:
|
||||||
|
```
|
||||||
|
Session Start
|
||||||
|
│
|
||||||
|
├─ L0: Load identity.txt
|
||||||
|
├─ L1: Scan palace rooms for active context
|
||||||
|
├─ L1.5: Surface promoted memories from last session
|
||||||
|
├─ L2: Load surviving scratchpad entries
|
||||||
|
│
|
||||||
|
└─ Ready: agent knows who it is, what it was doing, what it learned
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────┐
|
||||||
|
│ User Prompt │
|
||||||
|
└────────┬─────────┘
|
||||||
|
│
|
||||||
|
┌────────┴─────────┐
|
||||||
|
│ Recall Detector │
|
||||||
|
└────┬───────┬─────┘
|
||||||
|
│ │
|
||||||
|
[recall] [not recall]
|
||||||
|
│ │
|
||||||
|
┌───────┴────┐ ┌──┬─┴───────┐
|
||||||
|
│ Retrieval │ │ Normal Flow │
|
||||||
|
│ Enforcer │ └─────────────┘
|
||||||
|
│ L0→L1→L2 │
|
||||||
|
│ →L3→L4→L5│
|
||||||
|
└──────┬─────┘
|
||||||
|
│
|
||||||
|
┌──────┴─────┐
|
||||||
|
│ Response │
|
||||||
|
│ (grounded) │
|
||||||
|
└────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Anti-Patterns
|
||||||
|
|
||||||
|
| Don't | Do Instead |
|
||||||
|
|-------|------------|
|
||||||
|
| Generate from vibes when palace has data | Check palace first (L1) |
|
||||||
|
| Auto-promote everything to palace | Require explicit `promote_to_palace()` with reason |
|
||||||
|
| Store raw API responses as memories | Summarize before storing |
|
||||||
|
| Hallucinate when palace is empty | Say "I don't have this in my memory palace" |
|
||||||
|
| Dump entire palace on wake-up | Selective loading based on session context |
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
| Component | Issue | PR | Status |
|
||||||
|
|-----------|-------|----|--------|
|
||||||
|
| Skill port | #368 | #374 | In Review |
|
||||||
|
| Retrieval enforcer | #369 | #374 | In Review |
|
||||||
|
| Session scratchpad | #370 | #374 | In Review |
|
||||||
|
| Memory promotion | #371 | — | Open |
|
||||||
|
| Wake-up protocol | #372 | #374 | In Review |
|
||||||
122
fleet/agent_lifecycle.py
Normal file
122
fleet/agent_lifecycle.py
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
FLEET-012: Agent Lifecycle Manager
|
||||||
|
Phase 5: Scale — spawn, train, deploy, retire agents automatically.
|
||||||
|
|
||||||
|
Manages the full lifecycle:
|
||||||
|
1. PROVISION: Clone template, install deps, configure, test
|
||||||
|
2. DEPLOY: Add to active rotation, start accepting issues
|
||||||
|
3. MONITOR: Track performance, quality, heartbeat
|
||||||
|
4. RETIRE: Decommission when idle or underperforming
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 agent_lifecycle.py provision <name> <vps> [--model model]
|
||||||
|
python3 agent_lifecycle.py deploy <name>
|
||||||
|
python3 agent_lifecycle.py retire <name>
|
||||||
|
python3 agent_lifecycle.py status
|
||||||
|
python3 agent_lifecycle.py monitor
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os, sys, json
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
|
||||||
|
DATA_DIR = os.path.expanduser("~/.local/timmy/fleet-agents")
|
||||||
|
DB_FILE = os.path.join(DATA_DIR, "agents.json")
|
||||||
|
LOG_FILE = os.path.join(DATA_DIR, "lifecycle.log")
|
||||||
|
|
||||||
|
def ensure():
|
||||||
|
os.makedirs(DATA_DIR, exist_ok=True)
|
||||||
|
|
||||||
|
def log(msg, level="INFO"):
|
||||||
|
ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||||
|
entry = f"[{ts}] [{level}] {msg}"
|
||||||
|
with open(LOG_FILE, "a") as f: f.write(entry + "\n")
|
||||||
|
print(f" {entry}")
|
||||||
|
|
||||||
|
def load():
|
||||||
|
if os.path.exists(DB_FILE):
|
||||||
|
return json.loads(open(DB_FILE).read())
|
||||||
|
return {}
|
||||||
|
|
||||||
|
def save(db):
|
||||||
|
open(DB_FILE, "w").write(json.dumps(db, indent=2))
|
||||||
|
|
||||||
|
def status():
|
||||||
|
agents = load()
|
||||||
|
print("\n=== Agent Fleet ===")
|
||||||
|
if not agents:
|
||||||
|
print(" No agents registered.")
|
||||||
|
return
|
||||||
|
for name, a in agents.items():
|
||||||
|
state = a.get("state", "?")
|
||||||
|
vps = a.get("vps", "?")
|
||||||
|
model = a.get("model", "?")
|
||||||
|
tasks = a.get("tasks_completed", 0)
|
||||||
|
hb = a.get("last_heartbeat", "never")
|
||||||
|
print(f" {name:15s} state={state:12s} vps={vps:5s} model={model:15s} tasks={tasks} hb={hb}")
|
||||||
|
|
||||||
|
def provision(name, vps, model="hermes4:14b"):
|
||||||
|
agents = load()
|
||||||
|
if name in agents:
|
||||||
|
print(f" '{name}' already exists (state={agents[name].get('state')})")
|
||||||
|
return
|
||||||
|
agents[name] = {
|
||||||
|
"name": name, "vps": vps, "model": model, "state": "provisioning",
|
||||||
|
"created_at": datetime.now(timezone.utc).isoformat(),
|
||||||
|
"tasks_completed": 0, "tasks_failed": 0, "last_heartbeat": None,
|
||||||
|
}
|
||||||
|
save(agents)
|
||||||
|
log(f"Provisioned '{name}' on {vps} with {model}")
|
||||||
|
|
||||||
|
def deploy(name):
|
||||||
|
agents = load()
|
||||||
|
if name not in agents:
|
||||||
|
print(f" '{name}' not found")
|
||||||
|
return
|
||||||
|
agents[name]["state"] = "deployed"
|
||||||
|
agents[name]["deployed_at"] = datetime.now(timezone.utc).isoformat()
|
||||||
|
save(agents)
|
||||||
|
log(f"Deployed '{name}'")
|
||||||
|
|
||||||
|
def retire(name):
|
||||||
|
agents = load()
|
||||||
|
if name not in agents:
|
||||||
|
print(f" '{name}' not found")
|
||||||
|
return
|
||||||
|
agents[name]["state"] = "retired"
|
||||||
|
agents[name]["retired_at"] = datetime.now(timezone.utc).isoformat()
|
||||||
|
save(agents)
|
||||||
|
log(f"Retired '{name}'. Completed {agents[name].get('tasks_completed', 0)} tasks.")
|
||||||
|
|
||||||
|
def monitor():
|
||||||
|
agents = load()
|
||||||
|
now = datetime.now(timezone.utc)
|
||||||
|
changes = 0
|
||||||
|
for name, a in agents.items():
|
||||||
|
if a.get("state") != "deployed": continue
|
||||||
|
hb = a.get("last_heartbeat")
|
||||||
|
if hb:
|
||||||
|
try:
|
||||||
|
hb_t = datetime.fromisoformat(hb)
|
||||||
|
hours = (now - hb_t).total_seconds() / 3600
|
||||||
|
if hours > 24 and a.get("state") == "deployed":
|
||||||
|
a["state"] = "idle"
|
||||||
|
a["idle_since"] = now.isoformat()
|
||||||
|
log(f"'{name}' idle for {hours:.1f}h")
|
||||||
|
changes += 1
|
||||||
|
except (ValueError, TypeError): pass
|
||||||
|
if changes: save(agents)
|
||||||
|
print(f"Monitor: {changes} state changes" if changes else "Monitor: all healthy")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
ensure()
|
||||||
|
cmd = sys.argv[1] if len(sys.argv) > 1 else "monitor"
|
||||||
|
if cmd == "status": status()
|
||||||
|
elif cmd == "provision" and len(sys.argv) >= 4:
|
||||||
|
model = sys.argv[4] if len(sys.argv) >= 5 else "hermes4:14b"
|
||||||
|
provision(sys.argv[2], sys.argv[3], model)
|
||||||
|
elif cmd == "deploy" and len(sys.argv) >= 3: deploy(sys.argv[2])
|
||||||
|
elif cmd == "retire" and len(sys.argv) >= 3: retire(sys.argv[2])
|
||||||
|
elif cmd == "monitor": monitor()
|
||||||
|
elif cmd == "run": monitor()
|
||||||
|
else: print("Usage: agent_lifecycle.py [provision|deploy|retire|status|monitor]")
|
||||||
122
fleet/delegation.py
Normal file
122
fleet/delegation.py
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
FLEET-010: Cross-Agent Task Delegation Protocol
|
||||||
|
Phase 3: Orchestration. Agents create issues, assign to other agents, review PRs.
|
||||||
|
|
||||||
|
Keyword-based heuristic assigns unassigned issues to the right agent:
|
||||||
|
- claw-code: small patches, config, docs, repo hygiene
|
||||||
|
- gemini: research, heavy implementation, architecture, debugging
|
||||||
|
- ezra: VPS, SSH, deploy, infrastructure, cron, ops
|
||||||
|
- bezalel: evennia, art, creative, music, visualization
|
||||||
|
- timmy: orchestration, review, deploy, fleet, pipeline
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 delegation.py run # Full cycle: scan, assign, report
|
||||||
|
python3 delegation.py status # Show current delegation state
|
||||||
|
python3 delegation.py monitor # Check agent assignments for stuck items
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os, sys, json, urllib.request
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
GITEA_BASE = "https://forge.alexanderwhitestone.com/api/v1"
|
||||||
|
TOKEN = Path(os.path.expanduser("~/.config/gitea/token")).read_text().strip()
|
||||||
|
DATA_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-resources"))
|
||||||
|
LOG_FILE = DATA_DIR / "delegation.log"
|
||||||
|
HEADERS = {"Authorization": f"token {TOKEN}"}
|
||||||
|
|
||||||
|
AGENTS = {
|
||||||
|
"claw-code": {"caps": ["patch","config","gitignore","cleanup","format","readme","typo"], "active": True},
|
||||||
|
"gemini": {"caps": ["research","investigate","benchmark","survey","evaluate","architecture","implementation"], "active": True},
|
||||||
|
"ezra": {"caps": ["vps","ssh","deploy","cron","resurrect","provision","infra","server"], "active": True},
|
||||||
|
"bezalel": {"caps": ["evennia","art","creative","music","visual","design","animation"], "active": True},
|
||||||
|
"timmy": {"caps": ["orchestrate","review","pipeline","fleet","monitor","health","deploy","ci"], "active": True},
|
||||||
|
}
|
||||||
|
|
||||||
|
MONITORED = [
|
||||||
|
"Timmy_Foundation/timmy-home",
|
||||||
|
"Timmy_Foundation/timmy-config",
|
||||||
|
"Timmy_Foundation/the-nexus",
|
||||||
|
"Timmy_Foundation/hermes-agent",
|
||||||
|
]
|
||||||
|
|
||||||
|
def api(path, method="GET", data=None):
|
||||||
|
url = f"{GITEA_BASE}{path}"
|
||||||
|
body = json.dumps(data).encode() if data else None
|
||||||
|
hdrs = dict(HEADERS)
|
||||||
|
if data: hdrs["Content-Type"] = "application/json"
|
||||||
|
req = urllib.request.Request(url, data=body, headers=hdrs, method=method)
|
||||||
|
try:
|
||||||
|
resp = urllib.request.urlopen(req, timeout=15)
|
||||||
|
raw = resp.read().decode()
|
||||||
|
return json.loads(raw) if raw.strip() else {}
|
||||||
|
except urllib.error.HTTPError as e:
|
||||||
|
body = e.read().decode()
|
||||||
|
print(f" API {e.code}: {body[:150]}")
|
||||||
|
return None
|
||||||
|
except Exception as e:
|
||||||
|
print(f" API error: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def log(msg):
|
||||||
|
ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||||
|
DATA_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
with open(LOG_FILE, "a") as f: f.write(f"[{ts}] {msg}\n")
|
||||||
|
|
||||||
|
def suggest_agent(title, body):
|
||||||
|
text = (title + " " + body).lower()
|
||||||
|
for agent, info in AGENTS.items():
|
||||||
|
for kw in info["caps"]:
|
||||||
|
if kw in text:
|
||||||
|
return agent, f"matched: {kw}"
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
def assign(repo, num, agent, reason=""):
|
||||||
|
result = api(f"/repos/{repo}/issues/{num}", method="PATCH",
|
||||||
|
data={"assignees": {"operation": "set", "usernames": [agent]}})
|
||||||
|
if result:
|
||||||
|
api(f"/repos/{repo}/issues/{num}/comments", method="POST",
|
||||||
|
data={"body": f"[DELEGATION] Assigned to {agent}. {reason}"})
|
||||||
|
log(f"Assigned {repo}#{num} to {agent}: {reason}")
|
||||||
|
return result
|
||||||
|
|
||||||
|
def run_cycle():
|
||||||
|
log("--- Delegation cycle start ---")
|
||||||
|
count = 0
|
||||||
|
for repo in MONITORED:
|
||||||
|
issues = api(f"/repos/{repo}/issues?state=open&limit=50")
|
||||||
|
if not issues: continue
|
||||||
|
for i in issues:
|
||||||
|
if i.get("assignees"): continue
|
||||||
|
title = i.get("title", "")
|
||||||
|
body = i.get("body", "")
|
||||||
|
if any(w in title.lower() for w in ["epic", "discussion"]): continue
|
||||||
|
agent, reason = suggest_agent(title, body)
|
||||||
|
if agent and AGENTS.get(agent, {}).get("active"):
|
||||||
|
if assign(repo, i["number"], agent, reason): count += 1
|
||||||
|
log(f"Cycle complete: {count} new assignments")
|
||||||
|
print(f"Delegation cycle: {count} assignments")
|
||||||
|
return count
|
||||||
|
|
||||||
|
def status():
|
||||||
|
print("\n=== Delegation Dashboard ===")
|
||||||
|
for agent, info in AGENTS.items():
|
||||||
|
count = 0
|
||||||
|
for repo in MONITORED:
|
||||||
|
issues = api(f"/repos/{repo}/issues?state=open&limit=50")
|
||||||
|
if issues:
|
||||||
|
for i in issues:
|
||||||
|
for a in (i.get("assignees") or []):
|
||||||
|
if a.get("login") == agent: count += 1
|
||||||
|
icon = "ON" if info["active"] else "OFF"
|
||||||
|
print(f" {agent:12s}: {count:>3} issues [{icon}]")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
cmd = sys.argv[1] if len(sys.argv) > 1 else "run"
|
||||||
|
DATA_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
if cmd == "status": status()
|
||||||
|
elif cmd == "run":
|
||||||
|
run_cycle()
|
||||||
|
status()
|
||||||
|
else: status()
|
||||||
126
fleet/model_pipeline.py
Normal file
126
fleet/model_pipeline.py
Normal file
@@ -0,0 +1,126 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
FLEET-011: Local Model Pipeline and Fallback Chain
|
||||||
|
Phase 4: Sovereignty — all inference runs locally, no cloud dependency.
|
||||||
|
|
||||||
|
Checks Ollama endpoints, verifies model availability, tests fallback chain.
|
||||||
|
Logs results. The chain runs: hermes4:14b -> qwen2.5:7b -> gemma3:1b -> gemma4 (latest)
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 model_pipeline.py # Run full fallback test
|
||||||
|
python3 model_pipeline.py status # Show current model status
|
||||||
|
python3 model_pipeline.py list # List all local models
|
||||||
|
python3 model_pipeline.py test # Generate test output from each model
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os, sys, json, urllib.request
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "localhost:11434")
|
||||||
|
LOG_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-health"))
|
||||||
|
CHAIN_FILE = Path(os.path.expanduser("~/.local/timmy/fleet-resources/model-chain.json"))
|
||||||
|
|
||||||
|
DEFAULT_CHAIN = [
|
||||||
|
{"model": "hermes4:14b", "role": "primary"},
|
||||||
|
{"model": "qwen2.5:7b", "role": "fallback"},
|
||||||
|
{"model": "phi3:3.8b", "role": "emergency"},
|
||||||
|
{"model": "gemma3:1b", "role": "minimal"},
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def log(msg):
|
||||||
|
LOG_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
with open(LOG_DIR / "model-pipeline.log", "a") as f:
|
||||||
|
f.write(f"[{datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M:%S')}] {msg}\n")
|
||||||
|
|
||||||
|
|
||||||
|
def check_ollama():
|
||||||
|
try:
|
||||||
|
resp = urllib.request.urlopen(f"http://{OLLAMA_HOST}/api/tags", timeout=5)
|
||||||
|
return json.loads(resp.read())
|
||||||
|
except Exception as e:
|
||||||
|
return {"error": str(e)}
|
||||||
|
|
||||||
|
|
||||||
|
def list_models():
|
||||||
|
data = check_ollama()
|
||||||
|
if "error" in data:
|
||||||
|
print(f" Ollama not reachable at {OLLAMA_HOST}: {data['error']}")
|
||||||
|
return []
|
||||||
|
models = data.get("models", [])
|
||||||
|
for m in models:
|
||||||
|
name = m.get("name", "?")
|
||||||
|
size = m.get("size", 0) / (1024**3)
|
||||||
|
print(f" {name:<25s} {size:.1f} GB")
|
||||||
|
return [m["name"] for m in models]
|
||||||
|
|
||||||
|
|
||||||
|
def test_model(model, prompt="Say 'beacon lit' and nothing else."):
|
||||||
|
try:
|
||||||
|
body = json.dumps({"model": model, "prompt": prompt, "stream": False}).encode()
|
||||||
|
req = urllib.request.Request(f"http://{OLLAMA_HOST}/api/generate", data=body,
|
||||||
|
headers={"Content-Type": "application/json"})
|
||||||
|
resp = urllib.request.urlopen(req, timeout=60)
|
||||||
|
result = json.loads(resp.read())
|
||||||
|
return True, result.get("response", "").strip()
|
||||||
|
except Exception as e:
|
||||||
|
return False, str(e)[:100]
|
||||||
|
|
||||||
|
|
||||||
|
def test_chain():
|
||||||
|
chain_data = {}
|
||||||
|
if CHAIN_FILE.exists():
|
||||||
|
chain_data = json.loads(CHAIN_FILE.read_text())
|
||||||
|
chain = chain_data.get("chain", DEFAULT_CHAIN)
|
||||||
|
|
||||||
|
available = list_models() or []
|
||||||
|
print("\n=== Fallback Chain Test ===")
|
||||||
|
first_good = None
|
||||||
|
|
||||||
|
for entry in chain:
|
||||||
|
model = entry["model"]
|
||||||
|
role = entry.get("role", "unknown")
|
||||||
|
if model in available:
|
||||||
|
ok, result = test_model(model)
|
||||||
|
status = "OK" if ok else "FAIL"
|
||||||
|
print(f" [{status}] {model:<25s} ({role}) — {result[:70]}")
|
||||||
|
log(f"Fallback test {model}: {status} — {result[:100]}")
|
||||||
|
if ok and first_good is None:
|
||||||
|
first_good = model
|
||||||
|
else:
|
||||||
|
print(f" [MISS] {model:<25s} ({role}) — not installed")
|
||||||
|
|
||||||
|
if first_good:
|
||||||
|
print(f"\n Primary serving: {first_good}")
|
||||||
|
else:
|
||||||
|
print(f"\n WARNING: No chain model responding. Fallback broken.")
|
||||||
|
log("FALLBACK CHAIN BROKEN — no models responding")
|
||||||
|
|
||||||
|
|
||||||
|
def status():
|
||||||
|
data = check_ollama()
|
||||||
|
if "error" in data:
|
||||||
|
print(f" Ollama: DOWN — {data['error']}")
|
||||||
|
else:
|
||||||
|
models = data.get("models", [])
|
||||||
|
print(f" Ollama: UP — {len(models)} models loaded")
|
||||||
|
print("\n=== Local Models ===")
|
||||||
|
list_models()
|
||||||
|
print("\n=== Chain Configuration ===")
|
||||||
|
if CHAIN_FILE.exists():
|
||||||
|
chain = json.loads(CHAIN_FILE.read_text()).get("chain", DEFAULT_CHAIN)
|
||||||
|
else:
|
||||||
|
chain = DEFAULT_CHAIN
|
||||||
|
for e in chain:
|
||||||
|
print(f" {e['model']:<25s} {e.get('role','?')}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
cmd = sys.argv[1] if len(sys.argv) > 1 else "status"
|
||||||
|
if cmd == "status": status()
|
||||||
|
elif cmd == "list": list_models()
|
||||||
|
elif cmd == "test": test_chain()
|
||||||
|
else:
|
||||||
|
status()
|
||||||
|
test_chain()
|
||||||
Reference in New Issue
Block a user