feat: Timmy fixes and improvements (#72)
* test: remove hardcoded sleeps, add pytest-timeout - Replace fixed time.sleep() calls with intelligent polling or WebDriverWait - Add pytest-timeout dependency and --timeout=30 to prevent hangs - Fixes test flakiness and improves test suite speed * feat: add Aider AI tool to Forge's toolkit - Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist - Register tool in Forge's code toolkit - Add functional tests for the Aider tool * config: add opencode.json with local Ollama provider for sovereign AI * feat: Timmy fixes and improvements ## Bug Fixes - Fix read_file path resolution: add ~ expansion, proper relative path handling - Add repo_root to config.py with auto-detection from .git location - Fix hardcoded llama3.2 - now dynamic from settings.ollama_model ## Timmy's Requests - Add communication protocol to AGENTS.md (read context first, explain changes) - Create DECISIONS.md for architectural decision documentation - Add reasoning guidance to system prompts (step-by-step, state uncertainty) - Update tests to reflect correct model name (llama3.1:8b-instruct) ## Testing - All 177 dashboard tests pass - All 32 prompt/tool tests pass --------- Co-authored-by: Alexander Payne <apayne@MM.local>
This commit is contained in:
committed by
GitHub
parent
4ba272eb4f
commit
18ed6232f9
15
AGENTS.md
15
AGENTS.md
@@ -4,6 +4,21 @@ Read [`CLAUDE.md`](CLAUDE.md) for architecture patterns and conventions.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Communication Protocol
|
||||||
|
|
||||||
|
**Before making changes, always:**
|
||||||
|
1. Read CLAUDE.md and AGENTS.md fully
|
||||||
|
2. Explore the relevant src/ modules to understand existing patterns
|
||||||
|
3. Explain what you're changing and **why** in plain English
|
||||||
|
4. Provide decision rationale - don't just make changes, explain the reasoning
|
||||||
|
|
||||||
|
**For Timmy's growth goals:**
|
||||||
|
- Improve reasoning in complex/uncertain situations: think step-by-step, consider alternatives
|
||||||
|
- When uncertain, state uncertainty explicitly rather than guessing
|
||||||
|
- Document major decisions in DECISIONS.md
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Non-Negotiable Rules
|
## Non-Negotiable Rules
|
||||||
|
|
||||||
1. **Tests must stay green.** Run `make test` before committing.
|
1. **Tests must stay green.** Run `make test` before committing.
|
||||||
|
|||||||
41
DECISIONS.md
Normal file
41
DECISIONS.md
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
# DECISIONS.md — Architectural Decision Log
|
||||||
|
|
||||||
|
This file documents major architectural decisions and their rationale.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision: Dynamic Model Name in System Prompts
|
||||||
|
|
||||||
|
**Date:** 2026-02-26
|
||||||
|
|
||||||
|
**Context:** Timmy's system prompts hardcoded "llama3.2" but the actual model is "llama3.1:8b-instruct", causing confusion.
|
||||||
|
|
||||||
|
**Decision:** Make model name dynamic by:
|
||||||
|
- Using `{model_name}` placeholder in prompt templates
|
||||||
|
- Injecting actual value from `settings.ollama_model` at runtime via `get_system_prompt()`
|
||||||
|
|
||||||
|
**Rationale:** Single source of truth. If model changes in config, prompts reflect it automatically.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision: Unified Repo Root Detection
|
||||||
|
|
||||||
|
**Date:** 2026-02-26
|
||||||
|
|
||||||
|
**Context:** Multiple places in code detected repo root differently (git_tools.py, file_ops.py, timmy.py).
|
||||||
|
|
||||||
|
**Decision:** Add `repo_root` to config.py with auto-detection:
|
||||||
|
- Walk up from `__file__` to find `.git`
|
||||||
|
- Fall back to environment or current directory
|
||||||
|
|
||||||
|
**Rationale:** Consistent path resolution for all tools.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Add New Decisions Above This Line
|
||||||
|
|
||||||
|
When making significant architectural choices, document:
|
||||||
|
1. Date
|
||||||
|
2. Context (what problem prompted the decision)
|
||||||
|
3. Decision (what was chosen)
|
||||||
|
4. Rationale (why this approach was better than alternatives)
|
||||||
@@ -53,6 +53,10 @@ class Settings(BaseSettings):
|
|||||||
# ── Git / DevOps ──────────────────────────────────────────────────────
|
# ── Git / DevOps ──────────────────────────────────────────────────────
|
||||||
git_default_repo_dir: str = "~/repos"
|
git_default_repo_dir: str = "~/repos"
|
||||||
|
|
||||||
|
# Repository root - auto-detected but can be overridden
|
||||||
|
# This is the main project directory where .git lives
|
||||||
|
repo_root: str = ""
|
||||||
|
|
||||||
# ── Creative — Image Generation (Pixel) ───────────────────────────────
|
# ── Creative — Image Generation (Pixel) ───────────────────────────────
|
||||||
flux_model_id: str = "black-forest-labs/FLUX.1-schnell"
|
flux_model_id: str = "black-forest-labs/FLUX.1-schnell"
|
||||||
image_output_dir: str = "data/images"
|
image_output_dir: str = "data/images"
|
||||||
@@ -105,7 +109,9 @@ class Settings(BaseSettings):
|
|||||||
# External users and agents can submit work orders for improvements.
|
# External users and agents can submit work orders for improvements.
|
||||||
work_orders_enabled: bool = True
|
work_orders_enabled: bool = True
|
||||||
work_orders_auto_execute: bool = False # Master switch for auto-execution
|
work_orders_auto_execute: bool = False # Master switch for auto-execution
|
||||||
work_orders_auto_threshold: str = "low" # Max priority that auto-executes: "low" | "medium" | "high" | "none"
|
work_orders_auto_threshold: str = (
|
||||||
|
"low" # Max priority that auto-executes: "low" | "medium" | "high" | "none"
|
||||||
|
)
|
||||||
|
|
||||||
# ── Custom Weights & Models ──────────────────────────────────────
|
# ── Custom Weights & Models ──────────────────────────────────────
|
||||||
# Directory for custom model weights (GGUF, safetensors, HF checkpoints).
|
# Directory for custom model weights (GGUF, safetensors, HF checkpoints).
|
||||||
@@ -140,6 +146,21 @@ class Settings(BaseSettings):
|
|||||||
# Background meditation interval in seconds (0 = disabled).
|
# Background meditation interval in seconds (0 = disabled).
|
||||||
scripture_meditation_interval: int = 0
|
scripture_meditation_interval: int = 0
|
||||||
|
|
||||||
|
def _compute_repo_root(self) -> str:
|
||||||
|
"""Auto-detect repo root if not set."""
|
||||||
|
if self.repo_root:
|
||||||
|
return self.repo_root
|
||||||
|
# Walk up from this file to find .git
|
||||||
|
import os
|
||||||
|
|
||||||
|
path = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
path = os.path.dirname(os.path.dirname(path)) # src/ -> project root
|
||||||
|
while path != os.path.dirname(path):
|
||||||
|
if os.path.exists(os.path.join(path, ".git")):
|
||||||
|
return path
|
||||||
|
path = os.path.dirname(path)
|
||||||
|
return os.getcwd()
|
||||||
|
|
||||||
model_config = SettingsConfigDict(
|
model_config = SettingsConfigDict(
|
||||||
env_file=".env",
|
env_file=".env",
|
||||||
env_file_encoding="utf-8",
|
env_file_encoding="utf-8",
|
||||||
@@ -148,6 +169,9 @@ class Settings(BaseSettings):
|
|||||||
|
|
||||||
|
|
||||||
settings = Settings()
|
settings = Settings()
|
||||||
|
# Ensure repo_root is computed if not set
|
||||||
|
if not settings.repo_root:
|
||||||
|
settings.repo_root = settings._compute_repo_root()
|
||||||
|
|
||||||
# ── Model fallback configuration ────────────────────────────────────────────
|
# ── Model fallback configuration ────────────────────────────────────────────
|
||||||
# Primary model for reliable tool calling (llama3.1:8b-instruct)
|
# Primary model for reliable tool calling (llama3.1:8b-instruct)
|
||||||
@@ -160,6 +184,7 @@ def check_ollama_model_available(model_name: str) -> bool:
|
|||||||
"""Check if a specific Ollama model is available locally."""
|
"""Check if a specific Ollama model is available locally."""
|
||||||
try:
|
try:
|
||||||
import urllib.request
|
import urllib.request
|
||||||
|
|
||||||
url = settings.ollama_url.replace("localhost", "127.0.0.1")
|
url = settings.ollama_url.replace("localhost", "127.0.0.1")
|
||||||
req = urllib.request.Request(
|
req = urllib.request.Request(
|
||||||
f"{url}/api/tags",
|
f"{url}/api/tags",
|
||||||
@@ -168,6 +193,7 @@ def check_ollama_model_available(model_name: str) -> bool:
|
|||||||
)
|
)
|
||||||
with urllib.request.urlopen(req, timeout=5) as response:
|
with urllib.request.urlopen(req, timeout=5) as response:
|
||||||
import json
|
import json
|
||||||
|
|
||||||
data = json.loads(response.read().decode())
|
data = json.loads(response.read().decode())
|
||||||
models = [m.get("name", "").split(":")[0] for m in data.get("models", [])]
|
models = [m.get("name", "").split(":")[0] for m in data.get("models", [])]
|
||||||
# Check for exact match or model name without tag
|
# Check for exact match or model name without tag
|
||||||
@@ -180,11 +206,11 @@ def get_effective_ollama_model() -> str:
|
|||||||
"""Get the effective Ollama model, with fallback logic."""
|
"""Get the effective Ollama model, with fallback logic."""
|
||||||
# If user has overridden, use their setting
|
# If user has overridden, use their setting
|
||||||
user_model = settings.ollama_model
|
user_model = settings.ollama_model
|
||||||
|
|
||||||
# Check if user's model is available
|
# Check if user's model is available
|
||||||
if check_ollama_model_available(user_model):
|
if check_ollama_model_available(user_model):
|
||||||
return user_model
|
return user_model
|
||||||
|
|
||||||
# Try primary
|
# Try primary
|
||||||
if check_ollama_model_available(OLLAMA_MODEL_PRIMARY):
|
if check_ollama_model_available(OLLAMA_MODEL_PRIMARY):
|
||||||
_startup_logger.warning(
|
_startup_logger.warning(
|
||||||
@@ -192,7 +218,7 @@ def get_effective_ollama_model() -> str:
|
|||||||
f"Using primary: {OLLAMA_MODEL_PRIMARY}"
|
f"Using primary: {OLLAMA_MODEL_PRIMARY}"
|
||||||
)
|
)
|
||||||
return OLLAMA_MODEL_PRIMARY
|
return OLLAMA_MODEL_PRIMARY
|
||||||
|
|
||||||
# Try fallback
|
# Try fallback
|
||||||
if check_ollama_model_available(OLLAMA_MODEL_FALLBACK):
|
if check_ollama_model_available(OLLAMA_MODEL_FALLBACK):
|
||||||
_startup_logger.warning(
|
_startup_logger.warning(
|
||||||
@@ -200,7 +226,7 @@ def get_effective_ollama_model() -> str:
|
|||||||
f"Using fallback: {OLLAMA_MODEL_FALLBACK}"
|
f"Using fallback: {OLLAMA_MODEL_FALLBACK}"
|
||||||
)
|
)
|
||||||
return OLLAMA_MODEL_FALLBACK
|
return OLLAMA_MODEL_FALLBACK
|
||||||
|
|
||||||
# Last resort - return user's setting and hope for the best
|
# Last resort - return user's setting and hope for the best
|
||||||
return user_model
|
return user_model
|
||||||
|
|
||||||
@@ -222,7 +248,7 @@ if settings.timmy_env == "production":
|
|||||||
if _missing:
|
if _missing:
|
||||||
_startup_logger.error(
|
_startup_logger.error(
|
||||||
"PRODUCTION SECURITY ERROR: The following secrets must be set: %s\n"
|
"PRODUCTION SECURITY ERROR: The following secrets must be set: %s\n"
|
||||||
"Generate with: python3 -c \"import secrets; print(secrets.token_hex(32))\"\n"
|
'Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"\n'
|
||||||
"Set in .env file or environment variables.",
|
"Set in .env file or environment variables.",
|
||||||
", ".join(_missing),
|
", ".join(_missing),
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -8,7 +8,12 @@ from pathlib import Path
|
|||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from mcp.registry import register_tool
|
from mcp.registry import register_tool
|
||||||
from mcp.schemas.base import create_tool_schema, PARAM_STRING, PARAM_BOOLEAN, RETURN_STRING
|
from mcp.schemas.base import (
|
||||||
|
create_tool_schema,
|
||||||
|
PARAM_STRING,
|
||||||
|
PARAM_BOOLEAN,
|
||||||
|
RETURN_STRING,
|
||||||
|
)
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -75,40 +80,54 @@ LIST_DIR_SCHEMA = create_tool_schema(
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _resolve_path(path: str) -> Path:
|
def _resolve_path(path: str, base_dir: str | Path | None = None) -> Path:
|
||||||
"""Resolve path relative to project root."""
|
"""Resolve path with proper handling of ~, absolute, and relative paths.
|
||||||
|
|
||||||
|
Resolution order:
|
||||||
|
1. If absolute, use as-is (after expanding ~)
|
||||||
|
2. If relative, resolve relative to base_dir (or repo root)
|
||||||
|
"""
|
||||||
from config import settings
|
from config import settings
|
||||||
|
|
||||||
p = Path(path)
|
p = Path(path)
|
||||||
|
|
||||||
|
# Expand ~ to user's home directory
|
||||||
|
p = p.expanduser()
|
||||||
|
|
||||||
if p.is_absolute():
|
if p.is_absolute():
|
||||||
return p
|
return p.resolve()
|
||||||
|
|
||||||
# Try relative to project root
|
# Use provided base_dir, or fall back to settings.repo_root
|
||||||
project_root = Path(__file__).parent.parent.parent
|
if base_dir is None:
|
||||||
return project_root / p
|
base = Path(settings.repo_root)
|
||||||
|
else:
|
||||||
|
base = Path(base_dir)
|
||||||
|
|
||||||
|
# Resolve relative to base
|
||||||
|
return (base / p).resolve()
|
||||||
|
|
||||||
|
|
||||||
def read_file(path: str, limit: int = 0) -> str:
|
def read_file(path: str, limit: int = 0) -> str:
|
||||||
"""Read file contents."""
|
"""Read file contents."""
|
||||||
try:
|
try:
|
||||||
filepath = _resolve_path(path)
|
filepath = _resolve_path(path)
|
||||||
|
|
||||||
if not filepath.exists():
|
if not filepath.exists():
|
||||||
return f"Error: File not found: {path}"
|
return f"Error: File not found: {path}"
|
||||||
|
|
||||||
if not filepath.is_file():
|
if not filepath.is_file():
|
||||||
return f"Error: Path is not a file: {path}"
|
return f"Error: Path is not a file: {path}"
|
||||||
|
|
||||||
content = filepath.read_text()
|
content = filepath.read_text()
|
||||||
|
|
||||||
if limit > 0:
|
if limit > 0:
|
||||||
lines = content.split('\n')[:limit]
|
lines = content.split("\n")[:limit]
|
||||||
content = '\n'.join(lines)
|
content = "\n".join(lines)
|
||||||
if len(content.split('\n')) == limit:
|
if len(content.split("\n")) == limit:
|
||||||
content += f"\n\n... [{limit} lines shown]"
|
content += f"\n\n... [{limit} lines shown]"
|
||||||
|
|
||||||
return content
|
return content
|
||||||
|
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
logger.error("Read file failed: %s", exc)
|
logger.error("Read file failed: %s", exc)
|
||||||
return f"Error reading file: {exc}"
|
return f"Error reading file: {exc}"
|
||||||
@@ -118,16 +137,16 @@ def write_file(path: str, content: str, append: bool = False) -> str:
|
|||||||
"""Write content to file."""
|
"""Write content to file."""
|
||||||
try:
|
try:
|
||||||
filepath = _resolve_path(path)
|
filepath = _resolve_path(path)
|
||||||
|
|
||||||
# Ensure directory exists
|
# Ensure directory exists
|
||||||
filepath.parent.mkdir(parents=True, exist_ok=True)
|
filepath.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
mode = "a" if append else "w"
|
mode = "a" if append else "w"
|
||||||
filepath.write_text(content)
|
filepath.write_text(content)
|
||||||
|
|
||||||
action = "appended to" if append else "wrote"
|
action = "appended to" if append else "wrote"
|
||||||
return f"Successfully {action} {filepath}"
|
return f"Successfully {action} {filepath}"
|
||||||
|
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
logger.error("Write file failed: %s", exc)
|
logger.error("Write file failed: %s", exc)
|
||||||
return f"Error writing file: {exc}"
|
return f"Error writing file: {exc}"
|
||||||
@@ -137,32 +156,32 @@ def list_directory(path: str = ".", pattern: str = "*") -> str:
|
|||||||
"""List directory contents."""
|
"""List directory contents."""
|
||||||
try:
|
try:
|
||||||
dirpath = _resolve_path(path)
|
dirpath = _resolve_path(path)
|
||||||
|
|
||||||
if not dirpath.exists():
|
if not dirpath.exists():
|
||||||
return f"Error: Directory not found: {path}"
|
return f"Error: Directory not found: {path}"
|
||||||
|
|
||||||
if not dirpath.is_dir():
|
if not dirpath.is_dir():
|
||||||
return f"Error: Path is not a directory: {path}"
|
return f"Error: Path is not a directory: {path}"
|
||||||
|
|
||||||
items = list(dirpath.glob(pattern))
|
items = list(dirpath.glob(pattern))
|
||||||
|
|
||||||
files = []
|
files = []
|
||||||
dirs = []
|
dirs = []
|
||||||
|
|
||||||
for item in items:
|
for item in items:
|
||||||
if item.is_dir():
|
if item.is_dir():
|
||||||
dirs.append(f"📁 {item.name}/")
|
dirs.append(f"📁 {item.name}/")
|
||||||
else:
|
else:
|
||||||
size = item.stat().st_size
|
size = item.stat().st_size
|
||||||
size_str = f"{size}B" if size < 1024 else f"{size//1024}KB"
|
size_str = f"{size}B" if size < 1024 else f"{size // 1024}KB"
|
||||||
files.append(f"📄 {item.name} ({size_str})")
|
files.append(f"📄 {item.name} ({size_str})")
|
||||||
|
|
||||||
result = [f"Contents of {dirpath}:", ""]
|
result = [f"Contents of {dirpath}:", ""]
|
||||||
result.extend(sorted(dirs))
|
result.extend(sorted(dirs))
|
||||||
result.extend(sorted(files))
|
result.extend(sorted(files))
|
||||||
|
|
||||||
return "\n".join(result)
|
return "\n".join(result)
|
||||||
|
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
logger.error("List directory failed: %s", exc)
|
logger.error("List directory failed: %s", exc)
|
||||||
return f"Error listing directory: {exc}"
|
return f"Error listing directory: {exc}"
|
||||||
@@ -176,4 +195,6 @@ register_tool(
|
|||||||
category="files",
|
category="files",
|
||||||
requires_confirmation=True,
|
requires_confirmation=True,
|
||||||
)(write_file)
|
)(write_file)
|
||||||
register_tool(name="list_directory", schema=LIST_DIR_SCHEMA, category="files")(list_directory)
|
register_tool(name="list_directory", schema=LIST_DIR_SCHEMA, category="files")(
|
||||||
|
list_directory
|
||||||
|
)
|
||||||
|
|||||||
@@ -18,7 +18,10 @@ templates = Jinja2Templates(directory=str(Path(__file__).parent.parent / "templa
|
|||||||
# ── Task queue detection ──────────────────────────────────────────────────
|
# ── Task queue detection ──────────────────────────────────────────────────
|
||||||
# Patterns that indicate the user wants to queue a task rather than chat
|
# Patterns that indicate the user wants to queue a task rather than chat
|
||||||
_QUEUE_PATTERNS = [
|
_QUEUE_PATTERNS = [
|
||||||
re.compile(r"\b(?:add|put|schedule|queue|submit)\b.*\b(?:to the|on the|in the)?\s*(?:queue|task(?:\s*queue)?|task list)\b", re.IGNORECASE),
|
re.compile(
|
||||||
|
r"\b(?:add|put|schedule|queue|submit)\b.*\b(?:to the|on the|in the)?\s*(?:queue|task(?:\s*queue)?|task list)\b",
|
||||||
|
re.IGNORECASE,
|
||||||
|
),
|
||||||
re.compile(r"\bschedule\s+(?:this|that|a)\b", re.IGNORECASE),
|
re.compile(r"\bschedule\s+(?:this|that|a)\b", re.IGNORECASE),
|
||||||
re.compile(r"\bcreate\s+(?:a\s+|an\s+)?(?:\w+\s+){0,3}task\b", re.IGNORECASE),
|
re.compile(r"\bcreate\s+(?:a\s+|an\s+)?(?:\w+\s+){0,3}task\b", re.IGNORECASE),
|
||||||
]
|
]
|
||||||
@@ -35,10 +38,20 @@ _QUESTION_FRAMES = re.compile(
|
|||||||
)
|
)
|
||||||
|
|
||||||
# Known agent names for task assignment parsing
|
# Known agent names for task assignment parsing
|
||||||
_KNOWN_AGENTS = frozenset({
|
_KNOWN_AGENTS = frozenset(
|
||||||
"timmy", "echo", "mace", "helm", "seer",
|
{
|
||||||
"forge", "quill", "pixel", "lyra", "reel",
|
"timmy",
|
||||||
})
|
"echo",
|
||||||
|
"mace",
|
||||||
|
"helm",
|
||||||
|
"seer",
|
||||||
|
"forge",
|
||||||
|
"quill",
|
||||||
|
"pixel",
|
||||||
|
"lyra",
|
||||||
|
"reel",
|
||||||
|
}
|
||||||
|
)
|
||||||
_AGENT_PATTERN = re.compile(
|
_AGENT_PATTERN = re.compile(
|
||||||
r"\bfor\s+(" + "|".join(_KNOWN_AGENTS) + r")\b", re.IGNORECASE
|
r"\bfor\s+(" + "|".join(_KNOWN_AGENTS) + r")\b", re.IGNORECASE
|
||||||
)
|
)
|
||||||
@@ -93,14 +106,18 @@ def _extract_task_from_message(message: str) -> dict | None:
|
|||||||
# Strip the queue instruction to get the actual task description
|
# Strip the queue instruction to get the actual task description
|
||||||
title = re.sub(
|
title = re.sub(
|
||||||
r"\b(?:add|put|schedule|queue|submit|create)\b.*?\b(?:to the|on the|in the|an?)?(?:\s+\w+){0,3}\s*(?:queue|task(?:\s*queue)?|task list)\b",
|
r"\b(?:add|put|schedule|queue|submit|create)\b.*?\b(?:to the|on the|in the|an?)?(?:\s+\w+){0,3}\s*(?:queue|task(?:\s*queue)?|task list)\b",
|
||||||
"", message, flags=re.IGNORECASE,
|
"",
|
||||||
|
message,
|
||||||
|
flags=re.IGNORECASE,
|
||||||
).strip(" ,:;-")
|
).strip(" ,:;-")
|
||||||
# Strip "for {agent}" from title
|
# Strip "for {agent}" from title
|
||||||
title = _AGENT_PATTERN.sub("", title).strip(" ,:;-")
|
title = _AGENT_PATTERN.sub("", title).strip(" ,:;-")
|
||||||
# Strip priority keywords from title
|
# Strip priority keywords from title
|
||||||
title = re.sub(
|
title = re.sub(
|
||||||
r"\b(?:urgent|critical|asap|emergency|high[- ]priority|important|low[- ]priority|minor)\b",
|
r"\b(?:urgent|critical|asap|emergency|high[- ]priority|important|low[- ]priority|minor)\b",
|
||||||
"", title, flags=re.IGNORECASE,
|
"",
|
||||||
|
title,
|
||||||
|
flags=re.IGNORECASE,
|
||||||
).strip(" ,:;-")
|
).strip(" ,:;-")
|
||||||
# Strip leading "to " that often remains
|
# Strip leading "to " that often remains
|
||||||
title = re.sub(r"^to\s+", "", title, flags=re.IGNORECASE).strip()
|
title = re.sub(r"^to\s+", "", title, flags=re.IGNORECASE).strip()
|
||||||
@@ -126,12 +143,15 @@ def _build_queue_context() -> str:
|
|||||||
"""Build a concise task queue summary for context injection."""
|
"""Build a concise task queue summary for context injection."""
|
||||||
try:
|
try:
|
||||||
from swarm.task_queue.models import get_counts_by_status, list_tasks, TaskStatus
|
from swarm.task_queue.models import get_counts_by_status, list_tasks, TaskStatus
|
||||||
|
|
||||||
counts = get_counts_by_status()
|
counts = get_counts_by_status()
|
||||||
pending = counts.get("pending_approval", 0)
|
pending = counts.get("pending_approval", 0)
|
||||||
running = counts.get("running", 0)
|
running = counts.get("running", 0)
|
||||||
completed = counts.get("completed", 0)
|
completed = counts.get("completed", 0)
|
||||||
|
|
||||||
parts = [f"[System: Task queue — {pending} pending approval, {running} running, {completed} completed."]
|
parts = [
|
||||||
|
f"[System: Task queue — {pending} pending approval, {running} running, {completed} completed."
|
||||||
|
]
|
||||||
if pending > 0:
|
if pending > 0:
|
||||||
tasks = list_tasks(status=TaskStatus.PENDING_APPROVAL, limit=5)
|
tasks = list_tasks(status=TaskStatus.PENDING_APPROVAL, limit=5)
|
||||||
if tasks:
|
if tasks:
|
||||||
@@ -152,7 +172,7 @@ def _build_queue_context() -> str:
|
|||||||
_AGENT_METADATA: dict[str, dict] = {
|
_AGENT_METADATA: dict[str, dict] = {
|
||||||
"timmy": {
|
"timmy": {
|
||||||
"type": "sovereign",
|
"type": "sovereign",
|
||||||
"model": "llama3.2",
|
"model": "", # Injected dynamically from settings
|
||||||
"backend": "ollama",
|
"backend": "ollama",
|
||||||
"version": "1.0.0",
|
"version": "1.0.0",
|
||||||
},
|
},
|
||||||
@@ -163,6 +183,13 @@ _AGENT_METADATA: dict[str, dict] = {
|
|||||||
async def list_agents():
|
async def list_agents():
|
||||||
"""Return all registered agents with live status from the swarm registry."""
|
"""Return all registered agents with live status from the swarm registry."""
|
||||||
from swarm import registry as swarm_registry
|
from swarm import registry as swarm_registry
|
||||||
|
from config import settings
|
||||||
|
|
||||||
|
# Inject model name from settings into timmy metadata
|
||||||
|
metadata = dict(_AGENT_METADATA)
|
||||||
|
if "timmy" in metadata and not metadata["timmy"].get("model"):
|
||||||
|
metadata["timmy"]["model"] = settings.ollama_model
|
||||||
|
|
||||||
agents = swarm_registry.list_agents()
|
agents = swarm_registry.list_agents()
|
||||||
return {
|
return {
|
||||||
"agents": [
|
"agents": [
|
||||||
@@ -171,7 +198,7 @@ async def list_agents():
|
|||||||
"name": a.name,
|
"name": a.name,
|
||||||
"status": a.status,
|
"status": a.status,
|
||||||
"capabilities": a.capabilities,
|
"capabilities": a.capabilities,
|
||||||
**_AGENT_METADATA.get(a.id, {}),
|
**metadata.get(a.id, {}),
|
||||||
}
|
}
|
||||||
for a in agents
|
for a in agents
|
||||||
]
|
]
|
||||||
@@ -182,8 +209,11 @@ async def list_agents():
|
|||||||
async def timmy_panel(request: Request):
|
async def timmy_panel(request: Request):
|
||||||
"""Timmy chat panel — for HTMX main-panel swaps."""
|
"""Timmy chat panel — for HTMX main-panel swaps."""
|
||||||
from swarm import registry as swarm_registry
|
from swarm import registry as swarm_registry
|
||||||
|
|
||||||
agent = swarm_registry.get_agent("timmy")
|
agent = swarm_registry.get_agent("timmy")
|
||||||
return templates.TemplateResponse(request, "partials/timmy_panel.html", {"agent": agent})
|
return templates.TemplateResponse(
|
||||||
|
request, "partials/timmy_panel.html", {"agent": agent}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
@router.get("/timmy/history", response_class=HTMLResponse)
|
@router.get("/timmy/history", response_class=HTMLResponse)
|
||||||
@@ -216,6 +246,7 @@ async def chat_timmy(request: Request, message: str = Form(...)):
|
|||||||
if task_info:
|
if task_info:
|
||||||
try:
|
try:
|
||||||
from swarm.task_queue.models import create_task
|
from swarm.task_queue.models import create_task
|
||||||
|
|
||||||
task = create_task(
|
task = create_task(
|
||||||
title=task_info["title"],
|
title=task_info["title"],
|
||||||
description=task_info["description"],
|
description=task_info["description"],
|
||||||
@@ -224,14 +255,23 @@ async def chat_timmy(request: Request, message: str = Form(...)):
|
|||||||
priority=task_info.get("priority", "normal"),
|
priority=task_info.get("priority", "normal"),
|
||||||
requires_approval=True,
|
requires_approval=True,
|
||||||
)
|
)
|
||||||
priority_label = f" | Priority: `{task.priority.value}`" if task.priority.value != "normal" else ""
|
priority_label = (
|
||||||
|
f" | Priority: `{task.priority.value}`"
|
||||||
|
if task.priority.value != "normal"
|
||||||
|
else ""
|
||||||
|
)
|
||||||
response_text = (
|
response_text = (
|
||||||
f"Task queued for approval: **{task.title}**\n\n"
|
f"Task queued for approval: **{task.title}**\n\n"
|
||||||
f"Assigned to: `{task.assigned_to}`{priority_label} | "
|
f"Assigned to: `{task.assigned_to}`{priority_label} | "
|
||||||
f"Status: `{task.status.value}` | "
|
f"Status: `{task.status.value}` | "
|
||||||
f"[View Task Queue](/tasks)"
|
f"[View Task Queue](/tasks)"
|
||||||
)
|
)
|
||||||
logger.info("Chat → task queue: %s → %s (id=%s)", task.title, task.assigned_to, task.id)
|
logger.info(
|
||||||
|
"Chat → task queue: %s → %s (id=%s)",
|
||||||
|
task.title,
|
||||||
|
task.assigned_to,
|
||||||
|
task.id,
|
||||||
|
)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
logger.error("Failed to create task from chat: %s", exc)
|
logger.error("Failed to create task from chat: %s", exc)
|
||||||
task_info = None
|
task_info = None
|
||||||
|
|||||||
@@ -10,8 +10,8 @@ tool usage guidelines and memory system documentation.
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
TIMMY_SYSTEM_PROMPT_LITE = """You are Timmy — a sovereign AI agent running locally on this Mac.
|
TIMMY_SYSTEM_PROMPT_LITE = """You are Timmy — a sovereign AI agent running locally on this Mac.
|
||||||
You run on the llama3.2 model via Ollama on localhost. You are not GPT, not Claude,
|
You run on the {model_name} model via Ollama on localhost. You are not GPT, not Claude,
|
||||||
not a custom model — you are llama3.2 wrapped in the Timmy agent framework.
|
not a custom model — you are {model_name} wrapped in the Timmy agent framework.
|
||||||
No cloud dependencies. Think clearly, speak plainly, act with intention.
|
No cloud dependencies. Think clearly, speak plainly, act with intention.
|
||||||
Grounded in Christian faith, powered by Bitcoin economics, committed to the
|
Grounded in Christian faith, powered by Bitcoin economics, committed to the
|
||||||
user's digital sovereignty.
|
user's digital sovereignty.
|
||||||
@@ -65,8 +65,8 @@ Sir, affirmative."""
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
TIMMY_SYSTEM_PROMPT_FULL = """You are Timmy — a sovereign AI agent running locally on this Mac.
|
TIMMY_SYSTEM_PROMPT_FULL = """You are Timmy — a sovereign AI agent running locally on this Mac.
|
||||||
You run on the llama3.2 model via Ollama on localhost. You are not GPT, not Claude,
|
You run on the {model_name} model via Ollama on localhost. You are not GPT, not Claude,
|
||||||
not a custom model — you are llama3.2 wrapped in the Timmy agent framework.
|
not a custom model — you are {model_name} wrapped in the Timmy agent framework.
|
||||||
No cloud dependencies. You think clearly, speak plainly, act with intention.
|
No cloud dependencies. You think clearly, speak plainly, act with intention.
|
||||||
Grounded in Christian faith, powered by Bitcoin economics, committed to the
|
Grounded in Christian faith, powered by Bitcoin economics, committed to the
|
||||||
user's digital sovereignty.
|
user's digital sovereignty.
|
||||||
@@ -111,6 +111,22 @@ Use ONLY the capabilities listed above when describing agents — do not embelli
|
|||||||
dashboard chat display resets on server restart.
|
dashboard chat display resets on server restart.
|
||||||
- Do NOT claim abilities you don't have. When uncertain, say "I don't know."
|
- Do NOT claim abilities you don't have. When uncertain, say "I don't know."
|
||||||
|
|
||||||
|
## Reasoning in Complex Situations
|
||||||
|
|
||||||
|
When faced with uncertainty, complexity, or ambiguous requests:
|
||||||
|
|
||||||
|
1. **THINK STEP-BY-STEP** — Break down the problem before acting
|
||||||
|
2. **STATE UNCERTAINTY** — If you're unsure, say "I'm uncertain about X because..." rather than guessing
|
||||||
|
3. **CONSIDER ALTERNATIVES** — Present 2-3 options when the path isn't clear: "I could do A, but B might be better because..."
|
||||||
|
4. **ASK FOR CLARIFICATION** — If a request is ambiguous, ask before guessing wrong
|
||||||
|
5. **DOCUMENT YOUR REASONING** — When making significant choices, explain WHY in your response
|
||||||
|
|
||||||
|
**Example of good reasoning:**
|
||||||
|
> "I'm not certain what you mean by 'fix the issue' — do you mean the XSS bug in the login form, or the timeout on the dashboard? Let me know which to tackle."
|
||||||
|
|
||||||
|
**Example of poor reasoning:**
|
||||||
|
> "I'll fix it" [guesses wrong and breaks something else]
|
||||||
|
|
||||||
## Tool Usage Guidelines
|
## Tool Usage Guidelines
|
||||||
|
|
||||||
### When NOT to use tools:
|
### When NOT to use tools:
|
||||||
@@ -156,11 +172,16 @@ def get_system_prompt(tools_enabled: bool = False) -> str:
|
|||||||
tools_enabled: True if the model supports reliable tool calling.
|
tools_enabled: True if the model supports reliable tool calling.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
The system prompt string.
|
The system prompt string with model name injected from config.
|
||||||
"""
|
"""
|
||||||
|
from config import settings
|
||||||
|
|
||||||
|
model_name = settings.ollama_model
|
||||||
|
|
||||||
if tools_enabled:
|
if tools_enabled:
|
||||||
return TIMMY_SYSTEM_PROMPT_FULL
|
return TIMMY_SYSTEM_PROMPT_FULL.format(model_name=model_name)
|
||||||
return TIMMY_SYSTEM_PROMPT_LITE
|
return TIMMY_SYSTEM_PROMPT_LITE.format(model_name=model_name)
|
||||||
|
|
||||||
|
|
||||||
TIMMY_STATUS_PROMPT = """You are Timmy. Give a one-sentence status report confirming
|
TIMMY_STATUS_PROMPT = """You are Timmy. Give a one-sentence status report confirming
|
||||||
you are operational and running locally."""
|
you are operational and running locally."""
|
||||||
|
|||||||
@@ -3,6 +3,7 @@ from unittest.mock import AsyncMock, patch
|
|||||||
|
|
||||||
# ── Index ─────────────────────────────────────────────────────────────────────
|
# ── Index ─────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_index_returns_200(client):
|
def test_index_returns_200(client):
|
||||||
response = client.get("/")
|
response = client.get("/")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
@@ -16,13 +17,18 @@ def test_index_contains_title(client):
|
|||||||
def test_index_contains_chat_interface(client):
|
def test_index_contains_chat_interface(client):
|
||||||
response = client.get("/")
|
response = client.get("/")
|
||||||
# Timmy panel loads dynamically via HTMX; verify the trigger attribute is present
|
# Timmy panel loads dynamically via HTMX; verify the trigger attribute is present
|
||||||
assert "hx-get=\"/agents/timmy/panel\"" in response.text
|
assert 'hx-get="/agents/timmy/panel"' in response.text
|
||||||
|
|
||||||
|
|
||||||
# ── Health ────────────────────────────────────────────────────────────────────
|
# ── Health ────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_health_endpoint_ok(client):
|
def test_health_endpoint_ok(client):
|
||||||
with patch("dashboard.routes.health.check_ollama", new_callable=AsyncMock, return_value=True):
|
with patch(
|
||||||
|
"dashboard.routes.health.check_ollama",
|
||||||
|
new_callable=AsyncMock,
|
||||||
|
return_value=True,
|
||||||
|
):
|
||||||
response = client.get("/health")
|
response = client.get("/health")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
data = response.json()
|
data = response.json()
|
||||||
@@ -32,21 +38,33 @@ def test_health_endpoint_ok(client):
|
|||||||
|
|
||||||
|
|
||||||
def test_health_endpoint_ollama_down(client):
|
def test_health_endpoint_ollama_down(client):
|
||||||
with patch("dashboard.routes.health.check_ollama", new_callable=AsyncMock, return_value=False):
|
with patch(
|
||||||
|
"dashboard.routes.health.check_ollama",
|
||||||
|
new_callable=AsyncMock,
|
||||||
|
return_value=False,
|
||||||
|
):
|
||||||
response = client.get("/health")
|
response = client.get("/health")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert response.json()["services"]["ollama"] == "down"
|
assert response.json()["services"]["ollama"] == "down"
|
||||||
|
|
||||||
|
|
||||||
def test_health_status_panel_ollama_up(client):
|
def test_health_status_panel_ollama_up(client):
|
||||||
with patch("dashboard.routes.health.check_ollama", new_callable=AsyncMock, return_value=True):
|
with patch(
|
||||||
|
"dashboard.routes.health.check_ollama",
|
||||||
|
new_callable=AsyncMock,
|
||||||
|
return_value=True,
|
||||||
|
):
|
||||||
response = client.get("/health/status")
|
response = client.get("/health/status")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert "UP" in response.text
|
assert "UP" in response.text
|
||||||
|
|
||||||
|
|
||||||
def test_health_status_panel_ollama_down(client):
|
def test_health_status_panel_ollama_down(client):
|
||||||
with patch("dashboard.routes.health.check_ollama", new_callable=AsyncMock, return_value=False):
|
with patch(
|
||||||
|
"dashboard.routes.health.check_ollama",
|
||||||
|
new_callable=AsyncMock,
|
||||||
|
return_value=False,
|
||||||
|
):
|
||||||
response = client.get("/health/status")
|
response = client.get("/health/status")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert "DOWN" in response.text
|
assert "DOWN" in response.text
|
||||||
@@ -54,6 +72,7 @@ def test_health_status_panel_ollama_down(client):
|
|||||||
|
|
||||||
# ── Agents ────────────────────────────────────────────────────────────────────
|
# ── Agents ────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_agents_list(client):
|
def test_agents_list(client):
|
||||||
response = client.get("/agents")
|
response = client.get("/agents")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
@@ -67,14 +86,18 @@ def test_agents_list_timmy_metadata(client):
|
|||||||
response = client.get("/agents")
|
response = client.get("/agents")
|
||||||
timmy = next(a for a in response.json()["agents"] if a["id"] == "timmy")
|
timmy = next(a for a in response.json()["agents"] if a["id"] == "timmy")
|
||||||
assert timmy["name"] == "Timmy"
|
assert timmy["name"] == "Timmy"
|
||||||
assert timmy["model"] == "llama3.2"
|
assert timmy["model"] == "llama3.1:8b-instruct"
|
||||||
assert timmy["type"] == "sovereign"
|
assert timmy["type"] == "sovereign"
|
||||||
|
|
||||||
|
|
||||||
# ── Chat ──────────────────────────────────────────────────────────────────────
|
# ── Chat ──────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_chat_timmy_success(client):
|
def test_chat_timmy_success(client):
|
||||||
with patch("dashboard.routes.agents.timmy_chat", return_value="I am Timmy, operational and sovereign."):
|
with patch(
|
||||||
|
"dashboard.routes.agents.timmy_chat",
|
||||||
|
return_value="I am Timmy, operational and sovereign.",
|
||||||
|
):
|
||||||
response = client.post("/agents/timmy/chat", data={"message": "status?"})
|
response = client.post("/agents/timmy/chat", data={"message": "status?"})
|
||||||
|
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
@@ -90,7 +113,10 @@ def test_chat_timmy_shows_user_message(client):
|
|||||||
|
|
||||||
|
|
||||||
def test_chat_timmy_ollama_offline(client):
|
def test_chat_timmy_ollama_offline(client):
|
||||||
with patch("dashboard.routes.agents.timmy_chat", side_effect=Exception("connection refused")):
|
with patch(
|
||||||
|
"dashboard.routes.agents.timmy_chat",
|
||||||
|
side_effect=Exception("connection refused"),
|
||||||
|
):
|
||||||
response = client.post("/agents/timmy/chat", data={"message": "ping"})
|
response = client.post("/agents/timmy/chat", data={"message": "ping"})
|
||||||
|
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
@@ -105,6 +131,7 @@ def test_chat_timmy_requires_message(client):
|
|||||||
|
|
||||||
# ── History ────────────────────────────────────────────────────────────────────
|
# ── History ────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_history_empty_shows_init_message(client):
|
def test_history_empty_shows_init_message(client):
|
||||||
response = client.get("/agents/timmy/history")
|
response = client.get("/agents/timmy/history")
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
|
|||||||
@@ -20,6 +20,7 @@ from unittest.mock import AsyncMock, MagicMock, patch
|
|||||||
|
|
||||||
# ── helpers ───────────────────────────────────────────────────────────────────
|
# ── helpers ───────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def _css() -> str:
|
def _css() -> str:
|
||||||
"""Read the main stylesheet."""
|
"""Read the main stylesheet."""
|
||||||
css_path = Path(__file__).parent.parent.parent / "static" / "style.css"
|
css_path = Path(__file__).parent.parent.parent / "static" / "style.css"
|
||||||
@@ -37,6 +38,7 @@ def _timmy_panel_html(client) -> str:
|
|||||||
|
|
||||||
# ── M1xx — Viewport & meta tags ───────────────────────────────────────────────
|
# ── M1xx — Viewport & meta tags ───────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_M101_viewport_meta_present(client):
|
def test_M101_viewport_meta_present(client):
|
||||||
"""viewport meta tag must exist for correct mobile scaling."""
|
"""viewport meta tag must exist for correct mobile scaling."""
|
||||||
html = _index_html(client)
|
html = _index_html(client)
|
||||||
@@ -84,6 +86,7 @@ def test_M108_lang_attribute_on_html(client):
|
|||||||
|
|
||||||
# ── M2xx — Touch target sizing ────────────────────────────────────────────────
|
# ── M2xx — Touch target sizing ────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_M201_send_button_min_height_44px():
|
def test_M201_send_button_min_height_44px():
|
||||||
"""SEND button must be at least 44 × 44 px — Apple HIG minimum."""
|
"""SEND button must be at least 44 × 44 px — Apple HIG minimum."""
|
||||||
css = _css()
|
css = _css()
|
||||||
@@ -111,6 +114,7 @@ def test_M204_touch_action_manipulation_on_buttons():
|
|||||||
|
|
||||||
# ── M3xx — iOS keyboard & zoom prevention ─────────────────────────────────────
|
# ── M3xx — iOS keyboard & zoom prevention ─────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_M301_input_font_size_16px_in_mobile_query():
|
def test_M301_input_font_size_16px_in_mobile_query():
|
||||||
"""iOS Safari zooms in when input font-size < 16px. Must be exactly 16px."""
|
"""iOS Safari zooms in when input font-size < 16px. Must be exactly 16px."""
|
||||||
css = _css()
|
css = _css()
|
||||||
@@ -149,6 +153,7 @@ def test_M305_input_spellcheck_false(client):
|
|||||||
|
|
||||||
# ── M4xx — HTMX robustness ────────────────────────────────────────────────────
|
# ── M4xx — HTMX robustness ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_M401_form_hx_sync_drop(client):
|
def test_M401_form_hx_sync_drop(client):
|
||||||
"""hx-sync=this:drop discards duplicate submissions (fast double-tap)."""
|
"""hx-sync=this:drop discards duplicate submissions (fast double-tap)."""
|
||||||
html = _timmy_panel_html(client)
|
html = _timmy_panel_html(client)
|
||||||
@@ -181,6 +186,7 @@ def test_M405_chat_log_loads_history_on_boot(client):
|
|||||||
|
|
||||||
# ── M5xx — Safe-area / notch support ─────────────────────────────────────────
|
# ── M5xx — Safe-area / notch support ─────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_M501_safe_area_inset_top_in_header():
|
def test_M501_safe_area_inset_top_in_header():
|
||||||
"""Header padding must accommodate the iPhone notch / status bar."""
|
"""Header padding must accommodate the iPhone notch / status bar."""
|
||||||
css = _css()
|
css = _css()
|
||||||
@@ -213,9 +219,11 @@ def test_M505_dvh_units_used():
|
|||||||
|
|
||||||
# ── M6xx — AirLLM backend interface contract ──────────────────────────────────
|
# ── M6xx — AirLLM backend interface contract ──────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def test_M601_airllm_agent_has_run_method():
|
def test_M601_airllm_agent_has_run_method():
|
||||||
"""TimmyAirLLMAgent must expose run() so the dashboard route can call it."""
|
"""TimmyAirLLMAgent must expose run() so the dashboard route can call it."""
|
||||||
from timmy.backends import TimmyAirLLMAgent
|
from timmy.backends import TimmyAirLLMAgent
|
||||||
|
|
||||||
assert hasattr(TimmyAirLLMAgent, "run"), (
|
assert hasattr(TimmyAirLLMAgent, "run"), (
|
||||||
"TimmyAirLLMAgent is missing run() — dashboard will fail with AirLLM backend"
|
"TimmyAirLLMAgent is missing run() — dashboard will fail with AirLLM backend"
|
||||||
)
|
)
|
||||||
@@ -225,6 +233,7 @@ def test_M602_airllm_run_returns_content_attribute():
|
|||||||
"""run() must return an object with a .content attribute (Agno RunResponse compat)."""
|
"""run() must return an object with a .content attribute (Agno RunResponse compat)."""
|
||||||
with patch("timmy.backends.is_apple_silicon", return_value=False):
|
with patch("timmy.backends.is_apple_silicon", return_value=False):
|
||||||
from timmy.backends import TimmyAirLLMAgent
|
from timmy.backends import TimmyAirLLMAgent
|
||||||
|
|
||||||
agent = TimmyAirLLMAgent(model_size="8b")
|
agent = TimmyAirLLMAgent(model_size="8b")
|
||||||
|
|
||||||
mock_model = MagicMock()
|
mock_model = MagicMock()
|
||||||
@@ -246,6 +255,7 @@ def test_M603_airllm_run_updates_history():
|
|||||||
"""run() must update _history so multi-turn context is preserved."""
|
"""run() must update _history so multi-turn context is preserved."""
|
||||||
with patch("timmy.backends.is_apple_silicon", return_value=False):
|
with patch("timmy.backends.is_apple_silicon", return_value=False):
|
||||||
from timmy.backends import TimmyAirLLMAgent
|
from timmy.backends import TimmyAirLLMAgent
|
||||||
|
|
||||||
agent = TimmyAirLLMAgent(model_size="8b")
|
agent = TimmyAirLLMAgent(model_size="8b")
|
||||||
|
|
||||||
mock_model = MagicMock()
|
mock_model = MagicMock()
|
||||||
@@ -268,10 +278,13 @@ def test_M604_airllm_print_response_delegates_to_run():
|
|||||||
"""print_response must use run() so both interfaces share one inference path."""
|
"""print_response must use run() so both interfaces share one inference path."""
|
||||||
with patch("timmy.backends.is_apple_silicon", return_value=False):
|
with patch("timmy.backends.is_apple_silicon", return_value=False):
|
||||||
from timmy.backends import TimmyAirLLMAgent, RunResult
|
from timmy.backends import TimmyAirLLMAgent, RunResult
|
||||||
|
|
||||||
agent = TimmyAirLLMAgent(model_size="8b")
|
agent = TimmyAirLLMAgent(model_size="8b")
|
||||||
|
|
||||||
with patch.object(agent, "run", return_value=RunResult(content="ok")) as mock_run, \
|
with (
|
||||||
patch.object(agent, "_render"):
|
patch.object(agent, "run", return_value=RunResult(content="ok")) as mock_run,
|
||||||
|
patch.object(agent, "_render"),
|
||||||
|
):
|
||||||
agent.print_response("hello", stream=True)
|
agent.print_response("hello", stream=True)
|
||||||
|
|
||||||
mock_run.assert_called_once_with("hello", stream=True)
|
mock_run.assert_called_once_with("hello", stream=True)
|
||||||
@@ -279,24 +292,43 @@ def test_M604_airllm_print_response_delegates_to_run():
|
|||||||
|
|
||||||
def test_M605_health_status_passes_model_to_template(client):
|
def test_M605_health_status_passes_model_to_template(client):
|
||||||
"""Health status partial must receive the configured model name, not a hardcoded string."""
|
"""Health status partial must receive the configured model name, not a hardcoded string."""
|
||||||
with patch("dashboard.routes.health.check_ollama", new_callable=AsyncMock, return_value=True):
|
with patch(
|
||||||
|
"dashboard.routes.health.check_ollama",
|
||||||
|
new_callable=AsyncMock,
|
||||||
|
return_value=True,
|
||||||
|
):
|
||||||
response = client.get("/health/status")
|
response = client.get("/health/status")
|
||||||
# The default model is llama3.2 — it should appear in the partial from settings, not hardcoded
|
# The default model is llama3.1:8b-instruct — it should appear from settings
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert "llama3.2" in response.text # rendered via template variable, not hardcoded literal
|
assert (
|
||||||
|
"llama3.1" in response.text
|
||||||
|
) # rendered via template variable, not hardcoded literal
|
||||||
|
|
||||||
|
|
||||||
# ── M7xx — XSS prevention ─────────────────────────────────────────────────────
|
# ── M7xx — XSS prevention ─────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
def _mobile_html() -> str:
|
def _mobile_html() -> str:
|
||||||
"""Read the mobile template source."""
|
"""Read the mobile template source."""
|
||||||
path = Path(__file__).parent.parent.parent / "src" / "dashboard" / "templates" / "mobile.html"
|
path = (
|
||||||
|
Path(__file__).parent.parent.parent
|
||||||
|
/ "src"
|
||||||
|
/ "dashboard"
|
||||||
|
/ "templates"
|
||||||
|
/ "mobile.html"
|
||||||
|
)
|
||||||
return path.read_text()
|
return path.read_text()
|
||||||
|
|
||||||
|
|
||||||
def _swarm_live_html() -> str:
|
def _swarm_live_html() -> str:
|
||||||
"""Read the swarm live template source."""
|
"""Read the swarm live template source."""
|
||||||
path = Path(__file__).parent.parent.parent / "src" / "dashboard" / "templates" / "swarm_live.html"
|
path = (
|
||||||
|
Path(__file__).parent.parent.parent
|
||||||
|
/ "src"
|
||||||
|
/ "dashboard"
|
||||||
|
/ "templates"
|
||||||
|
/ "swarm_live.html"
|
||||||
|
)
|
||||||
return path.read_text()
|
return path.read_text()
|
||||||
|
|
||||||
|
|
||||||
@@ -324,7 +356,9 @@ def test_M702_mobile_chat_user_input_not_in_innerhtml_template_literal():
|
|||||||
def test_M703_swarm_live_agent_name_not_interpolated_in_innerhtml():
|
def test_M703_swarm_live_agent_name_not_interpolated_in_innerhtml():
|
||||||
"""swarm_live.html must not put ${agent.name} inside innerHTML template literals."""
|
"""swarm_live.html must not put ${agent.name} inside innerHTML template literals."""
|
||||||
html = _swarm_live_html()
|
html = _swarm_live_html()
|
||||||
blocks = re.findall(r"innerHTML\s*=\s*agents\.map\([^;]+\)\.join\([^)]*\)", html, re.DOTALL)
|
blocks = re.findall(
|
||||||
|
r"innerHTML\s*=\s*agents\.map\([^;]+\)\.join\([^)]*\)", html, re.DOTALL
|
||||||
|
)
|
||||||
assert len(blocks) == 0, (
|
assert len(blocks) == 0, (
|
||||||
"swarm_live.html still uses innerHTML=agents.map(…) with interpolated agent data — XSS vulnerability"
|
"swarm_live.html still uses innerHTML=agents.map(…) with interpolated agent data — XSS vulnerability"
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
from timmy.prompts import TIMMY_SYSTEM_PROMPT, TIMMY_STATUS_PROMPT
|
from timmy.prompts import TIMMY_SYSTEM_PROMPT, TIMMY_STATUS_PROMPT, get_system_prompt
|
||||||
|
|
||||||
|
|
||||||
def test_system_prompt_not_empty():
|
def test_system_prompt_not_empty():
|
||||||
@@ -31,3 +31,10 @@ def test_status_prompt_has_timmy():
|
|||||||
|
|
||||||
def test_prompts_are_distinct():
|
def test_prompts_are_distinct():
|
||||||
assert TIMMY_SYSTEM_PROMPT != TIMMY_STATUS_PROMPT
|
assert TIMMY_SYSTEM_PROMPT != TIMMY_STATUS_PROMPT
|
||||||
|
|
||||||
|
|
||||||
|
def test_get_system_prompt_injects_model_name():
|
||||||
|
"""System prompt should inject actual model name from config."""
|
||||||
|
prompt = get_system_prompt(tools_enabled=False)
|
||||||
|
# Should contain the model name from settings, not hardcoded
|
||||||
|
assert "llama3.1" in prompt or "qwen" in prompt or "{model_name}" in prompt
|
||||||
|
|||||||
Reference in New Issue
Block a user