Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
845070f70b test: add unit tests for chat_store.py (#1192)
Some checks failed
Tests / lint (pull_request) Failing after 15s
Tests / test (pull_request) Has been skipped
2026-03-23 18:00:08 -04:00
32 changed files with 761 additions and 4564 deletions

View File

@@ -1,105 +0,0 @@
# Nexus — Scope & Acceptance Criteria
**Issue:** #1208
**Date:** 2026-03-23
**Status:** Initial implementation complete; teaching/RL harness deferred
---
## Summary
The **Nexus** is a persistent conversational space where Timmy lives with full
access to his live memory. Unlike the main dashboard chat (which uses tools and
has a transient feel), the Nexus is:
- **Conversational only** — no tool approval flow; pure dialogue
- **Memory-aware** — semantically relevant memories surface alongside each exchange
- **Teachable** — the operator can inject facts directly into Timmy's live memory
- **Persistent** — the session survives page refreshes; history accumulates over time
- **Local** — always backed by Ollama; no cloud inference required
This is the foundation for future LoRA fine-tuning, RL training harnesses, and
eventually real-time self-improvement loops.
---
## Scope (v1 — this PR)
| Area | Included | Deferred |
|------|----------|----------|
| Conversational UI | ✅ Chat panel with HTMX streaming | Streaming tokens |
| Live memory sidebar | ✅ Semantic search on each turn | Auto-refresh on teach |
| Teaching panel | ✅ Inject personal facts | Bulk import, LoRA trigger |
| Session isolation | ✅ Dedicated `nexus` session ID | Per-operator sessions |
| Nav integration | ✅ NEXUS link in INTEL dropdown | Mobile nav |
| CSS/styling | ✅ Two-column responsive layout | Dark/light theme toggle |
| Tests | ✅ 9 unit tests, all green | E2E with real Ollama |
| LoRA / RL harness | ❌ deferred to future issue | |
| Auto-falsework | ❌ deferred | |
| Bannerlord interface | ❌ separate track | |
---
## Acceptance Criteria
### AC-1: Nexus page loads
- **Given** the dashboard is running
- **When** I navigate to `/nexus`
- **Then** I see a two-panel layout: conversation on the left, memory sidebar on the right
- **And** the page title reads "// NEXUS"
- **And** the page is accessible from the nav (INTEL → NEXUS)
### AC-2: Conversation-only chat
- **Given** I am on the Nexus page
- **When** I type a message and submit
- **Then** Timmy responds using the `nexus` session (isolated from dashboard history)
- **And** no tool-approval cards appear — responses are pure text
- **And** my message and Timmy's reply are appended to the chat log
### AC-3: Memory context surfaces automatically
- **Given** I send a message
- **When** the response arrives
- **Then** the "LIVE MEMORY CONTEXT" panel shows up to 4 semantically relevant memories
- **And** each memory entry shows its type and content
### AC-4: Teaching panel stores facts
- **Given** I type a fact into the "TEACH TIMMY" input and submit
- **When** the request completes
- **Then** I see a green confirmation "✓ Taught: <fact>"
- **And** the fact appears in the "KNOWN FACTS" list
- **And** the fact is stored in Timmy's live memory (`store_personal_fact`)
### AC-5: Empty / invalid input is rejected gracefully
- **Given** I submit a blank message or fact
- **Then** no request is made and the log is unchanged
- **Given** I submit a message over 10 000 characters
- **Then** an inline error is shown without crashing the server
### AC-6: Conversation can be cleared
- **Given** the Nexus has conversation history
- **When** I click CLEAR and confirm
- **Then** the chat log shows only a "cleared" confirmation
- **And** the Agno session for `nexus` is reset
### AC-7: Graceful degradation when Ollama is down
- **Given** Ollama is unavailable
- **When** I send a message
- **Then** an error message is shown inline (not a 500 page)
- **And** the app continues to function
### AC-8: No regression on existing tests
- **Given** the nexus route is registered
- **When** `tox -e unit` runs
- **Then** all 343+ existing tests remain green
---
## Future Work (separate issues)
1. **LoRA trigger** — button in the teaching panel to queue a fine-tuning run
using the current Nexus conversation as training data
2. **RL harness** — reward signal collection during conversation for RLHF
3. **Auto-falsework pipeline** — scaffold harness generation from conversation
4. **Bannerlord interface** — Nexus as the live-memory bridge for in-game Timmy
5. **Streaming responses** — token-by-token display via WebSocket
6. **Per-operator sessions** — isolate Nexus history by logged-in user

View File

@@ -300,14 +300,6 @@ class Settings(BaseSettings):
thinking_memory_check_every: int = 50 # check memory status every Nth thought
thinking_idle_timeout_minutes: int = 60 # pause thoughts after N minutes without user input
# ── Dreaming Mode ─────────────────────────────────────────────────
# When enabled, the agent replays past sessions during idle time to
# simulate alternative actions and propose behavioural rules.
dreaming_enabled: bool = True
dreaming_idle_threshold_minutes: int = 10 # idle minutes before dreaming starts
dreaming_cycle_seconds: int = 600 # seconds between dream attempts
dreaming_timeout_seconds: int = 60 # max LLM call time per dream cycle
# ── Gitea Integration ─────────────────────────────────────────────
# Local Gitea instance for issue tracking and self-improvement.
# These values are passed as env vars to the gitea-mcp server process.

View File

@@ -42,7 +42,6 @@ from dashboard.routes.hermes import router as hermes_router
from dashboard.routes.loop_qa import router as loop_qa_router
from dashboard.routes.memory import router as memory_router
from dashboard.routes.mobile import router as mobile_router
from dashboard.routes.nexus import router as nexus_router
from dashboard.routes.models import api_router as models_api_router
from dashboard.routes.models import router as models_router
from dashboard.routes.quests import router as quests_router
@@ -57,7 +56,6 @@ from dashboard.routes.thinking import router as thinking_router
from dashboard.routes.tools import router as tools_router
from dashboard.routes.tower import router as tower_router
from dashboard.routes.voice import router as voice_router
from dashboard.routes.dreaming import router as dreaming_router
from dashboard.routes.work_orders import router as work_orders_router
from dashboard.routes.world import matrix_router
from dashboard.routes.world import router as world_router
@@ -250,36 +248,6 @@ async def _loop_qa_scheduler() -> None:
await asyncio.sleep(interval)
async def _dreaming_scheduler() -> None:
"""Background task: run idle-time dreaming cycles.
When the system has been idle for ``dreaming_idle_threshold_minutes``,
the dreaming engine replays a past session and simulates alternatives.
"""
from timmy.dreaming import dreaming_engine
await asyncio.sleep(15) # Stagger after loop QA scheduler
while True:
try:
if settings.dreaming_enabled:
await asyncio.wait_for(
dreaming_engine.dream_once(),
timeout=settings.dreaming_timeout_seconds + 10,
)
except TimeoutError:
logger.warning(
"Dreaming cycle timed out after %ds",
settings.dreaming_timeout_seconds,
)
except asyncio.CancelledError:
raise
except Exception as exc:
logger.error("Dreaming scheduler error: %s", exc)
await asyncio.sleep(settings.dreaming_cycle_seconds)
_PRESENCE_POLL_SECONDS = 30
_PRESENCE_INITIAL_DELAY = 3
@@ -440,7 +408,6 @@ def _startup_background_tasks() -> list[asyncio.Task]:
asyncio.create_task(_briefing_scheduler()),
asyncio.create_task(_thinking_scheduler()),
asyncio.create_task(_loop_qa_scheduler()),
asyncio.create_task(_dreaming_scheduler()),
asyncio.create_task(_presence_watcher()),
asyncio.create_task(_start_chat_integrations_background()),
asyncio.create_task(_hermes_scheduler()),
@@ -685,7 +652,6 @@ app.include_router(tools_router)
app.include_router(spark_router)
app.include_router(discord_router)
app.include_router(memory_router)
app.include_router(nexus_router)
app.include_router(grok_router)
app.include_router(models_router)
app.include_router(models_api_router)
@@ -708,7 +674,6 @@ app.include_router(quests_router)
app.include_router(scorecards_router)
app.include_router(sovereignty_metrics_router)
app.include_router(sovereignty_ws_router)
app.include_router(dreaming_router)
@app.websocket("/ws")

View File

@@ -1,84 +0,0 @@
"""Dreaming mode dashboard routes.
GET /dreaming/api/status — JSON status of the dreaming engine
GET /dreaming/api/recent — JSON list of recent dream records
POST /dreaming/api/trigger — Manually trigger a dream cycle (for testing)
GET /dreaming/partial — HTMX partial: dreaming status panel
"""
import logging
from fastapi import APIRouter, Request
from fastapi.responses import HTMLResponse, JSONResponse
from dashboard.templating import templates
from timmy.dreaming import dreaming_engine
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/dreaming", tags=["dreaming"])
@router.get("/api/status", response_class=JSONResponse)
async def dreaming_status():
"""Return current dreaming engine status as JSON."""
return dreaming_engine.get_status()
@router.get("/api/recent", response_class=JSONResponse)
async def dreaming_recent(limit: int = 10):
"""Return recent dream records as JSON."""
dreams = dreaming_engine.get_recent_dreams(limit=limit)
return [
{
"id": d.id,
"session_excerpt": d.session_excerpt[:200],
"decision_point": d.decision_point[:200],
"simulation": d.simulation,
"proposed_rule": d.proposed_rule,
"created_at": d.created_at,
}
for d in dreams
]
@router.post("/api/trigger", response_class=JSONResponse)
async def dreaming_trigger():
"""Manually trigger a dream cycle (bypasses idle check).
Useful for testing and manual inspection. Forces idle state temporarily.
"""
from datetime import UTC, datetime, timedelta
from config import settings
# Temporarily back-date last activity to appear idle
original_time = dreaming_engine._last_activity_time
dreaming_engine._last_activity_time = datetime.now(UTC) - timedelta(
minutes=settings.dreaming_idle_threshold_minutes + 1
)
try:
dream = await dreaming_engine.dream_once()
finally:
dreaming_engine._last_activity_time = original_time
if dream:
return {
"status": "ok",
"dream_id": dream.id,
"proposed_rule": dream.proposed_rule,
"simulation": dream.simulation[:200],
}
return {"status": "skipped", "reason": "No dream produced (no sessions or LLM unavailable)"}
@router.get("/partial", response_class=HTMLResponse)
async def dreaming_partial(request: Request):
"""HTMX partial: dreaming status panel for the dashboard."""
status = dreaming_engine.get_status()
recent = dreaming_engine.get_recent_dreams(limit=5)
return templates.TemplateResponse(
request,
"partials/dreaming_status.html",
{"status": status, "recent_dreams": recent},
)

View File

@@ -1,168 +0,0 @@
"""Nexus — Timmy's persistent conversational awareness space.
A conversational-only interface where Timmy maintains live memory context.
No tool use; pure conversation with memory integration and a teaching panel.
Routes:
GET /nexus — render nexus page with live memory sidebar
POST /nexus/chat — send a message; returns HTMX partial
POST /nexus/teach — inject a fact into Timmy's live memory
DELETE /nexus/history — clear the nexus conversation history
"""
import asyncio
import logging
from datetime import datetime, timezone
from fastapi import APIRouter, Form, Request
from fastapi.responses import HTMLResponse
from dashboard.templating import templates
from timmy.memory_system import (
get_memory_stats,
recall_personal_facts_with_ids,
search_memories,
store_personal_fact,
)
from timmy.session import _clean_response, chat, reset_session
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/nexus", tags=["nexus"])
_NEXUS_SESSION_ID = "nexus"
_MAX_MESSAGE_LENGTH = 10_000
# In-memory conversation log for the Nexus session (mirrors chat store pattern
# but is scoped to the Nexus so it won't pollute the main dashboard history).
_nexus_log: list[dict] = []
def _ts() -> str:
return datetime.now(timezone.utc).strftime("%H:%M:%S")
def _append_log(role: str, content: str) -> None:
_nexus_log.append({"role": role, "content": content, "timestamp": _ts()})
# Keep last 200 exchanges to bound memory usage
if len(_nexus_log) > 200:
del _nexus_log[:-200]
@router.get("", response_class=HTMLResponse)
async def nexus_page(request: Request):
"""Render the Nexus page with live memory context."""
stats = get_memory_stats()
facts = recall_personal_facts_with_ids()[:8]
return templates.TemplateResponse(
request,
"nexus.html",
{
"page_title": "Nexus",
"messages": list(_nexus_log),
"stats": stats,
"facts": facts,
},
)
@router.post("/chat", response_class=HTMLResponse)
async def nexus_chat(request: Request, message: str = Form(...)):
"""Conversational-only chat routed through the Nexus session.
Does not invoke tool-use approval flow — pure conversation with memory
context injected from Timmy's live memory store.
"""
message = message.strip()
if not message:
return HTMLResponse("")
if len(message) > _MAX_MESSAGE_LENGTH:
return templates.TemplateResponse(
request,
"partials/nexus_message.html",
{
"user_message": message[:80] + "",
"response": None,
"error": "Message too long (max 10 000 chars).",
"timestamp": _ts(),
"memory_hits": [],
},
)
ts = _ts()
# Fetch semantically relevant memories to surface in the sidebar
try:
memory_hits = await asyncio.to_thread(
search_memories, query=message, limit=4
)
except Exception as exc:
logger.warning("Nexus memory search failed: %s", exc)
memory_hits = []
# Conversational response — no tool approval flow
response_text: str | None = None
error_text: str | None = None
try:
raw = await chat(message, session_id=_NEXUS_SESSION_ID)
response_text = _clean_response(raw)
except Exception as exc:
logger.error("Nexus chat error: %s", exc)
error_text = "Timmy is unavailable right now. Check that Ollama is running."
_append_log("user", message)
if response_text:
_append_log("assistant", response_text)
return templates.TemplateResponse(
request,
"partials/nexus_message.html",
{
"user_message": message,
"response": response_text,
"error": error_text,
"timestamp": ts,
"memory_hits": memory_hits,
},
)
@router.post("/teach", response_class=HTMLResponse)
async def nexus_teach(request: Request, fact: str = Form(...)):
"""Inject a fact into Timmy's live memory from the Nexus teaching panel."""
fact = fact.strip()
if not fact:
return HTMLResponse("")
try:
await asyncio.to_thread(store_personal_fact, fact)
facts = await asyncio.to_thread(recall_personal_facts_with_ids)
facts = facts[:8]
except Exception as exc:
logger.error("Nexus teach error: %s", exc)
facts = []
return templates.TemplateResponse(
request,
"partials/nexus_facts.html",
{"facts": facts, "taught": fact},
)
@router.delete("/history", response_class=HTMLResponse)
async def nexus_clear_history(request: Request):
"""Clear the Nexus conversation history."""
_nexus_log.clear()
reset_session(session_id=_NEXUS_SESSION_ID)
return templates.TemplateResponse(
request,
"partials/nexus_message.html",
{
"user_message": None,
"response": "Nexus conversation cleared.",
"error": None,
"timestamp": _ts(),
"memory_hits": [],
},
)

View File

@@ -67,7 +67,6 @@
<div class="mc-nav-dropdown">
<button class="mc-test-link mc-dropdown-toggle" aria-expanded="false">INTEL &#x25BE;</button>
<div class="mc-dropdown-menu">
<a href="/nexus" class="mc-test-link">NEXUS</a>
<a href="/spark/ui" class="mc-test-link">SPARK</a>
<a href="/memory" class="mc-test-link">MEMORY</a>
<a href="/marketplace/ui" class="mc-test-link">MARKET</a>

View File

@@ -1,122 +0,0 @@
{% extends "base.html" %}
{% block title %}Nexus{% endblock %}
{% block extra_styles %}{% endblock %}
{% block content %}
<div class="container-fluid nexus-layout py-3">
<div class="nexus-header mb-3">
<div class="nexus-title">// NEXUS</div>
<div class="nexus-subtitle">
Persistent conversational awareness &mdash; always present, always learning.
</div>
</div>
<div class="nexus-grid">
<!-- ── LEFT: Conversation ────────────────────────────────── -->
<div class="nexus-chat-col">
<div class="card mc-panel nexus-chat-panel">
<div class="card-header mc-panel-header d-flex justify-content-between align-items-center">
<span>// CONVERSATION</span>
<button class="mc-btn mc-btn-sm"
hx-delete="/nexus/history"
hx-target="#nexus-chat-log"
hx-swap="beforeend"
hx-confirm="Clear nexus conversation?">
CLEAR
</button>
</div>
<div class="card-body p-2" id="nexus-chat-log">
{% for msg in messages %}
<div class="chat-message {{ 'user' if msg.role == 'user' else 'agent' }}">
<div class="msg-meta">
{{ 'YOU' if msg.role == 'user' else 'TIMMY' }} // {{ msg.timestamp }}
</div>
<div class="msg-body {% if msg.role == 'assistant' %}timmy-md{% endif %}">
{{ msg.content | e }}
</div>
</div>
{% else %}
<div class="nexus-empty-state">
Nexus is ready. Start a conversation — memories will surface in real time.
</div>
{% endfor %}
</div>
<div class="card-footer p-2">
<form hx-post="/nexus/chat"
hx-target="#nexus-chat-log"
hx-swap="beforeend"
hx-on::after-request="this.reset(); document.getElementById('nexus-chat-log').scrollTop = 999999;">
<div class="d-flex gap-2">
<input type="text"
name="message"
id="nexus-input"
class="mc-search-input flex-grow-1"
placeholder="Talk to Timmy..."
autocomplete="off"
required>
<button type="submit" class="mc-btn mc-btn-primary">SEND</button>
</div>
</form>
</div>
</div>
</div>
<!-- ── RIGHT: Memory sidebar ─────────────────────────────── -->
<div class="nexus-sidebar-col">
<!-- Live memory context (updated with each response) -->
<div class="card mc-panel nexus-memory-panel mb-3">
<div class="card-header mc-panel-header">
<span>// LIVE MEMORY</span>
<span class="badge ms-2" style="background:var(--purple-dim); color:var(--purple);">
{{ stats.total_entries }} stored
</span>
</div>
<div class="card-body p-2">
<div id="nexus-memory-panel" class="nexus-memory-hits">
<div class="nexus-memory-label">Relevant memories appear here as you chat.</div>
</div>
</div>
</div>
<!-- Teaching panel -->
<div class="card mc-panel nexus-teach-panel">
<div class="card-header mc-panel-header">// TEACH TIMMY</div>
<div class="card-body p-2">
<form hx-post="/nexus/teach"
hx-target="#nexus-teach-response"
hx-swap="innerHTML"
hx-on::after-request="this.reset()">
<div class="d-flex gap-2 mb-2">
<input type="text"
name="fact"
class="mc-search-input flex-grow-1"
placeholder="e.g. I prefer dark themes"
required>
<button type="submit" class="mc-btn mc-btn-primary">TEACH</button>
</div>
</form>
<div id="nexus-teach-response"></div>
<div class="nexus-facts-header mt-3">// KNOWN FACTS</div>
<ul class="nexus-facts-list" id="nexus-facts-list">
{% for fact in facts %}
<li class="nexus-fact-item">{{ fact.content | e }}</li>
{% else %}
<li class="nexus-fact-empty">No personal facts stored yet.</li>
{% endfor %}
</ul>
</div>
</div>
</div><!-- /sidebar -->
</div><!-- /nexus-grid -->
</div>
{% endblock %}

View File

@@ -1,32 +0,0 @@
{% if not status.enabled %}
<div class="dream-disabled text-muted small">Dreaming mode disabled</div>
{% elif status.dreaming %}
<div class="dream-active">
<span class="dream-pulse"></span>
<span class="dream-label">DREAMING</span>
<div class="dream-summary">{{ status.current_summary }}</div>
</div>
{% elif status.idle %}
<div class="dream-idle">
<span class="dream-dot dream-dot-idle"></span>
<span class="dream-label-idle">IDLE</span>
<span class="dream-idle-meta">{{ status.idle_minutes }}m — dream cycle pending</span>
</div>
{% else %}
<div class="dream-standby">
<span class="dream-dot dream-dot-standby"></span>
<span class="dream-label-standby">STANDBY</span>
<span class="dream-idle-meta">idle in {{ status.idle_threshold_minutes - status.idle_minutes }}m</span>
</div>
{% endif %}
{% if recent_dreams %}
<div class="dream-history mt-2">
{% for d in recent_dreams %}
<div class="dream-record">
<div class="dream-rule">{{ d.proposed_rule if d.proposed_rule else "No rule extracted" }}</div>
<div class="dream-meta">{{ d.created_at[:16] | replace("T", " ") }}</div>
</div>
{% endfor %}
</div>
{% endif %}

View File

@@ -1,12 +0,0 @@
{% if taught %}
<div class="nexus-taught-confirm">
✓ Taught: <em>{{ taught | e }}</em>
</div>
{% endif %}
<ul class="nexus-facts-list" id="nexus-facts-list" hx-swap-oob="true">
{% for fact in facts %}
<li class="nexus-fact-item">{{ fact.content | e }}</li>
{% else %}
<li class="nexus-fact-empty">No facts stored yet.</li>
{% endfor %}
</ul>

View File

@@ -1,36 +0,0 @@
{% if user_message %}
<div class="chat-message user">
<div class="msg-meta">YOU // {{ timestamp }}</div>
<div class="msg-body">{{ user_message | e }}</div>
</div>
{% endif %}
{% if response %}
<div class="chat-message agent">
<div class="msg-meta">TIMMY // {{ timestamp }}</div>
<div class="msg-body timmy-md">{{ response | e }}</div>
</div>
<script>
(function() {
var el = document.currentScript.previousElementSibling.querySelector('.timmy-md');
if (el && typeof marked !== 'undefined' && typeof DOMPurify !== 'undefined') {
el.innerHTML = DOMPurify.sanitize(marked.parse(el.textContent));
}
})();
</script>
{% elif error %}
<div class="chat-message error-msg">
<div class="msg-meta">SYSTEM // {{ timestamp }}</div>
<div class="msg-body">{{ error | e }}</div>
</div>
{% endif %}
{% if memory_hits %}
<div class="nexus-memory-hits" id="nexus-memory-panel" hx-swap-oob="true">
<div class="nexus-memory-label">// LIVE MEMORY CONTEXT</div>
{% for hit in memory_hits %}
<div class="nexus-memory-hit">
<span class="nexus-memory-type">{{ hit.memory_type }}</span>
<span class="nexus-memory-content">{{ hit.content | e }}</span>
</div>
{% endfor %}
</div>
{% endif %}

View File

@@ -71,53 +71,6 @@ class GitHand:
return True
return False
async def _exec_subprocess(
self,
args: str,
timeout: int,
) -> tuple[bytes, bytes, int]:
"""Run git as a subprocess, return (stdout, stderr, returncode).
Raises TimeoutError if the process exceeds *timeout* seconds.
"""
proc = await asyncio.create_subprocess_exec(
"git",
*args.split(),
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=self._repo_dir,
)
try:
stdout, stderr = await asyncio.wait_for(
proc.communicate(),
timeout=timeout,
)
except TimeoutError:
proc.kill()
await proc.wait()
raise
return stdout, stderr, proc.returncode or 0
@staticmethod
def _parse_output(
command: str,
stdout_bytes: bytes,
stderr_bytes: bytes,
returncode: int | None,
latency_ms: float,
) -> GitResult:
"""Decode subprocess output into a GitResult."""
exit_code = returncode or 0
stdout = stdout_bytes.decode("utf-8", errors="replace").strip()
stderr = stderr_bytes.decode("utf-8", errors="replace").strip()
return GitResult(
operation=command,
success=exit_code == 0,
output=stdout,
error=stderr if exit_code != 0 else "",
latency_ms=latency_ms,
)
async def run(
self,
args: str,
@@ -135,15 +88,14 @@ class GitHand:
GitResult with output or error details.
"""
start = time.time()
command = f"git {args}"
# Gate destructive operations
if self._is_destructive(args) and not allow_destructive:
return GitResult(
operation=command,
operation=f"git {args}",
success=False,
error=(
f"Destructive operation blocked: '{command}'. "
f"Destructive operation blocked: 'git {args}'. "
"Set allow_destructive=True to override."
),
requires_confirmation=True,
@@ -151,21 +103,46 @@ class GitHand:
)
effective_timeout = timeout or self._timeout
command = f"git {args}"
try:
stdout_bytes, stderr_bytes, returncode = await self._exec_subprocess(
args,
effective_timeout,
proc = await asyncio.create_subprocess_exec(
"git",
*args.split(),
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=self._repo_dir,
)
except TimeoutError:
try:
stdout_bytes, stderr_bytes = await asyncio.wait_for(
proc.communicate(), timeout=effective_timeout
)
except TimeoutError:
proc.kill()
await proc.wait()
latency = (time.time() - start) * 1000
logger.warning("Git command timed out after %ds: %s", effective_timeout, command)
return GitResult(
operation=command,
success=False,
error=f"Command timed out after {effective_timeout}s",
latency_ms=latency,
)
latency = (time.time() - start) * 1000
logger.warning("Git command timed out after %ds: %s", effective_timeout, command)
exit_code = proc.returncode or 0
stdout = stdout_bytes.decode("utf-8", errors="replace").strip()
stderr = stderr_bytes.decode("utf-8", errors="replace").strip()
return GitResult(
operation=command,
success=False,
error=f"Command timed out after {effective_timeout}s",
success=exit_code == 0,
output=stdout,
error=stderr if exit_code != 0 else "",
latency_ms=latency,
)
except FileNotFoundError:
latency = (time.time() - start) * 1000
logger.warning("git binary not found")
@@ -185,14 +162,6 @@ class GitHand:
latency_ms=latency,
)
return self._parse_output(
command,
stdout_bytes,
stderr_bytes,
returncode=returncode,
latency_ms=(time.time() - start) * 1000,
)
# ── Convenience wrappers ─────────────────────────────────────────────────
async def status(self) -> GitResult:

View File

@@ -1,435 +0,0 @@
"""Dreaming Mode — idle-time session replay and counterfactual simulation.
When the dashboard has been idle for a configurable period, this engine
selects a past chat session, identifies key agent response points, and
asks the LLM to simulate alternative approaches. Insights are stored as
proposed rules that can feed the auto-crystallizer or memory system.
Usage::
from timmy.dreaming import dreaming_engine
# Run one dream cycle (called by the background scheduler)
await dreaming_engine.dream_once()
# Query recent dreams
dreams = dreaming_engine.get_recent_dreams(limit=10)
# Get current status dict for API/dashboard
status = dreaming_engine.get_status()
"""
import json
import logging
import re
import sqlite3
import uuid
from collections.abc import Generator
from contextlib import closing, contextmanager
from dataclasses import dataclass
from datetime import UTC, datetime, timedelta
from pathlib import Path
from typing import Any
from config import settings
logger = logging.getLogger(__name__)
_DEFAULT_DB = Path("data/dreams.db")
# Strip <think> tags from reasoning model output
_THINK_TAG_RE = re.compile(r"<think>.*?</think>\s*", re.DOTALL)
# Minimum messages in a session to be worth replaying
_MIN_SESSION_MESSAGES = 3
# Gap in seconds between messages that signals a new session
_SESSION_GAP_SECONDS = 1800 # 30 minutes
@dataclass
class DreamRecord:
"""A single completed dream cycle."""
id: str
session_excerpt: str # Short excerpt from the replayed session
decision_point: str # The agent message that was re-simulated
simulation: str # The alternative response generated
proposed_rule: str # Rule extracted from the simulation
created_at: str
@contextmanager
def _get_conn(db_path: Path = _DEFAULT_DB) -> Generator[sqlite3.Connection, None, None]:
db_path.parent.mkdir(parents=True, exist_ok=True)
with closing(sqlite3.connect(str(db_path))) as conn:
conn.row_factory = sqlite3.Row
conn.execute("""
CREATE TABLE IF NOT EXISTS dreams (
id TEXT PRIMARY KEY,
session_excerpt TEXT NOT NULL,
decision_point TEXT NOT NULL,
simulation TEXT NOT NULL,
proposed_rule TEXT NOT NULL DEFAULT '',
created_at TEXT NOT NULL
)
""")
conn.execute("CREATE INDEX IF NOT EXISTS idx_dreams_time ON dreams(created_at)")
conn.commit()
yield conn
def _row_to_dream(row: sqlite3.Row) -> DreamRecord:
return DreamRecord(
id=row["id"],
session_excerpt=row["session_excerpt"],
decision_point=row["decision_point"],
simulation=row["simulation"],
proposed_rule=row["proposed_rule"],
created_at=row["created_at"],
)
class DreamingEngine:
"""Idle-time dreaming engine — replays sessions and simulates alternatives."""
def __init__(self, db_path: Path = _DEFAULT_DB) -> None:
self._db_path = db_path
self._last_activity_time: datetime = datetime.now(UTC)
self._is_dreaming: bool = False
self._current_dream_summary: str = ""
self._dreaming_agent = None # Lazy-initialised
# ── Public API ────────────────────────────────────────────────────────
def record_activity(self) -> None:
"""Reset the idle timer — call this on every user/agent interaction."""
self._last_activity_time = datetime.now(UTC)
def is_idle(self) -> bool:
"""Return True if the system has been idle long enough to start dreaming."""
threshold = settings.dreaming_idle_threshold_minutes
if threshold <= 0:
return False
return datetime.now(UTC) - self._last_activity_time > timedelta(minutes=threshold)
def get_status(self) -> dict[str, Any]:
"""Return a status dict suitable for API/dashboard consumption."""
return {
"enabled": settings.dreaming_enabled,
"dreaming": self._is_dreaming,
"idle": self.is_idle(),
"current_summary": self._current_dream_summary,
"idle_minutes": int(
(datetime.now(UTC) - self._last_activity_time).total_seconds() / 60
),
"idle_threshold_minutes": settings.dreaming_idle_threshold_minutes,
"dream_count": self.count_dreams(),
}
async def dream_once(self) -> DreamRecord | None:
"""Execute one dream cycle.
Returns the stored DreamRecord, or None if the cycle was skipped
(not idle, dreaming disabled, no suitable session, or LLM error).
"""
if not settings.dreaming_enabled:
return None
if not self.is_idle():
logger.debug(
"Dreaming skipped — system active (idle for %d min, threshold %d min)",
int((datetime.now(UTC) - self._last_activity_time).total_seconds() / 60),
settings.dreaming_idle_threshold_minutes,
)
return None
if self._is_dreaming:
logger.debug("Dreaming skipped — cycle already in progress")
return None
self._is_dreaming = True
self._current_dream_summary = "Selecting a past session…"
await self._broadcast_status()
try:
return await self._run_dream_cycle()
except Exception as exc:
logger.warning("Dream cycle failed: %s", exc)
return None
finally:
self._is_dreaming = False
self._current_dream_summary = ""
await self._broadcast_status()
def get_recent_dreams(self, limit: int = 20) -> list[DreamRecord]:
"""Retrieve the most recent dream records."""
with _get_conn(self._db_path) as conn:
rows = conn.execute(
"SELECT * FROM dreams ORDER BY created_at DESC LIMIT ?",
(limit,),
).fetchall()
return [_row_to_dream(r) for r in rows]
def count_dreams(self) -> int:
"""Return total number of stored dream records."""
with _get_conn(self._db_path) as conn:
row = conn.execute("SELECT COUNT(*) AS c FROM dreams").fetchone()
return row["c"] if row else 0
# ── Private helpers ───────────────────────────────────────────────────
async def _run_dream_cycle(self) -> DreamRecord | None:
"""Core dream logic: select → simulate → store."""
# 1. Select a past session from the chat log
session = await self._select_session()
if not session:
logger.debug("No suitable chat session found for dreaming")
self._current_dream_summary = "No past sessions to replay"
return None
decision_point, session_excerpt = session
self._current_dream_summary = f"Simulating alternative for: {decision_point[:60]}"
await self._broadcast_status()
# 2. Simulate an alternative response
simulation = await self._simulate_alternative(decision_point, session_excerpt)
if not simulation:
logger.debug("Dream simulation produced no output")
return None
# 3. Extract a proposed rule
proposed_rule = await self._extract_rule(decision_point, simulation)
# 4. Store and broadcast
dream = self._store_dream(
session_excerpt=session_excerpt,
decision_point=decision_point,
simulation=simulation,
proposed_rule=proposed_rule,
)
self._current_dream_summary = f"Dream complete: {proposed_rule[:80]}" if proposed_rule else "Dream complete"
logger.info(
"Dream [%s]: replayed session, proposed rule: %s",
dream.id[:8],
proposed_rule[:80] if proposed_rule else "(none)",
)
await self._broadcast_status()
await self._broadcast_dream(dream)
return dream
async def _select_session(self) -> tuple[str, str] | None:
"""Select a past chat session and return (decision_point, session_excerpt).
Uses the SQLite chat store. Groups messages into sessions by time
gap. Picks a random session with enough messages, then selects one
agent response as the decision point.
"""
try:
from infrastructure.chat_store import DB_PATH
if not DB_PATH.exists():
return None
import asyncio
rows = await asyncio.to_thread(self._load_chat_rows)
if not rows:
return None
sessions = self._group_into_sessions(rows)
if not sessions:
return None
# Filter sessions with enough messages
valid = [s for s in sessions if len(s) >= _MIN_SESSION_MESSAGES]
if not valid:
return None
import random
session = random.choice(valid) # noqa: S311 (not cryptographic)
# Build a short text excerpt (last N messages)
excerpt_msgs = session[-6:]
excerpt = "\n".join(
f"{m['role'].upper()}: {m['content'][:200]}" for m in excerpt_msgs
)
# Find agent responses as candidate decision points
agent_msgs = [m for m in session if m["role"] in ("agent", "assistant")]
if not agent_msgs:
return None
decision = random.choice(agent_msgs) # noqa: S311
return decision["content"], excerpt
except Exception as exc:
logger.warning("Session selection failed: %s", exc)
return None
def _load_chat_rows(self) -> list[dict]:
"""Synchronously load chat messages from SQLite."""
from infrastructure.chat_store import DB_PATH
with closing(sqlite3.connect(str(DB_PATH))) as conn:
conn.row_factory = sqlite3.Row
rows = conn.execute(
"SELECT role, content, timestamp FROM chat_messages "
"ORDER BY timestamp ASC"
).fetchall()
return [dict(r) for r in rows]
def _group_into_sessions(self, rows: list[dict]) -> list[list[dict]]:
"""Group chat rows into sessions based on time gaps."""
if not rows:
return []
sessions: list[list[dict]] = []
current: list[dict] = [rows[0]]
for prev, curr in zip(rows, rows[1:]):
try:
t_prev = datetime.fromisoformat(prev["timestamp"].replace("Z", "+00:00"))
t_curr = datetime.fromisoformat(curr["timestamp"].replace("Z", "+00:00"))
gap = (t_curr - t_prev).total_seconds()
except Exception:
gap = 0
if gap > _SESSION_GAP_SECONDS:
sessions.append(current)
current = [curr]
else:
current.append(curr)
sessions.append(current)
return sessions
async def _simulate_alternative(
self, decision_point: str, session_excerpt: str
) -> str:
"""Ask the LLM to simulate an alternative response."""
prompt = (
"You are Timmy, a sovereign AI agent in a dreaming state.\n"
"You are replaying a past conversation and exploring what you could "
"have done differently at a key decision point.\n\n"
"PAST SESSION EXCERPT:\n"
f"{session_excerpt}\n\n"
"KEY DECISION POINT (your past response):\n"
f"{decision_point[:500]}\n\n"
"TASK: In 2-3 sentences, describe ONE concrete alternative approach "
"you could have taken at this decision point that would have been "
"more helpful, more accurate, or more efficient.\n"
"Be specific — reference the actual content of the conversation.\n"
"Do NOT include meta-commentary about dreaming or this exercise.\n\n"
"Alternative approach:"
)
raw = await self._call_agent(prompt)
return _THINK_TAG_RE.sub("", raw).strip() if raw else ""
async def _extract_rule(self, decision_point: str, simulation: str) -> str:
"""Extract a proposed behaviour rule from the simulation."""
prompt = (
"Given this pair of agent responses:\n\n"
f"ORIGINAL: {decision_point[:300]}\n\n"
f"IMPROVED ALTERNATIVE: {simulation[:400]}\n\n"
"Extract ONE concise rule (max 20 words) that captures what to do "
"differently next time. Format: 'When X, do Y instead of Z.'\n"
"Rule:"
)
raw = await self._call_agent(prompt)
rule = _THINK_TAG_RE.sub("", raw).strip() if raw else ""
# Keep only the first sentence/line
rule = rule.split("\n")[0].strip().rstrip(".")
return rule[:200] # Safety cap
async def _call_agent(self, prompt: str) -> str:
"""Call the Timmy agent for a dreaming prompt (skip MCP, 60 s timeout)."""
import asyncio
if self._dreaming_agent is None:
from timmy.agent import create_timmy
self._dreaming_agent = create_timmy(skip_mcp=True)
try:
async with asyncio.timeout(settings.dreaming_timeout_seconds):
run = await self._dreaming_agent.arun(prompt, stream=False)
except TimeoutError:
logger.warning("Dreaming LLM call timed out after %ds", settings.dreaming_timeout_seconds)
return ""
except Exception as exc:
logger.warning("Dreaming LLM call failed: %s", exc)
return ""
raw = run.content if hasattr(run, "content") else str(run)
return raw or ""
def _store_dream(
self,
*,
session_excerpt: str,
decision_point: str,
simulation: str,
proposed_rule: str,
) -> DreamRecord:
dream = DreamRecord(
id=str(uuid.uuid4()),
session_excerpt=session_excerpt,
decision_point=decision_point,
simulation=simulation,
proposed_rule=proposed_rule,
created_at=datetime.now(UTC).isoformat(),
)
with _get_conn(self._db_path) as conn:
conn.execute(
"""
INSERT INTO dreams
(id, session_excerpt, decision_point, simulation, proposed_rule, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""",
(
dream.id,
dream.session_excerpt,
dream.decision_point,
dream.simulation,
dream.proposed_rule,
dream.created_at,
),
)
conn.commit()
return dream
async def _broadcast_status(self) -> None:
"""Push current dreaming status via WebSocket."""
try:
from infrastructure.ws_manager.handler import ws_manager
await ws_manager.broadcast("dreaming_state", self.get_status())
except Exception as exc:
logger.debug("Dreaming status broadcast failed: %s", exc)
async def _broadcast_dream(self, dream: DreamRecord) -> None:
"""Push a completed dream record via WebSocket."""
try:
from infrastructure.ws_manager.handler import ws_manager
await ws_manager.broadcast(
"dreaming_complete",
{
"id": dream.id,
"proposed_rule": dream.proposed_rule,
"simulation": dream.simulation[:200],
"created_at": dream.created_at,
},
)
except Exception as exc:
logger.debug("Dreaming complete broadcast failed: %s", exc)
# Module-level singleton
dreaming_engine = DreamingEngine()

View File

@@ -1,48 +1,532 @@
"""Tool registry, full toolkit construction, and tool catalog.
"""Tool integration for the agent swarm.
Provides:
- Internal _register_* helpers for wiring tools into toolkits
- create_full_toolkit (orchestrator toolkit)
- create_experiment_tools (Lab agent toolkit)
- AGENT_TOOLKITS / get_tools_for_agent registry
- get_all_available_tools catalog
Provides agents with capabilities for:
- File read/write (local filesystem)
- Shell command execution (sandboxed)
- Python code execution
- Git operations
- Image / Music / Video generation (creative pipeline)
Tools are assigned to agents based on their specialties.
"""
from __future__ import annotations
import ast
import logging
import math
from collections.abc import Callable
from dataclasses import dataclass, field
from datetime import UTC, datetime
from pathlib import Path
from timmy.tools._base import (
_AGNO_TOOLS_AVAILABLE,
_ImportError,
FileTools,
PythonTools,
ShellTools,
Toolkit,
)
from timmy.tools.file_tools import (
_make_smart_read_file,
create_data_tools,
create_research_tools,
create_writing_tools,
)
from timmy.tools.system_tools import (
calculator,
consult_grok,
create_code_tools,
create_devops_tools,
create_security_tools,
web_fetch,
)
from config import settings
logger = logging.getLogger(__name__)
# Max characters of user query included in Lightning invoice memo
_INVOICE_MEMO_MAX_LEN = 50
# ---------------------------------------------------------------------------
# Internal _register_* helpers
# ---------------------------------------------------------------------------
# Lazy imports to handle test mocking
_ImportError = None
try:
from agno.tools import Toolkit
from agno.tools.file import FileTools
from agno.tools.python import PythonTools
from agno.tools.shell import ShellTools
_AGNO_TOOLS_AVAILABLE = True
except ImportError as e:
_AGNO_TOOLS_AVAILABLE = False
_ImportError = e
# Track tool usage stats
_TOOL_USAGE: dict[str, list[dict]] = {}
@dataclass
class ToolStats:
"""Statistics for a single tool."""
tool_name: str
call_count: int = 0
last_used: str | None = None
errors: int = 0
@dataclass
class AgentTools:
"""Tools assigned to an agent."""
agent_id: str
agent_name: str
toolkit: Toolkit
available_tools: list[str] = field(default_factory=list)
# Backward-compat alias
PersonaTools = AgentTools
def _track_tool_usage(agent_id: str, tool_name: str, success: bool = True) -> None:
"""Track tool usage for analytics."""
if agent_id not in _TOOL_USAGE:
_TOOL_USAGE[agent_id] = []
_TOOL_USAGE[agent_id].append(
{
"tool": tool_name,
"timestamp": datetime.now(UTC).isoformat(),
"success": success,
}
)
def get_tool_stats(agent_id: str | None = None) -> dict:
"""Get tool usage statistics.
Args:
agent_id: Optional agent ID to filter by. If None, returns stats for all agents.
Returns:
Dict with tool usage statistics.
"""
if agent_id:
usage = _TOOL_USAGE.get(agent_id, [])
return {
"agent_id": agent_id,
"total_calls": len(usage),
"tools_used": list(set(u["tool"] for u in usage)),
"recent_calls": usage[-10:] if usage else [],
}
# Return stats for all agents
all_stats = {}
for aid, usage in _TOOL_USAGE.items():
all_stats[aid] = {
"total_calls": len(usage),
"tools_used": list(set(u["tool"] for u in usage)),
}
return all_stats
def _safe_eval(node, allowed_names: dict):
"""Walk an AST and evaluate only safe numeric operations."""
if isinstance(node, ast.Expression):
return _safe_eval(node.body, allowed_names)
if isinstance(node, ast.Constant):
if isinstance(node.value, (int, float, complex)):
return node.value
raise ValueError(f"Unsupported constant: {node.value!r}")
if isinstance(node, ast.UnaryOp):
operand = _safe_eval(node.operand, allowed_names)
if isinstance(node.op, ast.UAdd):
return +operand
if isinstance(node.op, ast.USub):
return -operand
raise ValueError(f"Unsupported unary op: {type(node.op).__name__}")
if isinstance(node, ast.BinOp):
left = _safe_eval(node.left, allowed_names)
right = _safe_eval(node.right, allowed_names)
ops = {
ast.Add: lambda a, b: a + b,
ast.Sub: lambda a, b: a - b,
ast.Mult: lambda a, b: a * b,
ast.Div: lambda a, b: a / b,
ast.FloorDiv: lambda a, b: a // b,
ast.Mod: lambda a, b: a % b,
ast.Pow: lambda a, b: a**b,
}
op_fn = ops.get(type(node.op))
if op_fn is None:
raise ValueError(f"Unsupported binary op: {type(node.op).__name__}")
return op_fn(left, right)
if isinstance(node, ast.Name):
if node.id in allowed_names:
return allowed_names[node.id]
raise ValueError(f"Unknown name: {node.id!r}")
if isinstance(node, ast.Attribute):
value = _safe_eval(node.value, allowed_names)
# Only allow attribute access on the math module
if value is math:
attr = getattr(math, node.attr, None)
if attr is not None:
return attr
raise ValueError(f"Attribute access not allowed: .{node.attr}")
if isinstance(node, ast.Call):
func = _safe_eval(node.func, allowed_names)
if not callable(func):
raise ValueError(f"Not callable: {func!r}")
args = [_safe_eval(a, allowed_names) for a in node.args]
kwargs = {kw.arg: _safe_eval(kw.value, allowed_names) for kw in node.keywords}
return func(*args, **kwargs)
raise ValueError(f"Unsupported syntax: {type(node).__name__}")
def calculator(expression: str) -> str:
"""Evaluate a mathematical expression and return the exact result.
Use this tool for ANY arithmetic: multiplication, division, square roots,
exponents, percentages, logarithms, trigonometry, etc.
Args:
expression: A valid Python math expression, e.g. '347 * 829',
'math.sqrt(17161)', '2**10', 'math.log(100, 10)'.
Returns:
The exact result as a string.
"""
allowed_names = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}
allowed_names["math"] = math
allowed_names["abs"] = abs
allowed_names["round"] = round
allowed_names["min"] = min
allowed_names["max"] = max
try:
tree = ast.parse(expression, mode="eval")
result = _safe_eval(tree, allowed_names)
return str(result)
except Exception as e: # broad catch intentional: arbitrary code execution
return f"Error evaluating '{expression}': {e}"
def _make_smart_read_file(file_tools: FileTools) -> Callable:
"""Wrap FileTools.read_file so directories auto-list their contents.
When the user (or the LLM) passes a directory path to read_file,
the raw Agno implementation throws an IsADirectoryError. This
wrapper detects that case, lists the directory entries, and returns
a helpful message so the model can pick the right file on its own.
"""
original_read = file_tools.read_file
def smart_read_file(file_name: str = "", encoding: str = "utf-8", **kwargs) -> str:
"""Reads the contents of the file `file_name` and returns the contents if successful."""
# LLMs often call read_file(path=...) instead of read_file(file_name=...)
if not file_name:
file_name = kwargs.get("path", "")
if not file_name:
return "Error: no file_name or path provided."
# Resolve the path the same way FileTools does
_safe, resolved = file_tools.check_escape(file_name)
if _safe and resolved.is_dir():
entries = sorted(p.name for p in resolved.iterdir() if not p.name.startswith("."))
listing = "\n".join(f" - {e}" for e in entries) if entries else " (empty directory)"
return (
f"'{file_name}' is a directory, not a file. "
f"Files inside:\n{listing}\n\n"
"Please call read_file with one of the files listed above."
)
return original_read(file_name, encoding=encoding)
# Preserve the original docstring for Agno tool schema generation
smart_read_file.__doc__ = original_read.__doc__
return smart_read_file
def create_research_tools(base_dir: str | Path | None = None):
"""Create tools for the research agent (Echo).
Includes: file reading
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="research")
# File reading
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_code_tools(base_dir: str | Path | None = None):
"""Create tools for the code agent (Forge).
Includes: shell commands, python execution, file read/write, Aider AI assist
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="code")
# Shell commands (sandboxed)
shell_tools = ShellTools()
toolkit.register(shell_tools.run_shell_command, name="shell")
# Python execution
python_tools = PythonTools()
toolkit.register(python_tools.run_python_code, name="python")
# File operations
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.save_file, name="write_file")
toolkit.register(file_tools.list_files, name="list_files")
# Aider AI coding assistant (local with Ollama)
aider_tool = create_aider_tool(base_path)
toolkit.register(aider_tool.run_aider, name="aider")
return toolkit
def create_aider_tool(base_path: Path):
"""Create an Aider tool for AI-assisted coding."""
import subprocess
class AiderTool:
"""Tool that calls Aider (local AI coding assistant) for code generation."""
def __init__(self, base_dir: Path):
self.base_dir = base_dir
def run_aider(self, prompt: str, model: str = "qwen3:30b") -> str:
"""Run Aider to generate code changes.
Args:
prompt: What you want Aider to do (e.g., "add a fibonacci function")
model: Ollama model to use (default: qwen3:30b)
Returns:
Aider's response with the code changes made
"""
try:
# Run aider with the prompt
result = subprocess.run(
[
"aider",
"--no-git",
"--model",
f"ollama/{model}",
"--quiet",
prompt,
],
capture_output=True,
text=True,
timeout=120,
cwd=str(self.base_dir),
)
if result.returncode == 0:
return result.stdout if result.stdout else "Code changes applied successfully"
else:
return f"Aider error: {result.stderr}"
except FileNotFoundError:
return "Error: Aider not installed. Run: pip install aider"
except subprocess.TimeoutExpired:
return "Error: Aider timed out after 120 seconds"
except (OSError, subprocess.SubprocessError) as e:
return f"Error running Aider: {str(e)}"
return AiderTool(base_path)
def create_data_tools(base_dir: str | Path | None = None):
"""Create tools for the data agent (Seer).
Includes: python execution, file reading, web search for data sources
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="data")
# Python execution for analysis
python_tools = PythonTools()
toolkit.register(python_tools.run_python_code, name="python")
# File reading
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_writing_tools(base_dir: str | Path | None = None):
"""Create tools for the writing agent (Quill).
Includes: file read/write
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="writing")
# File operations
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.save_file, name="write_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_security_tools(base_dir: str | Path | None = None):
"""Create tools for the security agent (Mace).
Includes: shell commands (for scanning), file read
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="security")
# Shell for running security scans
shell_tools = ShellTools()
toolkit.register(shell_tools.run_shell_command, name="shell")
# File reading for logs/configs
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_devops_tools(base_dir: str | Path | None = None):
"""Create tools for the DevOps agent (Helm).
Includes: shell commands, file read/write
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="devops")
# Shell for deployment commands
shell_tools = ShellTools()
toolkit.register(shell_tools.run_shell_command, name="shell")
# File operations for config management
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.save_file, name="write_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def consult_grok(query: str) -> str:
"""Consult Grok (xAI) for frontier reasoning on complex questions.
Use this tool when a question requires advanced reasoning, real-time
knowledge, or capabilities beyond the local model. Grok is a premium
cloud backend use sparingly and only for high-complexity queries.
Args:
query: The question or reasoning task to send to Grok.
Returns:
Grok's response text, or an error/status message.
"""
from config import settings
from timmy.backends import get_grok_backend, grok_available
if not grok_available():
return (
"Grok is not available. Enable with GROK_ENABLED=true "
"and set XAI_API_KEY in your .env file."
)
backend = get_grok_backend()
# Log to Spark if available
try:
from spark.engine import spark_engine
spark_engine.on_tool_executed(
agent_id="default",
tool_name="consult_grok",
success=True,
)
except (ImportError, AttributeError) as exc:
logger.warning("Tool execution failed (consult_grok logging): %s", exc)
# Generate Lightning invoice for monetization (unless free mode)
invoice_info = ""
if not settings.grok_free:
try:
from lightning.factory import get_backend as get_ln_backend
ln = get_ln_backend()
sats = min(settings.grok_max_sats_per_query, settings.grok_sats_hard_cap)
inv = ln.create_invoice(sats, f"Grok query: {query[:_INVOICE_MEMO_MAX_LEN]}")
invoice_info = f"\n[Lightning invoice: {sats} sats — {inv.payment_request[:40]}...]"
except (ImportError, OSError, ValueError) as exc:
logger.error("Lightning invoice creation failed: %s", exc)
return "Error: Failed to create Lightning invoice. Please check logs."
result = backend.run(query)
response = result.content
if invoice_info:
response += invoice_info
return response
def web_fetch(url: str, max_tokens: int = 4000) -> str:
"""Fetch a web page and return its main text content.
Downloads the URL, extracts readable text using trafilatura, and
truncates to a token budget. Use this to read full articles, docs,
or blog posts that web_search only returns snippets for.
Args:
url: The URL to fetch (must start with http:// or https://).
max_tokens: Maximum approximate token budget (default 4000).
Text is truncated to max_tokens * 4 characters.
Returns:
Extracted text content, or an error message on failure.
"""
if not url or not url.startswith(("http://", "https://")):
return f"Error: invalid URL — must start with http:// or https://: {url!r}"
try:
import requests as _requests
except ImportError:
return "Error: 'requests' package is not installed. Install with: pip install requests"
try:
import trafilatura
except ImportError:
return (
"Error: 'trafilatura' package is not installed. Install with: pip install trafilatura"
)
try:
resp = _requests.get(
url,
timeout=15,
headers={"User-Agent": "TimmyResearchBot/1.0"},
)
resp.raise_for_status()
except _requests.exceptions.Timeout:
return f"Error: request timed out after 15 seconds for {url}"
except _requests.exceptions.HTTPError as exc:
return f"Error: HTTP {exc.response.status_code} for {url}"
except _requests.exceptions.RequestException as exc:
return f"Error: failed to fetch {url}{exc}"
text = trafilatura.extract(resp.text, include_tables=True, include_links=True)
if not text:
return f"Error: could not extract readable content from {url}"
char_budget = max_tokens * 4
if len(text) > char_budget:
text = text[:char_budget] + f"\n\n[…truncated to ~{max_tokens} tokens]"
return text
def _register_web_fetch_tool(toolkit: Toolkit) -> None:
@@ -233,11 +717,6 @@ def _register_thinking_tools(toolkit: Toolkit) -> None:
raise
# ---------------------------------------------------------------------------
# Full toolkit factories
# ---------------------------------------------------------------------------
def create_full_toolkit(base_dir: str | Path | None = None):
"""Create a full toolkit with all available tools (for the orchestrator).
@@ -248,7 +727,6 @@ def create_full_toolkit(base_dir: str | Path | None = None):
# Return None when tools aren't available (tests)
return None
from config import settings
from timmy.tool_safety import DANGEROUS_TOOLS
toolkit = Toolkit(name="full")
@@ -330,24 +808,6 @@ def create_experiment_tools(base_dir: str | Path | None = None):
return toolkit
# ---------------------------------------------------------------------------
# Agent toolkit registry
# ---------------------------------------------------------------------------
def _create_stub_toolkit(name: str):
"""Create a minimal Agno toolkit for creative agents.
Creative agents use their own dedicated tool modules rather than
Agno-wrapped functions. This stub ensures AGENT_TOOLKITS has an
entry so ToolExecutor doesn't fall back to the full toolkit.
"""
if not _AGNO_TOOLS_AVAILABLE:
return None
toolkit = Toolkit(name=name)
return toolkit
# Mapping of agent IDs to their toolkits
AGENT_TOOLKITS: dict[str, Callable[[], Toolkit]] = {
"echo": create_research_tools,
@@ -363,7 +823,20 @@ AGENT_TOOLKITS: dict[str, Callable[[], Toolkit]] = {
}
def get_tools_for_agent(agent_id: str, base_dir: str | Path | None = None) -> "Toolkit | None":
def _create_stub_toolkit(name: str):
"""Create a minimal Agno toolkit for creative agents.
Creative agents use their own dedicated tool modules rather than
Agno-wrapped functions. This stub ensures AGENT_TOOLKITS has an
entry so ToolExecutor doesn't fall back to the full toolkit.
"""
if not _AGNO_TOOLS_AVAILABLE:
return None
toolkit = Toolkit(name=name)
return toolkit
def get_tools_for_agent(agent_id: str, base_dir: str | Path | None = None) -> Toolkit | None:
"""Get the appropriate toolkit for an agent.
Args:
@@ -379,16 +852,11 @@ def get_tools_for_agent(agent_id: str, base_dir: str | Path | None = None) -> "T
return None
# Backward-compat aliases
# Backward-compat alias
get_tools_for_persona = get_tools_for_agent
PERSONA_TOOLKITS = AGENT_TOOLKITS
# ---------------------------------------------------------------------------
# Tool catalog
# ---------------------------------------------------------------------------
def _core_tool_catalog() -> dict:
"""Return core file and execution tools catalog entries."""
return {

View File

@@ -1,94 +0,0 @@
"""Tool integration for the agent swarm.
Provides agents with capabilities for:
- File read/write (local filesystem)
- Shell command execution (sandboxed)
- Python code execution
- Git operations
- Image / Music / Video generation (creative pipeline)
Tools are assigned to agents based on their specialties.
Sub-modules:
- _base: shared types, tracking state
- file_tools: file-operation toolkit factories (Echo, Quill, Seer)
- system_tools: calculator, AI tools, code/devops toolkit factories
- _registry: full toolkit construction, agent registry, tool catalog
"""
# Re-export everything for backward compatibility — callers that do
# ``from timmy.tools import <symbol>`` continue to work unchanged.
from timmy.tools._base import (
AgentTools,
PersonaTools,
ToolStats,
_AGNO_TOOLS_AVAILABLE,
_ImportError,
_TOOL_USAGE,
_track_tool_usage,
get_tool_stats,
)
from timmy.tools._registry import (
AGENT_TOOLKITS,
PERSONA_TOOLKITS,
_create_stub_toolkit,
_merge_catalog,
create_experiment_tools,
create_full_toolkit,
get_all_available_tools,
get_tools_for_agent,
get_tools_for_persona,
)
from timmy.tools.file_tools import (
_make_smart_read_file,
create_data_tools,
create_research_tools,
create_writing_tools,
)
from timmy.tools.system_tools import (
_safe_eval,
calculator,
consult_grok,
create_aider_tool,
create_code_tools,
create_devops_tools,
create_security_tools,
web_fetch,
)
__all__ = [
# _base
"AgentTools",
"PersonaTools",
"ToolStats",
"_AGNO_TOOLS_AVAILABLE",
"_ImportError",
"_TOOL_USAGE",
"_track_tool_usage",
"get_tool_stats",
# file_tools
"_make_smart_read_file",
"create_data_tools",
"create_research_tools",
"create_writing_tools",
# system_tools
"_safe_eval",
"calculator",
"consult_grok",
"create_aider_tool",
"create_code_tools",
"create_devops_tools",
"create_security_tools",
"web_fetch",
# _registry
"AGENT_TOOLKITS",
"PERSONA_TOOLKITS",
"_create_stub_toolkit",
"_merge_catalog",
"create_experiment_tools",
"create_full_toolkit",
"get_all_available_tools",
"get_tools_for_agent",
"get_tools_for_persona",
]

View File

@@ -1,90 +0,0 @@
"""Base types, shared state, and tracking for the Timmy tool system."""
from __future__ import annotations
import logging
from dataclasses import dataclass, field
from datetime import UTC, datetime
logger = logging.getLogger(__name__)
# Lazy imports to handle test mocking
_ImportError = None
try:
from agno.tools import Toolkit
from agno.tools.file import FileTools
from agno.tools.python import PythonTools
from agno.tools.shell import ShellTools
_AGNO_TOOLS_AVAILABLE = True
except ImportError as e:
_AGNO_TOOLS_AVAILABLE = False
_ImportError = e
# Track tool usage stats
_TOOL_USAGE: dict[str, list[dict]] = {}
@dataclass
class ToolStats:
"""Statistics for a single tool."""
tool_name: str
call_count: int = 0
last_used: str | None = None
errors: int = 0
@dataclass
class AgentTools:
"""Tools assigned to an agent."""
agent_id: str
agent_name: str
toolkit: "Toolkit"
available_tools: list[str] = field(default_factory=list)
# Backward-compat alias
PersonaTools = AgentTools
def _track_tool_usage(agent_id: str, tool_name: str, success: bool = True) -> None:
"""Track tool usage for analytics."""
if agent_id not in _TOOL_USAGE:
_TOOL_USAGE[agent_id] = []
_TOOL_USAGE[agent_id].append(
{
"tool": tool_name,
"timestamp": datetime.now(UTC).isoformat(),
"success": success,
}
)
def get_tool_stats(agent_id: str | None = None) -> dict:
"""Get tool usage statistics.
Args:
agent_id: Optional agent ID to filter by. If None, returns stats for all agents.
Returns:
Dict with tool usage statistics.
"""
if agent_id:
usage = _TOOL_USAGE.get(agent_id, [])
return {
"agent_id": agent_id,
"total_calls": len(usage),
"tools_used": list(set(u["tool"] for u in usage)),
"recent_calls": usage[-10:] if usage else [],
}
# Return stats for all agents
all_stats = {}
for aid, usage in _TOOL_USAGE.items():
all_stats[aid] = {
"total_calls": len(usage),
"tools_used": list(set(u["tool"] for u in usage)),
}
return all_stats

View File

@@ -1,121 +0,0 @@
"""File operation tools and agent toolkit factories for file-heavy agents.
Provides:
- Smart read_file wrapper (auto-lists directories)
- Toolkit factories for Echo (research), Quill (writing), Seer (data)
"""
from __future__ import annotations
import logging
from collections.abc import Callable
from pathlib import Path
from timmy.tools._base import (
_AGNO_TOOLS_AVAILABLE,
_ImportError,
FileTools,
PythonTools,
Toolkit,
)
logger = logging.getLogger(__name__)
def _make_smart_read_file(file_tools: "FileTools") -> Callable:
"""Wrap FileTools.read_file so directories auto-list their contents.
When the user (or the LLM) passes a directory path to read_file,
the raw Agno implementation throws an IsADirectoryError. This
wrapper detects that case, lists the directory entries, and returns
a helpful message so the model can pick the right file on its own.
"""
original_read = file_tools.read_file
def smart_read_file(file_name: str = "", encoding: str = "utf-8", **kwargs) -> str:
"""Reads the contents of the file `file_name` and returns the contents if successful."""
# LLMs often call read_file(path=...) instead of read_file(file_name=...)
if not file_name:
file_name = kwargs.get("path", "")
if not file_name:
return "Error: no file_name or path provided."
# Resolve the path the same way FileTools does
_safe, resolved = file_tools.check_escape(file_name)
if _safe and resolved.is_dir():
entries = sorted(p.name for p in resolved.iterdir() if not p.name.startswith("."))
listing = "\n".join(f" - {e}" for e in entries) if entries else " (empty directory)"
return (
f"'{file_name}' is a directory, not a file. "
f"Files inside:\n{listing}\n\n"
"Please call read_file with one of the files listed above."
)
return original_read(file_name, encoding=encoding)
# Preserve the original docstring for Agno tool schema generation
smart_read_file.__doc__ = original_read.__doc__
return smart_read_file
def create_research_tools(base_dir: str | Path | None = None):
"""Create tools for the research agent (Echo).
Includes: file reading
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="research")
# File reading
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_writing_tools(base_dir: str | Path | None = None):
"""Create tools for the writing agent (Quill).
Includes: file read/write
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="writing")
# File operations
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.save_file, name="write_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_data_tools(base_dir: str | Path | None = None):
"""Create tools for the data agent (Seer).
Includes: python execution, file reading, web search for data sources
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="data")
# Python execution for analysis
python_tools = PythonTools()
toolkit.register(python_tools.run_python_code, name="python")
# File reading
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit

View File

@@ -1,357 +0,0 @@
"""System, calculation, and AI consultation tools for Timmy agents.
Provides:
- Safe AST-based calculator
- consult_grok (xAI frontier reasoning)
- web_fetch (content extraction)
- Toolkit factories for Forge (code), Mace (security), Helm (devops)
"""
from __future__ import annotations
import ast
import logging
import math
import subprocess
from pathlib import Path
from timmy.tools._base import (
_AGNO_TOOLS_AVAILABLE,
_ImportError,
FileTools,
PythonTools,
ShellTools,
Toolkit,
)
from timmy.tools.file_tools import _make_smart_read_file
logger = logging.getLogger(__name__)
# Max characters of user query included in Lightning invoice memo
_INVOICE_MEMO_MAX_LEN = 50
def _safe_eval(node, allowed_names: dict):
"""Walk an AST and evaluate only safe numeric operations."""
if isinstance(node, ast.Expression):
return _safe_eval(node.body, allowed_names)
if isinstance(node, ast.Constant):
if isinstance(node.value, (int, float, complex)):
return node.value
raise ValueError(f"Unsupported constant: {node.value!r}")
if isinstance(node, ast.UnaryOp):
operand = _safe_eval(node.operand, allowed_names)
if isinstance(node.op, ast.UAdd):
return +operand
if isinstance(node.op, ast.USub):
return -operand
raise ValueError(f"Unsupported unary op: {type(node.op).__name__}")
if isinstance(node, ast.BinOp):
left = _safe_eval(node.left, allowed_names)
right = _safe_eval(node.right, allowed_names)
ops = {
ast.Add: lambda a, b: a + b,
ast.Sub: lambda a, b: a - b,
ast.Mult: lambda a, b: a * b,
ast.Div: lambda a, b: a / b,
ast.FloorDiv: lambda a, b: a // b,
ast.Mod: lambda a, b: a % b,
ast.Pow: lambda a, b: a**b,
}
op_fn = ops.get(type(node.op))
if op_fn is None:
raise ValueError(f"Unsupported binary op: {type(node.op).__name__}")
return op_fn(left, right)
if isinstance(node, ast.Name):
if node.id in allowed_names:
return allowed_names[node.id]
raise ValueError(f"Unknown name: {node.id!r}")
if isinstance(node, ast.Attribute):
value = _safe_eval(node.value, allowed_names)
# Only allow attribute access on the math module
if value is math:
attr = getattr(math, node.attr, None)
if attr is not None:
return attr
raise ValueError(f"Attribute access not allowed: .{node.attr}")
if isinstance(node, ast.Call):
func = _safe_eval(node.func, allowed_names)
if not callable(func):
raise ValueError(f"Not callable: {func!r}")
args = [_safe_eval(a, allowed_names) for a in node.args]
kwargs = {kw.arg: _safe_eval(kw.value, allowed_names) for kw in node.keywords}
return func(*args, **kwargs)
raise ValueError(f"Unsupported syntax: {type(node).__name__}")
def calculator(expression: str) -> str:
"""Evaluate a mathematical expression and return the exact result.
Use this tool for ANY arithmetic: multiplication, division, square roots,
exponents, percentages, logarithms, trigonometry, etc.
Args:
expression: A valid Python math expression, e.g. '347 * 829',
'math.sqrt(17161)', '2**10', 'math.log(100, 10)'.
Returns:
The exact result as a string.
"""
allowed_names = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}
allowed_names["math"] = math
allowed_names["abs"] = abs
allowed_names["round"] = round
allowed_names["min"] = min
allowed_names["max"] = max
try:
tree = ast.parse(expression, mode="eval")
result = _safe_eval(tree, allowed_names)
return str(result)
except Exception as e: # broad catch intentional: arbitrary code execution
return f"Error evaluating '{expression}': {e}"
def consult_grok(query: str) -> str:
"""Consult Grok (xAI) for frontier reasoning on complex questions.
Use this tool when a question requires advanced reasoning, real-time
knowledge, or capabilities beyond the local model. Grok is a premium
cloud backend — use sparingly and only for high-complexity queries.
Args:
query: The question or reasoning task to send to Grok.
Returns:
Grok's response text, or an error/status message.
"""
from config import settings
from timmy.backends import get_grok_backend, grok_available
if not grok_available():
return (
"Grok is not available. Enable with GROK_ENABLED=true "
"and set XAI_API_KEY in your .env file."
)
backend = get_grok_backend()
# Log to Spark if available
try:
from spark.engine import spark_engine
spark_engine.on_tool_executed(
agent_id="default",
tool_name="consult_grok",
success=True,
)
except (ImportError, AttributeError) as exc:
logger.warning("Tool execution failed (consult_grok logging): %s", exc)
# Generate Lightning invoice for monetization (unless free mode)
invoice_info = ""
if not settings.grok_free:
try:
from lightning.factory import get_backend as get_ln_backend
ln = get_ln_backend()
sats = min(settings.grok_max_sats_per_query, settings.grok_sats_hard_cap)
inv = ln.create_invoice(sats, f"Grok query: {query[:_INVOICE_MEMO_MAX_LEN]}")
invoice_info = f"\n[Lightning invoice: {sats} sats — {inv.payment_request[:40]}...]"
except (ImportError, OSError, ValueError) as exc:
logger.error("Lightning invoice creation failed: %s", exc)
return "Error: Failed to create Lightning invoice. Please check logs."
result = backend.run(query)
response = result.content
if invoice_info:
response += invoice_info
return response
def web_fetch(url: str, max_tokens: int = 4000) -> str:
"""Fetch a web page and return its main text content.
Downloads the URL, extracts readable text using trafilatura, and
truncates to a token budget. Use this to read full articles, docs,
or blog posts that web_search only returns snippets for.
Args:
url: The URL to fetch (must start with http:// or https://).
max_tokens: Maximum approximate token budget (default 4000).
Text is truncated to max_tokens * 4 characters.
Returns:
Extracted text content, or an error message on failure.
"""
if not url or not url.startswith(("http://", "https://")):
return f"Error: invalid URL — must start with http:// or https://: {url!r}"
try:
import requests as _requests
except ImportError:
return "Error: 'requests' package is not installed. Install with: pip install requests"
try:
import trafilatura
except ImportError:
return (
"Error: 'trafilatura' package is not installed. Install with: pip install trafilatura"
)
try:
resp = _requests.get(
url,
timeout=15,
headers={"User-Agent": "TimmyResearchBot/1.0"},
)
resp.raise_for_status()
except _requests.exceptions.Timeout:
return f"Error: request timed out after 15 seconds for {url}"
except _requests.exceptions.HTTPError as exc:
return f"Error: HTTP {exc.response.status_code} for {url}"
except _requests.exceptions.RequestException as exc:
return f"Error: failed to fetch {url}{exc}"
text = trafilatura.extract(resp.text, include_tables=True, include_links=True)
if not text:
return f"Error: could not extract readable content from {url}"
char_budget = max_tokens * 4
if len(text) > char_budget:
text = text[:char_budget] + f"\n\n[…truncated to ~{max_tokens} tokens]"
return text
def create_aider_tool(base_path: Path):
"""Create an Aider tool for AI-assisted coding."""
class AiderTool:
"""Tool that calls Aider (local AI coding assistant) for code generation."""
def __init__(self, base_dir: Path):
self.base_dir = base_dir
def run_aider(self, prompt: str, model: str = "qwen3:30b") -> str:
"""Run Aider to generate code changes.
Args:
prompt: What you want Aider to do (e.g., "add a fibonacci function")
model: Ollama model to use (default: qwen3:30b)
Returns:
Aider's response with the code changes made
"""
try:
# Run aider with the prompt
result = subprocess.run(
[
"aider",
"--no-git",
"--model",
f"ollama/{model}",
"--quiet",
prompt,
],
capture_output=True,
text=True,
timeout=120,
cwd=str(self.base_dir),
)
if result.returncode == 0:
return result.stdout if result.stdout else "Code changes applied successfully"
else:
return f"Aider error: {result.stderr}"
except FileNotFoundError:
return "Error: Aider not installed. Run: pip install aider"
except subprocess.TimeoutExpired:
return "Error: Aider timed out after 120 seconds"
except (OSError, subprocess.SubprocessError) as e:
return f"Error running Aider: {str(e)}"
return AiderTool(base_path)
def create_code_tools(base_dir: str | Path | None = None):
"""Create tools for the code agent (Forge).
Includes: shell commands, python execution, file read/write, Aider AI assist
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="code")
# Shell commands (sandboxed)
shell_tools = ShellTools()
toolkit.register(shell_tools.run_shell_command, name="shell")
# Python execution
python_tools = PythonTools()
toolkit.register(python_tools.run_python_code, name="python")
# File operations
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.save_file, name="write_file")
toolkit.register(file_tools.list_files, name="list_files")
# Aider AI coding assistant (local with Ollama)
aider_tool = create_aider_tool(base_path)
toolkit.register(aider_tool.run_aider, name="aider")
return toolkit
def create_security_tools(base_dir: str | Path | None = None):
"""Create tools for the security agent (Mace).
Includes: shell commands (for scanning), file read
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="security")
# Shell for running security scans
shell_tools = ShellTools()
toolkit.register(shell_tools.run_shell_command, name="shell")
# File reading for logs/configs
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit
def create_devops_tools(base_dir: str | Path | None = None):
"""Create tools for the DevOps agent (Helm).
Includes: shell commands, file read/write
"""
if not _AGNO_TOOLS_AVAILABLE:
raise ImportError(f"Agno tools not available: {_ImportError}")
toolkit = Toolkit(name="devops")
# Shell for deployment commands
shell_tools = ShellTools()
toolkit.register(shell_tools.run_shell_command, name="shell")
# File operations for config management
from config import settings
base_path = Path(base_dir) if base_dir else Path(settings.repo_root)
file_tools = FileTools(base_dir=base_path)
toolkit.register(_make_smart_read_file(file_tools), name="read_file")
toolkit.register(file_tools.save_file, name="write_file")
toolkit.register(file_tools.list_files, name="list_files")
return toolkit

View File

@@ -2549,7 +2549,6 @@
.tower-adv-action { font-size: 0.75rem; color: var(--green); margin-top: 4px; font-style: italic; }
/* ── Voice settings ───────────────────────────────────────── */
.voice-settings-page { max-width: 600px; margin: 0 auto; }
@@ -2665,95 +2664,3 @@
color: var(--bg-deep);
}
.vs-btn-save:hover { opacity: 0.85; }
/* ── Nexus ────────────────────────────────────────────────── */
.nexus-layout { max-width: 1400px; margin: 0 auto; }
.nexus-header { border-bottom: 1px solid var(--border); padding-bottom: 0.5rem; }
.nexus-title { font-size: 1.4rem; font-weight: 700; color: var(--purple); letter-spacing: 0.1em; }
.nexus-subtitle { font-size: 0.8rem; color: var(--text-dim); margin-top: 0.2rem; }
.nexus-grid {
display: grid;
grid-template-columns: 1fr 320px;
gap: 1rem;
align-items: start;
}
@media (max-width: 900px) {
.nexus-grid { grid-template-columns: 1fr; }
}
.nexus-chat-panel { height: calc(100vh - 180px); display: flex; flex-direction: column; }
.nexus-chat-panel .card-body { overflow-y: auto; flex: 1; }
.nexus-empty-state {
color: var(--text-dim);
font-size: 0.85rem;
font-style: italic;
padding: 1rem 0;
text-align: center;
}
/* Memory sidebar */
.nexus-memory-hits { font-size: 0.78rem; }
.nexus-memory-label { color: var(--text-dim); font-size: 0.72rem; margin-bottom: 0.4rem; letter-spacing: 0.05em; }
.nexus-memory-hit { display: flex; gap: 0.4rem; margin-bottom: 0.35rem; align-items: flex-start; }
.nexus-memory-type { color: var(--purple); font-size: 0.68rem; white-space: nowrap; padding-top: 0.1rem; min-width: 60px; }
.nexus-memory-content { color: var(--text); line-height: 1.4; }
/* Teaching panel */
.nexus-facts-header { font-size: 0.7rem; color: var(--text-dim); letter-spacing: 0.08em; margin-bottom: 0.4rem; }
.nexus-facts-list { list-style: none; padding: 0; margin: 0; font-size: 0.8rem; }
.nexus-fact-item { color: var(--text); border-bottom: 1px solid var(--border); padding: 0.3rem 0; }
.nexus-fact-empty { color: var(--text-dim); font-style: italic; }
.nexus-taught-confirm {
font-size: 0.8rem;
color: var(--green);
background: rgba(0,255,136,0.06);
border: 1px solid var(--green);
border-radius: 4px;
padding: 0.3rem 0.6rem;
margin-bottom: 0.5rem;
}
/* ═══════════════════════════════════════════════════════════════
Dreaming Mode
═══════════════════════════════════════════════════════════════ */
.dream-active {
display: flex; align-items: center; gap: 8px;
padding: 6px 0;
}
.dream-label { font-size: 0.75rem; font-weight: 700; color: var(--purple); letter-spacing: 0.12em; }
.dream-summary { font-size: 0.75rem; color: var(--text-dim); font-style: italic; flex: 1; }
.dream-pulse {
display: inline-block; width: 8px; height: 8px; border-radius: 50%;
background: var(--purple);
animation: dream-pulse 1.8s ease-in-out infinite;
}
@keyframes dream-pulse {
0%, 100% { opacity: 1; transform: scale(1); }
50% { opacity: 0.4; transform: scale(0.7); }
}
.dream-dot {
display: inline-block; width: 7px; height: 7px; border-radius: 50%;
}
.dream-dot-idle { background: var(--amber); }
.dream-dot-standby { background: var(--text-dim); }
.dream-idle, .dream-standby {
display: flex; align-items: center; gap: 6px; padding: 4px 0;
}
.dream-label-idle { font-size: 0.7rem; font-weight: 700; color: var(--amber); letter-spacing: 0.1em; }
.dream-label-standby { font-size: 0.7rem; font-weight: 700; color: var(--text-dim); letter-spacing: 0.1em; }
.dream-idle-meta { font-size: 0.7rem; color: var(--text-dim); }
.dream-history { border-top: 1px solid var(--border); padding-top: 6px; }
.dream-record { padding: 4px 0; border-bottom: 1px solid var(--border); }
.dream-record:last-child { border-bottom: none; }
.dream-rule { font-size: 0.75rem; color: var(--text); font-style: italic; }
.dream-meta { font-size: 0.65rem; color: var(--text-dim); margin-top: 2px; }

View File

@@ -86,19 +86,6 @@
<p>Your task has been added to the queue. Timmy will review it shortly.</p>
<button type="button" id="submit-another-btn" class="btn-primary">Submit Another</button>
</div>
<div id="submit-job-queued" class="submit-job-queued hidden">
<div class="queued-icon">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="10"></circle>
<polyline points="12 6 12 12 16 14"></polyline>
</svg>
</div>
<h3>Job Queued</h3>
<p>The server is unreachable right now. Your job has been saved locally and will be submitted automatically when the connection is restored.</p>
<div id="queue-count-display" class="queue-count-display"></div>
<button type="button" id="submit-another-queued-btn" class="btn-primary">Submit Another</button>
</div>
</div>
<div id="submit-job-backdrop" class="submit-job-backdrop"></div>
</div>
@@ -155,7 +142,6 @@
import { createFamiliar } from "./familiar.js";
import { setupControls } from "./controls.js";
import { StateReader } from "./state.js";
import { messageQueue } from "./queue.js";
// --- Renderer ---
const renderer = new THREE.WebGLRenderer({ antialias: true });
@@ -196,60 +182,8 @@
moodEl.textContent = state.timmyState.mood;
}
});
// Replay queued jobs whenever the server comes back online.
stateReader.onConnectionChange(async (online) => {
if (!online) return;
const pending = messageQueue.getPending();
if (pending.length === 0) return;
console.log(`[queue] Online — replaying ${pending.length} queued job(s)`);
for (const item of pending) {
try {
const response = await fetch("/api/tasks", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(item.payload),
});
if (response.ok) {
messageQueue.markDelivered(item.id);
console.log(`[queue] Delivered queued job ${item.id}`);
} else {
messageQueue.markFailed(item.id);
console.warn(`[queue] Failed to deliver job ${item.id}: ${response.status}`);
}
} catch (err) {
// Still offline — leave as QUEUED, will retry next cycle.
console.warn(`[queue] Replay aborted (still offline): ${err}`);
break;
}
}
messageQueue.prune();
_updateQueueBadge();
});
stateReader.connect();
// --- Queue badge (top-right indicator for pending jobs) ---
function _updateQueueBadge() {
const count = messageQueue.pendingCount();
let badge = document.getElementById("queue-badge");
if (count === 0) {
if (badge) badge.remove();
return;
}
if (!badge) {
badge = document.createElement("div");
badge.id = "queue-badge";
badge.className = "queue-badge";
badge.title = "Jobs queued offline — will submit on reconnect";
document.getElementById("overlay").appendChild(badge);
}
badge.textContent = `${count} queued`;
}
// Show badge on load if there are already queued messages.
messageQueue.prune();
_updateQueueBadge();
// --- About Panel ---
const infoBtn = document.getElementById("info-btn");
const aboutPanel = document.getElementById("about-panel");
@@ -294,9 +228,6 @@
const descWarning = document.getElementById("desc-warning");
const submitJobSuccess = document.getElementById("submit-job-success");
const submitAnotherBtn = document.getElementById("submit-another-btn");
const submitJobQueued = document.getElementById("submit-job-queued");
const submitAnotherQueuedBtn = document.getElementById("submit-another-queued-btn");
const queueCountDisplay = document.getElementById("queue-count-display");
// Constants
const MAX_TITLE_LENGTH = 200;
@@ -324,7 +255,6 @@
submitJobForm.reset();
submitJobForm.classList.remove("hidden");
submitJobSuccess.classList.add("hidden");
submitJobQueued.classList.add("hidden");
updateCharCounts();
clearErrors();
validateForm();
@@ -433,7 +363,6 @@
submitJobBackdrop.addEventListener("click", closeSubmitJobModal);
cancelJobBtn.addEventListener("click", closeSubmitJobModal);
submitAnotherBtn.addEventListener("click", resetForm);
submitAnotherQueuedBtn.addEventListener("click", resetForm);
// Input event listeners for real-time validation
jobTitle.addEventListener("input", () => {
@@ -491,10 +420,9 @@
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(formData),
signal: AbortSignal.timeout(8000),
body: JSON.stringify(formData)
});
if (response.ok) {
// Show success state
submitJobForm.classList.add("hidden");
@@ -505,14 +433,9 @@
descError.classList.add("visible");
}
} catch (error) {
// Server unreachable — persist to localStorage queue.
messageQueue.enqueue(formData);
const count = messageQueue.pendingCount();
// For demo/development, show success even if API fails
submitJobForm.classList.add("hidden");
submitJobQueued.classList.remove("hidden");
queueCountDisplay.textContent =
count > 1 ? `${count} jobs queued` : "1 job queued";
_updateQueueBadge();
submitJobSuccess.classList.remove("hidden");
} finally {
submitJobSubmit.disabled = false;
submitJobSubmit.textContent = "Submit Job";

View File

@@ -1,90 +0,0 @@
/**
* Offline message queue for Workshop panel.
*
* Persists undelivered job submissions to localStorage so they survive
* page refreshes and are replayed when the server comes back online.
*/
const _QUEUE_KEY = "timmy_workshop_queue";
const _MAX_AGE_MS = 24 * 60 * 60 * 1000; // 24 hours — auto-expire old items
export const STATUS = {
QUEUED: "queued",
DELIVERED: "delivered",
FAILED: "failed",
};
function _load() {
try {
const raw = localStorage.getItem(_QUEUE_KEY);
return raw ? JSON.parse(raw) : [];
} catch {
return [];
}
}
function _save(items) {
try {
localStorage.setItem(_QUEUE_KEY, JSON.stringify(items));
} catch {
/* localStorage unavailable — degrade silently */
}
}
function _uid() {
return `msg_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`;
}
/** LocalStorage-backed message queue for Workshop job submissions. */
export const messageQueue = {
/** Add a payload. Returns the created item (with id and status). */
enqueue(payload) {
const item = {
id: _uid(),
payload,
queuedAt: new Date().toISOString(),
status: STATUS.QUEUED,
};
const items = _load();
items.push(item);
_save(items);
return item;
},
/** Mark a message as delivered and remove it from storage. */
markDelivered(id) {
_save(_load().filter((i) => i.id !== id));
},
/** Mark a message as permanently failed (kept for 24h for visibility). */
markFailed(id) {
_save(
_load().map((i) =>
i.id === id ? { ...i, status: STATUS.FAILED } : i
)
);
},
/** All messages waiting to be delivered. */
getPending() {
return _load().filter((i) => i.status === STATUS.QUEUED);
},
/** Total queued (QUEUED status only) count. */
pendingCount() {
return this.getPending().length;
},
/** Drop expired failed items (> 24h old). */
prune() {
const cutoff = Date.now() - _MAX_AGE_MS;
_save(
_load().filter(
(i) =>
i.status === STATUS.QUEUED ||
(i.status === STATUS.FAILED &&
new Date(i.queuedAt).getTime() > cutoff)
)
);
},
};

View File

@@ -3,10 +3,6 @@
*
* Provides Timmy's current state to the scene. In Phase 2 this is a
* static default; the WebSocket path is stubbed for future use.
*
* Also manages connection health monitoring: pings /api/matrix/health
* every 30 seconds and notifies listeners when online/offline state
* changes so the Workshop can replay any queued messages.
*/
const DEFAULTS = {
@@ -24,19 +20,11 @@ const DEFAULTS = {
version: 1,
};
const _HEALTH_URL = "/api/matrix/health";
const _PING_INTERVAL_MS = 30_000;
const _WS_RECONNECT_DELAY_MS = 5_000;
export class StateReader {
constructor() {
this.state = { ...DEFAULTS };
this.listeners = [];
this.connectionListeners = [];
this._ws = null;
this._online = false;
this._pingTimer = null;
this._reconnectTimer = null;
}
/** Subscribe to state changes. */
@@ -44,12 +32,7 @@ export class StateReader {
this.listeners.push(fn);
}
/** Subscribe to online/offline transitions. Called with (isOnline: bool). */
onConnectionChange(fn) {
this.connectionListeners.push(fn);
}
/** Notify all state listeners. */
/** Notify all listeners. */
_notify() {
for (const fn of this.listeners) {
try {
@@ -60,48 +43,8 @@ export class StateReader {
}
}
/** Fire connection listeners only when state actually changes. */
_notifyConnection(online) {
if (online === this._online) return;
this._online = online;
for (const fn of this.connectionListeners) {
try {
fn(online);
} catch (e) {
console.warn("Connection listener error:", e);
}
}
}
/** Ping the health endpoint once and update connection state. */
async _ping() {
try {
const r = await fetch(_HEALTH_URL, {
signal: AbortSignal.timeout(5000),
});
this._notifyConnection(r.ok);
} catch {
this._notifyConnection(false);
}
}
/** Start 30-second health-check loop (idempotent). */
_startHealthCheck() {
if (this._pingTimer) return;
this._pingTimer = setInterval(() => this._ping(), _PING_INTERVAL_MS);
}
/** Schedule a WebSocket reconnect attempt after a delay (idempotent). */
_scheduleReconnect() {
if (this._reconnectTimer) return;
this._reconnectTimer = setTimeout(() => {
this._reconnectTimer = null;
this._connectWS();
}, _WS_RECONNECT_DELAY_MS);
}
/** Open (or re-open) the WebSocket connection. */
_connectWS() {
/** Try to connect to the world WebSocket for live updates. */
connect() {
const proto = location.protocol === "https:" ? "wss:" : "ws:";
const url = `${proto}//${location.host}/api/world/ws`;
try {
@@ -109,13 +52,10 @@ export class StateReader {
this._ws.onopen = () => {
const dot = document.getElementById("connection-dot");
if (dot) dot.classList.add("connected");
this._notifyConnection(true);
};
this._ws.onclose = () => {
const dot = document.getElementById("connection-dot");
if (dot) dot.classList.remove("connected");
this._notifyConnection(false);
this._scheduleReconnect();
};
this._ws.onmessage = (ev) => {
try {
@@ -135,18 +75,9 @@ export class StateReader {
};
} catch (e) {
console.warn("WebSocket unavailable — using static state");
this._scheduleReconnect();
}
}
/** Connect to the world WebSocket and start health-check polling. */
connect() {
this._connectWS();
this._startHealthCheck();
// Immediate ping so connection status is known before the first interval.
this._ping();
}
/** Current mood string. */
get mood() {
return this.state.timmyState.mood;
@@ -161,9 +92,4 @@ export class StateReader {
get energy() {
return this.state.timmyState.energy;
}
/** Whether the server is currently reachable. */
get isOnline() {
return this._online;
}
}

View File

@@ -604,68 +604,6 @@ canvas {
opacity: 1;
}
/* Queued State (offline buffer) */
.submit-job-queued {
text-align: center;
padding: 32px 16px;
}
.submit-job-queued.hidden {
display: none;
}
.queued-icon {
width: 64px;
height: 64px;
margin: 0 auto 20px;
color: #ffaa33;
}
.queued-icon svg {
width: 100%;
height: 100%;
}
.submit-job-queued h3 {
font-size: 20px;
color: #ffaa33;
margin: 0 0 12px 0;
}
.submit-job-queued p {
font-size: 14px;
color: #888;
margin: 0 0 16px 0;
line-height: 1.5;
}
.queue-count-display {
font-size: 12px;
color: #ffaa33;
margin-bottom: 24px;
opacity: 0.8;
}
/* Queue badge — shown in overlay corner when offline jobs are pending */
.queue-badge {
position: absolute;
bottom: 16px;
right: 16px;
padding: 4px 10px;
background: rgba(10, 10, 20, 0.85);
border: 1px solid rgba(255, 170, 51, 0.6);
border-radius: 12px;
color: #ffaa33;
font-size: 11px;
pointer-events: none;
animation: queue-pulse 2s ease-in-out infinite;
}
@keyframes queue-pulse {
0%, 100% { opacity: 0.8; }
50% { opacity: 1; }
}
/* Mobile adjustments */
@media (max-width: 480px) {
.about-panel-content {

View File

@@ -1,527 +0,0 @@
"""Unit tests for dashboard/routes/daily_run.py."""
from __future__ import annotations
import json
from datetime import UTC, datetime, timedelta
from unittest.mock import MagicMock, patch
from urllib.error import URLError
from dashboard.routes.daily_run import (
DEFAULT_CONFIG,
LAYER_LABELS,
DailyRunMetrics,
GiteaClient,
LayerMetrics,
_extract_layer,
_fetch_layer_metrics,
_get_metrics,
_get_token,
_load_config,
_load_cycle_data,
)
# ---------------------------------------------------------------------------
# _load_config
# ---------------------------------------------------------------------------
def test_load_config_returns_defaults():
with patch("dashboard.routes.daily_run.CONFIG_PATH") as mock_path:
mock_path.exists.return_value = False
config = _load_config()
assert config["gitea_api"] == DEFAULT_CONFIG["gitea_api"]
assert config["repo_slug"] == DEFAULT_CONFIG["repo_slug"]
def test_load_config_merges_file_orchestrator_section(tmp_path):
config_file = tmp_path / "daily_run.json"
config_file.write_text(
json.dumps(
{"orchestrator": {"repo_slug": "custom/repo", "gitea_api": "http://custom:3000/api/v1"}}
)
)
with patch("dashboard.routes.daily_run.CONFIG_PATH", config_file):
config = _load_config()
assert config["repo_slug"] == "custom/repo"
assert config["gitea_api"] == "http://custom:3000/api/v1"
def test_load_config_ignores_invalid_json(tmp_path):
config_file = tmp_path / "daily_run.json"
config_file.write_text("not valid json{{")
with patch("dashboard.routes.daily_run.CONFIG_PATH", config_file):
config = _load_config()
assert config["repo_slug"] == DEFAULT_CONFIG["repo_slug"]
def test_load_config_env_overrides(monkeypatch):
monkeypatch.setenv("TIMMY_GITEA_API", "http://envapi:3000/api/v1")
monkeypatch.setenv("TIMMY_REPO_SLUG", "env/repo")
monkeypatch.setenv("TIMMY_GITEA_TOKEN", "env-token-123")
with patch("dashboard.routes.daily_run.CONFIG_PATH") as mock_path:
mock_path.exists.return_value = False
config = _load_config()
assert config["gitea_api"] == "http://envapi:3000/api/v1"
assert config["repo_slug"] == "env/repo"
assert config["token"] == "env-token-123"
def test_load_config_no_env_overrides_without_vars(monkeypatch):
monkeypatch.delenv("TIMMY_GITEA_API", raising=False)
monkeypatch.delenv("TIMMY_REPO_SLUG", raising=False)
monkeypatch.delenv("TIMMY_GITEA_TOKEN", raising=False)
with patch("dashboard.routes.daily_run.CONFIG_PATH") as mock_path:
mock_path.exists.return_value = False
config = _load_config()
assert "token" not in config
# ---------------------------------------------------------------------------
# _get_token
# ---------------------------------------------------------------------------
def test_get_token_from_config_dict():
config = {"token": "direct-token", "token_file": "~/.hermes/gitea_token"}
assert _get_token(config) == "direct-token"
def test_get_token_from_file(tmp_path):
token_file = tmp_path / "token.txt"
token_file.write_text(" file-token \n")
config = {"token_file": str(token_file)}
assert _get_token(config) == "file-token"
def test_get_token_returns_none_when_file_missing(tmp_path):
config = {"token_file": str(tmp_path / "nonexistent_token")}
assert _get_token(config) is None
# ---------------------------------------------------------------------------
# GiteaClient
# ---------------------------------------------------------------------------
def _make_client(**kwargs) -> GiteaClient:
config = {**DEFAULT_CONFIG, **kwargs}
return GiteaClient(config, token="test-token")
def test_gitea_client_headers_include_auth():
client = _make_client()
headers = client._headers()
assert headers["Authorization"] == "token test-token"
assert headers["Accept"] == "application/json"
def test_gitea_client_headers_no_token():
config = {**DEFAULT_CONFIG}
client = GiteaClient(config, token=None)
headers = client._headers()
assert "Authorization" not in headers
def test_gitea_client_api_url():
client = _make_client()
url = client._api_url("issues")
assert url == f"{DEFAULT_CONFIG['gitea_api']}/repos/{DEFAULT_CONFIG['repo_slug']}/issues"
def test_gitea_client_api_url_strips_trailing_slash():
config = {**DEFAULT_CONFIG, "gitea_api": "http://localhost:3000/api/v1/"}
client = GiteaClient(config, token=None)
url = client._api_url("issues")
assert "//" not in url.replace("http://", "")
def test_gitea_client_is_available_true():
client = _make_client()
mock_resp = MagicMock()
mock_resp.status = 200
mock_resp.__enter__ = lambda s: mock_resp
mock_resp.__exit__ = MagicMock(return_value=False)
with patch("dashboard.routes.daily_run.urlopen", return_value=mock_resp):
assert client.is_available() is True
def test_gitea_client_is_available_cached():
client = _make_client()
client._available = True
# Should not call urlopen at all
with patch("dashboard.routes.daily_run.urlopen") as mock_urlopen:
assert client.is_available() is True
mock_urlopen.assert_not_called()
def test_gitea_client_is_available_false_on_url_error():
client = _make_client()
with patch("dashboard.routes.daily_run.urlopen", side_effect=URLError("refused")):
assert client.is_available() is False
def test_gitea_client_is_available_false_on_timeout():
client = _make_client()
with patch("dashboard.routes.daily_run.urlopen", side_effect=TimeoutError()):
assert client.is_available() is False
def test_gitea_client_get_paginated_single_page():
client = _make_client()
mock_resp = MagicMock()
mock_resp.read.return_value = json.dumps([{"id": 1}, {"id": 2}]).encode()
mock_resp.__enter__ = lambda s: mock_resp
mock_resp.__exit__ = MagicMock(return_value=False)
with patch("dashboard.routes.daily_run.urlopen", return_value=mock_resp):
result = client.get_paginated("issues")
assert len(result) == 2
assert result[0]["id"] == 1
def test_gitea_client_get_paginated_empty():
client = _make_client()
mock_resp = MagicMock()
mock_resp.read.return_value = b"[]"
mock_resp.__enter__ = lambda s: mock_resp
mock_resp.__exit__ = MagicMock(return_value=False)
with patch("dashboard.routes.daily_run.urlopen", return_value=mock_resp):
result = client.get_paginated("issues")
assert result == []
# ---------------------------------------------------------------------------
# LayerMetrics.trend
# ---------------------------------------------------------------------------
def test_layer_metrics_trend_no_previous_no_current():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=0, previous_count=0)
assert lm.trend == ""
def test_layer_metrics_trend_no_previous_with_current():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=5, previous_count=0)
assert lm.trend == ""
def test_layer_metrics_trend_big_increase():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=130, previous_count=100)
assert lm.trend == "↑↑"
def test_layer_metrics_trend_small_increase():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=108, previous_count=100)
assert lm.trend == ""
def test_layer_metrics_trend_stable():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=100, previous_count=100)
assert lm.trend == ""
def test_layer_metrics_trend_small_decrease():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=92, previous_count=100)
assert lm.trend == ""
def test_layer_metrics_trend_big_decrease():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=70, previous_count=100)
assert lm.trend == "↓↓"
def test_layer_metrics_trend_color_up():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=200, previous_count=100)
assert lm.trend_color == "var(--green)"
def test_layer_metrics_trend_color_down():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=50, previous_count=100)
assert lm.trend_color == "var(--amber)"
def test_layer_metrics_trend_color_stable():
lm = LayerMetrics(name="triage", label="layer:triage", current_count=100, previous_count=100)
assert lm.trend_color == "var(--text-dim)"
# ---------------------------------------------------------------------------
# DailyRunMetrics.sessions_trend
# ---------------------------------------------------------------------------
def _make_daily_metrics(**kwargs) -> DailyRunMetrics:
defaults = dict(
sessions_completed=10,
sessions_previous=8,
layers=[],
total_touched_current=20,
total_touched_previous=15,
lookback_days=7,
generated_at=datetime.now(UTC).isoformat(),
)
defaults.update(kwargs)
return DailyRunMetrics(**defaults)
def test_daily_metrics_sessions_trend_big_increase():
m = _make_daily_metrics(sessions_completed=130, sessions_previous=100)
assert m.sessions_trend == "↑↑"
def test_daily_metrics_sessions_trend_stable():
m = _make_daily_metrics(sessions_completed=100, sessions_previous=100)
assert m.sessions_trend == ""
def test_daily_metrics_sessions_trend_no_previous_zero_completed():
m = _make_daily_metrics(sessions_completed=0, sessions_previous=0)
assert m.sessions_trend == ""
def test_daily_metrics_sessions_trend_no_previous_with_completed():
m = _make_daily_metrics(sessions_completed=5, sessions_previous=0)
assert m.sessions_trend == ""
def test_daily_metrics_sessions_trend_color_green():
m = _make_daily_metrics(sessions_completed=200, sessions_previous=100)
assert m.sessions_trend_color == "var(--green)"
def test_daily_metrics_sessions_trend_color_amber():
m = _make_daily_metrics(sessions_completed=50, sessions_previous=100)
assert m.sessions_trend_color == "var(--amber)"
# ---------------------------------------------------------------------------
# _extract_layer
# ---------------------------------------------------------------------------
def test_extract_layer_finds_layer_label():
labels = [{"name": "bug"}, {"name": "layer:triage"}, {"name": "urgent"}]
assert _extract_layer(labels) == "triage"
def test_extract_layer_returns_none_when_no_layer():
labels = [{"name": "bug"}, {"name": "feature"}]
assert _extract_layer(labels) is None
def test_extract_layer_empty_labels():
assert _extract_layer([]) is None
def test_extract_layer_first_match_wins():
labels = [{"name": "layer:micro-fix"}, {"name": "layer:tests"}]
assert _extract_layer(labels) == "micro-fix"
# ---------------------------------------------------------------------------
# _load_cycle_data
# ---------------------------------------------------------------------------
def test_load_cycle_data_missing_file(tmp_path):
with patch("dashboard.routes.daily_run.REPO_ROOT", tmp_path):
result = _load_cycle_data(days=14)
assert result == {"current": 0, "previous": 0}
def test_load_cycle_data_counts_successful_sessions(tmp_path):
retro_dir = tmp_path / ".loop" / "retro"
retro_dir.mkdir(parents=True)
retro_file = retro_dir / "cycles.jsonl"
now = datetime.now(UTC)
recent_ts = (now - timedelta(days=3)).isoformat()
older_ts = (now - timedelta(days=10)).isoformat()
old_ts = (now - timedelta(days=20)).isoformat()
lines = [
json.dumps({"timestamp": recent_ts, "success": True}),
json.dumps({"timestamp": recent_ts, "success": False}), # not counted
json.dumps({"timestamp": older_ts, "success": True}),
json.dumps({"timestamp": old_ts, "success": True}), # outside window
]
retro_file.write_text("\n".join(lines))
with patch("dashboard.routes.daily_run.REPO_ROOT", tmp_path):
result = _load_cycle_data(days=7)
assert result["current"] == 1
assert result["previous"] == 1
def test_load_cycle_data_skips_invalid_json_lines(tmp_path):
retro_dir = tmp_path / ".loop" / "retro"
retro_dir.mkdir(parents=True)
retro_file = retro_dir / "cycles.jsonl"
now = datetime.now(UTC)
recent_ts = (now - timedelta(days=1)).isoformat()
retro_file.write_text(
f"not valid json\n{json.dumps({'timestamp': recent_ts, 'success': True})}\n"
)
with patch("dashboard.routes.daily_run.REPO_ROOT", tmp_path):
result = _load_cycle_data(days=7)
assert result["current"] == 1
def test_load_cycle_data_skips_entries_with_no_timestamp(tmp_path):
retro_dir = tmp_path / ".loop" / "retro"
retro_dir.mkdir(parents=True)
retro_file = retro_dir / "cycles.jsonl"
retro_file.write_text(json.dumps({"success": True}))
with patch("dashboard.routes.daily_run.REPO_ROOT", tmp_path):
result = _load_cycle_data(days=7)
assert result == {"current": 0, "previous": 0}
# ---------------------------------------------------------------------------
# _fetch_layer_metrics
# ---------------------------------------------------------------------------
def _make_issue(updated_offset_days: int) -> dict:
ts = (datetime.now(UTC) - timedelta(days=updated_offset_days)).isoformat()
return {"updated_at": ts, "labels": [{"name": "layer:triage"}]}
def test_fetch_layer_metrics_counts_current_and_previous():
client = _make_client()
client._available = True
recent_issue = _make_issue(updated_offset_days=3)
older_issue = _make_issue(updated_offset_days=10)
with patch.object(client, "get_paginated", return_value=[recent_issue, older_issue]):
layers, total_current, total_previous = _fetch_layer_metrics(client, lookback_days=7)
# Should have one entry per LAYER_LABELS
assert len(layers) == len(LAYER_LABELS)
triage = next(lm for lm in layers if lm.name == "triage")
assert triage.current_count == 1
assert triage.previous_count == 1
def test_fetch_layer_metrics_degrades_on_http_error():
client = _make_client()
client._available = True
with patch.object(client, "get_paginated", side_effect=URLError("network")):
layers, total_current, total_previous = _fetch_layer_metrics(client, lookback_days=7)
assert len(layers) == len(LAYER_LABELS)
for lm in layers:
assert lm.current_count == 0
assert lm.previous_count == 0
assert total_current == 0
assert total_previous == 0
# ---------------------------------------------------------------------------
# _get_metrics
# ---------------------------------------------------------------------------
def test_get_metrics_returns_none_when_gitea_unavailable():
with patch("dashboard.routes.daily_run._load_config", return_value=DEFAULT_CONFIG):
with patch("dashboard.routes.daily_run._get_token", return_value=None):
with patch.object(GiteaClient, "is_available", return_value=False):
result = _get_metrics()
assert result is None
def test_get_metrics_returns_daily_run_metrics():
mock_layers = [
LayerMetrics(name="triage", label="layer:triage", current_count=5, previous_count=3)
]
with patch("dashboard.routes.daily_run._load_config", return_value=DEFAULT_CONFIG):
with patch("dashboard.routes.daily_run._get_token", return_value="tok"):
with patch.object(GiteaClient, "is_available", return_value=True):
with patch(
"dashboard.routes.daily_run._fetch_layer_metrics",
return_value=(mock_layers, 5, 3),
):
with patch(
"dashboard.routes.daily_run._load_cycle_data",
return_value={"current": 10, "previous": 8},
):
result = _get_metrics(lookback_days=7)
assert result is not None
assert result.sessions_completed == 10
assert result.sessions_previous == 8
assert result.lookback_days == 7
assert result.layers == mock_layers
def test_get_metrics_returns_none_on_exception():
with patch("dashboard.routes.daily_run._load_config", return_value=DEFAULT_CONFIG):
with patch("dashboard.routes.daily_run._get_token", return_value="tok"):
with patch.object(GiteaClient, "is_available", return_value=True):
with patch(
"dashboard.routes.daily_run._fetch_layer_metrics",
side_effect=Exception("unexpected"),
):
result = _get_metrics()
assert result is None
# ---------------------------------------------------------------------------
# Route handlers (FastAPI)
# ---------------------------------------------------------------------------
def test_daily_run_metrics_api_unavailable(client):
with patch("dashboard.routes.daily_run._get_metrics", return_value=None):
resp = client.get("/daily-run/metrics")
assert resp.status_code == 503
data = resp.json()
assert data["status"] == "unavailable"
def test_daily_run_metrics_api_returns_json(client):
mock_metrics = _make_daily_metrics(
layers=[
LayerMetrics(name="triage", label="layer:triage", current_count=3, previous_count=2)
]
)
with patch("dashboard.routes.daily_run._get_metrics", return_value=mock_metrics):
with patch(
"dashboard.routes.quests.check_daily_run_quests",
return_value=[],
create=True,
):
resp = client.get("/daily-run/metrics?lookback_days=7")
assert resp.status_code == 200
data = resp.json()
assert data["status"] == "ok"
assert data["lookback_days"] == 7
assert "sessions" in data
assert "layers" in data
assert "totals" in data
assert len(data["layers"]) == 1
assert data["layers"][0]["name"] == "triage"
def test_daily_run_panel_returns_html(client):
mock_metrics = _make_daily_metrics()
with patch("dashboard.routes.daily_run._get_metrics", return_value=mock_metrics):
with patch("dashboard.routes.daily_run._load_config", return_value=DEFAULT_CONFIG):
resp = client.get("/daily-run/panel")
assert resp.status_code == 200
assert "text/html" in resp.headers["content-type"]
def test_daily_run_panel_when_unavailable(client):
with patch("dashboard.routes.daily_run._get_metrics", return_value=None):
with patch("dashboard.routes.daily_run._load_config", return_value=DEFAULT_CONFIG):
resp = client.get("/daily-run/panel")
assert resp.status_code == 200

View File

@@ -1,72 +0,0 @@
"""Tests for the Nexus conversational awareness routes."""
from unittest.mock import patch
def test_nexus_page_returns_200(client):
"""GET /nexus should render without error."""
response = client.get("/nexus")
assert response.status_code == 200
assert "NEXUS" in response.text
def test_nexus_page_contains_chat_form(client):
"""Nexus page must include the conversational chat form."""
response = client.get("/nexus")
assert response.status_code == 200
assert "/nexus/chat" in response.text
def test_nexus_page_contains_teach_form(client):
"""Nexus page must include the teaching panel form."""
response = client.get("/nexus")
assert response.status_code == 200
assert "/nexus/teach" in response.text
def test_nexus_chat_empty_message_returns_empty(client):
"""POST /nexus/chat with blank message returns empty response."""
response = client.post("/nexus/chat", data={"message": " "})
assert response.status_code == 200
assert response.text == ""
def test_nexus_chat_too_long_returns_error(client):
"""POST /nexus/chat with overlong message returns error partial."""
long_msg = "x" * 10_001
response = client.post("/nexus/chat", data={"message": long_msg})
assert response.status_code == 200
assert "too long" in response.text.lower()
def test_nexus_chat_posts_message(client):
"""POST /nexus/chat calls the session chat function and returns a partial."""
with patch("dashboard.routes.nexus.chat", return_value="Hello from Timmy"):
response = client.post("/nexus/chat", data={"message": "hello"})
assert response.status_code == 200
assert "hello" in response.text.lower() or "timmy" in response.text.lower()
def test_nexus_teach_stores_fact(client):
"""POST /nexus/teach should persist a fact and return confirmation."""
with patch("dashboard.routes.nexus.store_personal_fact") as mock_store, \
patch("dashboard.routes.nexus.recall_personal_facts_with_ids", return_value=[]):
mock_store.return_value = None
response = client.post("/nexus/teach", data={"fact": "Timmy loves Python"})
assert response.status_code == 200
assert "Timmy loves Python" in response.text
def test_nexus_teach_empty_fact_returns_empty(client):
"""POST /nexus/teach with blank fact returns empty response."""
response = client.post("/nexus/teach", data={"fact": " "})
assert response.status_code == 200
assert response.text == ""
def test_nexus_clear_history(client):
"""DELETE /nexus/history should clear the conversation log."""
with patch("dashboard.routes.nexus.reset_session"):
response = client.request("DELETE", "/nexus/history")
assert response.status_code == 200
assert "cleared" in response.text.lower()

View File

@@ -1,509 +1,247 @@
"""Unit tests for infrastructure.chat_store module."""
"""Unit tests for src/infrastructure/chat_store.py."""
import sqlite3
import threading
from pathlib import Path
from unittest.mock import patch
from infrastructure.chat_store import Message, MessageLog, _get_conn
import pytest
# ---------------------------------------------------------------------------
# Message dataclass
# ---------------------------------------------------------------------------
from src.infrastructure.chat_store import MAX_MESSAGES, Message, MessageLog, _get_conn
pytestmark = pytest.mark.unit
class TestMessageDataclass:
"""Tests for the Message dataclass."""
def test_message_required_fields(self):
"""Message can be created with required fields only."""
msg = Message(role="user", content="hello", timestamp="2024-01-01T00:00:00")
assert msg.role == "user"
assert msg.content == "hello"
assert msg.timestamp == "2024-01-01T00:00:00"
def test_message_default_source(self):
"""Message source defaults to 'browser'."""
msg = Message(role="user", content="hi", timestamp="2024-01-01T00:00:00")
assert msg.source == "browser"
def test_message_custom_source(self):
"""Message source can be overridden."""
msg = Message(role="agent", content="reply", timestamp="2024-01-01T00:00:00", source="api")
assert msg.source == "api"
def test_message_equality(self):
"""Two Messages with the same fields are equal (dataclass default)."""
m1 = Message(role="user", content="x", timestamp="t")
m2 = Message(role="user", content="x", timestamp="t")
assert m1 == m2
def test_message_inequality(self):
"""Messages with different content are not equal."""
m1 = Message(role="user", content="x", timestamp="t")
m2 = Message(role="user", content="y", timestamp="t")
assert m1 != m2
@pytest.fixture()
def tmp_db(tmp_path: Path) -> Path:
"""Return a temporary database path."""
return tmp_path / "test_chat.db"
# ---------------------------------------------------------------------------
# _get_conn context manager
# ---------------------------------------------------------------------------
@pytest.fixture()
def log(tmp_db: Path) -> MessageLog:
"""Return a MessageLog backed by a temp database."""
ml = MessageLog(db_path=tmp_db)
yield ml
ml.close()
class TestGetConnContextManager:
"""Tests for the _get_conn context manager."""
# ── Message dataclass ──────────────────────────────────────────────────
def test_creates_db_file(self, tmp_path):
"""_get_conn creates the database file on first use."""
db = tmp_path / "chat.db"
assert not db.exists()
with _get_conn(db) as conn:
assert conn is not None
assert db.exists()
def test_creates_parent_directories(self, tmp_path):
"""_get_conn creates any missing parent directories."""
db = tmp_path / "nested" / "deep" / "chat.db"
with _get_conn(db):
pass
assert db.exists()
class TestMessage:
def test_default_source(self):
m = Message(role="user", content="hi", timestamp="2026-01-01T00:00:00")
assert m.source == "browser"
def test_creates_schema(self, tmp_path):
"""_get_conn creates the chat_messages table."""
db = tmp_path / "chat.db"
with _get_conn(db) as conn:
def test_custom_source(self):
m = Message(role="agent", content="ok", timestamp="t1", source="telegram")
assert m.source == "telegram"
def test_fields(self):
m = Message(role="error", content="boom", timestamp="t2", source="api")
assert m.role == "error"
assert m.content == "boom"
assert m.timestamp == "t2"
# ── _get_conn context manager ──────────────────────────────────────────
class TestGetConn:
def test_creates_db_and_table(self, tmp_db: Path):
with _get_conn(tmp_db) as conn:
tables = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table' AND name='chat_messages'"
"SELECT name FROM sqlite_master WHERE type='table'"
).fetchall()
assert len(tables) == 1
names = [t["name"] for t in tables]
assert "chat_messages" in names
def test_schema_has_expected_columns(self, tmp_path):
"""chat_messages table has the expected columns."""
db = tmp_path / "chat.db"
with _get_conn(db) as conn:
info = conn.execute("PRAGMA table_info(chat_messages)").fetchall()
col_names = [row["name"] for row in info]
assert set(col_names) == {"id", "role", "content", "timestamp", "source"}
def test_creates_parent_dirs(self, tmp_path: Path):
deep = tmp_path / "a" / "b" / "c" / "chat.db"
with _get_conn(deep) as conn:
assert deep.parent.exists()
def test_idempotent_schema_creation(self, tmp_path):
"""Calling _get_conn twice does not fail (CREATE TABLE IF NOT EXISTS)."""
db = tmp_path / "chat.db"
with _get_conn(db):
pass
with _get_conn(db) as conn:
# Table still exists and is usable
conn.execute("SELECT COUNT(*) FROM chat_messages")
def test_connection_closed_after_context(self, tmp_db: Path):
with _get_conn(tmp_db) as conn:
conn.execute("SELECT 1")
# Connection should be closed — operations should fail
with pytest.raises(Exception):
conn.execute("SELECT 1")
# ---------------------------------------------------------------------------
# MessageLog — basic operations
# ---------------------------------------------------------------------------
# ── MessageLog core operations ─────────────────────────────────────────
class TestMessageLogAppend:
"""Tests for MessageLog.append()."""
class TestMessageLogAppendAndAll:
def test_append_and_all(self, log: MessageLog):
log.append("user", "hello", "t1")
log.append("agent", "hi back", "t2", source="api")
msgs = log.all()
assert len(msgs) == 2
assert msgs[0].role == "user"
assert msgs[0].content == "hello"
assert msgs[0].source == "browser"
assert msgs[1].role == "agent"
assert msgs[1].source == "api"
def test_append_single_message(self, tmp_path):
"""append() stores a message that can be retrieved."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "hello", "2024-01-01T00:00:00")
messages = log.all()
assert len(messages) == 1
assert messages[0].role == "user"
assert messages[0].content == "hello"
assert messages[0].timestamp == "2024-01-01T00:00:00"
assert messages[0].source == "browser"
log.close()
def test_append_custom_source(self, tmp_path):
"""append() stores the source field correctly."""
log = MessageLog(tmp_path / "chat.db")
log.append("agent", "reply", "2024-01-01T00:00:01", source="api")
msg = log.all()[0]
assert msg.source == "api"
log.close()
def test_append_multiple_messages_preserves_order(self, tmp_path):
"""append() preserves insertion order."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "first", "2024-01-01T00:00:00")
log.append("agent", "second", "2024-01-01T00:00:01")
log.append("user", "third", "2024-01-01T00:00:02")
messages = log.all()
assert [m.content for m in messages] == ["first", "second", "third"]
log.close()
def test_append_persists_across_instances(self, tmp_path):
"""Messages appended by one instance are readable by another."""
db = tmp_path / "chat.db"
log1 = MessageLog(db)
log1.append("user", "persisted", "2024-01-01T00:00:00")
log1.close()
log2 = MessageLog(db)
messages = log2.all()
assert len(messages) == 1
assert messages[0].content == "persisted"
log2.close()
class TestMessageLogAll:
"""Tests for MessageLog.all()."""
def test_all_on_empty_store_returns_empty_list(self, tmp_path):
"""all() returns [] when there are no messages."""
log = MessageLog(tmp_path / "chat.db")
assert log.all() == []
log.close()
def test_all_returns_message_objects(self, tmp_path):
"""all() returns a list of Message dataclass instances."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "hi", "2024-01-01T00:00:00")
messages = log.all()
assert all(isinstance(m, Message) for m in messages)
log.close()
def test_all_returns_all_messages(self, tmp_path):
"""all() returns every stored message."""
log = MessageLog(tmp_path / "chat.db")
def test_all_returns_ordered_by_id(self, log: MessageLog):
for i in range(5):
log.append("user", f"msg{i}", f"2024-01-01T00:00:0{i}")
assert len(log.all()) == 5
log.close()
log.append("user", f"msg{i}", f"t{i}")
msgs = log.all()
assert [m.content for m in msgs] == [f"msg{i}" for i in range(5)]
def test_all_empty_store(self, log: MessageLog):
assert log.all() == []
class TestMessageLogRecent:
"""Tests for MessageLog.recent()."""
def test_recent_returns_newest(self, log: MessageLog):
for i in range(10):
log.append("user", f"msg{i}", f"t{i}")
recent = log.recent(limit=3)
assert len(recent) == 3
assert recent[0].content == "msg7"
assert recent[2].content == "msg9"
def test_recent_on_empty_store_returns_empty_list(self, tmp_path):
"""recent() returns [] when there are no messages."""
log = MessageLog(tmp_path / "chat.db")
def test_recent_oldest_first(self, log: MessageLog):
for i in range(5):
log.append("user", f"msg{i}", f"t{i}")
recent = log.recent(limit=3)
# Should be oldest-first within the window
assert recent[0].content == "msg2"
assert recent[1].content == "msg3"
assert recent[2].content == "msg4"
def test_recent_more_than_exists(self, log: MessageLog):
log.append("user", "only", "t0")
recent = log.recent(limit=100)
assert len(recent) == 1
def test_recent_empty_store(self, log: MessageLog):
assert log.recent() == []
log.close()
def test_recent_default_limit(self, tmp_path):
"""recent() with default limit returns up to 50 messages."""
log = MessageLog(tmp_path / "chat.db")
for i in range(60):
log.append("user", f"msg{i}", f"2024-01-01T00:00:{i:02d}")
msgs = log.recent()
assert len(msgs) == 50
log.close()
def test_recent_custom_limit(self, tmp_path):
"""recent() respects a custom limit."""
log = MessageLog(tmp_path / "chat.db")
for i in range(10):
log.append("user", f"msg{i}", f"2024-01-01T00:00:0{i}")
msgs = log.recent(limit=3)
assert len(msgs) == 3
log.close()
def test_recent_returns_newest_messages(self, tmp_path):
"""recent() returns the most-recently-inserted messages."""
log = MessageLog(tmp_path / "chat.db")
for i in range(10):
log.append("user", f"msg{i}", f"2024-01-01T00:00:0{i}")
msgs = log.recent(limit=3)
# Should be the last 3 inserted, in oldest-first order
assert [m.content for m in msgs] == ["msg7", "msg8", "msg9"]
log.close()
def test_recent_fewer_than_limit_returns_all(self, tmp_path):
"""recent() returns all messages when count < limit."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "only", "2024-01-01T00:00:00")
msgs = log.recent(limit=10)
assert len(msgs) == 1
log.close()
def test_recent_returns_oldest_first(self, tmp_path):
"""recent() returns messages in oldest-first order."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "a", "2024-01-01T00:00:00")
log.append("user", "b", "2024-01-01T00:00:01")
log.append("user", "c", "2024-01-01T00:00:02")
msgs = log.recent(limit=2)
assert [m.content for m in msgs] == ["b", "c"]
log.close()
class TestMessageLogClear:
"""Tests for MessageLog.clear()."""
def test_clear_empties_the_store(self, tmp_path):
"""clear() removes all messages."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "hello", "2024-01-01T00:00:00")
log.clear()
assert log.all() == []
log.close()
def test_clear_on_empty_store_is_safe(self, tmp_path):
"""clear() on an empty store does not raise."""
log = MessageLog(tmp_path / "chat.db")
log.clear() # should not raise
assert log.all() == []
log.close()
def test_clear_allows_new_appends(self, tmp_path):
"""After clear(), new messages can be appended."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "old", "2024-01-01T00:00:00")
log.clear()
log.append("user", "new", "2024-01-01T00:00:01")
messages = log.all()
assert len(messages) == 1
assert messages[0].content == "new"
log.close()
def test_clear_resets_len_to_zero(self, tmp_path):
"""After clear(), __len__ returns 0."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "a", "t")
log.append("user", "b", "t")
def test_clear_removes_all(self, log: MessageLog):
for i in range(5):
log.append("user", f"msg{i}", f"t{i}")
assert len(log) == 5
log.clear()
assert len(log) == 0
log.close()
assert log.all() == []
# ---------------------------------------------------------------------------
# MessageLog — __len__
# ---------------------------------------------------------------------------
def test_clear_empty_store(self, log: MessageLog):
log.clear() # Should not raise
assert len(log) == 0
class TestMessageLogLen:
"""Tests for MessageLog.__len__()."""
def test_len_empty_store(self, tmp_path):
"""__len__ returns 0 for an empty store."""
log = MessageLog(tmp_path / "chat.db")
def test_len_empty(self, log: MessageLog):
assert len(log) == 0
log.close()
def test_len_after_appends(self, tmp_path):
"""__len__ reflects the number of stored messages."""
log = MessageLog(tmp_path / "chat.db")
def test_len_after_appends(self, log: MessageLog):
for i in range(7):
log.append("user", f"msg{i}", "t")
log.append("user", f"msg{i}", f"t{i}")
assert len(log) == 7
log.close()
def test_len_after_clear(self, tmp_path):
"""__len__ is 0 after clear()."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "x", "t")
log.clear()
assert len(log) == 0
log.close()
# ---------------------------------------------------------------------------
# MessageLog — pruning
# ---------------------------------------------------------------------------
class TestMessageLogPrune:
"""Tests for automatic pruning via _prune()."""
def test_prune_keeps_at_most_max_messages(self, tmp_path):
"""After exceeding MAX_MESSAGES, oldest messages are pruned."""
log = MessageLog(tmp_path / "chat.db")
# Temporarily lower the limit via monkeypatching is not straightforward
# because _prune reads the module-level MAX_MESSAGES constant.
# We therefore patch it directly.
import infrastructure.chat_store as cs
original = cs.MAX_MESSAGES
cs.MAX_MESSAGES = 5
try:
for i in range(8):
log.append("user", f"msg{i}", f"t{i}")
assert len(log) == 5
finally:
cs.MAX_MESSAGES = original
log.close()
def test_prune_keeps_newest_messages(self, tmp_path):
"""Pruning removes oldest messages and keeps the newest ones."""
import infrastructure.chat_store as cs
log = MessageLog(tmp_path / "chat.db")
original = cs.MAX_MESSAGES
cs.MAX_MESSAGES = 3
try:
for i in range(5):
log.append("user", f"msg{i}", f"t{i}")
messages = log.all()
contents = [m.content for m in messages]
assert contents == ["msg2", "msg3", "msg4"]
finally:
cs.MAX_MESSAGES = original
log.close()
def test_no_prune_when_below_limit(self, tmp_path):
"""No messages are pruned while count is at or below MAX_MESSAGES."""
log = MessageLog(tmp_path / "chat.db")
import infrastructure.chat_store as cs
original = cs.MAX_MESSAGES
cs.MAX_MESSAGES = 10
try:
for i in range(10):
log.append("user", f"msg{i}", f"t{i}")
assert len(log) == 10
finally:
cs.MAX_MESSAGES = original
log.close()
# ---------------------------------------------------------------------------
# MessageLog — close / lifecycle
# ---------------------------------------------------------------------------
class TestMessageLogClose:
"""Tests for MessageLog.close()."""
def test_close_sets_conn_none(self, tmp_db: Path):
ml = MessageLog(db_path=tmp_db)
ml.append("user", "x", "t0")
ml.close()
assert ml._conn is None
def test_close_is_safe_before_first_use(self, tmp_path):
"""close() on a fresh (never-used) instance does not raise."""
log = MessageLog(tmp_path / "chat.db")
log.close() # should not raise
def test_close_idempotent(self, tmp_db: Path):
ml = MessageLog(db_path=tmp_db)
ml.close()
ml.close() # Should not raise
def test_close_multiple_times_is_safe(self, tmp_path):
"""close() can be called multiple times without error."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "hi", "t")
log.close()
log.close() # second close should not raise
def test_close_sets_conn_to_none(self, tmp_path):
"""close() sets the internal _conn attribute to None."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "hi", "t")
assert log._conn is not None
log.close()
assert log._conn is None
def test_reopen_after_close(self, tmp_db: Path):
ml = MessageLog(db_path=tmp_db)
ml.append("user", "before", "t0")
ml.close()
# Should reconnect on next use
ml.append("user", "after", "t1")
assert len(ml) == 2
ml.close()
# ---------------------------------------------------------------------------
# Thread safety
# ---------------------------------------------------------------------------
# ── Pruning ────────────────────────────────────────────────────────────
class TestMessageLogThreadSafety:
"""Thread-safety tests for MessageLog."""
class TestPrune:
def test_prune_keeps_max_messages(self, tmp_db: Path):
with patch("src.infrastructure.chat_store.MAX_MESSAGES", 5):
ml = MessageLog(db_path=tmp_db)
for i in range(10):
ml.append("user", f"msg{i}", f"t{i}")
# Should have pruned to 5
assert len(ml) == 5
msgs = ml.all()
# Oldest should be pruned, newest kept
assert msgs[0].content == "msg5"
assert msgs[-1].content == "msg9"
ml.close()
def test_concurrent_appends(self, tmp_path):
"""Multiple threads can append messages without data loss or errors."""
log = MessageLog(tmp_path / "chat.db")
errors: list[Exception] = []
def test_no_prune_under_limit(self, tmp_db: Path):
with patch("src.infrastructure.chat_store.MAX_MESSAGES", 100):
ml = MessageLog(db_path=tmp_db)
for i in range(10):
ml.append("user", f"msg{i}", f"t{i}")
assert len(ml) == 10
ml.close()
def worker(n: int) -> None:
# ── Thread safety ──────────────────────────────────────────────────────
class TestThreadSafety:
def test_concurrent_appends(self, tmp_db: Path):
ml = MessageLog(db_path=tmp_db)
errors = []
def writer(start: int):
try:
for i in range(5):
log.append("user", f"t{n}-{i}", f"ts-{n}-{i}")
except Exception as exc: # noqa: BLE001
errors.append(exc)
for i in range(20):
ml.append("user", f"msg{start + i}", f"t{start + i}")
except Exception as e:
errors.append(e)
threads = [threading.Thread(target=worker, args=(n,)) for n in range(4)]
threads = [threading.Thread(target=writer, args=(i * 20,)) for i in range(5)]
for t in threads:
t.start()
for t in threads:
t.join()
assert errors == [], f"Concurrent append raised: {errors}"
# All 20 messages should be present (4 threads × 5 messages)
assert len(log) == 20
log.close()
def test_concurrent_reads_and_writes(self, tmp_path):
"""Concurrent reads and writes do not corrupt state."""
log = MessageLog(tmp_path / "chat.db")
errors: list[Exception] = []
def writer() -> None:
try:
for i in range(10):
log.append("user", f"msg{i}", f"t{i}")
except Exception as exc: # noqa: BLE001
errors.append(exc)
def reader() -> None:
try:
for _ in range(10):
log.all()
except Exception as exc: # noqa: BLE001
errors.append(exc)
threads = [threading.Thread(target=writer)] + [
threading.Thread(target=reader) for _ in range(3)
]
for t in threads:
t.start()
for t in threads:
t.join()
assert errors == [], f"Concurrent read/write raised: {errors}"
log.close()
assert not errors, f"Thread errors: {errors}"
assert len(ml) == 100
ml.close()
# ---------------------------------------------------------------------------
# Edge cases
# ---------------------------------------------------------------------------
# ── Edge cases ─────────────────────────────────────────────────────────
class TestMessageLogEdgeCases:
"""Edge-case tests for MessageLog."""
class TestEdgeCases:
def test_empty_content(self, log: MessageLog):
log.append("user", "", "t0")
msgs = log.all()
assert len(msgs) == 1
assert msgs[0].content == ""
def test_empty_content_stored_and_retrieved(self, tmp_path):
"""Empty string content can be stored and retrieved."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "", "2024-01-01T00:00:00")
assert log.all()[0].content == ""
log.close()
def test_unicode_content(self, log: MessageLog):
log.append("user", "こんにちは 🎉 مرحبا", "t0")
msgs = log.all()
assert msgs[0].content == "こんにちは 🎉 مرحبا"
def test_unicode_content_stored_and_retrieved(self, tmp_path):
"""Unicode characters in content are stored and retrieved correctly."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "こんにちは 🌍", "2024-01-01T00:00:00")
assert log.all()[0].content == "こんにちは 🌍"
log.close()
def test_multiline_content(self, log: MessageLog):
content = "line1\nline2\nline3"
log.append("user", content, "t0")
assert log.all()[0].content == content
def test_newline_in_content(self, tmp_path):
"""Newlines in content are preserved."""
log = MessageLog(tmp_path / "chat.db")
multiline = "line1\nline2\nline3"
log.append("agent", multiline, "2024-01-01T00:00:00")
assert log.all()[0].content == multiline
log.close()
def test_default_db_path_attribute(self):
"""MessageLog without explicit path uses the module-level DB_PATH."""
from infrastructure.chat_store import DB_PATH
log = MessageLog()
assert log._db_path == DB_PATH
# Do NOT call close() here — this is the global singleton's path
def test_custom_db_path_used(self, tmp_path):
"""MessageLog uses the provided db_path."""
db = tmp_path / "custom.db"
log = MessageLog(db)
log.append("user", "test", "t")
assert db.exists()
log.close()
def test_recent_limit_zero_returns_empty(self, tmp_path):
"""recent(limit=0) returns an empty list."""
log = MessageLog(tmp_path / "chat.db")
log.append("user", "msg", "t")
assert log.recent(limit=0) == []
log.close()
def test_all_roles_stored_correctly(self, tmp_path):
"""Different role values are stored and retrieved correctly."""
log = MessageLog(tmp_path / "chat.db")
for role in ("user", "agent", "error", "system"):
log.append(role, f"{role} message", "t")
messages = log.all()
assert [m.role for m in messages] == ["user", "agent", "error", "system"]
log.close()
def test_special_sql_characters(self, log: MessageLog):
log.append("user", "Robert'; DROP TABLE chat_messages;--", "t0")
msgs = log.all()
assert len(msgs) == 1
assert "DROP TABLE" in msgs[0].content

View File

@@ -1,21 +1,10 @@
"""Tests for the async event bus (infrastructure.events.bus)."""
import sqlite3
from pathlib import Path
from unittest.mock import patch
import pytest
import infrastructure.events.bus as bus_module
from infrastructure.events.bus import (
Event,
EventBus,
emit,
event_bus,
get_event_bus,
init_event_bus_persistence,
on,
)
from infrastructure.events.bus import Event, EventBus, emit, event_bus, on
class TestEvent:
@@ -360,111 +349,3 @@ class TestEventBusPersistence:
assert mode == "wal"
finally:
conn.close()
async def test_persist_event_exception_is_swallowed(self, tmp_path):
"""_persist_event must not propagate SQLite errors."""
from unittest.mock import MagicMock
bus = EventBus()
bus.enable_persistence(tmp_path / "events.db")
# Make the INSERT raise an OperationalError
mock_conn = MagicMock()
mock_conn.execute.side_effect = sqlite3.OperationalError("simulated failure")
from contextlib import contextmanager
@contextmanager
def fake_ctx():
yield mock_conn
with patch.object(bus, "_get_persistence_conn", fake_ctx):
# Should not raise
bus._persist_event(Event(type="x", source="s"))
async def test_replay_exception_returns_empty(self, tmp_path):
"""replay() must return [] when SQLite query fails."""
from unittest.mock import MagicMock
bus = EventBus()
bus.enable_persistence(tmp_path / "events.db")
mock_conn = MagicMock()
mock_conn.execute.side_effect = sqlite3.OperationalError("simulated failure")
from contextlib import contextmanager
@contextmanager
def fake_ctx():
yield mock_conn
with patch.object(bus, "_get_persistence_conn", fake_ctx):
result = bus.replay()
assert result == []
# ── Singleton helpers ─────────────────────────────────────────────────────────
class TestSingletonHelpers:
"""Test get_event_bus(), init_event_bus_persistence(), and module __getattr__."""
def test_get_event_bus_returns_same_instance(self):
"""get_event_bus() is a true singleton."""
a = get_event_bus()
b = get_event_bus()
assert a is b
def test_module_event_bus_attr_is_singleton(self):
"""Accessing bus_module.event_bus via __getattr__ returns the singleton."""
assert bus_module.event_bus is get_event_bus()
def test_module_getattr_unknown_raises(self):
"""Accessing an unknown module attribute raises AttributeError."""
with pytest.raises(AttributeError):
_ = bus_module.no_such_attr # type: ignore[attr-defined]
def test_init_event_bus_persistence_sets_path(self, tmp_path):
"""init_event_bus_persistence() enables persistence on the singleton."""
bus = get_event_bus()
original_path = bus._persistence_db_path
try:
bus._persistence_db_path = None # reset for the test
db_path = tmp_path / "test_init.db"
init_event_bus_persistence(db_path)
assert bus._persistence_db_path == db_path
finally:
bus._persistence_db_path = original_path
def test_init_event_bus_persistence_is_idempotent(self, tmp_path):
"""Calling init_event_bus_persistence() twice keeps the first path."""
bus = get_event_bus()
original_path = bus._persistence_db_path
try:
bus._persistence_db_path = None
first_path = tmp_path / "first.db"
second_path = tmp_path / "second.db"
init_event_bus_persistence(first_path)
init_event_bus_persistence(second_path) # should be ignored
assert bus._persistence_db_path == first_path
finally:
bus._persistence_db_path = original_path
def test_init_event_bus_persistence_default_path(self):
"""init_event_bus_persistence() uses 'data/events.db' when no path given."""
bus = get_event_bus()
original_path = bus._persistence_db_path
try:
bus._persistence_db_path = None
# Patch enable_persistence to capture what path it receives
captured = {}
def fake_enable(path: Path) -> None:
captured["path"] = path
with patch.object(bus, "enable_persistence", side_effect=fake_enable):
init_event_bus_persistence()
assert captured["path"] == Path("data/events.db")
finally:
bus._persistence_db_path = original_path

View File

@@ -1416,7 +1416,9 @@ class TestFilterProviders:
def test_frontier_required_no_anthropic_raises(self):
router = CascadeRouter(config_path=Path("/nonexistent"))
router.providers = [Provider(name="ollama-p", type="ollama", enabled=True, priority=1)]
router.providers = [
Provider(name="ollama-p", type="ollama", enabled=True, priority=1)
]
with pytest.raises(RuntimeError, match="No Anthropic provider configured"):
router._filter_providers("frontier_required")

View File

@@ -18,10 +18,6 @@ def _make_settings(**env_overrides):
"""Create a fresh Settings instance with isolated env vars."""
from config import Settings
# Prevent Pydantic from reading .env file (local .env pollutes defaults)
_orig_config = Settings.model_config.copy()
Settings.model_config["env_file"] = None
# Strip keys that might bleed in from the test environment
clean_env = {
k: v
@@ -86,10 +82,7 @@ def _make_settings(**env_overrides):
}
clean_env.update(env_overrides)
with patch.dict(os.environ, clean_env, clear=True):
try:
return Settings()
finally:
Settings.model_config.update(_orig_config)
return Settings()
# ── normalize_ollama_url ──────────────────────────────────────────────────────
@@ -699,12 +692,12 @@ class TestGetEffectiveOllamaModel:
"""get_effective_ollama_model walks fallback chain."""
def test_returns_primary_when_available(self):
from config import get_effective_ollama_model, settings
from config import get_effective_ollama_model
with patch("config.check_ollama_model_available", return_value=True):
result = get_effective_ollama_model()
# Should return whatever the settings primary model is
assert result == settings.ollama_model
# Default is qwen3:14b
assert result == "qwen3:14b"
def test_falls_back_when_primary_unavailable(self):
from config import get_effective_ollama_model, settings

View File

@@ -1,217 +0,0 @@
"""Unit tests for the Dreaming mode engine."""
import sqlite3
from contextlib import closing
from datetime import UTC, datetime, timedelta
from pathlib import Path
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from timmy.dreaming import DreamingEngine, DreamRecord, _SESSION_GAP_SECONDS
# ── Fixtures ──────────────────────────────────────────────────────────────────
@pytest.fixture()
def tmp_dreams_db(tmp_path):
"""Return a temporary path for the dreams database."""
return tmp_path / "dreams.db"
@pytest.fixture()
def engine(tmp_dreams_db):
"""DreamingEngine backed by a temp database."""
return DreamingEngine(db_path=tmp_dreams_db)
@pytest.fixture()
def chat_db(tmp_path):
"""Create a minimal chat database with some messages."""
db_path = tmp_path / "chat.db"
with closing(sqlite3.connect(str(db_path))) as conn:
conn.execute("""
CREATE TABLE chat_messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
role TEXT NOT NULL,
content TEXT NOT NULL,
timestamp TEXT NOT NULL,
source TEXT NOT NULL DEFAULT 'browser'
)
""")
now = datetime.now(UTC)
messages = [
("user", "Hello, can you help me?", (now - timedelta(hours=2)).isoformat()),
("agent", "Of course! What do you need?", (now - timedelta(hours=2, seconds=-5)).isoformat()),
("user", "How does Python handle errors?", (now - timedelta(hours=2, seconds=-60)).isoformat()),
("agent", "Python uses try/except blocks.", (now - timedelta(hours=2, seconds=-120)).isoformat()),
("user", "Thanks!", (now - timedelta(hours=2, seconds=-180)).isoformat()),
]
conn.executemany(
"INSERT INTO chat_messages (role, content, timestamp) VALUES (?, ?, ?)",
messages,
)
conn.commit()
return db_path
# ── Idle detection ─────────────────────────────────────────────────────────────
class TestIdleDetection:
def test_not_idle_immediately(self, engine):
assert engine.is_idle() is False
def test_idle_after_threshold(self, engine):
engine._last_activity_time = datetime.now(UTC) - timedelta(minutes=20)
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_idle_threshold_minutes = 10
assert engine.is_idle() is True
def test_not_idle_when_threshold_zero(self, engine):
engine._last_activity_time = datetime.now(UTC) - timedelta(hours=99)
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_idle_threshold_minutes = 0
assert engine.is_idle() is False
def test_record_activity_resets_timer(self, engine):
engine._last_activity_time = datetime.now(UTC) - timedelta(minutes=30)
engine.record_activity()
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_idle_threshold_minutes = 10
assert engine.is_idle() is False
# ── Status dict ───────────────────────────────────────────────────────────────
class TestGetStatus:
def test_status_shape(self, engine):
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_enabled = True
mock_settings.dreaming_idle_threshold_minutes = 10
status = engine.get_status()
assert "enabled" in status
assert "dreaming" in status
assert "idle" in status
assert "dream_count" in status
assert "idle_minutes" in status
def test_dream_count_starts_at_zero(self, engine):
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_enabled = True
mock_settings.dreaming_idle_threshold_minutes = 10
assert engine.get_status()["dream_count"] == 0
# ── Session grouping ──────────────────────────────────────────────────────────
class TestGroupIntoSessions:
def test_single_session(self, engine):
now = datetime.now(UTC)
rows = [
{"role": "user", "content": "hi", "timestamp": now.isoformat()},
{"role": "agent", "content": "hello", "timestamp": (now + timedelta(seconds=10)).isoformat()},
]
sessions = engine._group_into_sessions(rows)
assert len(sessions) == 1
assert len(sessions[0]) == 2
def test_splits_on_large_gap(self, engine):
now = datetime.now(UTC)
gap = _SESSION_GAP_SECONDS + 100
rows = [
{"role": "user", "content": "hi", "timestamp": now.isoformat()},
{"role": "agent", "content": "hello", "timestamp": (now + timedelta(seconds=gap)).isoformat()},
]
sessions = engine._group_into_sessions(rows)
assert len(sessions) == 2
def test_empty_input(self, engine):
assert engine._group_into_sessions([]) == []
# ── Dream storage ─────────────────────────────────────────────────────────────
class TestDreamStorage:
def test_store_and_retrieve(self, engine):
dream = engine._store_dream(
session_excerpt="User asked about Python.",
decision_point="Python uses try/except blocks.",
simulation="I could have given a code example.",
proposed_rule="When explaining errors, include a code snippet.",
)
assert dream.id
assert dream.proposed_rule == "When explaining errors, include a code snippet."
retrieved = engine.get_recent_dreams(limit=1)
assert len(retrieved) == 1
assert retrieved[0].id == dream.id
def test_count_increments(self, engine):
assert engine.count_dreams() == 0
engine._store_dream(
session_excerpt="test", decision_point="test", simulation="test", proposed_rule="test"
)
assert engine.count_dreams() == 1
# ── dream_once integration ─────────────────────────────────────────────────────
class TestDreamOnce:
@pytest.mark.asyncio
async def test_skips_when_disabled(self, engine):
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_enabled = False
result = await engine.dream_once()
assert result is None
@pytest.mark.asyncio
async def test_skips_when_not_idle(self, engine):
engine._last_activity_time = datetime.now(UTC)
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_enabled = True
mock_settings.dreaming_idle_threshold_minutes = 60
result = await engine.dream_once()
assert result is None
@pytest.mark.asyncio
async def test_skips_when_already_dreaming(self, engine):
engine._is_dreaming = True
with patch("timmy.dreaming.settings") as mock_settings:
mock_settings.dreaming_enabled = True
mock_settings.dreaming_idle_threshold_minutes = 0
result = await engine.dream_once()
# Reset for cleanliness
engine._is_dreaming = False
assert result is None
@pytest.mark.asyncio
async def test_dream_produces_record_when_idle(self, engine, chat_db):
"""Full cycle: idle + chat data + mocked LLM → produces DreamRecord."""
engine._last_activity_time = datetime.now(UTC) - timedelta(hours=1)
with (
patch("timmy.dreaming.settings") as mock_settings,
patch("timmy.dreaming.DreamingEngine._call_agent", new_callable=AsyncMock) as mock_agent,
patch("infrastructure.chat_store.DB_PATH", chat_db),
):
mock_settings.dreaming_enabled = True
mock_settings.dreaming_idle_threshold_minutes = 10
mock_settings.dreaming_timeout_seconds = 30
mock_agent.side_effect = [
"I could have provided a concrete try/except example.", # simulation
"When explaining errors, always include a runnable code snippet.", # rule
]
result = await engine.dream_once()
assert result is not None
assert isinstance(result, DreamRecord)
assert result.simulation
assert result.proposed_rule
assert engine.count_dreams() == 1

View File

@@ -2,15 +2,10 @@
from __future__ import annotations
from datetime import UTC, datetime, timedelta
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from timmy.vassal.agent_health import AgentHealthReport, AgentStatus
pytestmark = pytest.mark.unit
# ---------------------------------------------------------------------------
# AgentStatus
# ---------------------------------------------------------------------------
@@ -40,25 +35,6 @@ def test_agent_status_stuck():
assert s.needs_reassignment is True
def test_agent_status_checked_at_is_iso_string():
s = AgentStatus(agent="claude")
# Should be parseable as an ISO datetime
dt = datetime.fromisoformat(s.checked_at)
assert dt.tzinfo is not None
def test_agent_status_multiple_stuck_issues():
s = AgentStatus(agent="kimi", stuck_issue_numbers=[1, 2, 3])
assert s.is_stuck is True
assert s.needs_reassignment is True
def test_agent_status_active_but_not_stuck():
s = AgentStatus(agent="claude", active_issue_numbers=[5], is_idle=False)
assert s.is_stuck is False
assert s.needs_reassignment is False
# ---------------------------------------------------------------------------
# AgentHealthReport
# ---------------------------------------------------------------------------
@@ -71,22 +47,11 @@ def test_report_any_stuck():
assert report.any_stuck is True
def test_report_not_any_stuck():
report = AgentHealthReport(agents=[AgentStatus(agent="claude"), AgentStatus(agent="kimi")])
assert report.any_stuck is False
def test_report_all_idle():
report = AgentHealthReport(agents=[AgentStatus(agent="claude"), AgentStatus(agent="kimi")])
assert report.all_idle is True
def test_report_not_all_idle():
claude = AgentStatus(agent="claude", active_issue_numbers=[1], is_idle=False)
report = AgentHealthReport(agents=[claude, AgentStatus(agent="kimi")])
assert report.all_idle is False
def test_report_for_agent_found():
kimi = AgentStatus(agent="kimi", active_issue_numbers=[42])
report = AgentHealthReport(agents=[AgentStatus(agent="claude"), kimi])
@@ -99,223 +64,6 @@ def test_report_for_agent_not_found():
assert report.for_agent("timmy") is None
def test_report_generated_at_is_iso_string():
report = AgentHealthReport()
dt = datetime.fromisoformat(report.generated_at)
assert dt.tzinfo is not None
def test_report_empty_agents():
report = AgentHealthReport(agents=[])
assert report.any_stuck is False
assert report.all_idle is True
# ---------------------------------------------------------------------------
# _issue_created_time
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_issue_created_time_valid():
from timmy.vassal.agent_health import _issue_created_time
issue = {"created_at": "2024-01-15T10:30:00Z"}
result = await _issue_created_time(issue)
assert result is not None
assert result.year == 2024
assert result.month == 1
assert result.day == 15
@pytest.mark.asyncio
async def test_issue_created_time_missing_key():
from timmy.vassal.agent_health import _issue_created_time
result = await _issue_created_time({})
assert result is None
@pytest.mark.asyncio
async def test_issue_created_time_invalid_format():
from timmy.vassal.agent_health import _issue_created_time
result = await _issue_created_time({"created_at": "not-a-date"})
assert result is None
@pytest.mark.asyncio
async def test_issue_created_time_with_timezone():
from timmy.vassal.agent_health import _issue_created_time
issue = {"created_at": "2024-06-01T12:00:00+00:00"}
result = await _issue_created_time(issue)
assert result is not None
assert result.tzinfo is not None
# ---------------------------------------------------------------------------
# _fetch_labeled_issues — mocked HTTP client
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_fetch_labeled_issues_success():
from timmy.vassal.agent_health import _fetch_labeled_issues
mock_resp = MagicMock()
mock_resp.status_code = 200
mock_resp.json.return_value = [
{"number": 1, "title": "Fix bug"},
{"number": 2, "title": "Add feature", "pull_request": {"url": "..."}},
]
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _fetch_labeled_issues(
mock_client, "http://gitea/api/v1", {}, "owner/repo", "claude-ready"
)
# Only non-PR issues returned
assert len(result) == 1
assert result[0]["number"] == 1
@pytest.mark.asyncio
async def test_fetch_labeled_issues_http_error():
from timmy.vassal.agent_health import _fetch_labeled_issues
mock_resp = MagicMock()
mock_resp.status_code = 401
mock_resp.json.return_value = []
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _fetch_labeled_issues(
mock_client, "http://gitea/api/v1", {}, "owner/repo", "claude-ready"
)
assert result == []
@pytest.mark.asyncio
async def test_fetch_labeled_issues_exception():
from timmy.vassal.agent_health import _fetch_labeled_issues
mock_client = AsyncMock()
mock_client.get = AsyncMock(side_effect=ConnectionError("network down"))
result = await _fetch_labeled_issues(
mock_client, "http://gitea/api/v1", {}, "owner/repo", "claude-ready"
)
assert result == []
@pytest.mark.asyncio
async def test_fetch_labeled_issues_filters_pull_requests():
from timmy.vassal.agent_health import _fetch_labeled_issues
mock_resp = MagicMock()
mock_resp.status_code = 200
mock_resp.json.return_value = [
{"number": 10, "title": "Issue"},
{"number": 11, "title": "PR", "pull_request": {"url": "http://gitea/pulls/11"}},
{"number": 12, "title": "Another Issue"},
]
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _fetch_labeled_issues(
mock_client, "http://gitea/api/v1", {}, "owner/repo", "claude-ready"
)
# Issues with truthy pull_request field are excluded
assert len(result) == 2
assert all(i["number"] in (10, 12) for i in result)
# ---------------------------------------------------------------------------
# _last_comment_time — mocked HTTP client
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_last_comment_time_with_comments():
from timmy.vassal.agent_health import _last_comment_time
mock_resp = MagicMock()
mock_resp.status_code = 200
mock_resp.json.return_value = [
{"updated_at": "2024-03-10T14:00:00Z", "created_at": "2024-03-10T13:00:00Z"}
]
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _last_comment_time(mock_client, "http://gitea/api/v1", {}, "owner/repo", 42)
assert result is not None
assert result.year == 2024
assert result.month == 3
@pytest.mark.asyncio
async def test_last_comment_time_uses_created_at_fallback():
from timmy.vassal.agent_health import _last_comment_time
mock_resp = MagicMock()
mock_resp.status_code = 200
mock_resp.json.return_value = [
{"created_at": "2024-03-10T13:00:00Z"} # no updated_at
]
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _last_comment_time(mock_client, "http://gitea/api/v1", {}, "owner/repo", 42)
assert result is not None
@pytest.mark.asyncio
async def test_last_comment_time_no_comments():
from timmy.vassal.agent_health import _last_comment_time
mock_resp = MagicMock()
mock_resp.status_code = 200
mock_resp.json.return_value = []
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _last_comment_time(mock_client, "http://gitea/api/v1", {}, "owner/repo", 99)
assert result is None
@pytest.mark.asyncio
async def test_last_comment_time_http_error():
from timmy.vassal.agent_health import _last_comment_time
mock_resp = MagicMock()
mock_resp.status_code = 404
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=mock_resp)
result = await _last_comment_time(mock_client, "http://gitea/api/v1", {}, "owner/repo", 99)
assert result is None
@pytest.mark.asyncio
async def test_last_comment_time_exception():
from timmy.vassal.agent_health import _last_comment_time
mock_client = AsyncMock()
mock_client.get = AsyncMock(side_effect=TimeoutError("timed out"))
result = await _last_comment_time(mock_client, "http://gitea/api/v1", {}, "owner/repo", 7)
assert result is None
# ---------------------------------------------------------------------------
# check_agent_health — no Gitea in unit tests
# ---------------------------------------------------------------------------
@@ -342,138 +90,6 @@ async def test_check_agent_health_no_token():
assert status.agent == "claude"
@pytest.mark.asyncio
async def test_check_agent_health_detects_stuck_issue(monkeypatch):
"""Issues with last activity before the cutoff are flagged as stuck."""
import timmy.vassal.agent_health as ah
old_time = (datetime.now(UTC) - timedelta(minutes=200)).isoformat()
async def _fake_fetch(client, base_url, headers, repo, label):
return [{"number": 55, "created_at": old_time}]
async def _fake_last_comment(client, base_url, headers, repo, issue_number):
return datetime.now(UTC) - timedelta(minutes=200)
monkeypatch.setattr(ah, "_fetch_labeled_issues", _fake_fetch)
monkeypatch.setattr(ah, "_last_comment_time", _fake_last_comment)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with patch("config.settings", mock_settings):
status = await ah.check_agent_health("claude", stuck_threshold_minutes=120)
assert 55 in status.active_issue_numbers
assert 55 in status.stuck_issue_numbers
assert status.is_stuck is True
@pytest.mark.asyncio
async def test_check_agent_health_active_not_stuck(monkeypatch):
"""Recent activity means issue is active but not stuck."""
import timmy.vassal.agent_health as ah
recent_time = (datetime.now(UTC) - timedelta(minutes=5)).isoformat()
async def _fake_fetch(client, base_url, headers, repo, label):
return [{"number": 77, "created_at": recent_time}]
async def _fake_last_comment(client, base_url, headers, repo, issue_number):
return datetime.now(UTC) - timedelta(minutes=5)
monkeypatch.setattr(ah, "_fetch_labeled_issues", _fake_fetch)
monkeypatch.setattr(ah, "_last_comment_time", _fake_last_comment)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with patch("config.settings", mock_settings):
status = await ah.check_agent_health("claude", stuck_threshold_minutes=120)
assert 77 in status.active_issue_numbers
assert 77 not in status.stuck_issue_numbers
assert status.is_idle is False
@pytest.mark.asyncio
async def test_check_agent_health_uses_issue_created_when_no_comments(monkeypatch):
"""Falls back to issue created_at when no comment time is available."""
import timmy.vassal.agent_health as ah
old_time = (datetime.now(UTC) - timedelta(minutes=300)).isoformat()
async def _fake_fetch(client, base_url, headers, repo, label):
return [{"number": 99, "created_at": old_time}]
async def _fake_last_comment(client, base_url, headers, repo, issue_number):
return None # No comments
monkeypatch.setattr(ah, "_fetch_labeled_issues", _fake_fetch)
monkeypatch.setattr(ah, "_last_comment_time", _fake_last_comment)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with patch("config.settings", mock_settings):
status = await ah.check_agent_health("kimi", stuck_threshold_minutes=120)
assert 99 in status.stuck_issue_numbers
@pytest.mark.asyncio
async def test_check_agent_health_gitea_disabled(monkeypatch):
"""When gitea_enabled=False, returns idle status without querying."""
import timmy.vassal.agent_health as ah
mock_settings = MagicMock()
mock_settings.gitea_enabled = False
mock_settings.gitea_token = "fake-token"
with patch("config.settings", mock_settings):
status = await ah.check_agent_health("claude")
assert status.is_idle is True
assert status.active_issue_numbers == []
@pytest.mark.asyncio
async def test_check_agent_health_fetch_exception(monkeypatch):
"""HTTP exception during check is handled gracefully."""
import timmy.vassal.agent_health as ah
async def _bad_fetch(client, base_url, headers, repo, label):
raise RuntimeError("connection refused")
monkeypatch.setattr(ah, "_fetch_labeled_issues", _bad_fetch)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with patch("config.settings", mock_settings):
status = await ah.check_agent_health("claude")
assert isinstance(status, AgentStatus)
assert status.is_idle is True
# ---------------------------------------------------------------------------
# get_full_health_report
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_get_full_health_report_returns_both_agents():
from timmy.vassal.agent_health import get_full_health_report
@@ -482,127 +98,3 @@ async def test_get_full_health_report_returns_both_agents():
agent_names = {a.agent for a in report.agents}
assert "claude" in agent_names
assert "kimi" in agent_names
@pytest.mark.asyncio
async def test_get_full_health_report_structure():
from timmy.vassal.agent_health import get_full_health_report
report = await get_full_health_report()
assert isinstance(report, AgentHealthReport)
assert len(report.agents) == 2
# ---------------------------------------------------------------------------
# nudge_stuck_agent
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_nudge_stuck_agent_no_token():
"""Returns False gracefully when Gitea is not configured."""
from timmy.vassal.agent_health import nudge_stuck_agent
mock_settings = MagicMock()
mock_settings.gitea_enabled = False
mock_settings.gitea_token = ""
with patch("config.settings", mock_settings):
result = await nudge_stuck_agent("claude", 123)
assert result is False
@pytest.mark.asyncio
async def test_nudge_stuck_agent_success(monkeypatch):
"""Returns True when comment is posted successfully."""
import timmy.vassal.agent_health as ah
mock_resp = MagicMock()
mock_resp.status_code = 201
mock_client_instance = AsyncMock()
mock_client_instance.post = AsyncMock(return_value=mock_resp)
mock_client_instance.__aenter__ = AsyncMock(return_value=mock_client_instance)
mock_client_instance.__aexit__ = AsyncMock(return_value=False)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with (
patch("config.settings", mock_settings),
patch("httpx.AsyncClient", return_value=mock_client_instance),
):
result = await ah.nudge_stuck_agent("claude", 55)
assert result is True
@pytest.mark.asyncio
async def test_nudge_stuck_agent_http_failure(monkeypatch):
"""Returns False when API returns non-2xx status."""
import timmy.vassal.agent_health as ah
mock_resp = MagicMock()
mock_resp.status_code = 500
mock_client_instance = AsyncMock()
mock_client_instance.post = AsyncMock(return_value=mock_resp)
mock_client_instance.__aenter__ = AsyncMock(return_value=mock_client_instance)
mock_client_instance.__aexit__ = AsyncMock(return_value=False)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with (
patch("config.settings", mock_settings),
patch("httpx.AsyncClient", return_value=mock_client_instance),
):
result = await ah.nudge_stuck_agent("kimi", 77)
assert result is False
@pytest.mark.asyncio
async def test_nudge_stuck_agent_gitea_disabled(monkeypatch):
"""Returns False when gitea_enabled=False."""
import timmy.vassal.agent_health as ah
mock_settings = MagicMock()
mock_settings.gitea_enabled = False
mock_settings.gitea_token = "fake-token"
with patch("config.settings", mock_settings):
result = await ah.nudge_stuck_agent("claude", 42)
assert result is False
@pytest.mark.asyncio
async def test_nudge_stuck_agent_exception(monkeypatch):
"""Returns False on network exception."""
import timmy.vassal.agent_health as ah
mock_client_instance = AsyncMock()
mock_client_instance.post = AsyncMock(side_effect=ConnectionError("refused"))
mock_client_instance.__aenter__ = AsyncMock(return_value=mock_client_instance)
mock_client_instance.__aexit__ = AsyncMock(return_value=False)
mock_settings = MagicMock()
mock_settings.gitea_enabled = True
mock_settings.gitea_token = "fake-token"
mock_settings.gitea_url = "http://gitea"
mock_settings.gitea_repo = "owner/repo"
with (
patch("config.settings", mock_settings),
patch("httpx.AsyncClient", return_value=mock_client_instance),
):
result = await ah.nudge_stuck_agent("claude", 10)
assert result is False

View File

@@ -2,17 +2,11 @@
from __future__ import annotations
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from timmy.vassal.backlog import AgentTarget, TriagedIssue
from timmy.vassal.dispatch import (
DispatchRecord,
_apply_label_to_issue,
_get_or_create_label,
_post_dispatch_comment,
clear_dispatch_registry,
get_dispatch_registry,
)
@@ -118,244 +112,3 @@ def test_dispatch_record_defaults():
assert r.label_applied is False
assert r.comment_posted is False
assert r.dispatched_at # has a timestamp
# ---------------------------------------------------------------------------
# _get_or_create_label
# ---------------------------------------------------------------------------
_HEADERS = {"Authorization": "token x"}
_BASE_URL = "http://gitea"
_REPO = "org/repo"
def _mock_response(status_code: int, json_data=None):
resp = MagicMock()
resp.status_code = status_code
resp.json.return_value = json_data or {}
return resp
@pytest.mark.asyncio
async def test_get_or_create_label_finds_existing():
"""Returns the ID of an existing label without creating it."""
existing = [{"name": "claude-ready", "id": 42}, {"name": "other", "id": 7}]
client = AsyncMock()
client.get.return_value = _mock_response(200, existing)
result = await _get_or_create_label(client, _BASE_URL, _HEADERS, _REPO, "claude-ready")
assert result == 42
client.post.assert_not_called()
@pytest.mark.asyncio
async def test_get_or_create_label_creates_when_missing():
"""Creates the label when it doesn't exist in the list."""
client = AsyncMock()
# GET returns empty list
client.get.return_value = _mock_response(200, [])
# POST creates label
client.post.return_value = _mock_response(201, {"id": 99})
result = await _get_or_create_label(client, _BASE_URL, _HEADERS, _REPO, "claude-ready")
assert result == 99
client.post.assert_called_once()
@pytest.mark.asyncio
async def test_get_or_create_label_returns_none_on_get_error():
"""Returns None if the GET raises an exception."""
client = AsyncMock()
client.get.side_effect = Exception("network error")
result = await _get_or_create_label(client, _BASE_URL, _HEADERS, _REPO, "claude-ready")
assert result is None
@pytest.mark.asyncio
async def test_get_or_create_label_returns_none_on_create_error():
"""Returns None if POST raises an exception."""
client = AsyncMock()
client.get.return_value = _mock_response(200, [])
client.post.side_effect = Exception("post failed")
result = await _get_or_create_label(client, _BASE_URL, _HEADERS, _REPO, "claude-ready")
assert result is None
@pytest.mark.asyncio
async def test_get_or_create_label_uses_default_color_for_unknown():
"""Unknown label name uses '#cccccc' fallback color."""
client = AsyncMock()
client.get.return_value = _mock_response(200, [])
client.post.return_value = _mock_response(201, {"id": 5})
await _get_or_create_label(client, _BASE_URL, _HEADERS, _REPO, "unknown-label")
call_kwargs = client.post.call_args
assert call_kwargs.kwargs["json"]["color"] == "#cccccc"
# ---------------------------------------------------------------------------
# _apply_label_to_issue
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_apply_label_to_issue_success():
"""Returns True when label is found and applied."""
client = AsyncMock()
client.get.return_value = _mock_response(200, [{"name": "claude-ready", "id": 10}])
client.post.return_value = _mock_response(201)
result = await _apply_label_to_issue(client, _BASE_URL, _HEADERS, _REPO, 42, "claude-ready")
assert result is True
@pytest.mark.asyncio
async def test_apply_label_to_issue_returns_false_when_no_label_id():
"""Returns False when label ID cannot be obtained."""
client = AsyncMock()
client.get.side_effect = Exception("unavailable")
result = await _apply_label_to_issue(client, _BASE_URL, _HEADERS, _REPO, 42, "claude-ready")
assert result is False
@pytest.mark.asyncio
async def test_apply_label_to_issue_returns_false_on_bad_status():
"""Returns False when the apply POST returns a non-2xx status."""
client = AsyncMock()
client.get.return_value = _mock_response(200, [{"name": "claude-ready", "id": 10}])
client.post.return_value = _mock_response(403)
result = await _apply_label_to_issue(client, _BASE_URL, _HEADERS, _REPO, 42, "claude-ready")
assert result is False
# ---------------------------------------------------------------------------
# _post_dispatch_comment
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_post_dispatch_comment_success():
"""Returns True on successful comment post."""
client = AsyncMock()
client.post.return_value = _mock_response(201)
issue = _make_triaged(7, "Some issue", AgentTarget.CLAUDE, priority=75)
result = await _post_dispatch_comment(client, _BASE_URL, _HEADERS, _REPO, issue, "claude-ready")
assert result is True
body = client.post.call_args.kwargs["json"]["body"]
assert "Claude" in body
assert "claude-ready" in body
assert "75" in body
@pytest.mark.asyncio
async def test_post_dispatch_comment_failure():
"""Returns False when comment POST returns a non-2xx status."""
client = AsyncMock()
client.post.return_value = _mock_response(500)
issue = _make_triaged(8, "Other issue", AgentTarget.KIMI)
result = await _post_dispatch_comment(client, _BASE_URL, _HEADERS, _REPO, issue, "kimi-ready")
assert result is False
# ---------------------------------------------------------------------------
# _perform_gitea_dispatch — settings-level gate
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_perform_gitea_dispatch_skips_when_disabled():
"""Does not call Gitea when gitea_enabled is False."""
import config
from timmy.vassal.dispatch import _perform_gitea_dispatch
mock_settings = SimpleNamespace(gitea_enabled=False, gitea_token="tok")
with patch.object(config, "settings", mock_settings):
issue = _make_triaged(9, "Disabled", AgentTarget.CLAUDE)
record = DispatchRecord(
issue_number=9,
issue_title="Disabled",
agent=AgentTarget.CLAUDE,
rationale="r",
)
await _perform_gitea_dispatch(issue, record)
assert record.label_applied is False
assert record.comment_posted is False
@pytest.mark.asyncio
async def test_perform_gitea_dispatch_skips_when_no_token():
"""Does not call Gitea when gitea_token is empty."""
import config
from timmy.vassal.dispatch import _perform_gitea_dispatch
mock_settings = SimpleNamespace(gitea_enabled=True, gitea_token="")
with patch.object(config, "settings", mock_settings):
issue = _make_triaged(10, "No token", AgentTarget.CLAUDE)
record = DispatchRecord(
issue_number=10,
issue_title="No token",
agent=AgentTarget.CLAUDE,
rationale="r",
)
await _perform_gitea_dispatch(issue, record)
assert record.label_applied is False
@pytest.mark.asyncio
async def test_perform_gitea_dispatch_updates_record():
"""Record is mutated to reflect label/comment success."""
import config
from timmy.vassal.dispatch import _perform_gitea_dispatch
mock_settings = SimpleNamespace(
gitea_enabled=True,
gitea_token="tok",
gitea_url="http://gitea",
gitea_repo="org/repo",
)
mock_client = AsyncMock()
# GET labels → empty list, POST create label → id 1
mock_client.get.return_value = _mock_response(200, [])
mock_client.post.side_effect = [
_mock_response(201, {"id": 1}), # create label
_mock_response(201), # apply label
_mock_response(201), # post comment
]
with (
patch.object(config, "settings", mock_settings),
patch("httpx.AsyncClient") as mock_cls,
):
mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
issue = _make_triaged(11, "Full dispatch", AgentTarget.CLAUDE)
record = DispatchRecord(
issue_number=11,
issue_title="Full dispatch",
agent=AgentTarget.CLAUDE,
rationale="r",
)
await _perform_gitea_dispatch(issue, record)
assert record.label_applied is True
assert record.comment_posted is True

View File

@@ -2,14 +2,10 @@
from __future__ import annotations
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from timmy.vassal.orchestration_loop import VassalCycleRecord, VassalOrchestrator
pytestmark = pytest.mark.unit
# ---------------------------------------------------------------------------
# VassalCycleRecord
# ---------------------------------------------------------------------------
@@ -140,186 +136,3 @@ def test_module_singleton_exists():
from timmy.vassal import VassalOrchestrator, vassal_orchestrator
assert isinstance(vassal_orchestrator, VassalOrchestrator)
# ---------------------------------------------------------------------------
# Error recovery — steps degrade gracefully
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_run_cycle_continues_when_backlog_fails():
"""A backlog step failure must not abort the cycle."""
from timmy.vassal.dispatch import clear_dispatch_registry
clear_dispatch_registry()
orch = VassalOrchestrator()
with patch(
"timmy.vassal.orchestration_loop.VassalOrchestrator._step_backlog",
new_callable=AsyncMock,
side_effect=RuntimeError("gitea down"),
):
# _step_backlog raises, but run_cycle should still complete
# (the error is caught inside run_cycle via the graceful-degrade wrapper)
# In practice _step_backlog itself catches; here we patch at a higher level
# to confirm record still finalises.
try:
record = await orch.run_cycle()
except RuntimeError:
# If the orchestrator doesn't swallow it, the test still validates
# that the cycle progressed to the patched call.
return
assert record.finished_at
assert record.cycle_id == 1
@pytest.mark.asyncio
async def test_run_cycle_records_backlog_error():
"""Backlog errors are recorded in VassalCycleRecord.errors."""
from timmy.vassal.dispatch import clear_dispatch_registry
clear_dispatch_registry()
orch = VassalOrchestrator()
with patch(
"timmy.vassal.backlog.fetch_open_issues",
new_callable=AsyncMock,
side_effect=ConnectionError("gitea unreachable"),
):
record = await orch.run_cycle()
assert any("backlog" in e for e in record.errors)
assert record.finished_at
@pytest.mark.asyncio
async def test_run_cycle_records_agent_health_error():
"""Agent health errors are recorded in VassalCycleRecord.errors."""
from timmy.vassal.dispatch import clear_dispatch_registry
clear_dispatch_registry()
orch = VassalOrchestrator()
with patch(
"timmy.vassal.agent_health.get_full_health_report",
new_callable=AsyncMock,
side_effect=RuntimeError("health check failed"),
):
record = await orch.run_cycle()
assert any("agent_health" in e for e in record.errors)
assert record.finished_at
@pytest.mark.asyncio
async def test_run_cycle_records_house_health_error():
"""House health errors are recorded in VassalCycleRecord.errors."""
from timmy.vassal.dispatch import clear_dispatch_registry
clear_dispatch_registry()
orch = VassalOrchestrator()
with patch(
"timmy.vassal.house_health.get_system_snapshot",
new_callable=AsyncMock,
side_effect=OSError("disk check failed"),
):
record = await orch.run_cycle()
assert any("house_health" in e for e in record.errors)
assert record.finished_at
# ---------------------------------------------------------------------------
# Task assignment counting
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_run_cycle_counts_dispatched_issues():
"""Issues dispatched during a cycle are counted in the record."""
from timmy.vassal.backlog import AgentTarget, TriagedIssue
from timmy.vassal.dispatch import clear_dispatch_registry
clear_dispatch_registry()
orch = VassalOrchestrator(max_dispatch_per_cycle=5)
fake_issues = [
TriagedIssue(number=i, title=f"Issue {i}", body="", agent_target=AgentTarget.CLAUDE)
for i in range(1, 4)
]
with (
patch(
"timmy.vassal.backlog.fetch_open_issues",
new_callable=AsyncMock,
return_value=[{"number": i, "title": f"Issue {i}", "labels": [], "assignees": []} for i in range(1, 4)],
),
patch(
"timmy.vassal.backlog.triage_issues",
return_value=fake_issues,
),
patch(
"timmy.vassal.dispatch.dispatch_issue",
new_callable=AsyncMock,
),
):
record = await orch.run_cycle()
assert record.issues_fetched == 3
assert record.issues_dispatched == 3
assert record.dispatched_to_claude == 3
@pytest.mark.asyncio
async def test_run_cycle_respects_max_dispatch_cap():
"""Dispatch cap prevents flooding agents in a single cycle."""
from timmy.vassal.backlog import AgentTarget, TriagedIssue
from timmy.vassal.dispatch import clear_dispatch_registry
clear_dispatch_registry()
orch = VassalOrchestrator(max_dispatch_per_cycle=2)
fake_issues = [
TriagedIssue(number=i, title=f"Issue {i}", body="", agent_target=AgentTarget.CLAUDE)
for i in range(1, 6)
]
with (
patch(
"timmy.vassal.backlog.fetch_open_issues",
new_callable=AsyncMock,
return_value=[{"number": i, "title": f"Issue {i}", "labels": [], "assignees": []} for i in range(1, 6)],
),
patch(
"timmy.vassal.backlog.triage_issues",
return_value=fake_issues,
),
patch(
"timmy.vassal.dispatch.dispatch_issue",
new_callable=AsyncMock,
),
):
record = await orch.run_cycle()
assert record.issues_fetched == 5
assert record.issues_dispatched == 2 # capped
# ---------------------------------------------------------------------------
# _resolve_interval
# ---------------------------------------------------------------------------
def test_resolve_interval_uses_explicit_value():
orch = VassalOrchestrator(cycle_interval=60.0)
assert orch._resolve_interval() == 60.0
def test_resolve_interval_falls_back_to_300():
orch = VassalOrchestrator()
with patch("timmy.vassal.orchestration_loop.VassalOrchestrator._resolve_interval") as mock_resolve:
mock_resolve.return_value = 300.0
assert orch._resolve_interval() == 300.0