Add optional prompt argument to `timmy tick` so custom journal
prompts can be passed from the CLI (seed_type="prompted").
Fix extract_user_name() learning verbs as names (e.g. "Serving").
Now requires the candidate word to start with a capital letter in
the original message, rejects common verb suffixes (-ing, -tion,
etc.), and deduplicates the naive regex in TimmyWithMemory to use
the fixed ConversationManager.extract_user_name() instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Model upgrade:
- qwen2.5:14b → qwen3.5:latest across config, tools, and docs
- Added qwen3.5 to multimodal model registry
Self-hosted Gitea CI:
- .gitea/workflows/tests.yml: lint + test jobs via act_runner
- Unified Dockerfile: pre-baked deps from poetry.lock for fast CI
- sitepackages=true in tox for ~2s dep resolution (was ~40s)
- OLLAMA_URL set to dead port in CI to prevent real LLM calls
Test isolation fixes:
- Smoke test fixture mocks create_timmy (was hitting real Ollama)
- WebSocket sends initial_state before joining broadcast pool (race fix)
- Tests use settings.ollama_model/url instead of hardcoded values
- skip_ci marker for Ollama-dependent tests, excluded in CI tox envs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* polish: streamline nav, extract inline styles, improve tablet UX
- Restructure desktop nav from 8+ flat links + overflow dropdown into
5 grouped dropdowns (Core, Agents, Intel, System, More) matching
the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* fix(security): move auth-gate credentials to environment variables
Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
* refactor(tooling): migrate from black+isort+bandit to ruff
Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.
https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: remove invalid show_tool_calls kwarg crashing Agent init (regression)
show_tool_calls was removed in f95c960 (Feb 26) because agno 2.5.x
doesn't accept it, then reintroduced in fd0ede0 (Mar 8) without
runtime testing — mocked tests hid the breakage.
Replace the bogus assertion with a regression guard and an allowlist
test that catches unknown kwargs before they reach production.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: auto-install git hooks, add black/isort to dev deps
- Add .githooks/ with portable pre-commit hook (macOS + Linux)
- make install now auto-activates hooks via core.hooksPath
- Add black and isort to poetry dev group (were only in CI via raw pip)
- Fix black formatting on 2 files flagged by CI
- Fix test_autoresearch_perplexity patching wrong module path
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Wire up automatic error-to-task escalation and fix the agentic loop
stopping after the first tool call.
Auto-escalation:
- Add swarm.task_queue.models with create_task() bridge to existing
task queue SQLite DB
- Add swarm.event_log with EventType enum, log_event(), and SQLite
persistence + WebSocket broadcast
- Wire capture_error() into request logging middleware so unhandled
HTTP exceptions auto-create [BUG] tasks with stack traces, git
context, and push notifications (5-min dedup window)
Agentic loop (Round 11 Bug #1):
- Wrap agent_chat() in asyncio.to_thread() to stop blocking the
event loop (fixes Discord heartbeat warnings)
- Enable Agno's native multi-turn tool chaining via show_tool_calls
and tool_call_limit on the Agent config
- Strengthen multi-step continuation prompts with explicit examples
Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* chore: stop tracking runtime-generated self-modify reports
These 65 files in data/self_modify_reports/ are auto-generated at
runtime and already listed in .gitignore. Tracking them caused
conflicts when pulling from main.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve 8 dashboard bugs from Round 4 testing report
- Fix Ollama timeout regression: request_timeout → timeout (agno API)
- Add Bootstrap JS to base.html (fixes creative UI tab switching)
- Send initial_state on Swarm Live WebSocket connect
- Add /api/queue/status endpoint (stops 404 log spam from chat panel)
- Populate agent tools from registry on /tools page
- Add notification bell dropdown with /api/notifications endpoint
- All 1157 tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Round 2+3 bug fix batch:
1. Ollama timeout: Add request_timeout=300 to prevent socket read errors
on complex 30-60s prompts (production crash fix)
2. Memory API: Create missing HTMX partial templates (memory_facts.html,
memory_results.html) so Save/Search buttons work
3. CALM page: Add create_tables() call so SQLAlchemy tables exist on
first request (was returning HTTP 500)
4. Task Queue: Full SQLite-backed rebuild with CRUD endpoints, HTMX
partials, and action buttons (approve/veto/pause/cancel/retry)
5. Work Orders: Full SQLite-backed rebuild with submit/approve/reject/
execute pipeline and HTMX polling partials
6. Memory READ tool: Add memory_read function so Timmy stops calling
read_file when trying to recall stored facts
Also: Close GitHub issues #115, #114, #112, #110 as won't-fix.
Comment on #107 confirming prune_memories() already wired to startup.
Tests: 33 new tests across 4 test files, all passing.
Full suite: 1155 passed, 2 pre-existing failures (hands_shell).
Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Remove persona system, identity, and all Timmy references
Strip the codebase to pure orchestration logic:
- Delete TIMMY_IDENTITY.md and memory/self/identity.md
- Gut brain/identity.py to no-op stubs (empty returns)
- Remove all system prompts reinforcing Timmy's character, faith,
sovereignty, sign-off ("Sir, affirmative"), and agent roster
- Replace identity-laden prompts with generic local-AI-assistant prompts
- Remove "You work for Timmy" from all sub-agent system prompts
- Rename PersonaTools → AgentTools, PERSONA_TOOLKITS → AGENT_TOOLKITS
- Replace "timmy" agent ID with "orchestrator" across routes, marketplace,
tools catalog, and orchestrator class
- Strip Timmy references from config comments, templates, telegram bot,
chat API, and dashboard UI
- Delete tests/brain/test_identity.py entirely
- Fix all test assertions that checked for persona identity content
729 tests pass (2 pre-existing failures in test_calm.py unrelated).
https://claude.ai/code/session_01LjQGUE6nk9W9674zaxrYxy
* Add Taskosaur (PM + AI task execution) to docker-compose
Spins up Taskosaur alongside the dashboard on `docker compose up`:
- postgres:16-alpine (port 5432, Taskosaur DB)
- redis:7-alpine (Bull queue backend)
- taskosaur (ports 3000 API / 3001 UI)
- dashboard now depends_on taskosaur healthy
- TASKOSAUR_API_URL injected into dashboard environment
Dashboard can reach Taskosaur at http://taskosaur:3000/api on the
internal network. Frontend UI accessible at http://localhost:3001.
https://claude.ai/code/session_01LjQGUE6nk9W9674zaxrYxy
---------
Co-authored-by: Claude <noreply@anthropic.com>
- Add MultiModalManager with capability detection for vision/audio/tools
- Define fallback chains: vision (llama3.2:3b -> llava:7b -> moondream)
tools (llama3.1:8b-instruct -> qwen2.5:7b)
- Update CascadeRouter to detect content type and select appropriate models
- Add model pulling with automatic fallback in agent creation
- Update providers.yaml with multi-modal model configurations
- Update OllamaAdapter to use model resolution with vision support
Tests: All 96 infrastructure tests pass
- Add GrokBackend class in src/timmy/backends.py with full sync/async
support, health checks, usage stats, and cost estimation in sats
- Add consult_grok tool to Timmy's toolkit for proactive Grok queries
- Extend cascade router with Grok provider type for failover chain
- Add Grok Mode toggle card to Mission Control dashboard (HTMX live)
- Add "Ask Grok" button on chat input for direct Grok queries
- Add /grok/* routes: status, toggle, chat, stats endpoints
- Integrate Lightning invoice generation for Grok usage monetization
- Add GROK_ENABLED, XAI_API_KEY, GROK_DEFAULT_MODEL, GROK_MAX_SATS_PER_QUERY,
GROK_FREE config settings via pydantic-settings
- Update .env.example and docker-compose.yml with Grok env vars
- Add 21 tests covering backend, tools, and route endpoints (all green)
Local-first ethos preserved: Grok is premium augmentation only,
disabled by default, and Lightning-payable when enabled.
https://claude.ai/code/session_01FygwN8wS8J6WGZ8FPb7XGV
Addresses 14 bugs from 3 rounds of deep chat evaluation:
- Add chat-to-task pipeline in agents.py with regex-based intent detection,
agent extraction, priority extraction, and title cleaning
- Filter meta-questions ("how do I create a task?") from task creation
- Inject real-time date/time context into every chat message
- Inject live queue state when user asks about tasks
- Ground system prompts with agent roster, honesty guardrails, self-knowledge,
math delegation template, anti-filler rules, values-conflict guidance
- Add CSS for markdown code blocks, inline code, lists, blockquotes in chat
- Add highlight.js CDN for syntax highlighting in chat responses
- Reduce small-model memory context budget (4000→2000) for expanded prompt
- Add 27 comprehensive tests covering the full chat-to-task pipeline
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove show_tool_calls kwarg (not in Agno 2.5.3), which crashed Agent.__init__
- Guard memory_search against top_k=None from model, return formatted string
- Skip Telegram/Discord startup silently when no token configured
- Replace placeholder MEMORY.md with proper structured hot memory document
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Timmy was exhibiting severe incoherence (no memory between messages, tool call
leakage, chain-of-thought narration, random tool invocations) due to creating
a brand new agent per HTTP request and giving a 3B model (llama3.2) a 73-line
system prompt with complex tool-calling instructions it couldn't follow.
Key changes:
- Add session.py singleton with stable session_id for conversation continuity
- Add _model_supports_tools() to strip tools from small models (< 7B)
- Add two-tier prompts: lite (12 lines) for small models, full for capable ones
- Add response sanitizer to strip leaked JSON tool calls and CoT narration
- Set show_tool_calls=False to prevent raw tool JSON in output
- Wire ConversationManager for user name extraction
- Deprecate orphaned memory_layers.py (unused 4-layer system)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit replaces the previous memory_layers.py with a proper three-tier
memory system as specified by the user:
## Tier 1 — Hot Memory (MEMORY.md)
- Single flat file always loaded into system context
- Contains: current status, standing rules, agent roster, key decisions
- ~300 lines max, pruned monthly
- Managed by HotMemory class
## Tier 2 — Structured Vault (memory/)
- Directory with three namespaces:
• self/ — identity.md, user_profile.md, methodology.md
• notes/ — session logs, AARs, research
• aar/ — post-task retrospectives
- Markdown format, Obsidian-compatible
- Append-only, date-stamped
- Managed by VaultMemory class
## Handoff Protocol
- last-session-handoff.md written at session end
- Contains: summary, key decisions, open items, next steps
- Auto-loaded at next session start
- Maintains continuity across resets
## Implementation
### New Files:
- src/timmy/memory_system.py — Core memory system
- MEMORY.md — Hot memory template
- memory/self/*.md — Identity, user profile, methodology
### Modified:
- src/timmy/agent.py — Integrated with memory system
- create_timmy() injects memory context
- TimmyWithMemory class with automatic fact extraction
- tests/test_agent.py — Updated for memory context
## Key Principles
- Hot memory = small and curated
- Vault = append-only, never delete
- Handoffs = continuity mechanism
- Flat files = human-readable, portable
## Usage
All 973 tests pass.
## Security (Workset A)
- XSS: Verified templates use safe DOM methods (textContent, createElement)
- Secrets: Fail-fast in production mode when L402 secrets not set
- Environment mode: Add TIMMY_ENV (development|production) validation
## Privacy (Workset C)
- Add telemetry_enabled config (default: False for sovereign AI)
- Pass telemetry setting to Agno Agent
- Update .env.example with TELEMETRY_ENABLED and TIMMY_ENV docs
## Agent Intelligence (Workset D)
- Enhanced TIMMY_SYSTEM_PROMPT with:
- Tool usage guidelines (when to use, when not to)
- Memory awareness documentation
- Operating mode documentation
- Help reduce unnecessary tool calls for simple queries
All 895 tests pass.
Telemetry disabled by default aligns with sovereign AI vision.
- Change Toolkit.add_tool() to Toolkit.register() (method was renamed in Agno)
- Fix PythonTools method: python -> run_python_code
- Fix FileTools method: write_file -> save_file
- Fix FileTools base_dir parameter: str -> Path object
- Fix Agent tools parameter: pass Toolkit wrapped in list
These fixes resolve critical startup errors that prevented Timmy agent from initializing:
- AttributeError: 'Toolkit' object has no attribute 'add_tool'
- AttributeError: 'PythonTools' object has no attribute 'python'
- TypeError: 'Toolkit' object is not iterable
All 895 tests pass after these changes.
Quality review: Agent now fully functional with working inference, memory,
and self-awareness capabilities.
Config (src/config.py):
- pydantic-settings Settings class: OLLAMA_URL, OLLAMA_MODEL, DEBUG
- Reads from .env (gitignored) with sane defaults
- settings singleton imported by health.py and agent.py
Removes two hardcodes:
- health.py: OLLAMA_URL="http://localhost:11434" → settings.ollama_url
- agent.py: Ollama(id="llama3.2") → settings.ollama_model
app.py:
- logging.basicConfig at INFO — requests/errors now visible in terminal
- docs_url/redoc_url gated on settings.debug (off by default)
pyproject.toml:
- pydantic-settings>=2.0.0 added to main dependencies
- hatch wheel config updated to include src/config.py
.env.example: documents all three env vars with inline comments
.gitignore: add !.env.example negation so the template gets committed
.github/workflows/tests.yml: runs pytest --cov on every push/PR
(ubuntu-latest, Python 3.11, pip cache)
All 27 tests pass.
https://claude.ai/code/session_01M4L3R98N5fgXFZRvV8X9b6