* fix: name extraction blocklist, memory preview escaping, and gitignore cleanup
- Add _NAME_BLOCKLIST to extract_user_name() to reject gerunds and UI-state
words like "Sending" that were incorrectly captured as user names
- Collapse whitespace in get_memory_status() preview so newlines survive
JSON serialization without showing raw \n escape sequences
- Broaden .gitignore from specific memory/self/user_profile.md to memory/self/
and untrack memory/self/methodology.md (runtime-edited file)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: catch Ollama connection errors in session.py + add 71 smoke tests
- Wrap agent.run() in session.py with try/except so Ollama connection
failures return a graceful fallback message instead of dumping raw
tracebacks to Docker logs
- Add tests/test_smoke.py with 71 tests covering every GET route:
core pages, feature pages, JSON APIs, and a parametrized no-500 sweep
— catches import errors, template failures, and schema mismatches
that unit tests miss
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: agentic loop for multi-step tasks + Round 10 regression fixes
Agentic loop (Parts 1-4):
- Add multi-step chaining instructions to system prompt
- New agentic_loop.py with plan→execute→adapt→summarize flow
- Register plan_and_execute tool for background task execution
- Add max_agent_steps config setting (default: 10)
- Discord fix: 300s timeout, typing indicator, send error handling
- 16 new unit + e2e tests for agentic loop
Round 10 regressions (R1-R5, P1):
- R1: Fix literal \n escape sequences in tool responses
- R2: Chat timeout/error feedback in agent panel
- R3: /hands infinite spinner → static empty states
- R4: /self-coding infinite spinner → static stats + journal
- R5: /grok/status raw JSON → HTML dashboard template
- P1: VETO confirmation dialog on task cards
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: briefing route 500 in CI when agno is MagicMock stub
_call_agent() returned a MagicMock instead of a string when agno is
stubbed in tests, causing SQLite "Error binding parameter 4" on save.
Ensure the return value is always an actual string.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: briefing route 500 in CI — graceful degradation at route level
When agno is stubbed with MagicMock in CI, agent.run() returns a
MagicMock instead of raising — so the exception handler never fires
and a MagicMock propagates as the summary to SQLite, which can't
bind it.
Fix: catch at the route level and return a fallback Briefing object.
This follows the project's graceful degradation pattern — the briefing
page always renders, even when the backend is completely unavailable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: upgrade primary model from llama3.1:8b to qwen2.5:14b
- Swap OLLAMA_MODEL_PRIMARY to qwen2.5:14b for better reasoning
- llama3.1:8b-instruct becomes fallback
- Update .env default and README quick start
- Fix hardcoded model assertions in tests
qwen2.5:14b provides significantly better multi-step reasoning
and tool calling reliability while still running locally on
modest hardware. The 8B model remains as automatic fallback.
* security: centralize config, harden uploads, fix silent exceptions
- Add 9 pydantic Settings fields (skip_embeddings, disable_csrf,
rqlite_url, brain_source, brain_db_path, csrf_cookie_secure,
chat_api_max_body_bytes, timmy_test_mode) to centralize env-var access
- Migrate 8 os.environ.get() calls across 5 source files to use
`from config import settings` per project convention
- Add path traversal defense-in-depth to file upload endpoint
- Add 1MB request body size limit to chat API
- Make CSRF cookie secure flag configurable via settings
- Replace 2 silent `except: pass` blocks with debug logging in session.py
- Remove unused `import os` from brain/memory.py and csrf.py
- Update 5 CSRF test fixtures to patch settings instead of os.environ
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Add sandboxed calculator tool to Timmy's toolkit so arithmetic questions
get exact answers instead of LLM hallucinations
- Update system prompts (lite + full) to instruct Timmy to always use the
calculator and never attempt multi-digit math in his head
- Add self-contradiction guard to both prompts ("commit to your facts")
- Render Timmy's chat responses as markdown via marked.js + DOMPurify
instead of raw escaped text
- Suppress empty briefing notification on startup when there are 0
pending approval items
- Add calculator to session response sanitizer regex
- 18 new calculator tests, 2 updated briefing notification tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Timmy was exhibiting severe incoherence (no memory between messages, tool call
leakage, chain-of-thought narration, random tool invocations) due to creating
a brand new agent per HTTP request and giving a 3B model (llama3.2) a 73-line
system prompt with complex tool-calling instructions it couldn't follow.
Key changes:
- Add session.py singleton with stable session_id for conversation continuity
- Add _model_supports_tools() to strip tools from small models (< 7B)
- Add two-tier prompts: lite (12 lines) for small models, full for capable ones
- Add response sanitizer to strip leaked JSON tool calls and CoT narration
- Set show_tool_calls=False to prevent raw tool JSON in output
- Wire ConversationManager for user name extraction
- Deprecate orphaned memory_layers.py (unused 4-layer system)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>