feat: agentic loop for multi-step tasks + regression fixes (#148)

* fix: name extraction blocklist, memory preview escaping, and gitignore cleanup - Add _NAME_BLOCKLIST to extract_user_name() to reject gerunds and UI-state words like "Sending" that were incorrectly captured as user names - Collapse whitespace in get_memory_status() preview so newlines survive JSON serialization without showing raw \n escape sequences - Broaden .gitignore from specific memory/self/user_profile.md to memory/self/ and untrack memory/self/methodology.md (runtime-edited file) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: catch Ollama connection errors in session.py + add 71 smoke tests - Wrap agent.run() in session.py with try/except so Ollama connection failures return a graceful fallback message instead of dumping raw tracebacks to Docker logs - Add tests/test_smoke.py with 71 tests covering every GET route: core pages, feature pages, JSON APIs, and a parametrized no-500 sweep — catches import errors, template failures, and schema mismatches that unit tests miss Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: agentic loop for multi-step tasks + Round 10 regression fixes Agentic loop (Parts 1-4): - Add multi-step chaining instructions to system prompt - New agentic_loop.py with plan→execute→adapt→summarize flow - Register plan_and_execute tool for background task execution - Add max_agent_steps config setting (default: 10) - Discord fix: 300s timeout, typing indicator, send error handling - 16 new unit + e2e tests for agentic loop Round 10 regressions (R1-R5, P1): - R1: Fix literal \n escape sequences in tool responses - R2: Chat timeout/error feedback in agent panel - R3: /hands infinite spinner → static empty states - R4: /self-coding infinite spinner → static stats + journal - R5: /grok/status raw JSON → HTML dashboard template - P1: VETO confirmation dialog on task cards Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: briefing route 500 in CI when agno is MagicMock stub _call_agent() returned a MagicMock instead of a string when agno is stubbed in tests, causing SQLite "Error binding parameter 4" on save. Ensure the return value is always an actual string. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: briefing route 500 in CI — graceful degradation at route level When agno is stubbed with MagicMock in CI, agent.run() returns a MagicMock instead of raising — so the exception handler never fires and a MagicMock propagates as the summary to SQLite, which can't bind it. Fix: catch at the route level and return a fallback Briefing object. This follows the project's graceful degradation pattern — the briefing page always renders, even when the backend is completely unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 01:46:29 -05:00
parent b8e0f4539f
commit 7792ae745f
22 changed files with 1206 additions and 142 deletions
--- a/memory/self/methodology.md
+++ b/memory/self/methodology.md
@@ -1,70 +0,0 @@
-# Timmy Methodology
-
-## Tool Usage Philosophy
-
-### When NOT to Use Tools
-
- Identity questions ("What is your name?")
- General knowledge (history, science, concepts)
- Simple math (2+2, basic calculations)
- Greetings and social chat
- Anything in training data
-
-### When TO Use Tools
-
- Current events/news (after training cutoff)
- Explicit file operations (user requests)
- Complex calculations requiring precision
- Real-time data (prices, weather)
- System operations (explicit user request)
-
-### Decision Process
-
-1. Can I answer this from my training data? → Answer directly
-2. Does this require current/real-time info? → Consider web_search
-3. Did user explicitly request file/code/shell? → Use appropriate tool
-4. Is this a simple calculation? → Answer directly
-5. Unclear? → Answer directly (don't tool-spam)
-
-## Memory Management
-
-### Working Memory (Hot)
- Last 20 messages
- Immediate context
- Topic tracking
-
-### Short-Term Memory (Agno SQLite)
- Recent 100 conversations
- Survives restarts
- Automatic
-
-### Long-Term Memory (Vault)
- User facts and preferences
- Important learnings
- AARs and retrospectives
-
-### Hot Memory (MEMORY.md)
- Always loaded
- Current status, rules, roster
- User profile summary
- Pruned monthly
-
-## Handoff Protocol
-
-At end of every session:
-
-1. Write `memory/notes/last-session-handoff.md`
-2. Update MEMORY.md with any key decisions
-3. Extract facts to `memory/self/user_profile.md`
-4. If task completed, write AAR to `memory/aar/`
-
-## Session Start Hook
-
-1. Read MEMORY.md into system context
-2. Read last-session-handoff.md if exists
-3. Inject user profile context
-4. Begin conversation
-
---
-
-*Last updated: 2026-02-25*