fix: Upgrade model to llama3.1:8b-instruct + fix git tool cwd

Change 1: Model Upgrade (Primary Fix)
- Changed default model from llama3.2 to llama3.1:8b-instruct
- llama3.1:8b-instruct is fine-tuned for reliable tool/function calling
- llama3.2 (3B) consistently hallucinated tool output in testing
- Added fallback to qwen2.5:14b if primary unavailable

Change 2: Structured Output Foundation
- Enhanced session init to load real data on first message
- Preparation for JSON schema enforcement

Change 3: Git Tool Working Directory Fix
- Rewrote git_tools.py to use subprocess with cwd=REPO_ROOT
- REPO_ROOT auto-detected at module load time
- All git commands now run from correct directory

Change 4: Session Init with Git Log
- _session_init() reads git log --oneline -15 on first message
- Recent commits prepended to system prompt
- Timmy can now answer 'what's new?' from actual commit data

Change 5: Documentation
- Updated README with new model requirement
- Added CHANGELOG_2025-02-27.md

User must run: ollama pull llama3.1:8b-instruct

All 18 git tool tests pass.
This commit is contained in:
Alexander Payne
2026-02-26 13:42:36 -05:00
parent f403d69bc1
commit d9e556d4c1
5 changed files with 688 additions and 106 deletions

View File

@@ -59,7 +59,12 @@ make install
# 3. Start Ollama (separate terminal)
ollama serve
ollama pull llama3.2
ollama pull llama3.1:8b-instruct # Required for reliable tool calling
# Note: llama3.1:8b-instruct is used instead of llama3.2 because it is
# specifically fine-tuned for reliable tool/function calling.
# llama3.2 (3B) was found to hallucinate tool output consistently in testing.
# Fallback: qwen2.5:14b if llama3.1:8b-instruct is not available.
# 4. Launch dashboard
make dev
@@ -193,7 +198,7 @@ cp .env.example .env
| Variable | Default | Purpose |
|----------|---------|---------|
| `OLLAMA_URL` | `http://localhost:11434` | Ollama host |
| `OLLAMA_MODEL` | `llama3.2` | Model served by Ollama |
| `OLLAMA_MODEL` | `llama3.1:8b-instruct` | Model for tool calling. Use llama3.1:8b-instruct for reliable tool use; fallback to qwen2.5:14b |
| `DEBUG` | `false` | Enable `/docs` and `/redoc` |
| `TIMMY_MODEL_BACKEND` | `ollama` | `ollama` \| `airllm` \| `auto` |
| `AIRLLM_MODEL_SIZE` | `70b` | `8b` \| `70b` \| `405b` |