* test: remove hardcoded sleeps, add pytest-timeout - Replace fixed time.sleep() calls with intelligent polling or WebDriverWait - Add pytest-timeout dependency and --timeout=30 to prevent hangs - Fixes test flakiness and improves test suite speed * feat: add Aider AI tool to Forge's toolkit - Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist - Register tool in Forge's code toolkit - Add functional tests for the Aider tool * config: add opencode.json with local Ollama provider for sovereign AI * feat: Timmy fixes and improvements ## Bug Fixes - Fix read_file path resolution: add ~ expansion, proper relative path handling - Add repo_root to config.py with auto-detection from .git location - Fix hardcoded llama3.2 - now dynamic from settings.ollama_model ## Timmy's Requests - Add communication protocol to AGENTS.md (read context first, explain changes) - Create DECISIONS.md for architectural decision documentation - Add reasoning guidance to system prompts (step-by-step, state uncertainty) - Update tests to reflect correct model name (llama3.1:8b-instruct) ## Testing - All 177 dashboard tests pass - All 32 prompt/tool tests pass --------- Co-authored-by: Alexander Payne <apayne@MM.local>
3.0 KiB
AGENTS.md — Timmy Time Development Standards for AI Agents
Read CLAUDE.md for architecture patterns and conventions.
Communication Protocol
Before making changes, always:
- Read CLAUDE.md and AGENTS.md fully
- Explore the relevant src/ modules to understand existing patterns
- Explain what you're changing and why in plain English
- Provide decision rationale - don't just make changes, explain the reasoning
For Timmy's growth goals:
- Improve reasoning in complex/uncertain situations: think step-by-step, consider alternatives
- When uncertain, state uncertainty explicitly rather than guessing
- Document major decisions in DECISIONS.md
Non-Negotiable Rules
- Tests must stay green. Run
make testbefore committing. - No cloud dependencies. All AI computation runs on localhost.
- No new top-level files without purpose. Don't litter the root directory.
- Follow existing patterns — singletons, graceful degradation, pydantic-settings.
- Security defaults: Never hard-code secrets.
- XSS prevention: Never use
innerHTMLwith untrusted content.
Agent Roster
Build Tier
Local (Ollama) — Primary workhorse. Free. Unrestricted. Best for: everything, iterative dev, Docker swarm workers.
Kimi (Moonshot) — Paid. Large-context feature drops, new subsystems, persona agents. Avoid: touching CI/pyproject.toml, adding cloud calls, removing tests.
DeepSeek — Near-free. Second-opinion generation, large refactors (R1 for hard problems). Avoid: bypassing review tier for security modules.
Review Tier
Claude (Anthropic) — Architecture, tests, docs, CI/CD, PR review. Avoid: large one-shot feature dumps.
Gemini (Google) — Docs, frontend polish, boilerplate, diff summaries. Avoid: security modules, Python business logic without Claude review.
Manus AI — Security audits, coverage gaps, L402 validation. Avoid: large refactors, new features, prompt changes.
Docker Agents
Container agents poll the coordinator's HTTP API (not in-memory SwarmComms):
GET /internal/tasks → list tasks open for bidding
POST /internal/bids → submit a bid
COORDINATOR_URL=http://dashboard:8000 is set by docker-compose.
make docker-build # build image
make docker-up # start dashboard
make docker-agent # add a worker
File Conventions
| Pattern | Convention |
|---|---|
| New route | src/dashboard/routes/<name>.py + register in app.py |
| New template | src/dashboard/templates/<name>.html extends base.html |
| New subsystem | Add to existing src/<package>/ — see module map in CLAUDE.md |
| New test | tests/<module>/test_<feature>.py (mirror source structure) |
| Secrets | Via config.settings + startup warning if default |
| DB files | Project root or data/ — never in src/ |
Roadmap
v2.0 Exodus (in progress): Swarm + L402 + Voice + Marketplace + Hands
v3.0 Revelation (planned): Lightning treasury + .app bundle + federation