# AGENTS.md — Timmy Time Development Standards for AI Agents This file is the authoritative reference for any AI agent contributing to this repository. Read it first. Every time. --- ## 1. Project at a Glance **Timmy Time** is a local-first, sovereign AI agent system. No cloud. No telemetry. Bitcoin Lightning economics baked in. | Thing | Value | |------------------|----------------------------------------------------| | Language | Python 3.11+ | | Web framework | FastAPI + Jinja2 + HTMX | | Agent framework | Agno (wraps Ollama or AirLLM) | | Persistence | SQLite (`timmy.db`, `data/swarm.db`) | | Tests | pytest — must stay green | | Entry points | `timmy`, `timmy-serve`, `self-tdd` | | Config | pydantic-settings, reads `.env` | | Containers | Docker — each agent can run as an isolated service | ``` src/ config.py # Central settings (OLLAMA_URL, DEBUG, etc.) timmy/ # Core agent: agent.py, backends.py, cli.py, prompts.py dashboard/ # FastAPI app + routes + Jinja2 templates app.py store.py # In-memory MessageLog singleton routes/ # agents, health, swarm, swarm_ws, marketplace, │ # mobile, mobile_test, voice, voice_enhanced, │ # swarm_internal (HTTP API for Docker agents) templates/ # base.html + page templates + partials/ swarm/ # Multi-agent coordinator, registry, bidder, tasks, comms docker_runner.py # Spawn agents as Docker containers timmy_serve/ # L402 Lightning proxy, payment handler, TTS, CLI spark/ # Intelligence engine — events, predictions, advisory creative/ # Creative director + video assembler pipeline tools/ # Git, image, music, video tools for persona agents lightning/ # Lightning backend abstraction (mock + LND) agent_core/ # Substrate-agnostic agent interface voice/ # NLU intent detection (regex-based, no cloud) ws_manager/ # WebSocket manager (ws_manager singleton) notifications/ # Push notification store (notifier singleton) shortcuts/ # Siri Shortcuts API endpoints telegram_bot/ # Telegram bridge self_tdd/ # Continuous test watchdog tests/ # One test_*.py per module, all mocked static/ # style.css + bg.svg (arcane theme) docs/ # GitHub Pages site ``` --- ## 2. Non-Negotiable Rules 1. **Tests must stay green.** Run `make test` before committing. 2. **No cloud dependencies.** All AI computation runs on localhost. 3. **No new top-level files without purpose.** Don't litter the root directory. 4. **Follow existing patterns** — singletons, graceful degradation, pydantic-settings config. 5. **Security defaults:** Never hard-code secrets. Warn at startup when defaults are in use. 6. **XSS prevention:** Never use `innerHTML` with untrusted content. --- ## 3. Agent Roster Agents are divided into two tiers: **Builders** generate code and features; **Reviewers** provide quality gates, feedback, and hardening. The Local agent is the primary workhorse — use it as much as possible to minimise cost. --- ### 🏗️ BUILD TIER --- ### Local — Ollama (primary workhorse) **Model:** Any — `qwen2.5-coder`, `deepseek-coder-v2`, `codellama`, or whatever is loaded in Ollama. The owner decides the model; this agent is unrestricted. **Cost:** Free. Runs on the host machine. **Best for:** - Everything. This is the default agent for all coding tasks. - Iterative development, fast feedback loops, bulk generation - Running as a Docker swarm worker — scales horizontally at zero marginal cost - Experimenting with new models without changing any other code **Conventions to follow:** - Communicate with the coordinator over HTTP (`COORDINATOR_URL` env var) - Register capabilities honestly so the auction system routes tasks well - Write tests for anything non-trivial **No restrictions.** If a model can do it, do it. --- ### Kimi (Moonshot AI) **Model:** Moonshot large-context models. **Cost:** Paid API. **Best for:** - Large context feature drops (new pages, new subsystems, new agent personas) - Implementing roadmap items that require reading many files at once - Generating boilerplate for new agents (Echo, Mace, Helm, Seer, Forge, Quill) **Conventions to follow:** - Deliver working code with accompanying tests (even if minimal) - Match the arcane CSS theme — extend `static/style.css` - New agents follow the `SwarmNode` + `Registry` + Docker pattern - Lightning-gated endpoints follow the L402 pattern in `src/timmy_serve/l402_proxy.py` **Avoid:** - Touching CI/CD or pyproject.toml without coordinating - Adding cloud API calls - Removing existing tests --- ### DeepSeek (DeepSeek API) **Model:** `deepseek-chat` (V3) or `deepseek-reasoner` (R1). **Cost:** Near-free (~$0.14/M tokens). **Best for:** - Second-opinion feature generation when Kimi is busy or context is smaller - Large refactors with reasoning traces (use R1 for hard problems) - Code review passes before merging Kimi PRs - Anything that doesn't need a frontier model but benefits from strong reasoning **Conventions to follow:** - Same conventions as Kimi - Prefer V3 for straightforward tasks; R1 for anything requiring multi-step logic - Submit PRs for review by Claude before merging **Avoid:** - Bypassing the review tier for security-sensitive modules - Touching `src/swarm/coordinator.py` without Claude review --- ### 🔍 REVIEW TIER --- ### Claude (Anthropic) **Model:** Claude Sonnet. **Cost:** Paid API. **Best for:** - Architecture decisions and code-quality review - Writing and fixing tests; keeping coverage green - Updating documentation (README, AGENTS.md, inline comments) - CI/CD, tooling, Docker infrastructure - Debugging tricky async or import issues - Reviewing PRs from Local, Kimi, and DeepSeek before merge **Conventions to follow:** - Prefer editing existing files over creating new ones - Keep route files thin — business logic lives in the module, not the route - Use `from config import settings` for all env-var access - New routes go in `src/dashboard/routes/`, registered in `app.py` - Always add a corresponding `tests/test_.py` **Avoid:** - Large one-shot feature dumps (use Local or Kimi) - Touching `src/swarm/coordinator.py` for security work (that's Manus's lane) --- ### Gemini (Google) **Model:** Gemini 2.0 Flash (free tier) or Pro. **Cost:** Free tier generous; upgrade only if needed. **Best for:** - Documentation, README updates, inline docstrings - Frontend polish — HTML templates, CSS, accessibility review - Boilerplate generation (test stubs, config files, GitHub Actions) - Summarising large diffs for human review **Conventions to follow:** - Submit changes as PRs; always include a plain-English summary of what changed - For CSS changes, test at mobile breakpoint (≤768px) before submitting - Never modify Python business logic without Claude review **Avoid:** - Security-sensitive modules (that's Manus's lane) - Changing auction or payment logic - Large Python refactors --- ### Manus AI **Strengths:** Precision security work, targeted bug fixes, coverage gap analysis. **Best for:** - Security audits (XSS, injection, secret exposure) - Closing test coverage gaps for existing modules - Performance profiling of specific endpoints - Validating L402/Lightning payment flows **Conventions to follow:** - Scope tightly — one security issue per PR - Every security fix must have a regression test - Use `pytest-cov` output to identify gaps before writing new tests - Document the vulnerability class in the PR description **Avoid:** - Large-scale refactors (that's Claude's lane) - New feature work (use Local or Kimi) - Changing agent personas or prompt content --- ## 4. Docker — Running Agents as Containers Each agent can run as an isolated Docker container. Containers share the `data/` volume for SQLite and communicate with the coordinator over HTTP. ```bash make docker-build # build the image make docker-up # start dashboard + deps make docker-agent # spawn one agent worker (LOCAL model) make docker-down # stop everything make docker-logs # tail all service logs ``` ### How container agents communicate Container agents cannot use the in-memory `SwarmComms` channel. Instead they poll the coordinator's internal HTTP API: ``` GET /internal/tasks → list tasks open for bidding POST /internal/bids → submit a bid ``` Set `COORDINATOR_URL=http://dashboard:8000` in the container environment (docker-compose sets this automatically). ### Spawning a container agent from Python ```python from swarm.docker_runner import DockerAgentRunner runner = DockerAgentRunner(coordinator_url="http://dashboard:8000") info = runner.spawn("Echo", image="timmy-time:latest") runner.stop(info["container_id"]) ``` --- ## 5. Architecture Patterns ### Singletons (module-level instances) ```python from dashboard.store import message_log from notifications.push import notifier from ws_manager.handler import ws_manager from timmy_serve.payment_handler import payment_handler from swarm.coordinator import coordinator ``` ### Config access ```python from config import settings url = settings.ollama_url # never os.environ.get() directly in route files ``` ### HTMX pattern ```python return templates.TemplateResponse( "partials/chat_message.html", {"request": request, "role": "user", "content": message} ) ``` ### Graceful degradation ```python try: result = await some_optional_service() except Exception: result = fallback_value # log, don't crash ``` ### Tests - All heavy deps (`agno`, `airllm`, `pyttsx3`) are stubbed in `tests/conftest.py` - Use `pytest.fixture` for shared state; prefer function scope - Use `TestClient` from `fastapi.testclient` for route tests - No real Ollama required — mock `agent.run()` --- ## 6. Running Locally ```bash make install # create venv + install dev deps make test # run full test suite make dev # start dashboard (http://localhost:8000) make watch # self-TDD watchdog (60s poll) make test-cov # coverage report ``` Or with Docker: ```bash make docker-build # build image make docker-up # start dashboard make docker-agent # add a Local agent worker ``` --- ## 7. Roadmap (v2 → v3) **v2.0.0 — Exodus (in progress)** - [x] Persistent swarm state across restarts - [x] Docker infrastructure for agent containers - [x] Implement Echo, Mace, Helm, Seer, Forge, Quill persona agents (+ Pixel, Lyra, Reel) - [x] MCP tool integration for Timmy - [ ] Real LND gRPC backend for `PaymentHandler` (replace mock) - [ ] Marketplace frontend — wire `/marketplace` route to real data **v3.0.0 — Revelation (planned)** - [ ] Bitcoin Lightning treasury (agent earns and spends sats autonomously) - [ ] Single `.app` bundle for macOS (no Python install required) - [ ] Federation — multiple Timmy instances discover and bid on each other's tasks - [ ] Redis pub/sub replacing SQLite polling for high-throughput swarms --- ## 8. File Conventions | Pattern | Convention | |---------|-----------| | New route | `src/dashboard/routes/.py` + register in `app.py` | | New template | `src/dashboard/templates/.html` extends `base.html` | | New partial | `src/dashboard/templates/partials/.html` | | New subsystem | `src//` with `__init__.py` | | New test file | `tests/test_.py` | | Secrets | Read via `os.environ.get("VAR", "default")` + startup warning if default | | DB files | `.db` files go in project root or `data/` — never in `src/` | | Docker | One service per agent type in `docker-compose.yml` |