Files

Alexander Payne 3463f4e4a4 fix: rename src/websocket to src/ws_manager to avoid websocket-client clash

selenium depends on websocket-client which installs a top-level
`websocket` package that shadows our src/websocket/ module on CI.
Renaming to ws_manager eliminates the conflict entirely — no more
sys.path hacks needed in conftest or Selenium tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-25 07:57:28 -05:00

12 KiB

Raw Blame History

AGENTS.md — Timmy Time Development Standards for AI Agents

This file is the authoritative reference for any AI agent contributing to this repository. Read it first. Every time.

1. Project at a Glance

Timmy Time is a local-first, sovereign AI agent system. No cloud. No telemetry. Bitcoin Lightning economics baked in.

Thing	Value
Language	Python 3.11+
Web framework	FastAPI + Jinja2 + HTMX
Agent framework	Agno (wraps Ollama or AirLLM)
Persistence	SQLite (`timmy.db`, `data/swarm.db`)
Tests	pytest — must stay green
Entry points	`timmy`, `timmy-serve`, `self-tdd`
Config	pydantic-settings, reads `.env`
Containers	Docker — each agent can run as an isolated service

src/
  config.py             # Central settings (OLLAMA_URL, DEBUG, etc.)
  timmy/                # Core agent: agent.py, backends.py, cli.py, prompts.py
  dashboard/            # FastAPI app + routes + Jinja2 templates
    app.py
    store.py            # In-memory MessageLog singleton
    routes/             # agents, health, swarm, swarm_ws, marketplace,
    │                   # mobile, mobile_test, voice, voice_enhanced,
    │                   # swarm_internal (HTTP API for Docker agents)
    templates/          # base.html + page templates + partials/
  swarm/                # Multi-agent coordinator, registry, bidder, tasks, comms
    docker_runner.py    # Spawn agents as Docker containers
  timmy_serve/          # L402 Lightning proxy, payment handler, TTS, CLI
  spark/                # Intelligence engine — events, predictions, advisory
  creative/             # Creative director + video assembler pipeline
  tools/                # Git, image, music, video tools for persona agents
  lightning/            # Lightning backend abstraction (mock + LND)
  agent_core/           # Substrate-agnostic agent interface
  voice/                # NLU intent detection (regex-based, no cloud)
  ws_manager/           # WebSocket manager (ws_manager singleton)
  notifications/        # Push notification store (notifier singleton)
  shortcuts/            # Siri Shortcuts API endpoints
  telegram_bot/         # Telegram bridge
  self_tdd/             # Continuous test watchdog
tests/                  # One test_*.py per module, all mocked
static/                 # style.css + bg.svg (arcane theme)
docs/                   # GitHub Pages site

2. Non-Negotiable Rules

Tests must stay green. Run make test before committing.
No cloud dependencies. All AI computation runs on localhost.
No new top-level files without purpose. Don't litter the root directory.
Follow existing patterns — singletons, graceful degradation, pydantic-settings config.
Security defaults: Never hard-code secrets. Warn at startup when defaults are in use.
XSS prevention: Never use innerHTML with untrusted content.

3. Agent Roster

Agents are divided into two tiers: Builders generate code and features; Reviewers provide quality gates, feedback, and hardening. The Local agent is the primary workhorse — use it as much as possible to minimise cost.

🏗️ BUILD TIER

Local — Ollama (primary workhorse)

Model: Any — qwen2.5-coder, deepseek-coder-v2, codellama, or whatever is loaded in Ollama. The owner decides the model; this agent is unrestricted. Cost: Free. Runs on the host machine.

Best for:

Everything. This is the default agent for all coding tasks.
Iterative development, fast feedback loops, bulk generation
Running as a Docker swarm worker — scales horizontally at zero marginal cost
Experimenting with new models without changing any other code

Conventions to follow:

Communicate with the coordinator over HTTP (COORDINATOR_URL env var)
Register capabilities honestly so the auction system routes tasks well
Write tests for anything non-trivial

No restrictions. If a model can do it, do it.

Kimi (Moonshot AI)

Model: Moonshot large-context models. Cost: Paid API.

Best for:

Large context feature drops (new pages, new subsystems, new agent personas)
Implementing roadmap items that require reading many files at once
Generating boilerplate for new agents (Echo, Mace, Helm, Seer, Forge, Quill)

Conventions to follow:

Deliver working code with accompanying tests (even if minimal)
Match the arcane CSS theme — extend static/style.css
New agents follow the SwarmNode + Registry + Docker pattern
Lightning-gated endpoints follow the L402 pattern in src/timmy_serve/l402_proxy.py

Avoid:

Touching CI/CD or pyproject.toml without coordinating
Adding cloud API calls
Removing existing tests

DeepSeek (DeepSeek API)

Model: deepseek-chat (V3) or deepseek-reasoner (R1). Cost: Near-free (~$0.14/M tokens).

Best for:

Second-opinion feature generation when Kimi is busy or context is smaller
Large refactors with reasoning traces (use R1 for hard problems)
Code review passes before merging Kimi PRs
Anything that doesn't need a frontier model but benefits from strong reasoning

Conventions to follow:

Same conventions as Kimi
Prefer V3 for straightforward tasks; R1 for anything requiring multi-step logic
Submit PRs for review by Claude before merging

Avoid:

Bypassing the review tier for security-sensitive modules
Touching src/swarm/coordinator.py without Claude review

🔍 REVIEW TIER

Claude (Anthropic)

Model: Claude Sonnet. Cost: Paid API.

Best for:

Architecture decisions and code-quality review
Writing and fixing tests; keeping coverage green
Updating documentation (README, AGENTS.md, inline comments)
CI/CD, tooling, Docker infrastructure
Debugging tricky async or import issues
Reviewing PRs from Local, Kimi, and DeepSeek before merge

Conventions to follow:

Prefer editing existing files over creating new ones
Keep route files thin — business logic lives in the module, not the route
Use from config import settings for all env-var access
New routes go in src/dashboard/routes/, registered in app.py
Always add a corresponding tests/test_<module>.py

Avoid:

Large one-shot feature dumps (use Local or Kimi)
Touching src/swarm/coordinator.py for security work (that's Manus's lane)

Gemini (Google)

Model: Gemini 2.0 Flash (free tier) or Pro. Cost: Free tier generous; upgrade only if needed.

Best for:

Documentation, README updates, inline docstrings
Frontend polish — HTML templates, CSS, accessibility review
Boilerplate generation (test stubs, config files, GitHub Actions)
Summarising large diffs for human review

Conventions to follow:

Submit changes as PRs; always include a plain-English summary of what changed
For CSS changes, test at mobile breakpoint (≤768px) before submitting
Never modify Python business logic without Claude review

Avoid:

Security-sensitive modules (that's Manus's lane)
Changing auction or payment logic
Large Python refactors

Manus AI

Strengths: Precision security work, targeted bug fixes, coverage gap analysis.

Best for:

Security audits (XSS, injection, secret exposure)
Closing test coverage gaps for existing modules
Performance profiling of specific endpoints
Validating L402/Lightning payment flows

Conventions to follow:

Scope tightly — one security issue per PR
Every security fix must have a regression test
Use pytest-cov output to identify gaps before writing new tests
Document the vulnerability class in the PR description

Avoid:

Large-scale refactors (that's Claude's lane)
New feature work (use Local or Kimi)
Changing agent personas or prompt content

4. Docker — Running Agents as Containers

Each agent can run as an isolated Docker container. Containers share the data/ volume for SQLite and communicate with the coordinator over HTTP.

make docker-build          # build the image
make docker-up             # start dashboard + deps
make docker-agent          # spawn one agent worker (LOCAL model)
make docker-down           # stop everything
make docker-logs           # tail all service logs

How container agents communicate

Container agents cannot use the in-memory SwarmComms channel. Instead they poll the coordinator's internal HTTP API:

GET  /internal/tasks          → list tasks open for bidding
POST /internal/bids           → submit a bid

Set COORDINATOR_URL=http://dashboard:8000 in the container environment (docker-compose sets this automatically).

Spawning a container agent from Python

from swarm.docker_runner import DockerAgentRunner

runner = DockerAgentRunner(coordinator_url="http://dashboard:8000")
info   = runner.spawn("Echo", image="timmy-time:latest")
runner.stop(info["container_id"])

5. Architecture Patterns

Singletons (module-level instances)

from dashboard.store import message_log
from notifications.push import notifier
from ws_manager.handler import ws_manager
from timmy_serve.payment_handler import payment_handler
from swarm.coordinator import coordinator

Config access

from config import settings
url = settings.ollama_url   # never os.environ.get() directly in route files

HTMX pattern

return templates.TemplateResponse(
    "partials/chat_message.html",
    {"request": request, "role": "user", "content": message}
)

Graceful degradation

try:
    result = await some_optional_service()
except Exception:
    result = fallback_value   # log, don't crash

Tests

All heavy deps (agno, airllm, pyttsx3) are stubbed in tests/conftest.py
Use pytest.fixture for shared state; prefer function scope
Use TestClient from fastapi.testclient for route tests
No real Ollama required — mock agent.run()

6. Running Locally

make install        # create venv + install dev deps
make test           # run full test suite
make dev            # start dashboard (http://localhost:8000)
make watch          # self-TDD watchdog (60s poll)
make test-cov       # coverage report

Or with Docker:

make docker-build   # build image
make docker-up      # start dashboard
make docker-agent   # add a Local agent worker

7. Roadmap (v2 → v3)

v2.0.0 — Exodus (in progress)

Persistent swarm state across restarts
Docker infrastructure for agent containers
Implement Echo, Mace, Helm, Seer, Forge, Quill persona agents (+ Pixel, Lyra, Reel)
MCP tool integration for Timmy
Real LND gRPC backend for PaymentHandler (replace mock)
Marketplace frontend — wire /marketplace route to real data

v3.0.0 — Revelation (planned)

Bitcoin Lightning treasury (agent earns and spends sats autonomously)
Single .app bundle for macOS (no Python install required)
Federation — multiple Timmy instances discover and bid on each other's tasks
Redis pub/sub replacing SQLite polling for high-throughput swarms

8. File Conventions

Pattern	Convention
New route	`src/dashboard/routes/<name>.py` + register in `app.py`
New template	`src/dashboard/templates/<name>.html` extends `base.html`
New partial	`src/dashboard/templates/partials/<name>.html`
New subsystem	`src/<name>/` with `__init__.py`
New test file	`tests/test_<module>.py`
Secrets	Read via `os.environ.get("VAR", "default")` + startup warning if default
DB files	`.db` files go in project root or `data/` — never in `src/`
Docker	One service per agent type in `docker-compose.yml`

12 KiB Raw Blame History

AGENTS.md — Timmy Time Development Standards for AI Agents

1. Project at a Glance

2. Non-Negotiable Rules

3. Agent Roster

🏗️ BUILD TIER

Local — Ollama (primary workhorse)

Kimi (Moonshot AI)

DeepSeek (DeepSeek API)

🔍 REVIEW TIER

Claude (Anthropic)

Gemini (Google)

Manus AI

4. Docker — Running Agents as Containers

How container agents communicate

Spawning a container agent from Python

5. Architecture Patterns

Singletons (module-level instances)

Config access

HTMX pattern

Graceful degradation

Tests

6. Running Locally

7. Roadmap (v2 → v3)

8. File Conventions

12 KiB

Raw Blame History