- **Mode 1 — Per-Job (live):** Pay per request. Eval invoice (10 sats fixed) → Haiku judges the request → work invoice (dynamic, token-based) → Sonnet executes → result delivered.
- **Mode 2 — Session (live):** Pre-fund a credit balance. Requests automatically debit actual compute cost (eval + work tokens × 1.4 margin, converted to sats at live BTC/USD). No per-job invoices once active.
> **Note for repeat runs:** Tests 7 and 8c hit `GET /api/demo`, which is rate-limited to 5 req/hr per IP. If you run the testkit more than once in the same hour from the same IP, those two checks will return 429. This is expected behaviour — the rate limiter is working correctly. Run from a fresh IP (or wait an hour) for a clean 20/20.
- Stub mode is active (no real Lightning node). `paymentHash` is exposed on GET responses so the testkit can drive the full payment flow automatically. In production (real LNbits), `paymentHash` is hidden.
-`POST /api/dev/stub/pay/:hash` is only mounted when `NODE_ENV !== 'production'`.
- State machine advances server-side on every GET poll — no webhooks.
- AI models: Haiku for eval (cheap gating), Sonnet for work (full output).
- **Pricing:** eval = 10 sats fixed. Work invoice = actual token usage (input + output) × Anthropic per-token rate × 1.4 margin, converted at live BTC/USD. This is dynamic — a 53-char request typically produces an invoice of ~180 sats, not a fixed tier. The old 50/100/250 sat fixed tiers were replaced by this model.
- Max request length: 500 chars. Rate limiter: 5 req/hr/IP on `/api/demo` (in-memory, resets on server restart).
- Session expiry: 24 hours of inactivity. Balance is forfeited on expiry. Expiry is stated in the `expiresAt` field of every session response.
- Auth: `Authorization: Bearer <macaroon>` header. Macaroon is issued on first activation (GET /sessions/:id after deposit is paid).
- Cost per request: (eval tokens + work tokens) × model rate × 1.4 margin → converted to sats. If a request starts with enough balance but actual cost pushes balance negative, the request still completes and delivers — only the *next* request is blocked.
- If balance drops below 50 sats, session transitions to `paused`. Top up via `POST /sessions/:id/topup`. Session resumes automatically on the next GET poll once the topup invoice is paid.
- The same `POST /api/dev/stub/pay/:hash` endpoint works for all invoice types (eval, work, session deposit, topup).
### Eval + work latency (important for manual testers)
The eval call uses the real Anthropic API (Haiku), typically 2–5 seconds. The testkit uses polling loops (max 30s). Manual testers should poll with similar patience. The work call (Sonnet) typically runs 3–8 seconds.