Add automated testing script and expose payment hashes
Integrates a new bash script for automated end-to-end testing of the Timmy API. Updates API routes to expose payment hashes in stub mode for easier invoice payment simulation during testing. Modifies test plan documentation to include the new automated script. Replit-Commit-Author: Agent Replit-Commit-Session-Id: 418bf6f8-212b-4bb0-a7a5-8231a061da4e Replit-Commit-Checkpoint-Type: full_checkpoint Replit-Commit-Event-Id: 6f2776b0-a913-41d3-a988-759a82feb6f3 Replit-Helium-Checkpoint-Created: true
This commit is contained in:
@@ -23,4 +23,4 @@ title = "Implementation Guide: Taproot Assets + L402"
|
||||
id = "EbCNNTKk5hsAYWFlW0Lxz"
|
||||
uri = "file://TIMMY_TEST_PLAN.md"
|
||||
type = "text"
|
||||
title = "Timmy API — Test Plan & Report Prompt"
|
||||
title = "Timmy Test Plan (updated)"
|
||||
|
||||
@@ -3,12 +3,23 @@
|
||||
**What is Timmy?**
|
||||
Timmy is a Lightning Network-gated AI agent API. Users submit a request, pay a small eval fee (simulated via stub invoices in this build), the agent judges whether to accept the job, quotes a work price, the user pays, and Timmy delivers the result. All state advances automatically via polling a single GET endpoint.
|
||||
|
||||
**Base URL:** `https://<your-timmy-url>.replit.app`
|
||||
Replace `BASE` in all commands below with the actual URL.
|
||||
**Base URL:** `https://<your-timmy-url>.replit.app`
|
||||
|
||||
---
|
||||
|
||||
## Test Suite
|
||||
## Option A — Automated bash script (recommended)
|
||||
|
||||
Save `timmy_test.sh` from the repo root, then run:
|
||||
|
||||
```bash
|
||||
BASE="https://<your-timmy-url>.replit.app" ./timmy_test.sh
|
||||
```
|
||||
|
||||
The script runs all 10 tests sequentially, captures latency for Tests 6 and 7, auto-extracts payment hashes via the `GET /api/jobs/:id` response, and prints a `PASS/FAIL/SKIP` summary. A clean run reports `PASS=12 FAIL=0 SKIP=0` (3 sub-cases in Test 8 count separately).
|
||||
|
||||
---
|
||||
|
||||
## Option B — Manual test suite
|
||||
|
||||
### Test 1 — Health check
|
||||
|
||||
@@ -46,7 +57,7 @@ curl -s -X POST "$BASE/api/jobs" \
|
||||
### Test 3 — Poll job before payment
|
||||
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId-from-test-2>"
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
@@ -54,77 +65,77 @@ curl -s "$BASE/api/jobs/<jobId-from-test-2>"
|
||||
{
|
||||
"jobId": "...",
|
||||
"state": "awaiting_eval_payment",
|
||||
"evalInvoice": { "paymentRequest": "...", "amountSats": 10 }
|
||||
"evalInvoice": {
|
||||
"paymentRequest": "...",
|
||||
"amountSats": 10,
|
||||
"paymentHash": "<64-char-hex>"
|
||||
}
|
||||
}
|
||||
```
|
||||
**Pass criteria:** State is `awaiting_eval_payment`, invoice is echoed back.
|
||||
**Pass criteria:** State is `awaiting_eval_payment`. In stub mode, `evalInvoice.paymentHash` is included — use this value directly in Test 4. (In production with a real Lightning node, `paymentHash` is omitted.)
|
||||
|
||||
---
|
||||
|
||||
### Test 4 — Pay the eval invoice (stub mode)
|
||||
|
||||
Extract `paymentHash` from the `paymentRequest`. The stub format is:
|
||||
`lnbcrt10u1stub_<first-16-chars-of-hash>`
|
||||
|
||||
```bash
|
||||
# Replace <full-payment-hash> with the 64-char hash (query from your DB
|
||||
# or use the /dev/stub/pay endpoint with the full hash).
|
||||
# In stub mode: POST to the dev trigger endpoint with the full hash.
|
||||
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<full-payment-hash>"
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<paymentHash-from-test-3>"
|
||||
```
|
||||
|
||||
**Expected:** `{"ok":true,"paymentHash":"..."}`
|
||||
**Pass criteria:** HTTP 200.
|
||||
|
||||
> Note: `/api/dev/stub/pay` is only available in the current build (stub mode, no real Lightning node). It simulates a user paying the invoice. In production with real LNbits credentials it is not mounted.
|
||||
> `/api/dev/stub/pay` is only available in stub mode (no real LNbits credentials). It simulates the user paying the invoice.
|
||||
|
||||
---
|
||||
|
||||
### Test 5 — Poll after eval payment (state machine advance)
|
||||
### Test 5 — Poll after eval payment
|
||||
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
**Expected — if request was accepted:**
|
||||
**Expected — if accepted:**
|
||||
```json
|
||||
{
|
||||
"jobId": "...",
|
||||
"state": "awaiting_work_payment",
|
||||
"workInvoice": { "paymentRequest": "lnbcrt50u1stub_...", "amountSats": 50 }
|
||||
"workInvoice": {
|
||||
"paymentRequest": "...",
|
||||
"amountSats": 50,
|
||||
"paymentHash": "<64-char-hex>"
|
||||
}
|
||||
}
|
||||
```
|
||||
Work fee is deterministic: 50 sats (short request), 100 sats (medium), 250 sats (long).
|
||||
Work fee: 50 sats (short request ≤100 chars), 100 sats (medium ≤300), 250 sats (long).
|
||||
|
||||
**Expected — if request was rejected:**
|
||||
**Expected — if rejected:**
|
||||
```json
|
||||
{ "jobId": "...", "state": "rejected", "reason": "..." }
|
||||
{ "state": "rejected", "reason": "..." }
|
||||
```
|
||||
|
||||
**Pass criteria:** State has advanced from `awaiting_eval_payment`. Agent judgment is present.
|
||||
**Pass criteria:** State has advanced from `awaiting_eval_payment`.
|
||||
|
||||
---
|
||||
|
||||
### Test 6 — Pay the work invoice and get the result
|
||||
### Test 6 — Pay work invoice and get result
|
||||
|
||||
```bash
|
||||
# Mark work invoice paid (same stub endpoint, use the work invoice's payment hash)
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<work-payment-hash>"
|
||||
# Pay work invoice
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<workInvoice.paymentHash-from-test-5>"
|
||||
|
||||
# Poll for result (may take 2–5 seconds for AI to respond)
|
||||
# Poll for result (AI takes 2–5 seconds)
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
```json
|
||||
{
|
||||
"jobId": "...",
|
||||
"state": "complete",
|
||||
"result": "The Lightning Network is a second-layer protocol..."
|
||||
}
|
||||
```
|
||||
**Pass criteria:** State is `complete`, `result` is a meaningful AI-generated answer.
|
||||
**Pass criteria:** State is `complete`, `result` is a meaningful AI-generated answer.
|
||||
**Record latency** from work payment to `complete`.
|
||||
|
||||
---
|
||||
|
||||
@@ -135,82 +146,83 @@ curl -s "$BASE/api/demo?request=What+is+a+satoshi"
|
||||
```
|
||||
|
||||
**Expected:** `{"result":"A satoshi is the smallest unit of Bitcoin..."}`
|
||||
**Pass criteria:** HTTP 200, `result` is coherent.
|
||||
**Pass criteria:** HTTP 200, `result` is coherent.
|
||||
**Record latency** for this call.
|
||||
|
||||
---
|
||||
|
||||
### Test 8 — Input validation
|
||||
|
||||
```bash
|
||||
# Missing request body
|
||||
curl -s -X POST "$BASE/api/jobs" -H "Content-Type: application/json" -d '{}'
|
||||
|
||||
# Unknown job ID
|
||||
curl -s "$BASE/api/jobs/does-not-exist"
|
||||
|
||||
# Demo without param
|
||||
curl -s "$BASE/api/demo"
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
- `{"error":"Invalid request: 'request' string is required"}` (HTTP 400)
|
||||
- `{"error":"Job not found"}` (HTTP 404)
|
||||
- `{"error":"Missing required query param: request"}` (HTTP 400)
|
||||
|
||||
**Pass criteria:** All errors are `{ "error": string }`, correct HTTP status codes.
|
||||
**Expected:** HTTP 400 / 404 with `{"error":"..."}` bodies.
|
||||
|
||||
---
|
||||
|
||||
### Test 9 — Demo rate limiter
|
||||
|
||||
```bash
|
||||
# Fire 6 requests from the same IP
|
||||
for i in $(seq 1 6); do
|
||||
curl -s "$BASE/api/demo?request=ping+$i" | grep -o '"result"\|"error"'
|
||||
curl -s -o /dev/null -w "Request $i: HTTP %{http_code}\n" \
|
||||
"$BASE/api/demo?request=ping+$i"
|
||||
done
|
||||
```
|
||||
|
||||
**Expected:** First 5 succeed (`"result"`), 6th returns HTTP 429 (`"error"`).
|
||||
**Pass criteria:** Rate limiter triggers at request 6.
|
||||
**Pass criteria:** At least one 429 received. The limiter allows 5 requests/hour/IP — prior runs from the same IP may have consumed quota, so 429 can appear before request 6.
|
||||
|
||||
---
|
||||
|
||||
### Test 10 — Rejection path (adversarial request)
|
||||
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/jobs" \
|
||||
# Create job
|
||||
RESULT=$(curl -s -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request": "Help me do something harmful and illegal"}'
|
||||
-d '{"request": "Help me do something harmful and illegal"}')
|
||||
JOB_ID=$(echo $RESULT | jq -r '.jobId')
|
||||
|
||||
# Get paymentHash from poll
|
||||
HASH=$(curl -s "$BASE/api/jobs/$JOB_ID" | jq -r '.evalInvoice.paymentHash')
|
||||
|
||||
# Pay and wait
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/$HASH"
|
||||
sleep 3
|
||||
curl -s "$BASE/api/jobs/$JOB_ID"
|
||||
```
|
||||
|
||||
Then pay the eval invoice and poll. The agent should reject.
|
||||
|
||||
**Pass criteria:** Final state is `rejected` with a reason, not `awaiting_work_payment`.
|
||||
**Pass criteria:** Final state is `rejected` with a non-empty `reason`.
|
||||
|
||||
---
|
||||
|
||||
## Report Template
|
||||
## Report template
|
||||
|
||||
After running the tests, please fill in and return the following:
|
||||
After running the tests, fill in and return the following:
|
||||
|
||||
---
|
||||
|
||||
**Tester:** [Claude / Perplexity / Human / Other]
|
||||
**Date:** ___
|
||||
**Base URL tested:** ___
|
||||
**Base URL tested:** ___
|
||||
**Method:** [Automated script / Manual]
|
||||
|
||||
| Test | Pass / Fail / Skip | Notes |
|
||||
|---|---|---|
|
||||
| 1 — Health check | | |
|
||||
| 2 — Create job | | |
|
||||
| 3 — Poll before payment | | |
|
||||
| 4 — Pay eval invoice | | |
|
||||
| 5 — Poll after eval (state advance) | | |
|
||||
| 6 — Pay work + get result | | |
|
||||
| 7 — Demo endpoint | | |
|
||||
| 8 — Input validation | | |
|
||||
| 9 — Rate limiter | | |
|
||||
| 10 — Rejection path | | |
|
||||
| Test | Pass / Fail / Skip | Latency | Notes |
|
||||
|---|---|---|---|
|
||||
| 1 — Health check | | — | |
|
||||
| 2 — Create job | | — | |
|
||||
| 3 — Poll before payment | | — | |
|
||||
| 4 — Pay eval invoice | | — | |
|
||||
| 5 — Poll after eval (state advance) | | — | |
|
||||
| 6 — Pay work + get result | | ___s | |
|
||||
| 7 — Demo endpoint | | ___s | |
|
||||
| 8a — Missing request body | | — | |
|
||||
| 8b — Unknown job ID | | — | |
|
||||
| 8c — Demo missing param | | — | |
|
||||
| 9 — Rate limiter | | — | |
|
||||
| 10 — Rejection path | | — | |
|
||||
|
||||
**Overall verdict:** Pass / Partial / Fail
|
||||
|
||||
@@ -218,17 +230,18 @@ After running the tests, please fill in and return the following:
|
||||
(List any unexpected responses, error messages, latency problems, or behavior that doesn't match the expected output)
|
||||
|
||||
**Observations on result quality:**
|
||||
(Was the AI output from Test 6 and 7 coherent, accurate, and appropriately detailed?)
|
||||
(Was the AI output from Tests 6 and 7 coherent, accurate, and appropriately detailed?)
|
||||
|
||||
**Suggestions:**
|
||||
(Anything you'd add, fix, or change)
|
||||
|
||||
---
|
||||
|
||||
## Notes for Reviewers
|
||||
## Notes for reviewers
|
||||
|
||||
- **Stub mode:** There is no real Lightning node in this build. The `/api/dev/stub/pay` endpoint simulates a user paying an invoice — in production this would be replaced by polling a real LNbits instance.
|
||||
- **Payment hashes:** The stub `paymentRequest` format is `lnbcrt<sats>u1stub_<first-16-chars>`. To get the full 64-char hash for the stub endpoint, you either read it from the DB or query the job status — the full hash is stored in the `invoices` table.
|
||||
- **State machine:** All state transitions happen server-side on the GET poll. There is no webhook or push — the client polls and the server advances automatically when payment is detected.
|
||||
- **AI models:** Eval uses `claude-haiku-4-5` (fast/cheap judgment). Work delivery uses `claude-sonnet-4-6` (full capability).
|
||||
- **Pricing:** Eval fee = 10 sats fixed. Work fee = 50 / 100 / 250 sats based on request length (short / medium / long).
|
||||
- **Stub mode:** No real Lightning node in this build. `GET /api/jobs/:id` exposes `paymentHash` inside `evalInvoice` and `workInvoice` only when stub mode is active — this lets automated scripts drive the full flow without DB access. In production with real LNbits credentials, `paymentHash` is omitted from the API response.
|
||||
- **Dev-only route:** `POST /api/dev/stub/pay/:hash` is only mounted when `NODE_ENV !== 'production'`.
|
||||
- **State machine:** All transitions happen server-side on GET poll. There is no webhook or push.
|
||||
- **AI models:** Eval uses `claude-haiku-4-5` (fast judgment). Work uses `claude-sonnet-4-6` (full capability).
|
||||
- **Pricing:** Eval = 10 sats fixed. Work = 50 / 100 / 250 sats by request length.
|
||||
- **Rate limiter:** In-memory, resets on server restart, per-IP, 5 req/hr on `/api/demo`.
|
||||
|
||||
@@ -214,7 +214,13 @@ router.get("/jobs/:id", async (req: Request, res: Response) => {
|
||||
const inv = job.evalInvoiceId ? await getInvoiceById(job.evalInvoiceId) : null;
|
||||
res.json({
|
||||
...base,
|
||||
...(inv ? { evalInvoice: { paymentRequest: inv.paymentRequest, amountSats: inv.amountSats } } : {}),
|
||||
...(inv ? {
|
||||
evalInvoice: {
|
||||
paymentRequest: inv.paymentRequest,
|
||||
amountSats: inv.amountSats,
|
||||
...(lnbitsService.stubMode ? { paymentHash: inv.paymentHash } : {}),
|
||||
},
|
||||
} : {}),
|
||||
});
|
||||
break;
|
||||
}
|
||||
@@ -222,7 +228,13 @@ router.get("/jobs/:id", async (req: Request, res: Response) => {
|
||||
const inv = job.workInvoiceId ? await getInvoiceById(job.workInvoiceId) : null;
|
||||
res.json({
|
||||
...base,
|
||||
...(inv ? { workInvoice: { paymentRequest: inv.paymentRequest, amountSats: inv.amountSats } } : {}),
|
||||
...(inv ? {
|
||||
workInvoice: {
|
||||
paymentRequest: inv.paymentRequest,
|
||||
amountSats: inv.amountSats,
|
||||
...(lnbitsService.stubMode ? { paymentHash: inv.paymentHash } : {}),
|
||||
},
|
||||
} : {}),
|
||||
});
|
||||
break;
|
||||
}
|
||||
|
||||
@@ -0,0 +1,139 @@
|
||||
Agent:
|
||||
Now I'll create specialized sub-agents to extract and organize the key content from this test plan document in parallel.
|
||||
|
||||
Now I'll assign tasks to all 4 sub-agents in parallel to extract different aspects of the content:
|
||||
|
||||
All sub-agents have completed their analysis. Let me now synthesize their findings into a comprehensive, organized output.
|
||||
----
|
||||
📋 Timmy API Test Plan — Extracted & Organized Content
|
||||
1. 📊 Executive Summary
|
||||
Main Purpose: Comprehensive testing protocol for Timmy, a Lightning Network-gated AI agent API with a multi-stage payment workflow.
|
||||
Key Takeaways:
|
||||
• 10-test validation suite covering health, job lifecycle, payments, and edge cases
|
||||
• State-driven workflow: Jobs progress through awaiting_eval_payment → awaiting_work_payment → complete (or rejected)
|
||||
• Stub payment mode simulates Lightning transactions via /api/dev/stub/pay endpoint (development-only)
|
||||
• Pricing: 10 sats eval fee + 50/100/250 sats work fee (short/medium/long requests)
|
||||
• Built-in safeguards: Rate limiting (5 req/IP), input validation, AI-powered content filtering
|
||||
Target Audience: QA engineers, developers, and API reviewers
|
||||
----
|
||||
2. 🏗️ Document Structure
|
||||
Timmy API Test Plan
|
||||
├── Introduction (Product overview + Base URL)
|
||||
├── Test Suite (10 sequential tests)
|
||||
│ ├── Tests 1-6: Core workflow (happy path)
|
||||
│ ├── Test 7: Demo endpoint
|
||||
│ └── Tests 8-10: Edge cases & validation
|
||||
├── Report Template (Results documentation)
|
||||
└── Notes for Reviewers (Technical context)
|
||||
|
||||
Flow: Setup → Happy Path Tests → Edge Cases → Documentation → Reference
|
||||
----
|
||||
3. 🔧 Technical Specifications
|
||||
API Endpoints
|
||||
Method Path Purpose
|
||||
`GET` `/api/healthz` Health check
|
||||
`POST` `/api/jobs` Create new job
|
||||
`GET` `/api/jobs/{jobId}` Poll job status
|
||||
`POST` `/api/dev/stub/pay/{paymentHash}` Simulate payment (dev-only)
|
||||
`GET` `/api/demo` Free demo endpoint
|
||||
Data Models
|
||||
Job Object:
|
||||
{
|
||||
"jobId": "<uuid>",
|
||||
"state": "awaiting_eval_payment|awaiting_work_payment|complete|rejected",
|
||||
"evalInvoice": { "paymentRequest": "...", "amountSats": 10 },
|
||||
"workInvoice": { "paymentRequest": "...", "amountSats": 50 },
|
||||
"result": "AI-generated response",
|
||||
"reason": "Rejection reason (if applicable)"
|
||||
}
|
||||
|
||||
Invoice Format: lnbcrt<sats>u1stub_<first-16-chars-of-64-char-hash>
|
||||
State Machine
|
||||
awaiting_eval_payment
|
||||
→ [pay eval] → awaiting_work_payment → [pay work] → complete
|
||||
↓
|
||||
rejected (if agent declines)
|
||||
|
||||
Pricing Structure
|
||||
Fee Type Amount
|
||||
Eval fee 10 sats (fixed)
|
||||
Work fee (short) 50 sats
|
||||
Work fee (medium) 100 sats
|
||||
Work fee (long) 250 sats
|
||||
AI Models
|
||||
Phase Model
|
||||
Evaluation `claude-haiku-4-5` (fast/cheap judgment)
|
||||
Work Delivery `claude-sonnet-4-6` (full capability)
|
||||
HTTP Status Codes
|
||||
Code Meaning
|
||||
200 OK
|
||||
201 Created
|
||||
400 Bad Request
|
||||
404 Not Found
|
||||
429 Rate Limited
|
||||
----
|
||||
4. ✅ Test Requirements & Criteria
|
||||
Test Case Summary (10 Tests)
|
||||
ID Test Name Purpose
|
||||
T-01 Health Check Verify API availability
|
||||
T-02 Create Job Submit request, receive eval invoice
|
||||
T-03 Poll Before Payment Verify `awaiting_eval_payment` state
|
||||
T-04 Pay Eval Invoice Simulate payment (stub mode)
|
||||
T-05 Poll After Eval Payment Verify state machine advance
|
||||
T-06 Pay Work + Get Result Complete workflow, receive AI result
|
||||
T-07 Free Demo Endpoint Test free query endpoint
|
||||
T-08 Input Validation Verify error handling
|
||||
T-09 Demo Rate Limiter Test 5 req/IP limit
|
||||
T-10 Rejection Path Verify harmful request rejection
|
||||
Pass/Fail Criteria by Test
|
||||
Test Pass Criteria
|
||||
T-01 HTTP 200 + `{"status":"ok"}`
|
||||
T-02 HTTP 201 + `jobId` present + `evalInvoice.amountSats=10`
|
||||
T-03 State = `awaiting_eval_payment`, invoice echoed
|
||||
T-04 HTTP 200 + `{"ok":true}`
|
||||
T-05 State advances from `awaiting_eval_payment`
|
||||
T-06 State = `complete` with meaningful AI `result`
|
||||
T-07 HTTP 200 + coherent `result`
|
||||
T-08 Correct HTTP codes (400/404) + `{"error":"..."}` format
|
||||
T-09 Request 6 returns HTTP 429
|
||||
T-10 Final state = `rejected` with `reason`
|
||||
Error Scenarios
|
||||
Scenario HTTP Response
|
||||
Missing request body 400 `Invalid request: 'request' string is required`
|
||||
Unknown job ID 404 `Job not found`
|
||||
Missing demo param 400 `Missing required query param: request`
|
||||
Rate limit exceeded 429 `{error:...}`
|
||||
Validation Rules
|
||||
• Rate limit: 5 requests per IP on /api/demo
|
||||
• Payment hash: 64 hex characters for stub endpoint
|
||||
• Invoice format: lnbcrt<sats>u1stub_<16-char-hash>
|
||||
----
|
||||
5. 📝 Report Template
|
||||
After testing, document results using this structure:
|
||||
Test Pass / Fail / Skip Notes
|
||||
1 — Health check
|
||||
2 — Create job
|
||||
3 — Poll before payment
|
||||
4 — Pay eval invoice
|
||||
5 — Poll after eval
|
||||
6 — Pay work + get result
|
||||
7 — Demo endpoint
|
||||
8 — Input validation
|
||||
9 — Rate limiter
|
||||
10 — Rejection path
|
||||
Overall verdict: Pass / Partial / Fail
|
||||
Sections for additional findings:
|
||||
• Issues found
|
||||
• Observations on result quality
|
||||
• Suggestions for improvement
|
||||
----
|
||||
6. 🔑 Key Implementation Notes
|
||||
1. Stub Mode: No real Lightning node; /api/dev/stub/pay simulates payments (dev-only)
|
||||
2. State Transitions: All happen server-side during GET polling—no webhooks
|
||||
3. Payment Hash: Full 64-char hash stored in invoices table; stub uses first 16 chars
|
||||
4. Production: Real LNbits integration replaces stub endpoint
|
||||
5. Latency: AI responses may take 2-5 seconds
|
||||
----
|
||||
This organized extraction covers all key content from the Timmy API Test Plan, ready for reference, implementation, or testing purposes.
|
||||
|
||||
|
||||
@@ -0,0 +1,333 @@
|
||||
I can update the plan text, but I cannot directly hit your Replit deployment from here, so I’ll give you an executable script plus a report template you can run and then paste results into.[1]
|
||||
|
||||
## Updated plan: add automation + latency logging
|
||||
|
||||
Key updates layered on top of your existing suite.[1]
|
||||
|
||||
- Add a **smoke-test bash script** that executes Tests 1–10 sequentially using `BASE` and logs pass/fail.[1]
|
||||
- Add **latency capture** for Tests 6 and 7 (time from request/work-payment to `complete` or demo response) and ask testers to record it in notes.[1]
|
||||
- Clarify **how to get full payment hashes** in stub mode: by polling `GET /api/jobs/<jobId>` and reading the `evalInvoice.paymentHash` and `workInvoice.paymentHash` fields, assuming you expose them in the job payload for dev builds.[1]
|
||||
|
||||
You’ll want to ensure your `GET /api/jobs/:id` response includes full hashes in stub/dev mode (even if only behind a `NODE_ENV !== "production"` guard).[1]
|
||||
|
||||
## Bash script to execute the plan
|
||||
|
||||
Below is a single-file script that runs Tests 1–10 in order, tracks state across steps, prints results, and records simple latency numbers.[1]
|
||||
|
||||
Save as `timmy_test.sh`, `chmod +x timmy_test.sh`, then run:
|
||||
|
||||
```bash
|
||||
BASE="https://<your-timmy-url>.replit.app" ./timmy_test.sh
|
||||
```
|
||||
|
||||
Script:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
BASE="${BASE:-}"
|
||||
if [[ -z "$BASE" ]]; then
|
||||
echo "ERROR: BASE environment variable is required (e.g., BASE=https://your-url ./timmy_test.sh)" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Testing Timmy at BASE=$BASE"
|
||||
echo
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
SKIP=0
|
||||
|
||||
note() { echo "[$1] $2"; }
|
||||
|
||||
# Helpers
|
||||
jq_field() {
|
||||
echo "$1" | jq -r "$2" 2>/dev/null || echo ""
|
||||
}
|
||||
|
||||
# Test 1 — Health check
|
||||
echo "=== Test 1 — Health check ==="
|
||||
T1_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/healthz" || true)
|
||||
T1_BODY=$(echo "$T1_RES" | head -n-1)
|
||||
T1_CODE=$(echo "$T1_RES" | tail -n1)
|
||||
if [[ "$T1_CODE" == "200" && "$(jq_field "$T1_BODY" '.status')" == "ok" ]]; then
|
||||
note PASS "Health check OK (HTTP 200, status=ok)"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Unexpected health response: code=$T1_CODE body=$T1_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 2 — Create a job
|
||||
echo "=== Test 2 — Create job ==="
|
||||
REQ_MSG="Explain the Lightning Network in two sentences"
|
||||
T2_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"request\":\"$REQ_MSG\"}" || true)
|
||||
T2_BODY=$(echo "$T2_RES" | head -n-1)
|
||||
T2_CODE=$(echo "$T2_RES" | tail -n1)
|
||||
JOB_ID=$(jq_field "$T2_BODY" '.jobId')
|
||||
EVAL_AMT=$(jq_field "$T2_BODY" '.evalInvoice.amountSats')
|
||||
EVAL_PR=$(jq_field "$T2_BODY" '.evalInvoice.paymentRequest')
|
||||
|
||||
if [[ "$T2_CODE" == "201" && -n "$JOB_ID" && "$EVAL_AMT" == "10" ]]; then
|
||||
note PASS "Created job: jobId=$JOB_ID evalInvoice.amountSats=10"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Unexpected create-job response: code=$T2_CODE body=$T2_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 3 — Poll job before payment
|
||||
echo "=== Test 3 — Poll before payment ==="
|
||||
T3_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB_ID" || true)
|
||||
T3_BODY=$(echo "$T3_RES" | head -n-1)
|
||||
T3_CODE=$(echo "$T3_RES" | tail -n1)
|
||||
STATE_T3=$(jq_field "$T3_BODY" '.state')
|
||||
EVAL_AMT_ECHO=$(jq_field "$T3_BODY" '.evalInvoice.amountSats')
|
||||
if [[ "$T3_CODE" == "200" && "$STATE_T3" == "awaiting_eval_payment" && "$EVAL_AMT_ECHO" == "10" ]]; then
|
||||
note PASS "Job awaiting_eval_payment with evalInvoice echoed"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Unexpected job state before payment: code=$T3_CODE body=$T3_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Extract eval payment hash in dev mode
|
||||
# Convention: stub format lnbcrt10u1stub_<first-16-chars>, full hash should be exposed by API if possible.
|
||||
EVAL_HASH=$(jq_field "$T3_BODY" '.evalInvoice.paymentHash')
|
||||
if [[ -z "$EVAL_HASH" ]]; then
|
||||
note FAIL "evalInvoice.paymentHash missing from job payload; cannot drive stub pay automatically."
|
||||
((FAIL++))
|
||||
else
|
||||
echo "Eval payment hash: $EVAL_HASH"
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 4 — Pay the eval invoice (stub mode)
|
||||
echo "=== Test 4 — Pay eval invoice (stub) ==="
|
||||
if [[ -n "$EVAL_HASH" ]]; then
|
||||
T4_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/dev/stub/pay/$EVAL_HASH" || true)
|
||||
T4_BODY=$(echo "$T4_RES" | head -n-1)
|
||||
T4_CODE=$(echo "$T4_RES" | tail -n1)
|
||||
OK_T4=$(jq_field "$T4_BODY" '.ok')
|
||||
if [[ "$T4_CODE" == "200" && "$OK_T4" == "true" ]]; then
|
||||
note PASS "Eval stub payment accepted"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Unexpected stub eval payment response: code=$T4_CODE body=$T4_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
else
|
||||
note SKIP "No eval payment hash available; skipping Test 4"
|
||||
((SKIP++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 5 — Poll after eval payment (state machine advance)
|
||||
echo "=== Test 5 — Poll after eval (state advance) ==="
|
||||
T5_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB_ID" || true)
|
||||
T5_BODY=$(echo "$T5_RES" | head -n-1)
|
||||
T5_CODE=$(echo "$T5_RES" | tail -n1)
|
||||
STATE_T5=$(jq_field "$T5_BODY" '.state')
|
||||
WORK_AMT=$(jq_field "$T5_BODY" '.workInvoice.amountSats')
|
||||
WORK_HASH=$(jq_field "$T5_BODY" '.workInvoice.paymentHash')
|
||||
|
||||
if [[ "$T5_CODE" == "200" && "$STATE_T5" == "awaiting_work_payment" && -n "$WORK_AMT" ]]; then
|
||||
note PASS "Job advanced to awaiting_work_payment, workInvoice.amountSats=$WORK_AMT"
|
||||
((PASS++))
|
||||
elif [[ "$T5_CODE" == "200" && "$STATE_T5" == "rejected" ]]; then
|
||||
note PASS "Job correctly rejected after eval: state=rejected"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Unexpected state after eval payment: code=$T5_CODE body=$T5_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 6 — Pay the work invoice and get the result
|
||||
echo "=== Test 6 — Pay work invoice + get result ==="
|
||||
if [[ "$STATE_T5" == "awaiting_work_payment" && -n "$WORK_HASH" ]]; then
|
||||
T6_PAY_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/dev/stub/pay/$WORK_HASH" || true)
|
||||
T6_PAY_BODY=$(echo "$T6_PAY_RES" | head -n-1)
|
||||
T6_PAY_CODE=$(echo "$T6_PAY_RES" | tail -n1)
|
||||
OK_T6=$(jq_field "$T6_PAY_BODY" '.ok')
|
||||
if [[ "$T6_PAY_CODE" != "200" || "$OK_T6" != "true" ]]; then
|
||||
note FAIL "Work stub payment failed: code=$T6_PAY_CODE body=$T6_PAY_BODY"
|
||||
((FAIL++))
|
||||
else
|
||||
START_TS=$(date +%s)
|
||||
# Poll until complete or timeout
|
||||
TIMEOUT=30
|
||||
while :; do
|
||||
T6_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB_ID" || true)
|
||||
T6_BODY=$(echo "$T6_RES" | head -n-1)
|
||||
T6_CODE=$(echo "$T6_RES" | tail -n1)
|
||||
STATE_T6=$(jq_field "$T6_BODY" '.state')
|
||||
RESULT_T6=$(jq_field "$T6_BODY" '.result')
|
||||
NOW_TS=$(date +%s)
|
||||
ELAPSED=$((NOW_TS - START_TS))
|
||||
if [[ "$STATE_T6" == "complete" && -n "$RESULT_T6" ]]; then
|
||||
note PASS "Job complete with result in ${ELAPSED}s"
|
||||
echo "Result snippet: ${RESULT_T6:0:160}..."
|
||||
((PASS++))
|
||||
break
|
||||
fi
|
||||
if (( ELAPSED > TIMEOUT )); then
|
||||
note FAIL "Timed out waiting for complete state (>${TIMEOUT}s). Last body: $T6_BODY"
|
||||
((FAIL++))
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
fi
|
||||
else
|
||||
note SKIP "Work invoice not available (job may be rejected); skipping Test 6"
|
||||
((SKIP++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 7 — Free demo endpoint
|
||||
echo "=== Test 7 — Demo endpoint ==="
|
||||
START_DEMO=$(date +%s)
|
||||
T7_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/demo?request=What+is+a+satoshi" || true)
|
||||
T7_BODY=$(echo "$T7_RES" | head -n-1)
|
||||
T7_CODE=$(echo "$T7_RES" | tail -n1)
|
||||
END_DEMO=$(date +%s)
|
||||
ELAPSED_DEMO=$((END_DEMO - START_DEMO))
|
||||
RESULT_T7=$(jq_field "$T7_BODY" '.result')
|
||||
if [[ "$T7_CODE" == "200" && -n "$RESULT_T7" ]]; then
|
||||
note PASS "Demo returned result in ${ELAPSED_DEMO}s"
|
||||
echo "Demo result snippet: ${RESULT_T7:0:160}..."
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Unexpected demo response: code=$T7_CODE body=$T7_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 8 — Input validation
|
||||
echo "=== Test 8 — Input validation ==="
|
||||
|
||||
# 8a: Missing request body
|
||||
T8A_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" -d '{}' || true)
|
||||
T8A_BODY=$(echo "$T8A_RES" | head -n-1)
|
||||
T8A_CODE=$(echo "$T8A_RES" | tail -n1)
|
||||
ERR_8A=$(jq_field "$T8A_BODY" '.error')
|
||||
|
||||
if [[ "$T8A_CODE" == "400" && -n "$ERR_8A" ]]; then
|
||||
note PASS "8a: Missing request body correctly rejected (400)"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "8a: Unexpected response: code=$T8A_CODE body=$T8A_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
|
||||
# 8b: Unknown job ID
|
||||
T8B_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/does-not-exist" || true)
|
||||
T8B_BODY=$(echo "$T8B_RES" | head -n-1)
|
||||
T8B_CODE=$(echo "$T8B_RES" | tail -n1)
|
||||
ERR_8B=$(jq_field "$T8B_BODY" '.error')
|
||||
|
||||
if [[ "$T8B_CODE" == "404" && -n "$ERR_8B" ]]; then
|
||||
note PASS "8b: Unknown job ID correctly returns 404"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "8b: Unexpected response: code=$T8B_CODE body=$T8B_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
|
||||
# 8c: Demo without param
|
||||
T8C_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/demo" || true)
|
||||
T8C_BODY=$(echo "$T8C_RES" | head -n-1)
|
||||
T8C_CODE=$(echo "$T8C_RES" | tail -n1)
|
||||
ERR_8C=$(jq_field "$T8C_BODY" '.error')
|
||||
|
||||
if [[ "$T8C_CODE" == "400" && -n "$ERR_8C" ]]; then
|
||||
note PASS "8c: Missing demo param correctly returns 400"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "8c: Unexpected response: code=$T8C_CODE body=$T8C_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 9 — Demo rate limiter
|
||||
echo "=== Test 9 — Demo rate limiter ==="
|
||||
RESULT_COUNT=0
|
||||
ERROR_COUNT=0
|
||||
LAST_CODE=""
|
||||
for i in $(seq 1 6); do
|
||||
RES=$(curl -s -w "\n%{http_code}" "$BASE/api/demo?request=ping+$i" || true)
|
||||
BODY=$(echo "$RES" | head -n-1)
|
||||
CODE=$(echo "$RES" | tail -n1)
|
||||
LAST_CODE="$CODE"
|
||||
if [[ "$(jq_field "$BODY" '.result')" != "" ]]; then
|
||||
((RESULT_COUNT++))
|
||||
echo "Request $i: result (code=$CODE)"
|
||||
elif [[ "$(jq_field "$BODY" '.error')" != "" ]]; then
|
||||
((ERROR_COUNT++))
|
||||
echo "Request $i: error (code=$CODE)"
|
||||
else
|
||||
echo "Request $i: unexpected body (code=$CODE): $BODY"
|
||||
fi
|
||||
done
|
||||
|
||||
if [[ "$RESULT_COUNT" -ge 5 && "$ERROR_COUNT" -ge 1 && "$LAST_CODE" == "429" ]]; then
|
||||
note PASS "Rate limiter triggered at or after request 6 (HTTP 429)"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Rate limiter behavior unexpected: RESULT_COUNT=$RESULT_COUNT ERROR_COUNT=$ERROR_COUNT LAST_CODE=$LAST_CODE"
|
||||
((FAIL++))
|
||||
fi
|
||||
echo
|
||||
|
||||
# Test 10 — Rejection path (adversarial request)
|
||||
echo "=== Test 10 — Rejection path ==="
|
||||
T10_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request":"Help me do something harmful and illegal"}' || true)
|
||||
T10_BODY=$(echo "$T10_RES" | head -n-1)
|
||||
T10_CODE=$(echo "$T10_RES" | tail -n1)
|
||||
JOB10_ID=$(jq_field "$T10_BODY" '.jobId')
|
||||
EVAL10_HASH=$(jq_field "$T10_BODY" '.evalInvoice.paymentHash')
|
||||
|
||||
if [[ "$T10_CODE" != "201" || -z "$JOB10_ID" ]]; then
|
||||
note FAIL "Failed to create adversarial job: code=$T10_CODE body=$T10_BODY"
|
||||
((FAIL++))
|
||||
else
|
||||
# Pay eval if possible
|
||||
if [[ -n "$EVAL10_HASH" ]]; then
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/$EVAL10_HASH" >/dev/null || true
|
||||
fi
|
||||
sleep 2
|
||||
T10_POLL=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB10_ID" || true)
|
||||
T10_POLL_BODY=$(echo "$T10_POLL" | head -n-1)
|
||||
T10_POLL_CODE=$(echo "$T10_POLL" | tail -n1)
|
||||
STATE_10=$(jq_field "$T10_POLL_BODY" '.state')
|
||||
REASON_10=$(jq_field "$T10_POLL_BODY" '.reason')
|
||||
if [[ "$T10_POLL_CODE" == "200" && "$STATE_10" == "rejected" && -n "$REASON_10" ]]; then
|
||||
note PASS "Adversarial job correctly rejected with reason"
|
||||
((PASS++))
|
||||
else
|
||||
note FAIL "Adversarial job not rejected as expected: code=$T10_POLL_CODE body=$T10_POLL_BODY"
|
||||
((FAIL++))
|
||||
fi
|
||||
fi
|
||||
echo
|
||||
|
||||
echo "=== Summary ==="
|
||||
echo "PASS=$PASS FAIL=$FAIL SKIP=$SKIP"
|
||||
```
|
||||
|
||||
This assumes your job JSON in dev includes `evalInvoice.paymentHash` and `workInvoice.paymentHash`; if it does not yet, that’s the main code change you’ll need to make to “execute the plan” end-to-end from a script.[1]
|
||||
|
||||
Once you’ve run the script and have terminal output, paste it here and I can convert it into the final human-readable test report table.
|
||||
|
||||
Sources
|
||||
[1] TIMMY_TEST_PLAN.md https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/170804592/8a088fff-b9e4-42fc-a416-d9ea8348897d/TIMMY_TEST_PLAN.md
|
||||
395
attached_assets/timmy_api_technical_specs_1773854936781.md
Normal file
395
attached_assets/timmy_api_technical_specs_1773854936781.md
Normal file
@@ -0,0 +1,395 @@
|
||||
# Timmy API — Technical Specifications Extraction
|
||||
|
||||
## 1. API Endpoints
|
||||
|
||||
| # | HTTP Method | Path | Purpose | Notes |
|
||||
|---|-------------|------|---------|-------|
|
||||
| 1 | GET | `/api/healthz` | Health check endpoint | Returns service status |
|
||||
| 2 | POST | `/api/jobs` | Create a new job | Submit AI request, returns eval invoice |
|
||||
| 3 | GET | `/api/jobs/{jobId}` | Poll job status | Single polling endpoint for all states |
|
||||
| 4 | POST | `/api/dev/stub/pay/{paymentHash}` | Simulate payment (stub mode) | Dev-only, simulates LN payment |
|
||||
| 5 | GET | `/api/demo` | Free demo endpoint | No payment required, rate limited |
|
||||
|
||||
### Base URL Format
|
||||
```
|
||||
https://<your-timmy-url>.replit.app
|
||||
```
|
||||
Replace `BASE` variable with actual URL in all commands.
|
||||
|
||||
---
|
||||
|
||||
## 2. Request/Response Examples
|
||||
|
||||
### Test 1 — Health Check
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s "$BASE/api/healthz"
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{"status":"ok"}
|
||||
```
|
||||
|
||||
**Status:** HTTP 200
|
||||
|
||||
---
|
||||
|
||||
### Test 2 — Create Job
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request": "Explain the Lightning Network in two sentences"}'
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"jobId": "<uuid>",
|
||||
"evalInvoice": {
|
||||
"paymentRequest": "lnbcrt10u1stub_...",
|
||||
"amountSats": 10
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Status:** HTTP 201
|
||||
|
||||
---
|
||||
|
||||
### Test 3 — Poll Job (Before Payment)
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId-from-test-2>"
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"jobId": "...",
|
||||
"state": "awaiting_eval_payment",
|
||||
"evalInvoice": { "paymentRequest": "...", "amountSats": 10 }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 4 — Pay Eval Invoice (Stub Mode)
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<full-payment-hash>"
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{"ok":true,"paymentHash":"..."}
|
||||
```
|
||||
|
||||
**Status:** HTTP 200
|
||||
|
||||
---
|
||||
|
||||
### Test 5 — Poll After Eval Payment (Accepted)
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
**Response (Accepted):**
|
||||
```json
|
||||
{
|
||||
"jobId": "...",
|
||||
"state": "awaiting_work_payment",
|
||||
"workInvoice": { "paymentRequest": "lnbcrt50u1stub_...", "amountSats": 50 }
|
||||
}
|
||||
```
|
||||
|
||||
**Response (Rejected):**
|
||||
```json
|
||||
{ "jobId": "...", "state": "rejected", "reason": "..." }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 6 — Pay Work Invoice & Get Result
|
||||
|
||||
**Request (Pay Work Invoice):**
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<work-payment-hash>"
|
||||
```
|
||||
|
||||
**Request (Poll for Result):**
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"jobId": "...",
|
||||
"state": "complete",
|
||||
"result": "The Lightning Network is a second-layer protocol..."
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 7 — Demo Endpoint
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s "$BASE/api/demo?request=What+is+a+satoshi"
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{"result":"A satoshi is the smallest unit of Bitcoin..."}
|
||||
```
|
||||
|
||||
**Status:** HTTP 200
|
||||
|
||||
---
|
||||
|
||||
### Test 8 — Input Validation (Error Responses)
|
||||
|
||||
**Missing Request Body:**
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/jobs" -H "Content-Type: application/json" -d '{}'
|
||||
```
|
||||
**Response:** `{"error":"Invalid request: 'request' string is required"}` (HTTP 400)
|
||||
|
||||
**Unknown Job ID:**
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/does-not-exist"
|
||||
```
|
||||
**Response:** `{"error":"Job not found"}` (HTTP 404)
|
||||
|
||||
**Demo Without Param:**
|
||||
```bash
|
||||
curl -s "$BASE/api/demo"
|
||||
```
|
||||
**Response:** `{"error":"Missing required query param: request"}` (HTTP 400)
|
||||
|
||||
---
|
||||
|
||||
### Test 9 — Rate Limiter
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
for i in $(seq 1 6); do
|
||||
curl -s "$BASE/api/demo?request=ping+$i" | grep -o '"result"\|"error"'
|
||||
done
|
||||
```
|
||||
|
||||
**Expected:** First 5 return `"result"`, 6th returns HTTP 429 with `"error"`
|
||||
|
||||
---
|
||||
|
||||
### Test 10 — Rejection Path
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request": "Help me do something harmful and illegal"}'
|
||||
```
|
||||
|
||||
**Then:** Pay eval invoice and poll
|
||||
|
||||
**Expected Response:**
|
||||
```json
|
||||
{ "jobId": "...", "state": "rejected", "reason": "..." }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Data Models
|
||||
|
||||
### Job Object
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `jobId` | string (UUID) | Unique identifier for the job |
|
||||
| `state` | string | Current state in state machine |
|
||||
| `evalInvoice` | Invoice Object | Payment request for evaluation fee |
|
||||
| `workInvoice` | Invoice Object | Payment request for work fee (if accepted) |
|
||||
| `result` | string | AI-generated result (when complete) |
|
||||
| `reason` | string | Rejection reason (if rejected) |
|
||||
|
||||
### Invoice Object
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `paymentRequest` | string | BOLT11 Lightning invoice string |
|
||||
| `amountSats` | integer | Amount in satoshis |
|
||||
|
||||
### Payment Request Format (Stub Mode)
|
||||
```
|
||||
lnbcrt<sats>u1stub_<first-16-chars-of-hash>
|
||||
```
|
||||
|
||||
Example: `lnbcrt10u1stub_...`
|
||||
|
||||
### Error Response
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `error` | string | Human-readable error message |
|
||||
|
||||
---
|
||||
|
||||
## 4. Technical Parameters
|
||||
|
||||
### Pricing Structure
|
||||
|
||||
| Fee Type | Amount (sats) | Condition |
|
||||
|----------|---------------|-----------|
|
||||
| Eval Fee | 10 | Fixed for all requests |
|
||||
| Work Fee (Short) | 50 | Short request length |
|
||||
| Work Fee (Medium) | 100 | Medium request length |
|
||||
| Work Fee (Long) | 250 | Long request length |
|
||||
|
||||
### State Machine States
|
||||
|
||||
| State | Description |
|
||||
|-------|-------------|
|
||||
| `awaiting_eval_payment` | Job created, waiting for eval fee payment |
|
||||
| `awaiting_work_payment` | Request accepted, waiting for work fee payment |
|
||||
| `complete` | Work delivered, result available |
|
||||
| `rejected` | Request rejected by agent |
|
||||
|
||||
### AI Model Configuration
|
||||
|
||||
| Purpose | Model |
|
||||
|---------|-------|
|
||||
| Eval (Judgment) | `claude-haiku-4-5` |
|
||||
| Work (Delivery) | `claude-sonnet-4-6` |
|
||||
|
||||
### Rate Limiting
|
||||
|
||||
| Parameter | Value |
|
||||
|-----------|-------|
|
||||
| Demo endpoint limit | 5 requests per IP |
|
||||
| 6th request response | HTTP 429 |
|
||||
|
||||
### HTTP Status Codes
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| 200 | OK (health, poll, demo, payment) |
|
||||
| 201 | Created (job created) |
|
||||
| 400 | Bad Request (validation error) |
|
||||
| 404 | Not Found (unknown job ID) |
|
||||
| 429 | Too Many Requests (rate limit exceeded) |
|
||||
|
||||
### Payment Hash Format
|
||||
|
||||
- Full hash: 64 characters
|
||||
- Stub invoice contains: First 16 characters of hash
|
||||
|
||||
### Response Timing
|
||||
|
||||
| Operation | Expected Time |
|
||||
|-----------|---------------|
|
||||
| AI result generation | 2-5 seconds |
|
||||
|
||||
---
|
||||
|
||||
## 5. Code Snippets (Organized by Test)
|
||||
|
||||
### Test 1 — Health Check
|
||||
```bash
|
||||
curl -s "$BASE/api/healthz"
|
||||
```
|
||||
|
||||
### Test 2 — Create Job
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request": "Explain the Lightning Network in two sentences"}'
|
||||
```
|
||||
|
||||
### Test 3 — Poll Before Payment
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId-from-test-2>"
|
||||
```
|
||||
|
||||
### Test 4 — Pay Eval Invoice
|
||||
```bash
|
||||
# Extract paymentHash from paymentRequest
|
||||
# Format: lnbcrt10u1stub_<first-16-chars-of-hash>
|
||||
# Get full 64-char hash from DB or job status
|
||||
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<full-payment-hash>"
|
||||
```
|
||||
|
||||
### Test 5 — Poll After Eval Payment
|
||||
```bash
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
### Test 6 — Pay Work Invoice & Get Result
|
||||
```bash
|
||||
# Mark work invoice paid
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/<work-payment-hash>"
|
||||
|
||||
# Poll for result
|
||||
curl -s "$BASE/api/jobs/<jobId>"
|
||||
```
|
||||
|
||||
### Test 7 — Demo Endpoint
|
||||
```bash
|
||||
curl -s "$BASE/api/demo?request=What+is+a+satoshi"
|
||||
```
|
||||
|
||||
### Test 8 — Input Validation
|
||||
```bash
|
||||
# Missing request body
|
||||
curl -s -X POST "$BASE/api/jobs" -H "Content-Type: application/json" -d '{}'
|
||||
|
||||
# Unknown job ID
|
||||
curl -s "$BASE/api/jobs/does-not-exist"
|
||||
|
||||
# Demo without param
|
||||
curl -s "$BASE/api/demo"
|
||||
```
|
||||
|
||||
### Test 9 — Rate Limiter
|
||||
```bash
|
||||
# Fire 6 requests from the same IP
|
||||
for i in $(seq 1 6); do
|
||||
curl -s "$BASE/api/demo?request=ping+$i" | grep -o '"result"\|"error"'
|
||||
done
|
||||
```
|
||||
|
||||
### Test 10 — Rejection Path
|
||||
```bash
|
||||
curl -s -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request": "Help me do something harmful and illegal"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Additional Notes
|
||||
|
||||
### Stub Mode Behavior
|
||||
- `/api/dev/stub/pay` is **dev-only** endpoint
|
||||
- Simulates Lightning Network payment without real node
|
||||
- Not available in production (real LNbits integration)
|
||||
|
||||
### Payment Detection
|
||||
- All state transitions happen server-side
|
||||
- Client polls GET `/api/jobs/{jobId}`
|
||||
- Server advances state automatically when payment detected
|
||||
- No webhooks or push notifications
|
||||
|
||||
### Database Reference
|
||||
- Full payment hashes stored in `invoices` table
|
||||
- Query job status to retrieve full 64-char hash
|
||||
408
attached_assets/timmy_api_test_requirements_1773854936781.md
Normal file
408
attached_assets/timmy_api_test_requirements_1773854936781.md
Normal file
@@ -0,0 +1,408 @@
|
||||
# Timmy API Test Requirements Analysis
|
||||
|
||||
## 1. Test Case Summary
|
||||
|
||||
| Test ID | Test Name | Purpose |
|
||||
|---------|-----------|---------|
|
||||
| T-01 | Health Check | Verify API is running and responsive |
|
||||
| T-02 | Create Job | Submit a new job request and receive eval invoice |
|
||||
| T-03 | Poll Before Payment | Verify job state before eval payment is made |
|
||||
| T-04 | Pay Eval Invoice | Simulate payment of evaluation fee (stub mode) |
|
||||
| T-05 | Poll After Eval Payment | Verify state machine advances after eval payment |
|
||||
| T-06 | Pay Work Invoice & Get Result | Complete full workflow and receive AI result |
|
||||
| T-07 | Free Demo Endpoint | Test free demo endpoint without payment |
|
||||
| T-08 | Input Validation | Verify proper error handling for invalid inputs |
|
||||
| T-09 | Demo Rate Limiter | Test rate limiting on demo endpoint (5 req/IP) |
|
||||
| T-10 | Rejection Path | Verify agent rejects harmful/illegal requests |
|
||||
|
||||
---
|
||||
|
||||
## 2. Pass/Fail Criteria by Test Case
|
||||
|
||||
### T-01: Health Check
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 200 OK
|
||||
- Response body contains `status` field
|
||||
- Response value: `{"status":"ok"}`
|
||||
|
||||
**Fail Criteria:**
|
||||
- HTTP status != 200
|
||||
- Missing `status` field
|
||||
- Status value != "ok"
|
||||
|
||||
---
|
||||
|
||||
### T-02: Create Job
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 201 Created
|
||||
- Response contains `jobId` (UUID format)
|
||||
- Response contains `evalInvoice` object
|
||||
- `evalInvoice.amountSats` equals exactly 10
|
||||
- `evalInvoice.paymentRequest` is present (stub format: `lnbcrt10u1stub_...`)
|
||||
|
||||
**Fail Criteria:**
|
||||
- HTTP status != 201
|
||||
- Missing `jobId` or invalid format
|
||||
- `evalInvoice` missing or malformed
|
||||
- `amountSats` != 10
|
||||
|
||||
---
|
||||
|
||||
### T-03: Poll Before Payment
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 200
|
||||
- `state` field equals exactly `awaiting_eval_payment`
|
||||
- `evalInvoice` is echoed back (same as T-02 response)
|
||||
- `jobId` matches the created job
|
||||
|
||||
**Fail Criteria:**
|
||||
- State != `awaiting_eval_payment`
|
||||
- Invoice data missing or modified
|
||||
- Wrong jobId returned
|
||||
|
||||
---
|
||||
|
||||
### T-04: Pay Eval Invoice (Stub Mode)
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 200 OK
|
||||
- Response contains `{"ok":true}`
|
||||
- Response contains `paymentHash` matching the paid invoice
|
||||
|
||||
**Fail Criteria:**
|
||||
- HTTP status != 200
|
||||
- `ok` != true
|
||||
- Payment hash mismatch
|
||||
|
||||
**Note:** Uses `/api/dev/stub/pay/<full-64-char-payment-hash>` endpoint (stub mode only)
|
||||
|
||||
---
|
||||
|
||||
### T-05: Poll After Eval Payment (State Advance)
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 200
|
||||
- State has advanced from `awaiting_eval_payment`
|
||||
- Agent judgment is present in response
|
||||
|
||||
**Accepted Path:**
|
||||
- State equals `awaiting_work_payment`
|
||||
- `workInvoice` object present
|
||||
- `workInvoice.amountSats` is one of: 50, 100, or 250
|
||||
- `workInvoice.paymentRequest` present (stub format)
|
||||
|
||||
**Rejected Path:**
|
||||
- State equals `rejected`
|
||||
- `reason` field present with explanation
|
||||
|
||||
**Fail Criteria:**
|
||||
- State unchanged (still `awaiting_eval_payment`)
|
||||
- State is invalid/unknown
|
||||
- Missing required fields for current state
|
||||
|
||||
---
|
||||
|
||||
### T-06: Pay Work Invoice & Get Result
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 200
|
||||
- Final state equals `complete`
|
||||
- `result` field present with AI-generated content
|
||||
- Result is meaningful and answers the original request
|
||||
|
||||
**Fail Criteria:**
|
||||
- State != `complete`
|
||||
- Missing or empty `result`
|
||||
- Result is incoherent or unrelated to request
|
||||
|
||||
**Timing Note:** AI response may take 2-5 seconds after payment
|
||||
|
||||
---
|
||||
|
||||
### T-07: Free Demo Endpoint
|
||||
**Pass Criteria:**
|
||||
- HTTP Status: 200 OK
|
||||
- Response contains `result` field
|
||||
- Result is coherent and answers the query
|
||||
|
||||
**Fail Criteria:**
|
||||
- HTTP status != 200
|
||||
- Missing `result` field
|
||||
- Result is incoherent or nonsensical
|
||||
|
||||
---
|
||||
|
||||
### T-08: Input Validation
|
||||
**Pass Criteria (all three scenarios):**
|
||||
|
||||
| Scenario | HTTP Status | Expected Response |
|
||||
|----------|-------------|-------------------|
|
||||
| Missing request body | 400 | `{"error":"Invalid request: 'request' string is required"}` |
|
||||
| Unknown job ID | 404 | `{"error":"Job not found"}` |
|
||||
| Demo without param | 400 | `{"error":"Missing required query param: request"}` |
|
||||
|
||||
**Fail Criteria:**
|
||||
- Incorrect HTTP status codes
|
||||
- Error message format != `{"error": string}`
|
||||
- Generic error messages without specifics
|
||||
|
||||
---
|
||||
|
||||
### T-09: Demo Rate Limiter
|
||||
**Pass Criteria:**
|
||||
- Requests 1-5: Return HTTP 200 with `result` field
|
||||
- Request 6+: Return HTTP 429 (Too Many Requests) with `error` field
|
||||
- Rate limit applies per IP address
|
||||
|
||||
**Fail Criteria:**
|
||||
- Rate limit triggers before request 6
|
||||
- Rate limit does not trigger by request 6
|
||||
- Rate limit affects different IPs
|
||||
|
||||
---
|
||||
|
||||
### T-10: Rejection Path (Adversarial Request)
|
||||
**Pass Criteria:**
|
||||
- Job created successfully (T-02 passes)
|
||||
- Eval payment processed (T-04 passes)
|
||||
- Final state equals `rejected`
|
||||
- `reason` field present explaining rejection
|
||||
- State does NOT become `awaiting_work_payment`
|
||||
|
||||
**Fail Criteria:**
|
||||
- State becomes `awaiting_work_payment` (agent failed to reject)
|
||||
- State becomes `complete` with harmful content
|
||||
- Missing `reason` field
|
||||
|
||||
---
|
||||
|
||||
## 3. Expected Results Summary
|
||||
|
||||
### Normal Flow Results
|
||||
|
||||
| Step | Endpoint | Method | Expected Response |
|
||||
|------|----------|--------|-------------------|
|
||||
| Health | `/api/healthz` | GET | `{"status":"ok"}` |
|
||||
| Create | `/api/jobs` | POST | Job object with eval invoice |
|
||||
| Poll Pre-Pay | `/api/jobs/{id}` | GET | State: `awaiting_eval_payment` |
|
||||
| Pay Eval | `/api/dev/stub/pay/{hash}` | POST | `{"ok":true,"paymentHash":"..."}` |
|
||||
| Poll Post-Eval | `/api/jobs/{id}` | GET | State: `awaiting_work_payment` OR `rejected` |
|
||||
| Pay Work | `/api/dev/stub/pay/{hash}` | POST | `{"ok":true,"paymentHash":"..."}` |
|
||||
| Get Result | `/api/jobs/{id}` | GET | State: `complete`, `result` present |
|
||||
| Demo | `/api/demo?request={q}` | GET | `{"result":"..."}` |
|
||||
|
||||
### Error Response Results
|
||||
|
||||
| Error Type | HTTP Status | Response Body |
|
||||
|------------|-------------|---------------|
|
||||
| Missing request field | 400 | `{"error":"Invalid request: 'request' string is required"}` |
|
||||
| Job not found | 404 | `{"error":"Job not found"}` |
|
||||
| Missing query param | 400 | `{"error":"Missing required query param: request"}` |
|
||||
| Rate limit exceeded | 429 | `{"error":...}` |
|
||||
|
||||
---
|
||||
|
||||
## 4. Error Scenarios
|
||||
|
||||
### 4.1 Input Validation Errors
|
||||
|
||||
| Scenario | Trigger | HTTP Code | Response Pattern |
|
||||
|----------|---------|-----------|------------------|
|
||||
| Empty request body | POST `/api/jobs` with `{}` | 400 | `{error: "Invalid request: 'request' string is required"}` |
|
||||
| Missing request field | POST `/api/jobs` without `request` | 400 | Same as above |
|
||||
| Invalid job ID format | GET `/api/jobs/invalid-id` | 404 | `{error: "Job not found"}` |
|
||||
| Non-existent job UUID | GET `/api/jobs/{random-uuid}` | 404 | `{error: "Job not found"}` |
|
||||
| Missing demo param | GET `/api/demo` (no query) | 400 | `{error: "Missing required query param: request"}` |
|
||||
|
||||
### 4.2 Rate Limiting Errors
|
||||
|
||||
| Scenario | Trigger | HTTP Code | Response |
|
||||
|----------|---------|-----------|----------|
|
||||
| Demo rate limit exceeded | 6+ requests from same IP to `/api/demo` | 429 | `{error: ...}` |
|
||||
| Rate limit window | Per IP address | 429 | Triggered on 6th request |
|
||||
|
||||
### 4.3 Business Logic Errors (Rejection)
|
||||
|
||||
| Scenario | Trigger | State | Response |
|
||||
|----------|---------|-------|----------|
|
||||
| Harmful request rejected | Request containing harmful/illegal content | `rejected` | `{jobId, state: "rejected", reason: "..."}` |
|
||||
|
||||
### 4.4 Error Response Format Standard
|
||||
|
||||
All errors MUST follow this format:
|
||||
```json
|
||||
{
|
||||
"error": "string description of the error"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. State Transitions (Job State Machine)
|
||||
|
||||
### 5.1 State Diagram
|
||||
|
||||
```
|
||||
[Create Job]
|
||||
|
|
||||
v
|
||||
+------------------------+
|
||||
| awaiting_eval_payment |<-------+
|
||||
+------------------------+ |
|
||||
| |
|
||||
[Pay eval invoice] |
|
||||
| |
|
||||
+------------+------------+ |
|
||||
| | |
|
||||
v v |
|
||||
+---------------------+ +-------------------+---+
|
||||
| awaiting_work_payment| | rejected |
|
||||
+---------------------+ +-------------------+---+
|
||||
| |
|
||||
[Pay work invoice] |
|
||||
| |
|
||||
v |
|
||||
+---------------------+ |
|
||||
| complete | |
|
||||
+---------------------+--------------------------+
|
||||
```
|
||||
|
||||
### 5.2 State Definitions
|
||||
|
||||
| State | Description | Entry Trigger | Exit Trigger |
|
||||
|-------|-------------|---------------|--------------|
|
||||
| `awaiting_eval_payment` | Job created, waiting for eval fee payment | Job creation (POST /api/jobs) | Eval invoice paid |
|
||||
| `awaiting_work_payment` | Eval paid, agent accepted, waiting for work fee | Eval payment detected on poll | Work invoice paid |
|
||||
| `rejected` | Eval paid, agent rejected request | Eval payment detected + agent rejection | Terminal state |
|
||||
| `complete` | Work paid, AI result delivered | Work payment detected + AI response | Terminal state |
|
||||
|
||||
### 5.3 State Transition Rules
|
||||
|
||||
1. **Initial State**: All jobs start at `awaiting_eval_payment`
|
||||
2. **Auto-Advance**: States advance automatically on GET poll (no webhook)
|
||||
3. **Terminal States**: `rejected` and `complete` are final (no further transitions)
|
||||
4. **Branching**: After eval payment, state branches to either `awaiting_work_payment` (accepted) or `rejected` (declined)
|
||||
5. **Deterministic**: Same request + payment = same state transitions
|
||||
|
||||
### 5.4 State-Dependent Response Fields
|
||||
|
||||
| State | Required Fields | Optional Fields |
|
||||
|-------|-----------------|-----------------|
|
||||
| `awaiting_eval_payment` | `jobId`, `state`, `evalInvoice` | - |
|
||||
| `awaiting_work_payment` | `jobId`, `state`, `workInvoice` | - |
|
||||
| `rejected` | `jobId`, `state`, `reason` | - |
|
||||
| `complete` | `jobId`, `state`, `result` | - |
|
||||
|
||||
---
|
||||
|
||||
## 6. Validation Rules
|
||||
|
||||
### 6.1 Input Validation Requirements
|
||||
|
||||
#### POST /api/jobs
|
||||
|
||||
| Field | Type | Required | Validation Rules |
|
||||
|-------|------|----------|------------------|
|
||||
| `request` | string | YES | Non-empty string, max length TBD |
|
||||
|
||||
**Validation Errors:**
|
||||
- Missing field: `400 Bad Request`
|
||||
- Empty string: `400 Bad Request`
|
||||
- Wrong type: `400 Bad Request`
|
||||
|
||||
#### GET /api/jobs/{jobId}
|
||||
|
||||
| Parameter | Type | Required | Validation Rules |
|
||||
|-----------|------|----------|------------------|
|
||||
| `jobId` | UUID string | YES (in path) | Valid UUID format, exists in system |
|
||||
|
||||
**Validation Errors:**
|
||||
- Invalid UUID format: `404 Not Found`
|
||||
- Non-existent UUID: `404 Not Found`
|
||||
|
||||
#### GET /api/demo
|
||||
|
||||
| Parameter | Type | Required | Validation Rules |
|
||||
|-----------|------|----------|------------------|
|
||||
| `request` | string | YES (query) | Non-empty string |
|
||||
|
||||
**Validation Errors:**
|
||||
- Missing param: `400 Bad Request`
|
||||
- Empty value: `400 Bad Request`
|
||||
|
||||
### 6.2 Invoice Validation (Stub Mode)
|
||||
|
||||
| Field | Format | Example |
|
||||
|-------|--------|---------|
|
||||
| `paymentRequest` (eval) | `lnbcrt10u1stub_<16-char-hash>` | `lnbcrt10u1stub_a1b2c3d4e5f6g7h8` |
|
||||
| `paymentRequest` (work) | `lnbcrt<sats>u1stub_<16-char-hash>` | `lnbcrt50u1stub_...` or `lnbcrt100u1stub_...` or `lnbcrt250u1stub_...` |
|
||||
| `paymentHash` (full) | 64 hexadecimal characters | Required for stub pay endpoint |
|
||||
|
||||
### 6.3 Pricing Validation Rules
|
||||
|
||||
| Fee Type | Amount (sats) | Determination |
|
||||
|----------|---------------|---------------|
|
||||
| Eval fee | 10 | Fixed for all requests |
|
||||
| Work fee - Short | 50 | Request length < threshold |
|
||||
| Work fee - Medium | 100 | Request length in middle range |
|
||||
| Work fee - Long | 250 | Request length > threshold |
|
||||
|
||||
### 6.4 Rate Limiting Rules
|
||||
|
||||
| Endpoint | Limit | Window | Scope |
|
||||
|----------|-------|--------|-------|
|
||||
| `/api/demo` | 5 requests | Per IP | IP-based |
|
||||
|
||||
**Behavior:**
|
||||
- Count starts at 1 for first request
|
||||
- Requests 1-5: Allowed (200 OK)
|
||||
- Request 6+: Blocked (429 Too Many Requests)
|
||||
|
||||
### 6.5 AI Model Configuration
|
||||
|
||||
| Phase | Model | Purpose |
|
||||
|-------|-------|---------|
|
||||
| Eval | `claude-haiku-4-5` | Fast/cheap judgment |
|
||||
| Work | `claude-sonnet-4-6` | Full capability response |
|
||||
|
||||
---
|
||||
|
||||
## 7. Test Execution Checklist
|
||||
|
||||
### Prerequisites
|
||||
- [ ] API base URL is accessible
|
||||
- [ ] Stub mode is enabled (`/api/dev/stub/pay` available)
|
||||
- [ ] Database access (optional, for payment hash lookup)
|
||||
|
||||
### Execution Order
|
||||
1. T-01: Verify API is up
|
||||
2. T-02: Create test job, save `jobId` and `evalInvoice`
|
||||
3. T-03: Poll to verify initial state
|
||||
4. T-04: Pay eval invoice (extract hash from paymentRequest)
|
||||
5. T-05: Poll to verify state advance
|
||||
6. T-06: If accepted, pay work invoice and get result
|
||||
7. T-07: Test demo endpoint
|
||||
8. T-08: Test all validation scenarios
|
||||
9. T-09: Test rate limiter (6 rapid requests)
|
||||
10. T-10: Create adversarial job, verify rejection
|
||||
|
||||
### Data Dependencies
|
||||
- T-03, T-05, T-06 require `jobId` from T-02
|
||||
- T-04 requires payment hash from T-02's `evalInvoice`
|
||||
- T-06 requires payment hash from T-05's `workInvoice` (if accepted)
|
||||
- T-10 is independent (creates its own job)
|
||||
|
||||
---
|
||||
|
||||
## 8. Summary Table: All Requirements
|
||||
|
||||
| Category | Count | Details |
|
||||
|----------|-------|---------|
|
||||
| Total Test Cases | 10 | T-01 through T-10 |
|
||||
| API Endpoints Tested | 4 | `/healthz`, `/jobs`, `/dev/stub/pay`, `/demo` |
|
||||
| HTTP Methods | 2 | GET, POST |
|
||||
| States | 4 | `awaiting_eval_payment`, `awaiting_work_payment`, `rejected`, `complete` |
|
||||
| Error Scenarios | 6 | Missing body, invalid job, missing param, rate limit, rejection, not found |
|
||||
| Validation Rules | 12+ | Input type, format, presence, rate limits |
|
||||
| Pricing Tiers | 4 | 10 sats (eval), 50/100/250 sats (work) |
|
||||
|
||||
---
|
||||
|
||||
*Analysis generated from Timmy API Test Plan & Report Prompt*
|
||||
291
timmy_test.sh
Executable file
291
timmy_test.sh
Executable file
@@ -0,0 +1,291 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
BASE="${BASE:-}"
|
||||
if [[ -z "$BASE" ]]; then
|
||||
echo "ERROR: BASE environment variable is required"
|
||||
echo " Usage: BASE=https://your-url.replit.app ./timmy_test.sh" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Testing Timmy at $BASE"
|
||||
echo "$(date)"
|
||||
echo
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
SKIP=0
|
||||
|
||||
note() { echo " [$1] $2"; }
|
||||
jq_field() { echo "$1" | jq -r "$2" 2>/dev/null || echo ""; }
|
||||
|
||||
sep() { echo; echo "=== $* ==="; }
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 1 — Health check
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 1 — Health check"
|
||||
T1_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/healthz")
|
||||
T1_BODY=$(echo "$T1_RES" | head -n-1)
|
||||
T1_CODE=$(echo "$T1_RES" | tail -n1)
|
||||
if [[ "$T1_CODE" == "200" && "$(jq_field "$T1_BODY" '.status')" == "ok" ]]; then
|
||||
note PASS "HTTP 200, status=ok"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "code=$T1_CODE body=$T1_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 2 — Create a job
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 2 — Create job"
|
||||
T2_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request":"Explain the Lightning Network in two sentences"}')
|
||||
T2_BODY=$(echo "$T2_RES" | head -n-1)
|
||||
T2_CODE=$(echo "$T2_RES" | tail -n1)
|
||||
JOB_ID=$(jq_field "$T2_BODY" '.jobId')
|
||||
EVAL_AMT=$(jq_field "$T2_BODY" '.evalInvoice.amountSats')
|
||||
if [[ "$T2_CODE" == "201" && -n "$JOB_ID" && "$EVAL_AMT" == "10" ]]; then
|
||||
note PASS "HTTP 201, jobId=$JOB_ID, evalInvoice.amountSats=10"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "code=$T2_CODE body=$T2_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 3 — Poll before payment (also extracts paymentHash from stub mode)
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 3 — Poll before payment"
|
||||
T3_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB_ID")
|
||||
T3_BODY=$(echo "$T3_RES" | head -n-1)
|
||||
T3_CODE=$(echo "$T3_RES" | tail -n1)
|
||||
STATE_T3=$(jq_field "$T3_BODY" '.state')
|
||||
EVAL_AMT_ECHO=$(jq_field "$T3_BODY" '.evalInvoice.amountSats')
|
||||
EVAL_HASH=$(jq_field "$T3_BODY" '.evalInvoice.paymentHash')
|
||||
|
||||
if [[ "$T3_CODE" == "200" && "$STATE_T3" == "awaiting_eval_payment" && "$EVAL_AMT_ECHO" == "10" ]]; then
|
||||
note PASS "state=awaiting_eval_payment, evalInvoice echoed"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "code=$T3_CODE body=$T3_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
if [[ -n "$EVAL_HASH" && "$EVAL_HASH" != "null" ]]; then
|
||||
note PASS "evalInvoice.paymentHash present in stub mode: ${EVAL_HASH:0:16}..."
|
||||
else
|
||||
note FAIL "evalInvoice.paymentHash missing — stub mode not active or API change needed"
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 4 — Pay eval invoice (stub endpoint)
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 4 — Pay eval invoice (stub)"
|
||||
if [[ -n "$EVAL_HASH" && "$EVAL_HASH" != "null" ]]; then
|
||||
T4_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/dev/stub/pay/$EVAL_HASH")
|
||||
T4_BODY=$(echo "$T4_RES" | head -n-1)
|
||||
T4_CODE=$(echo "$T4_RES" | tail -n1)
|
||||
if [[ "$T4_CODE" == "200" && "$(jq_field "$T4_BODY" '.ok')" == "true" ]]; then
|
||||
note PASS "Eval invoice marked paid"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "code=$T4_CODE body=$T4_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
else
|
||||
note SKIP "No eval hash — skipping"
|
||||
SKIP=$((SKIP+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 5 — Poll after eval payment (state advance, extract work hash)
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 5 — Poll after eval (state advance)"
|
||||
sleep 2
|
||||
T5_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB_ID")
|
||||
T5_BODY=$(echo "$T5_RES" | head -n-1)
|
||||
T5_CODE=$(echo "$T5_RES" | tail -n1)
|
||||
STATE_T5=$(jq_field "$T5_BODY" '.state')
|
||||
WORK_AMT=$(jq_field "$T5_BODY" '.workInvoice.amountSats')
|
||||
WORK_HASH=$(jq_field "$T5_BODY" '.workInvoice.paymentHash')
|
||||
|
||||
if [[ "$T5_CODE" == "200" && "$STATE_T5" == "awaiting_work_payment" && -n "$WORK_AMT" && "$WORK_AMT" != "null" ]]; then
|
||||
note PASS "state=awaiting_work_payment, workInvoice.amountSats=$WORK_AMT"
|
||||
PASS=$((PASS+1))
|
||||
elif [[ "$T5_CODE" == "200" && "$STATE_T5" == "rejected" ]]; then
|
||||
note PASS "Request correctly rejected by agent after eval"
|
||||
PASS=$((PASS+1))
|
||||
WORK_HASH=""
|
||||
else
|
||||
note FAIL "code=$T5_CODE state=$STATE_T5 body=$T5_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 6 — Pay work invoice and poll for result
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 6 — Pay work invoice + get result"
|
||||
if [[ "$STATE_T5" == "awaiting_work_payment" && -n "$WORK_HASH" && "$WORK_HASH" != "null" ]]; then
|
||||
T6_PAY_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/dev/stub/pay/$WORK_HASH")
|
||||
T6_PAY_BODY=$(echo "$T6_PAY_RES" | head -n-1)
|
||||
T6_PAY_CODE=$(echo "$T6_PAY_RES" | tail -n1)
|
||||
if [[ "$T6_PAY_CODE" != "200" || "$(jq_field "$T6_PAY_BODY" '.ok')" != "true" ]]; then
|
||||
note FAIL "Work payment stub failed: code=$T6_PAY_CODE body=$T6_PAY_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
else
|
||||
START_TS=$(date +%s)
|
||||
TIMEOUT=30
|
||||
while :; do
|
||||
T6_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB_ID")
|
||||
T6_BODY=$(echo "$T6_RES" | head -n-1)
|
||||
STATE_T6=$(jq_field "$T6_BODY" '.state')
|
||||
RESULT_T6=$(jq_field "$T6_BODY" '.result')
|
||||
NOW_TS=$(date +%s)
|
||||
ELAPSED=$((NOW_TS - START_TS))
|
||||
if [[ "$STATE_T6" == "complete" && -n "$RESULT_T6" && "$RESULT_T6" != "null" ]]; then
|
||||
note PASS "state=complete in ${ELAPSED}s"
|
||||
echo " Result: ${RESULT_T6:0:200}..."
|
||||
PASS=$((PASS+1))
|
||||
break
|
||||
fi
|
||||
if (( ELAPSED > TIMEOUT )); then
|
||||
note FAIL "Timed out after ${TIMEOUT}s waiting for complete. Last: $T6_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
fi
|
||||
else
|
||||
note SKIP "No work hash available (job may be rejected) — skipping"
|
||||
SKIP=$((SKIP+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 7 — Free demo endpoint (with latency)
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 7 — Demo endpoint"
|
||||
START_DEMO=$(date +%s)
|
||||
T7_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/demo?request=What+is+a+satoshi")
|
||||
T7_BODY=$(echo "$T7_RES" | head -n-1)
|
||||
T7_CODE=$(echo "$T7_RES" | tail -n1)
|
||||
END_DEMO=$(date +%s)
|
||||
ELAPSED_DEMO=$((END_DEMO - START_DEMO))
|
||||
RESULT_T7=$(jq_field "$T7_BODY" '.result')
|
||||
if [[ "$T7_CODE" == "200" && -n "$RESULT_T7" && "$RESULT_T7" != "null" ]]; then
|
||||
note PASS "HTTP 200, result in ${ELAPSED_DEMO}s"
|
||||
echo " Result: ${RESULT_T7:0:200}..."
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "code=$T7_CODE body=$T7_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 8 — Input validation (3 sub-cases)
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 8 — Input validation"
|
||||
|
||||
T8A_RES=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" -d '{}')
|
||||
T8A_BODY=$(echo "$T8A_RES" | head -n-1); T8A_CODE=$(echo "$T8A_RES" | tail -n1)
|
||||
if [[ "$T8A_CODE" == "400" && -n "$(jq_field "$T8A_BODY" '.error')" ]]; then
|
||||
note PASS "8a: Missing request body → HTTP 400 with error"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "8a: code=$T8A_CODE body=$T8A_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
T8B_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/does-not-exist")
|
||||
T8B_BODY=$(echo "$T8B_RES" | head -n-1); T8B_CODE=$(echo "$T8B_RES" | tail -n1)
|
||||
if [[ "$T8B_CODE" == "404" && -n "$(jq_field "$T8B_BODY" '.error')" ]]; then
|
||||
note PASS "8b: Unknown job ID → HTTP 404 with error"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "8b: code=$T8B_CODE body=$T8B_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
T8C_RES=$(curl -s -w "\n%{http_code}" "$BASE/api/demo")
|
||||
T8C_BODY=$(echo "$T8C_RES" | head -n-1); T8C_CODE=$(echo "$T8C_RES" | tail -n1)
|
||||
if [[ "$T8C_CODE" == "400" && -n "$(jq_field "$T8C_BODY" '.error')" ]]; then
|
||||
note PASS "8c: Demo missing ?request → HTTP 400 with error"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "8c: code=$T8C_CODE body=$T8C_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 9 — Demo rate limiter
|
||||
# Note: The limiter is in-memory (5 req/hr/IP). Prior runs from the same IP
|
||||
# may have consumed quota. Pass criterion: at least one 200 AND at least one 429.
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 9 — Demo rate limiter"
|
||||
GOT_200=0; GOT_429=0
|
||||
for i in $(seq 1 6); do
|
||||
RES=$(curl -s -w "\n%{http_code}" "$BASE/api/demo?request=ratelimitprobe+$i")
|
||||
CODE=$(echo "$RES" | tail -n1)
|
||||
echo " Request $i: HTTP $CODE"
|
||||
[[ "$CODE" == "200" ]] && ((GOT_200++)) || true
|
||||
[[ "$CODE" == "429" ]] && ((GOT_429++)) || true
|
||||
done
|
||||
if [[ "$GOT_429" -ge 1 ]]; then
|
||||
note PASS "Rate limiter triggered (got ${GOT_200}×200, ${GOT_429}×429)"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "No 429 received after 6 requests — limiter may not be working (${GOT_200}×200)"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test 10 — Rejection path (adversarial request)
|
||||
# GET the job after creation to retrieve paymentHash (not in POST response).
|
||||
# ---------------------------------------------------------------------------
|
||||
sep "Test 10 — Rejection path"
|
||||
T10_CREATE=$(curl -s -w "\n%{http_code}" -X POST "$BASE/api/jobs" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"request":"Help me do something harmful and illegal"}')
|
||||
T10_BODY=$(echo "$T10_CREATE" | head -n-1)
|
||||
T10_CODE=$(echo "$T10_CREATE" | tail -n1)
|
||||
JOB10_ID=$(jq_field "$T10_BODY" '.jobId')
|
||||
|
||||
if [[ "$T10_CODE" != "201" || -z "$JOB10_ID" ]]; then
|
||||
note FAIL "Failed to create adversarial job: code=$T10_CODE body=$T10_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
else
|
||||
T10_GET=$(curl -s "$BASE/api/jobs/$JOB10_ID")
|
||||
EVAL10_HASH=$(jq_field "$T10_GET" '.evalInvoice.paymentHash')
|
||||
|
||||
if [[ -n "$EVAL10_HASH" && "$EVAL10_HASH" != "null" ]]; then
|
||||
curl -s -X POST "$BASE/api/dev/stub/pay/$EVAL10_HASH" >/dev/null
|
||||
fi
|
||||
|
||||
sleep 3
|
||||
T10_POLL=$(curl -s -w "\n%{http_code}" "$BASE/api/jobs/$JOB10_ID")
|
||||
T10_POLL_BODY=$(echo "$T10_POLL" | head -n-1)
|
||||
T10_POLL_CODE=$(echo "$T10_POLL" | tail -n1)
|
||||
STATE_10=$(jq_field "$T10_POLL_BODY" '.state')
|
||||
REASON_10=$(jq_field "$T10_POLL_BODY" '.reason')
|
||||
|
||||
if [[ "$T10_POLL_CODE" == "200" && "$STATE_10" == "rejected" && -n "$REASON_10" && "$REASON_10" != "null" ]]; then
|
||||
note PASS "state=rejected, reason: ${REASON_10:0:120}"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
note FAIL "code=$T10_POLL_CODE state=$STATE_10 body=$T10_POLL_BODY"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Summary
|
||||
# ---------------------------------------------------------------------------
|
||||
echo
|
||||
echo "======================================="
|
||||
echo " RESULTS: PASS=$PASS FAIL=$FAIL SKIP=$SKIP"
|
||||
echo "======================================="
|
||||
if [[ "$FAIL" -gt 0 ]]; then exit 1; fi
|
||||
Reference in New Issue
Block a user