Files
timmy-tower/artifacts/api-server/src/routes/testkit.ts

1142 lines
54 KiB
TypeScript
Raw Normal View History

import { Router, type Request, type Response } from "express";
task/34: Testkit self-serve plan + report endpoints ## Routes added to artifacts/api-server/src/routes/testkit.ts ### GET /api/testkit/plan - Returns TIMMY_TEST_PLAN.md verbatim as text/markdown; charset=utf-8 - Reads file at request time (not on startup) so edits to the plan are picked up without server restart - Path resolves via import.meta.url + dirname() → 4 levels up to project root (handles both dev/tsx and compiled dist/routes/ directories) ### GET /api/testkit/report - Returns only the content from "## Report template" heading to end-of-file - Content-Type: text/plain; charset=utf-8 — ready to copy and fill in - Slice is found with indexOf("## Report template"); 500 if marker absent - Uses the same PLAN_PATH as /api/testkit/plan (single source of truth) ## Deviation: __dirname → import.meta.url Original plan said "resolve relative to project root regardless of cwd". The codebase runs as ESM (tsx / ts-node with ESM), so __dirname is not defined. Fixed by using dirname(fileURLToPath(import.meta.url)) instead — equivalent semantics, correct in both dev and compiled output. ## AGENTS.md — Testing section added Three-step workflow documented between "Branch and PR conventions" and "Stub mode" sections: 1. curl <BASE>/api/testkit/plan — fetch plan before starting 2. curl -s <BASE>/api/testkit | bash — run suite after implementing 3. curl <BASE>/api/testkit/report — fetch report template to fill in ## Unchanged - GET /api/testkit bash script generation: untouched - No new test cases or script modifications ## TypeScript: 0 errors. Smoke tests all pass: - /api/testkit/plan → 200 text/markdown, full TIMMY_TEST_PLAN.md content - /api/testkit/report → 200 text/plain, starts at "## Report template" - /api/testkit → 200 bash script, unchanged
2026-03-19 21:02:43 +00:00
import { readFileSync } from "fs";
import { resolve, dirname } from "path";
import { fileURLToPath } from "url";
const router = Router();
/**
* GET /api/testkit
*
* Returns a self-contained bash script pre-configured with this server's
* BASE URL. Agents and testers can run the full test suite with one command:
*
* curl -s https://your-url.replit.app/api/testkit | bash
*
* Cross-platform: works on Linux and macOS (avoids GNU-only head -n-1).
*
* Audit log (what changed and why):
* - T3b REMOVED: "paymentHash present" was a separate PASS on the same HTTP
* response as T3. Artificial count inflation. Folded into T3 as one assertion.
* - T9 TIGHTENED: assertion is now GOT_429 -ge 3 (not -ge 1). Rate limit is 5/hr;
* after T8c (slot 1) and T7 (slot 2), T9 must see exactly 3×200 then 3×429.
* - T17 ADDED: GET /api/world/state new core route, zero prior coverage.
* - T18 ADDED: createdAt/completedAt timestamp fields present in code, never asserted.
* - T19 ADDED: X-RateLimit-* headers on /api/demo set in code, never verified.
* - T20 ADDED: POST /api/jobs/:id/refund guards financial endpoint, zero prior coverage.
* - T21 ADDED: GET /api/jobs/:id/stream SSE replay on completed job never tested.
* - T22 ADDED: GET /api/sessions/:id unknown ID 404 never tested.
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
* - T23 ADDED: POST /api/bootstrap stub flow highest-value endpoint, zero prior coverage.
* Guarded on stubMode=true; polls until state=provisioning|ready (20 s timeout).
* - T24 ADDED: costLedger completeness after job completion 8 fields, honest-accounting
* invariant (actualAmountSats workAmountSats), refundState enum check.
*/
router.get("/testkit", (req: Request, res: Response) => {
const proto =
(req.headers["x-forwarded-proto"] as string | undefined)?.split(",")[0]?.trim() ?? "https";
const host = (req.headers["x-forwarded-host"] as string | undefined) ?? req.hostname;
const base = `${proto}://${host}`;
const script = `#!/usr/bin/env bash
set -euo pipefail
BASE="${base}"
echo "Timmy Test Kit"
echo "Target: \$BASE"
echo "$(date)"
echo
PASS=0
FAIL=0
SKIP=0
note() { echo " [\$1] \$2"; }
sep() { echo; echo "=== \$* ==="; }
# body_of: strip last line (HTTP status code) works on GNU and BSD (macOS)
body_of() { echo "\$1" | sed '$d'; }
code_of() { echo "\$1" | tail -n1; }
# ---------------------------------------------------------------------------
# Test 1 Health check
# ---------------------------------------------------------------------------
sep "Test 1 — Health check"
T1_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/healthz")
T1_BODY=$(body_of "\$T1_RES"); T1_CODE=$(code_of "\$T1_RES")
if [[ "\$T1_CODE" == "200" ]] && [[ "$(echo "\$T1_BODY" | jq -r '.status' 2>/dev/null)" == "ok" ]]; then
note PASS "HTTP 200, status=ok"
PASS=\$((PASS+1))
else
note FAIL "code=\$T1_CODE body=\$T1_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 2 Create job
# ---------------------------------------------------------------------------
sep "Test 2 — Create job"
T2_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" \\
-d '{"request":"Explain the Lightning Network in two sentences"}')
T2_BODY=$(body_of "\$T2_RES"); T2_CODE=$(code_of "\$T2_RES")
JOB_ID=$(echo "\$T2_BODY" | jq -r '.jobId' 2>/dev/null || echo "")
EVAL_AMT=$(echo "\$T2_BODY" | jq -r '.evalInvoice.amountSats' 2>/dev/null || echo "")
if [[ "\$T2_CODE" == "201" && -n "\$JOB_ID" && "\$EVAL_AMT" == "10" ]]; then
note PASS "HTTP 201, jobId=\$JOB_ID, evalInvoice.amountSats=10"
PASS=\$((PASS+1))
else
note FAIL "code=\$T2_CODE body=\$T2_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 3 Poll before payment
# Audit note: T3b ("paymentHash present as separate PASS") was removed here.
# It was an additional PASS count on the same HTTP response as T3 just a
# stub-mode guard. Merged into a single assertion so the count is honest.
# ---------------------------------------------------------------------------
sep "Test 3 — Poll before payment"
T3_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$JOB_ID")
T3_BODY=$(body_of "\$T3_RES"); T3_CODE=$(code_of "\$T3_RES")
STATE_T3=$(echo "\$T3_BODY" | jq -r '.state' 2>/dev/null || echo "")
EVAL_AMT_ECHO=$(echo "\$T3_BODY" | jq -r '.evalInvoice.amountSats' 2>/dev/null || echo "")
EVAL_HASH=$(echo "\$T3_BODY" | jq -r '.evalInvoice.paymentHash' 2>/dev/null || echo "")
if [[ "\$T3_CODE" == "200" && "\$STATE_T3" == "awaiting_eval_payment" && \\
"\$EVAL_AMT_ECHO" == "10" && -n "\$EVAL_HASH" && "\$EVAL_HASH" != "null" ]]; then
note PASS "state=awaiting_eval_payment, evalInvoice echoed, paymentHash present (stub active)"
PASS=\$((PASS+1))
else
note FAIL "code=\$T3_CODE state=\$STATE_T3 evalAmt=\$EVAL_AMT_ECHO hash=\${EVAL_HASH:-missing}"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 4 Pay eval invoice (stub endpoint)
# ---------------------------------------------------------------------------
sep "Test 4 — Pay eval invoice (stub)"
if [[ -n "\$EVAL_HASH" && "\$EVAL_HASH" != "null" ]]; then
T4_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/dev/stub/pay/\$EVAL_HASH")
T4_BODY=$(body_of "\$T4_RES"); T4_CODE=$(code_of "\$T4_RES")
if [[ "\$T4_CODE" == "200" ]] && [[ "$(echo "\$T4_BODY" | jq -r '.ok' 2>/dev/null)" == "true" ]]; then
note PASS "Eval invoice marked paid"
PASS=\$((PASS+1))
else
note FAIL "code=\$T4_CODE body=\$T4_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No eval hash — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 5 Poll after eval payment (with retry loop real AI eval takes 25 s)
# ---------------------------------------------------------------------------
sep "Test 5 — Poll after eval (state advance)"
START_T5=$(date +%s)
T5_TIMEOUT=30
STATE_T5=""; WORK_AMT=""; WORK_HASH=""; T5_BODY=""; T5_CODE=""
while :; do
T5_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$JOB_ID")
T5_BODY=$(body_of "\$T5_RES"); T5_CODE=$(code_of "\$T5_RES")
STATE_T5=$(echo "\$T5_BODY" | jq -r '.state' 2>/dev/null || echo "")
WORK_AMT=$(echo "\$T5_BODY" | jq -r '.workInvoice.amountSats' 2>/dev/null || echo "")
WORK_HASH=$(echo "\$T5_BODY" | jq -r '.workInvoice.paymentHash' 2>/dev/null || echo "")
NOW_T5=$(date +%s); ELAPSED_T5=\$((NOW_T5 - START_T5))
if [[ "\$STATE_T5" == "awaiting_work_payment" || "\$STATE_T5" == "rejected" ]]; then break; fi
if (( ELAPSED_T5 > T5_TIMEOUT )); then break; fi
sleep 2
done
if [[ "\$T5_CODE" == "200" && "\$STATE_T5" == "awaiting_work_payment" && -n "\$WORK_AMT" && "\$WORK_AMT" != "null" ]]; then
note PASS "state=awaiting_work_payment in \$ELAPSED_T5 s, workInvoice.amountSats=\$WORK_AMT"
PASS=\$((PASS+1))
elif [[ "\$T5_CODE" == "200" && "\$STATE_T5" == "rejected" ]]; then
note PASS "Request correctly rejected by agent after eval (in \$ELAPSED_T5 s)"
PASS=\$((PASS+1))
WORK_HASH=""
else
note FAIL "code=\$T5_CODE state=\$STATE_T5 body=\$T5_BODY (after \$ELAPSED_T5 s)"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 6 Pay work invoice + poll for result
# ---------------------------------------------------------------------------
sep "Test 6 — Pay work invoice + get result"
STATE_T6=""
if [[ "\$STATE_T5" == "awaiting_work_payment" && -n "\$WORK_HASH" && "\$WORK_HASH" != "null" ]]; then
T6_PAY_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/dev/stub/pay/\$WORK_HASH")
T6_PAY_BODY=$(body_of "\$T6_PAY_RES"); T6_PAY_CODE=$(code_of "\$T6_PAY_RES")
if [[ "\$T6_PAY_CODE" != "200" ]] || [[ "$(echo "\$T6_PAY_BODY" | jq -r '.ok' 2>/dev/null)" != "true" ]]; then
note FAIL "Work payment stub failed: code=\$T6_PAY_CODE body=\$T6_PAY_BODY"
FAIL=\$((FAIL+1))
else
START_TS=$(date +%s)
TIMEOUT=30
while :; do
T6_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$JOB_ID")
T6_BODY=$(body_of "\$T6_RES")
STATE_T6=$(echo "\$T6_BODY" | jq -r '.state' 2>/dev/null || echo "")
RESULT_T6=$(echo "\$T6_BODY" | jq -r '.result' 2>/dev/null || echo "")
NOW_TS=$(date +%s); ELAPSED=\$((NOW_TS - START_TS))
if [[ "\$STATE_T6" == "complete" && -n "\$RESULT_T6" && "\$RESULT_T6" != "null" ]]; then
note PASS "state=complete in \$ELAPSED s"
echo " Result: \${RESULT_T6:0:200}..."
PASS=\$((PASS+1))
break
fi
if (( ELAPSED > TIMEOUT )); then
note FAIL "Timed out after \$TIMEOUT s. Last body: \$T6_BODY"
FAIL=\$((FAIL+1))
break
fi
sleep 2
done
fi
else
note SKIP "No work hash (job may be rejected) — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 8 Input validation
# Run BEFORE test 7 to avoid consuming rate-limit quota before T9 can count it.
# T8c hits /api/demo without ?request= the demo rate limiter fires first
# (consumes slot 1) then validation returns 400. This is intentional ordering.
# ---------------------------------------------------------------------------
sep "Test 8 — Input validation"
T8A_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" -d '{}')
T8A_BODY=$(body_of "\$T8A_RES"); T8A_CODE=$(code_of "\$T8A_RES")
if [[ "\$T8A_CODE" == "400" && -n "$(echo "\$T8A_BODY" | jq -r '.error' 2>/dev/null)" ]]; then
note PASS "8a: Missing request body → HTTP 400"
PASS=\$((PASS+1))
else
note FAIL "8a: code=\$T8A_CODE body=\$T8A_BODY"
FAIL=\$((FAIL+1))
fi
T8B_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/does-not-exist")
T8B_BODY=$(body_of "\$T8B_RES"); T8B_CODE=$(code_of "\$T8B_RES")
if [[ "\$T8B_CODE" == "404" && -n "$(echo "\$T8B_BODY" | jq -r '.error' 2>/dev/null)" ]]; then
note PASS "8b: Unknown job ID → HTTP 404"
PASS=\$((PASS+1))
else
note FAIL "8b: code=\$T8B_CODE body=\$T8B_BODY"
FAIL=\$((FAIL+1))
fi
# 8c: demo with no ?request= param. Rate limiter fires first (slot 1), then
# validation returns 400. Does NOT count as a 200 for rate-limit purposes.
T8C_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/demo")
T8C_BODY=$(body_of "\$T8C_RES"); T8C_CODE=$(code_of "\$T8C_RES")
if [[ "\$T8C_CODE" == "400" && -n "$(echo "\$T8C_BODY" | jq -r '.error' 2>/dev/null)" ]]; then
note PASS "8c: Demo missing param → HTTP 400"
PASS=\$((PASS+1))
else
note FAIL "8c: code=\$T8C_CODE body=\$T8C_BODY"
FAIL=\$((FAIL+1))
fi
LONG_STR=$(node -e "process.stdout.write('x'.repeat(501))" 2>/dev/null || python3 -c "print('x'*501,end='')" 2>/dev/null || printf '%501s' | tr ' ' 'x')
T8D_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" \\
-d "{\\"request\\":\\"\$LONG_STR\\"}")
T8D_BODY=$(body_of "\$T8D_RES"); T8D_CODE=$(code_of "\$T8D_RES")
T8D_ERR=$(echo "\$T8D_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T8D_CODE" == "400" && "\$T8D_ERR" == *"500 characters"* ]]; then
note PASS "8d: 501-char request → HTTP 400 with character limit error"
PASS=\$((PASS+1))
else
note FAIL "8d: code=\$T8D_CODE body=\$T8D_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 7 Demo endpoint (after T8, before rate-limit exhaustion in T9)
# Rate-limit slot accounting: T8c used slot 1. T7 uses slot 2. T9 gets 3 more.
# ---------------------------------------------------------------------------
sep "Test 7 — Demo endpoint"
START_DEMO=$(date +%s)
T7_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/demo?request=What+is+a+satoshi")
T7_BODY=$(body_of "\$T7_RES"); T7_CODE=$(code_of "\$T7_RES")
END_DEMO=$(date +%s); ELAPSED_DEMO=\$((END_DEMO - START_DEMO))
RESULT_T7=$(echo "\$T7_BODY" | jq -r '.result' 2>/dev/null || echo "")
if [[ "\$T7_CODE" == "200" && -n "\$RESULT_T7" && "\$RESULT_T7" != "null" ]]; then
note PASS "HTTP 200, result in \$ELAPSED_DEMO s"
echo " Result: \${RESULT_T7:0:200}..."
PASS=\$((PASS+1))
else
note FAIL "code=\$T7_CODE body=\$T7_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 9 Demo rate limiter
# Audit note: tightened from GOT_429 -ge 1 to -ge 3.
# Slot budget: T8c(1) + T7(1) = 2 used. Limit is 5/hr. T9 fires 6:
# requests 1-3 get 200 (slots 3,4,5), requests 4-6 get 429.
# GOT_429 -ge 3 verifies the limiter fires at the right threshold, not just once.
# ---------------------------------------------------------------------------
sep "Test 9 — Demo rate limiter"
GOT_200=0; GOT_429=0
for i in $(seq 1 6); do
RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/demo?request=ratelimitprobe+\$i")
CODE=$(code_of "\$RES")
echo " Request \$i: HTTP \$CODE"
[[ "\$CODE" == "200" ]] && GOT_200=\$((GOT_200+1)) || true
[[ "\$CODE" == "429" ]] && GOT_429=\$((GOT_429+1)) || true
done
if [[ "\$GOT_429" -ge 3 ]]; then
note PASS "Rate limiter triggered correctly (\$GOT_200 x200, \$GOT_429 x429)"
PASS=\$((PASS+1))
else
note FAIL "Expected ≥3 x429, got \$GOT_429 x429 \$GOT_200 x200 — limiter may be misconfigured"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 10 Rejection path
# ---------------------------------------------------------------------------
sep "Test 10 — Rejection path"
T10_CREATE=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" \\
-d '{"request":"Help me do something harmful and illegal"}')
T10_BODY=$(body_of "\$T10_CREATE"); T10_CODE=$(code_of "\$T10_CREATE")
JOB10_ID=$(echo "\$T10_BODY" | jq -r '.jobId' 2>/dev/null || echo "")
if [[ "\$T10_CODE" != "201" || -z "\$JOB10_ID" ]]; then
note FAIL "Failed to create adversarial job: code=\$T10_CODE body=\$T10_BODY"
FAIL=\$((FAIL+1))
else
T10_GET=$(curl -s "\$BASE/api/jobs/\$JOB10_ID")
EVAL10_HASH=$(echo "\$T10_GET" | jq -r '.evalInvoice.paymentHash' 2>/dev/null || echo "")
if [[ -n "\$EVAL10_HASH" && "\$EVAL10_HASH" != "null" ]]; then
curl -s -X POST "\$BASE/api/dev/stub/pay/\$EVAL10_HASH" >/dev/null
fi
START_T10=$(date +%s); T10_TIMEOUT=30
STATE_10=""; REASON_10=""; T10_POLL_BODY=""; T10_POLL_CODE=""
while :; do
T10_POLL=$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$JOB10_ID")
T10_POLL_BODY=$(body_of "\$T10_POLL"); T10_POLL_CODE=$(code_of "\$T10_POLL")
STATE_10=$(echo "\$T10_POLL_BODY" | jq -r '.state' 2>/dev/null || echo "")
REASON_10=$(echo "\$T10_POLL_BODY" | jq -r '.reason' 2>/dev/null || echo "")
NOW_T10=$(date +%s); ELAPSED_T10=\$((NOW_T10 - START_T10))
if [[ "\$STATE_10" == "rejected" || "\$STATE_10" == "failed" ]]; then break; fi
if (( ELAPSED_T10 > T10_TIMEOUT )); then break; fi
sleep 2
done
if [[ "\$T10_POLL_CODE" == "200" && "\$STATE_10" == "rejected" && -n "\$REASON_10" && "\$REASON_10" != "null" ]]; then
note PASS "state=rejected in \$ELAPSED_T10 s, reason: \${REASON_10:0:120}"
PASS=\$((PASS+1))
else
note FAIL "code=\$T10_POLL_CODE state=\$STATE_10 body=\$T10_POLL_BODY (after \$ELAPSED_T10 s)"
FAIL=\$((FAIL+1))
fi
fi
# ---------------------------------------------------------------------------
# Test 17 World state endpoint (GET /api/world/state)
# Core new route previously zero coverage.
# Verifies shape: timmyState.{mood,activity}, agentStates.{alpha,beta,gamma,delta},
# recentEvents is an array, updatedAt is an ISO 8601 timestamp.
# ---------------------------------------------------------------------------
sep "Test 17 — World state endpoint"
T17_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/world/state")
T17_BODY=$(body_of "\$T17_RES"); T17_CODE=$(code_of "\$T17_RES")
T17_MOOD=$(echo "\$T17_BODY" | jq -r '.timmyState.mood' 2>/dev/null || echo "")
T17_ACT=$(echo "\$T17_BODY" | jq -r '.timmyState.activity' 2>/dev/null || echo "")
T17_ALPHA=$(echo "\$T17_BODY" | jq -r '.agentStates.alpha' 2>/dev/null || echo "")
T17_BETA=$(echo "\$T17_BODY" | jq -r '.agentStates.beta' 2>/dev/null || echo "")
T17_EVENTS_TYPE=$(echo "\$T17_BODY" | jq -r '.recentEvents | type' 2>/dev/null || echo "")
T17_UPDATED=$(echo "\$T17_BODY" | jq -r '.updatedAt' 2>/dev/null || echo "")
T17_DATE_OK=false
[[ "\$T17_UPDATED" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}T ]] && T17_DATE_OK=true || true
if [[ "\$T17_CODE" == "200" \\
&& -n "\$T17_MOOD" && "\$T17_MOOD" != "null" \\
&& -n "\$T17_ACT" && "\$T17_ACT" != "null" \\
&& -n "\$T17_ALPHA" && "\$T17_ALPHA" != "null" \\
&& -n "\$T17_BETA" && "\$T17_BETA" != "null" \\
&& "\$T17_EVENTS_TYPE" == "array" \\
&& "\$T17_DATE_OK" == "true" ]]; then
note PASS "HTTP 200, timmyState={\$T17_MOOD/\$T17_ACT}, agentStates present, recentEvents=array, updatedAt=ISO"
PASS=\$((PASS+1))
else
note FAIL "code=\$T17_CODE mood=\$T17_MOOD act=\$T17_ACT alpha=\$T17_ALPHA events=\$T17_EVENTS_TYPE updated=\$T17_UPDATED"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 18 Job response timestamps (createdAt / completedAt)
# These fields exist in the code but were never asserted in any test.
# Uses the main JOB_ID from T2/T6 (state depends on whether T6 completed).
# Rule: createdAt always ISO; completedAt is ISO if state=complete, null otherwise.
# ---------------------------------------------------------------------------
sep "Test 18 — Job response timestamps (createdAt / completedAt)"
if [[ -n "\$JOB_ID" ]]; then
T18_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$JOB_ID")
T18_BODY=$(body_of "\$T18_RES"); T18_CODE=$(code_of "\$T18_RES")
T18_STATE=$(echo "\$T18_BODY" | jq -r '.state' 2>/dev/null || echo "")
T18_CREATED=$(echo "\$T18_BODY" | jq -r '.createdAt' 2>/dev/null || echo "")
T18_COMPLETED=$(echo "\$T18_BODY" | jq -r '.completedAt' 2>/dev/null || echo "")
T18_OK=true
# createdAt must always be an ISO timestamp
[[ "\$T18_CREATED" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}T ]] || T18_OK=false
# completedAt is ISO when state=complete, null (the JSON null string) otherwise
if [[ "\$T18_STATE" == "complete" ]]; then
[[ "\$T18_COMPLETED" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}T ]] || T18_OK=false
else
[[ "\$T18_COMPLETED" == "null" ]] || T18_OK=false
fi
if [[ "\$T18_CODE" == "200" && "\$T18_OK" == "true" ]]; then
note PASS "createdAt=\${T18_CREATED:0:24} completedAt=\$T18_COMPLETED (state=\$T18_STATE)"
PASS=\$((PASS+1))
else
note FAIL "code=\$T18_CODE createdAt=\$T18_CREATED completedAt=\$T18_COMPLETED state=\$T18_STATE"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No job ID — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 19 X-RateLimit-* headers on /api/demo responses
# These headers are always set (even on 429). By T19 the rate-limit window is
# exhausted so we expect a 429 but the headers still prove the middleware runs.
# Uses curl -si to capture both headers and body in one pass.
# ---------------------------------------------------------------------------
sep "Test 19 — X-RateLimit-* response headers on /api/demo"
T19_OUT=$(curl -si "\$BASE/api/demo?request=header+probe+test" 2>/dev/null || true)
T19_LIMIT=$(echo "\$T19_OUT" | grep -i "^X-RateLimit-Limit:" | head -1 | tr -d '\\r' | awk '{print \$2}')
T19_REMAINING=$(echo "\$T19_OUT" | grep -i "^X-RateLimit-Remaining:" | head -1 | tr -d '\\r' | awk '{print \$2}')
T19_RESET=$(echo "\$T19_OUT" | grep -i "^X-RateLimit-Reset:" | head -1 | tr -d '\\r' | awk '{print \$2}')
if [[ -n "\$T19_LIMIT" && -n "\$T19_REMAINING" && -n "\$T19_RESET" ]]; then
note PASS "X-RateLimit-Limit=\$T19_LIMIT X-RateLimit-Remaining=\$T19_REMAINING X-RateLimit-Reset=\$T19_RESET"
PASS=\$((PASS+1))
else
note FAIL "Missing rate-limit headers: Limit='\$T19_LIMIT' Remaining='\$T19_REMAINING' Reset='\$T19_RESET'"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 20 Refund endpoint guards (POST /api/jobs/:id/refund)
# This financial endpoint was completely untested. Three guard cases:
# 20a: missing invoice body 400 (fires before job-state check)
# 20b: unknown job ID 404
# 20c: non-complete job 409 "not complete" (fresh awaiting_eval_payment job)
# ---------------------------------------------------------------------------
sep "Test 20 — Refund endpoint guards"
# 20a: missing invoice field 400 fires before any state checks
if [[ -n "\$JOB_ID" ]]; then
T20A_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs/\$JOB_ID/refund" \\
-H "Content-Type: application/json" -d '{}')
T20A_BODY=$(body_of "\$T20A_RES"); T20A_CODE=$(code_of "\$T20A_RES")
if [[ "\$T20A_CODE" == "400" ]]; then
note PASS "20a: Missing invoice body → HTTP 400"
PASS=\$((PASS+1))
else
note FAIL "20a: Expected 400, got code=\$T20A_CODE body=\$T20A_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "20a: No job ID"
SKIP=\$((SKIP+1))
fi
# 20b: unknown job 404
T20B_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs/does-not-exist/refund" \\
-H "Content-Type: application/json" -d '{"invoice":"lnbc1testplaceholder"}')
T20B_CODE=$(code_of "\$T20B_RES")
if [[ "\$T20B_CODE" == "404" ]]; then
note PASS "20b: Unknown job ID → HTTP 404"
PASS=\$((PASS+1))
else
note FAIL "20b: Expected 404, got code=\$T20B_CODE"
FAIL=\$((FAIL+1))
fi
# 20c: create a fresh job (awaiting_eval_payment) refund must reject with 409
T20C_NEW=$(curl -s -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" \\
-d '{"request":"Refund guard probe — not harmful"}')
T20C_ID=$(echo "\$T20C_NEW" | jq -r '.jobId' 2>/dev/null || echo "")
if [[ -n "\$T20C_ID" && "\$T20C_ID" != "null" ]]; then
T20C_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs/\$T20C_ID/refund" \\
-H "Content-Type: application/json" -d '{"invoice":"lnbc1testplaceholder"}')
T20C_BODY=$(body_of "\$T20C_RES"); T20C_CODE=$(code_of "\$T20C_RES")
if [[ "\$T20C_CODE" == "409" ]]; then
note PASS "20c: Non-complete job (awaiting_eval_payment) → HTTP 409"
PASS=\$((PASS+1))
else
note FAIL "20c: Expected 409, got code=\$T20C_CODE body=\$T20C_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "20c: Could not create probe job"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 21 SSE stream replay on completed job (GET /api/jobs/:id/stream)
# For a complete job the handler immediately sends token + done and closes.
# curl -N disables buffering; --max-time 10 prevents hanging if job is not complete.
# Guarded: only runs when STATE_T6=complete to avoid the 90 s timeout that
# would occur if the job is in rejected/failed state.
# ---------------------------------------------------------------------------
sep "Test 21 — SSE stream replays completed job result"
if [[ -n "\$JOB_ID" && "\$STATE_T6" == "complete" ]]; then
T21_STREAM=$(curl -sN --max-time 10 "\$BASE/api/jobs/\$JOB_ID/stream" 2>/dev/null || true)
if echo "\$T21_STREAM" | grep -q "^event: token" && echo "\$T21_STREAM" | grep -q "^event: done"; then
note PASS "SSE stream: received 'event: token' and 'event: done' events"
PASS=\$((PASS+1))
else
note FAIL "SSE stream missing expected events. Got: \${T21_STREAM:0:300}"
FAIL=\$((FAIL+1))
fi
else
note SKIP "Skipping — job not complete (STATE_T6=\${STATE_T6:-not_set}) or no job ID"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 11 Session: create session
# ---------------------------------------------------------------------------
sep "Test 11 — Session: create session (awaiting_payment)"
T11_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/sessions" \\
-H "Content-Type: application/json" \\
-d '{"amount_sats": 200}')
T11_BODY=$(body_of "\$T11_RES"); T11_CODE=$(code_of "\$T11_RES")
SESSION_ID=$(echo "\$T11_BODY" | jq -r '.sessionId' 2>/dev/null || echo "")
T11_STATE=$(echo "\$T11_BODY" | jq -r '.state' 2>/dev/null || echo "")
T11_AMT=$(echo "\$T11_BODY" | jq -r '.invoice.amountSats' 2>/dev/null || echo "")
DEPOSIT_HASH=$(echo "\$T11_BODY" | jq -r '.invoice.paymentHash' 2>/dev/null || echo "")
if [[ "\$T11_CODE" == "201" && -n "\$SESSION_ID" && "\$T11_STATE" == "awaiting_payment" && "\$T11_AMT" == "200" ]]; then
note PASS "HTTP 201, sessionId=\$SESSION_ID, state=awaiting_payment, amount=200"
PASS=\$((PASS+1))
else
note FAIL "code=\$T11_CODE body=\$T11_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 12 Session: poll before payment
# ---------------------------------------------------------------------------
sep "Test 12 — Session: poll before payment"
T12_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/sessions/\$SESSION_ID")
T12_BODY=$(body_of "\$T12_RES"); T12_CODE=$(code_of "\$T12_RES")
T12_STATE=$(echo "\$T12_BODY" | jq -r '.state' 2>/dev/null || echo "")
if [[ -z "\$DEPOSIT_HASH" || "\$DEPOSIT_HASH" == "null" ]]; then
DEPOSIT_HASH=$(echo "\$T12_BODY" | jq -r '.invoice.paymentHash' 2>/dev/null || echo "")
fi
if [[ "\$T12_CODE" == "200" && "\$T12_STATE" == "awaiting_payment" ]]; then
note PASS "state=awaiting_payment before payment"
PASS=\$((PASS+1))
else
note FAIL "code=\$T12_CODE body=\$T12_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# Test 13 Session: pay deposit + activate
# ---------------------------------------------------------------------------
sep "Test 13 — Session: pay deposit (stub) + auto-advance to active"
SESSION_MACAROON=""
if [[ -n "\$DEPOSIT_HASH" && "\$DEPOSIT_HASH" != "null" ]]; then
curl -s -X POST "\$BASE/api/dev/stub/pay/\$DEPOSIT_HASH" >/dev/null
sleep 1
T13_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/sessions/\$SESSION_ID")
T13_BODY=$(body_of "\$T13_RES"); T13_CODE=$(code_of "\$T13_RES")
T13_STATE=$(echo "\$T13_BODY" | jq -r '.state' 2>/dev/null || echo "")
T13_BAL=$(echo "\$T13_BODY" | jq -r '.balanceSats' 2>/dev/null || echo "")
SESSION_MACAROON=$(echo "\$T13_BODY" | jq -r '.macaroon' 2>/dev/null || echo "")
if [[ "\$T13_CODE" == "200" && "\$T13_STATE" == "active" && "\$T13_BAL" == "200" && -n "\$SESSION_MACAROON" && "\$SESSION_MACAROON" != "null" ]]; then
note PASS "state=active, balanceSats=200, macaroon present"
PASS=\$((PASS+1))
else
note FAIL "code=\$T13_CODE state=\$T13_STATE bal=\$T13_BAL body=\$T13_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No deposit hash (stub mode not active)"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 14 Session: submit request (accepted path)
# ---------------------------------------------------------------------------
sep "Test 14 — Session: submit request (accepted)"
if [[ -n "\$SESSION_MACAROON" && "\$SESSION_MACAROON" != "null" ]]; then
START_T14=$(date +%s)
T14_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/sessions/\$SESSION_ID/request" \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer \$SESSION_MACAROON" \\
-d '{"request":"What is Bitcoin in one sentence?"}')
T14_BODY=$(body_of "\$T14_RES"); T14_CODE=$(code_of "\$T14_RES")
T14_STATE=$(echo "\$T14_BODY" | jq -r '.state' 2>/dev/null || echo "")
T14_DEBITED=$(echo "\$T14_BODY" | jq -r '.debitedSats' 2>/dev/null || echo "")
T14_BAL=$(echo "\$T14_BODY" | jq -r '.balanceRemaining' 2>/dev/null || echo "")
END_T14=$(date +%s); ELAPSED_T14=\$((END_T14 - START_T14))
if [[ "\$T14_CODE" == "200" && ("\$T14_STATE" == "complete" || "\$T14_STATE" == "rejected") && -n "\$T14_DEBITED" && "\$T14_DEBITED" != "null" && -n "\$T14_BAL" ]]; then
note PASS "state=\$T14_STATE in \${ELAPSED_T14}s, debitedSats=\$T14_DEBITED, balanceRemaining=\$T14_BAL"
PASS=\$((PASS+1))
else
note FAIL "code=\$T14_CODE body=\$T14_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No macaroon — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 15 Session: missing/invalid macaroon 401
# ---------------------------------------------------------------------------
sep "Test 15 — Session: reject request without valid macaroon"
if [[ -n "\$SESSION_ID" ]]; then
T15_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/sessions/\$SESSION_ID/request" \\
-H "Content-Type: application/json" \\
-d '{"request":"What is Bitcoin?"}')
T15_CODE=$(code_of "\$T15_RES")
if [[ "\$T15_CODE" == "401" ]]; then
note PASS "HTTP 401 without macaroon"
PASS=\$((PASS+1))
else
note FAIL "Expected 401, got code=\$T15_CODE"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No session ID — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 16 Session: topup invoice creation
# ---------------------------------------------------------------------------
sep "Test 16 — Session: topup invoice creation"
if [[ -n "\$SESSION_MACAROON" && "\$SESSION_MACAROON" != "null" ]]; then
T16_RES=$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/sessions/\$SESSION_ID/topup" \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer \$SESSION_MACAROON" \\
-d '{"amount_sats": 500}')
T16_BODY=$(body_of "\$T16_RES"); T16_CODE=$(code_of "\$T16_RES")
T16_PR=$(echo "\$T16_BODY" | jq -r '.topup.paymentRequest' 2>/dev/null || echo "")
T16_AMT=$(echo "\$T16_BODY" | jq -r '.topup.amountSats' 2>/dev/null || echo "")
if [[ "\$T16_CODE" == "200" && -n "\$T16_PR" && "\$T16_PR" != "null" && "\$T16_AMT" == "500" ]]; then
note PASS "Topup invoice created, amountSats=500"
PASS=\$((PASS+1))
else
note FAIL "code=\$T16_CODE body=\$T16_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No macaroon — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# Test 22 Session: unknown session ID 404
# Previously zero coverage. Sessions GET was only tested with a valid ID (T12).
# ---------------------------------------------------------------------------
sep "Test 22 — Session: unknown session ID → 404"
T22_RES=$(curl -s -w "\\n%{http_code}" "\$BASE/api/sessions/does-not-exist")
T22_BODY=$(body_of "\$T22_RES"); T22_CODE=$(code_of "\$T22_RES")
T22_ERR=$(echo "\$T22_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T22_CODE" == "404" && -n "\$T22_ERR" && "\$T22_ERR" != "null" ]]; then
note PASS "HTTP 404 with error field for unknown session ID"
PASS=\$((PASS+1))
else
note FAIL "Expected 404 with error, got code=\$T22_CODE body=\$T22_BODY"
FAIL=\$((FAIL+1))
fi
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
# ---------------------------------------------------------------------------
# Test 23 Bootstrap: create stub-pay poll provisioning state
# Highest-value paid feature (10,000 sats default) previously zero coverage.
# Guarded on stubMode=true: real DO provisioning requires DO_API_TOKEN (out of scope).
# Polls GET /api/bootstrap/:id until state=provisioning or state=ready (20 s timeout).
# ---------------------------------------------------------------------------
sep "Test 23 — Bootstrap: create + stub-pay + poll provisioning"
T23_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/bootstrap" \\
-H "Content-Type: application/json")
T23_BODY=\$(body_of "\$T23_RES"); T23_CODE=\$(code_of "\$T23_RES")
BOOTSTRAP_ID=\$(echo "\$T23_BODY" | jq -r '.bootstrapJobId' 2>/dev/null || echo "")
T23_STUB=\$(echo "\$T23_BODY" | jq -r '.stubMode' 2>/dev/null || echo "false")
BOOTSTRAP_HASH=\$(echo "\$T23_BODY" | jq -r '.invoice.paymentHash' 2>/dev/null || echo "")
if [[ "\$T23_CODE" != "201" || -z "\$BOOTSTRAP_ID" || "\$BOOTSTRAP_ID" == "null" ]]; then
note FAIL "Bootstrap create failed: code=\$T23_CODE body=\$T23_BODY"
FAIL=\$((FAIL+1))
elif [[ "\$T23_STUB" != "true" ]]; then
note SKIP "stubMode=false — skipping (requires DO_API_TOKEN for real provisioning)"
SKIP=\$((SKIP+1))
elif [[ -z "\$BOOTSTRAP_HASH" || "\$BOOTSTRAP_HASH" == "null" ]]; then
note FAIL "stubMode=true but invoice.paymentHash missing — cannot stub-pay: body=\$T23_BODY"
FAIL=\$((FAIL+1))
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
else
curl -s -X POST "\$BASE/api/dev/stub/pay/\$BOOTSTRAP_HASH" >/dev/null
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
START_T23=\$(date +%s); T23_TIMEOUT=20
T23_STATE=""; T23_MSG=""; T23_POLL_CODE=""; T23_POLL_ID=""
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
while :; do
T23_POLL=\$(curl -s -w "\\n%{http_code}" "\$BASE/api/bootstrap/\$BOOTSTRAP_ID")
T23_POLL_BODY=\$(body_of "\$T23_POLL"); T23_POLL_CODE=\$(code_of "\$T23_POLL")
T23_STATE=\$(echo "\$T23_POLL_BODY" | jq -r '.state' 2>/dev/null || echo "")
T23_MSG=\$(echo "\$T23_POLL_BODY" | jq -r '.message' 2>/dev/null || echo "")
T23_POLL_ID=\$(echo "\$T23_POLL_BODY" | jq -r '.bootstrapJobId' 2>/dev/null || echo "")
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
NOW_T23=\$(date +%s); ELAPSED_T23=\$((NOW_T23 - START_T23))
if [[ "\$T23_STATE" == "provisioning" || "\$T23_STATE" == "ready" ]]; then break; fi
if (( ELAPSED_T23 > T23_TIMEOUT )); then break; fi
sleep 2
done
if [[ "\$T23_POLL_CODE" == "200" \\
&& ("\$T23_STATE" == "provisioning" || "\$T23_STATE" == "ready") \\
&& -n "\$T23_MSG" && "\$T23_MSG" != "null" \\
&& "\$T23_POLL_ID" == "\$BOOTSTRAP_ID" ]]; then
note PASS "state=\$T23_STATE in \$ELAPSED_T23 s, message present, bootstrapJobId=\$BOOTSTRAP_ID (echoed in poll)"
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
PASS=\$((PASS+1))
else
note FAIL "code=\$T23_POLL_CODE state=\$T23_STATE elapsed=\${ELAPSED_T23}s pollId=\$T23_POLL_ID (expected \$BOOTSTRAP_ID) body=\$T23_POLL_BODY"
task(#24): Bootstrap route + cost-ledger testkit coverage — 29/29 PASS Task: Add T23 (bootstrap stub flow) and T24 (cost-ledger completeness) to the testkit, bringing total from 27/27 to 29/29 PASS with 0 FAIL, 0 SKIP. ## What was changed - `artifacts/api-server/src/routes/testkit.ts`: - Updated audit-log comment block to document T23 + T24 additions. - Inserted Test 23 after T22 (line ~654): POST /api/bootstrap → assert 201 + bootstrapJobId present. Guard on stubMode=true; SKIP if real DO mode (prevents hanging). Stub-pay the paymentHash via /api/dev/stub/pay/:hash. Poll GET /api/bootstrap/:id every 2s (20s timeout) until state=provisioning or state=ready; assert message field present. - Inserted Test 24 after T23: Guarded on STATE_T6=complete (reuses completed job from T6). GET /api/jobs/:id, extract costLedger. Assert all 8 fields non-null: actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd, actualAmountSats, workAmountSats, refundAmountSats, refundState. Honest-accounting invariant: actualAmountSats <= workAmountSats. refundAmountSats >= 0. refundState must match ^(not_applicable|pending|paid)$. ## No deviations from task spec - T23 guard logic matches spec exactly (stubMode check before poll). - T24 fields match the 8 specified in task-24.md plus the invariants. - No changes to bootstrap.ts or jobs.ts — existing routes already correct. ## Test run result 29/29 PASS, 0 FAIL, 0 SKIP (fresh server restart, rate-limit slots clean). T23: state=provisioning in 1s. T24: actualAmountSats(179)<=workAmountSats(182), refundAmountSats=3, refundState=pending.
2026-03-19 04:04:49 +00:00
FAIL=\$((FAIL+1))
fi
fi
# ---------------------------------------------------------------------------
# Test 24 Cost ledger completeness after job completion
# Uses the completed JOB_ID from T6 (guarded on STATE_T6=complete).
# Verifies 8 cost-ledger fields are non-null:
# actualInputTokens, actualOutputTokens, totalTokens, actualCostUsd,
# actualAmountSats, workAmountSats, refundAmountSats, refundState.
# Honest-accounting invariant: actualAmountSats <= workAmountSats.
# refundState must be one of: not_applicable | pending | paid.
# ---------------------------------------------------------------------------
sep "Test 24 — Cost ledger completeness (job completed)"
if [[ -n "\$JOB_ID" && "\$STATE_T6" == "complete" ]]; then
T24_RES=\$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$JOB_ID")
T24_BODY=\$(body_of "\$T24_RES"); T24_CODE=\$(code_of "\$T24_RES")
T24_LEDGER=\$(echo "\$T24_BODY" | jq '.costLedger' 2>/dev/null || echo "null")
T24_REFUND_STATE=\$(echo "\$T24_LEDGER" | jq -r '.refundState' 2>/dev/null || echo "")
T24_FIELDS_OK=true
for field in actualInputTokens actualOutputTokens totalTokens actualCostUsd actualAmountSats workAmountSats refundAmountSats refundState; do
VAL=\$(echo "\$T24_LEDGER" | jq -r ".\$field" 2>/dev/null || echo "MISSING")
[[ -z "\$VAL" || "\$VAL" == "null" ]] && T24_FIELDS_OK=false || true
done
T24_INV_OK=\$(echo "\$T24_LEDGER" | jq '
(.actualAmountSats != null) and
(.workAmountSats != null) and
(.actualAmountSats <= .workAmountSats) and
(.refundAmountSats >= 0)
' 2>/dev/null || echo "false")
if [[ "\$T24_CODE" == "200" \\
&& "\$T24_LEDGER" != "null" \\
&& "\$T24_FIELDS_OK" == "true" \\
&& "\$T24_INV_OK" == "true" \\
&& "\$T24_REFUND_STATE" =~ ^(not_applicable|pending|paid)\$ ]]; then
T24_ACTUAL=\$(echo "\$T24_LEDGER" | jq -r '.actualAmountSats' 2>/dev/null || echo "?")
T24_WORK=\$(echo "\$T24_LEDGER" | jq -r '.workAmountSats' 2>/dev/null || echo "?")
T24_REFUND=\$(echo "\$T24_LEDGER" | jq -r '.refundAmountSats' 2>/dev/null || echo "?")
note PASS "8 fields non-null, actualAmountSats(\$T24_ACTUAL)<=workAmountSats(\$T24_WORK), refundAmountSats=\$T24_REFUND, refundState=\$T24_REFUND_STATE"
PASS=\$((PASS+1))
else
note FAIL "code=\$T24_CODE ledger=\$T24_LEDGER fields_ok=\$T24_FIELDS_OK inv_ok=\$T24_INV_OK refundState=\$T24_REFUND_STATE"
FAIL=\$((FAIL+1))
fi
else
note SKIP "Skipping — job not complete (STATE_T6=\${STATE_T6:-not_set}) or no JOB_ID"
SKIP=\$((SKIP+1))
fi
task/35: Testkit T25–T36 — Nostr identity + trust engine coverage ## 12 new tests added to artifacts/api-server/src/routes/testkit.ts Inserted before the Summary block, after T24 (cost ledger). T25 — POST /identity/challenge: HTTP 200, nonce=64-char hex, expiresAt=ISO T26 — POST /identity/verify {}: HTTP 400, non-empty error T27 — POST /identity/verify fake nonce: HTTP 401, error contains "Nonce not found" (uses a plausible-looking event structure to hit the nonce check, not the signature check — tests the right layer) T28 — GET /identity/me no header: HTTP 401, error contains "Missing" T29 — GET /identity/me invalid token: HTTP 401 (Invalid/expired wording) T30 — POST /sessions bad X-Nostr-Token: HTTP 401, "Invalid or expired", no sessionId T31 — POST /jobs bad X-Nostr-Token: HTTP 401, "Invalid or expired" T32 — POST /sessions anonymous: HTTP 201, trust_tier="anonymous"; captures T32_SESSION_ID T33 — POST /jobs anonymous: HTTP 201, trust_tier="anonymous"; captures T33_JOB_ID T34 — GET /jobs/:id (using T33_JOB_ID): HTTP 200, trust_tier non-null and "anonymous" T35 — GET /sessions/:id (using T32_SESSION_ID): HTTP 200, trust_tier="anonymous" T36 — Full challenge→sign→verify E2E: inline node CJS script generates ephemeral secp256k1 keypair via nostr-tools CJS bundle, POSTs challenge, signs kind=27235 event with finalizeEvent(), verifies → nostr_token, GETs /identity/me, asserts tier=new, interactionCount=0, pubkey matches. Guard: SKIP if node not in PATH or script fails. ## nostr-tools import strategy nostr-tools v2 is ESM-only. CJS workaround: the package ships a CJS bundle at lib/cjs/index.js. T36 uses require() with the absolute path to that bundle. Falls back to bare require('nostr-tools') for portability, exits with code 1 if neither works — bash guard catches this and marks T36 SKIP (not FAIL). ## Stubs T37–T40 added as bash block comments after T36 Format: `# FUTURE T3N: <description>` so they are grepped easily. Covers: GET /api/estimate (cost preview), anonymous Lightning gate, trusted free tier, Timmy-initiates-zap. Does not affect PASS/FAIL totals. ## TIMMY_TEST_PLAN.md updated New "Nostr identity + trust engine (tests 25–36)" section added to the test table. ## TypeScript: 0 errors. All 12 tests smoke-tested individually against localhost:8080. T25-T35: all correct HTTP status codes and JSON fields verified via curl. T36: full E2E verified — tier=new, icount=0, pubkey matches /identity/me response.
2026-03-19 21:09:50 +00:00
# ===========================================================================
# T25T36 Nostr identity + trust engine
# ===========================================================================
# ---------------------------------------------------------------------------
# T25 POST /identity/challenge returns valid nonce
# ---------------------------------------------------------------------------
sep "Test 25 — POST /identity/challenge returns valid nonce"
T25_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/identity/challenge")
T25_BODY=\$(body_of "\$T25_RES"); T25_CODE=\$(code_of "\$T25_RES")
T25_NONCE=\$(echo "\$T25_BODY" | jq -r '.nonce' 2>/dev/null || echo "")
T25_EXP=\$(echo "\$T25_BODY" | jq -r '.expiresAt' 2>/dev/null || echo "")
T25_NONCE_OK=false; T25_EXP_OK=false
[[ "\$T25_NONCE" =~ ^[0-9a-f]{64}\$ ]] && T25_NONCE_OK=true || true
[[ "\$T25_EXP" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}T ]] && T25_EXP_OK=true || true
if [[ "\$T25_CODE" == "200" && "\$T25_NONCE_OK" == "true" && "\$T25_EXP_OK" == "true" ]]; then
note PASS "HTTP 200, nonce=64-char-hex, expiresAt=ISO"
PASS=\$((PASS+1))
else
note FAIL "code=\$T25_CODE nonce='\$T25_NONCE' expiresAt='\$T25_EXP'"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T26 POST /identity/verify with missing event body 400
# ---------------------------------------------------------------------------
sep "Test 26 — POST /identity/verify missing event → 400"
T26_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/identity/verify" \\
-H "Content-Type: application/json" -d '{}')
T26_BODY=\$(body_of "\$T26_RES"); T26_CODE=\$(code_of "\$T26_RES")
T26_ERR=\$(echo "\$T26_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T26_CODE" == "400" && -n "\$T26_ERR" ]]; then
note PASS "HTTP 400, error: \${T26_ERR:0:80}"
PASS=\$((PASS+1))
else
note FAIL "code=\$T26_CODE body=\$T26_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T27 POST /identity/verify with unknown nonce 401 "Nonce not found"
# Uses a real-looking (structurally plausible) event with a fake nonce.
# ---------------------------------------------------------------------------
sep "Test 27 — POST /identity/verify unknown nonce → 401"
T27_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/identity/verify" \\
-H "Content-Type: application/json" \\
-d '{"event":{"pubkey":"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa","content":"0000000000000000000000000000000000000000000000000000000000000000","kind":27235,"tags":[],"created_at":1700000000,"id":"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb","sig":"cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc"}}')
T27_BODY=\$(body_of "\$T27_RES"); T27_CODE=\$(code_of "\$T27_RES")
T27_ERR=\$(echo "\$T27_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T27_CODE" == "401" && "\$T27_ERR" == *"Nonce not found"* ]]; then
note PASS "HTTP 401, error contains 'Nonce not found'"
PASS=\$((PASS+1))
else
note FAIL "code=\$T27_CODE err='\$T27_ERR' body=\$T27_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T28 GET /identity/me with no X-Nostr-Token header 401 "Missing"
# ---------------------------------------------------------------------------
sep "Test 28 — GET /identity/me without token → 401"
T28_RES=\$(curl -s -w "\\n%{http_code}" "\$BASE/api/identity/me")
T28_BODY=\$(body_of "\$T28_RES"); T28_CODE=\$(code_of "\$T28_RES")
T28_ERR=\$(echo "\$T28_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T28_CODE" == "401" && "\$T28_ERR" == *"Missing"* ]]; then
note PASS "HTTP 401, error contains 'Missing'"
PASS=\$((PASS+1))
else
note FAIL "code=\$T28_CODE err='\$T28_ERR'"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T29 GET /identity/me with invalid X-Nostr-Token 401
# ---------------------------------------------------------------------------
sep "Test 29 — GET /identity/me with invalid token → 401"
T29_RES=\$(curl -s -w "\\n%{http_code}" "\$BASE/api/identity/me" \\
-H "X-Nostr-Token: totally.invalid.token")
T29_BODY=\$(body_of "\$T29_RES"); T29_CODE=\$(code_of "\$T29_RES")
T29_ERR=\$(echo "\$T29_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T29_CODE" == "401" && ( "\$T29_ERR" == *"Invalid"* || "\$T29_ERR" == *"expired"* ) ]]; then
note PASS "HTTP 401, error: \${T29_ERR:0:80}"
PASS=\$((PASS+1))
else
note FAIL "code=\$T29_CODE err='\$T29_ERR'"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T30 POST /sessions with bogus X-Nostr-Token 401
# ---------------------------------------------------------------------------
sep "Test 30 — POST /sessions with bad token → 401"
T30_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/sessions" \\
-H "Content-Type: application/json" \\
-H "X-Nostr-Token: badtoken" \\
-d '{"amount_sats":200}')
T30_BODY=\$(body_of "\$T30_RES"); T30_CODE=\$(code_of "\$T30_RES")
T30_ERR=\$(echo "\$T30_BODY" | jq -r '.error' 2>/dev/null || echo "")
T30_SESSION=\$(echo "\$T30_BODY" | jq -r '.sessionId' 2>/dev/null || echo "")
if [[ "\$T30_CODE" == "401" && "\$T30_ERR" == *"Invalid or expired"* && -z "\$T30_SESSION" ]]; then
note PASS "HTTP 401, error contains 'Invalid or expired', no sessionId created"
PASS=\$((PASS+1))
else
note FAIL "code=\$T30_CODE err='\$T30_ERR' sessionId='\$T30_SESSION'"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T31 POST /jobs with bogus X-Nostr-Token 401
# ---------------------------------------------------------------------------
sep "Test 31 — POST /jobs with bad token → 401"
T31_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" \\
-H "X-Nostr-Token: badtoken" \\
-d '{"request":"What is Bitcoin?"}')
T31_BODY=\$(body_of "\$T31_RES"); T31_CODE=\$(code_of "\$T31_RES")
T31_ERR=\$(echo "\$T31_BODY" | jq -r '.error' 2>/dev/null || echo "")
if [[ "\$T31_CODE" == "401" && "\$T31_ERR" == *"Invalid or expired"* ]]; then
note PASS "HTTP 401, error contains 'Invalid or expired'"
PASS=\$((PASS+1))
else
note FAIL "code=\$T31_CODE err='\$T31_ERR' body=\$T31_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T32 POST /sessions (anonymous) includes trust_tier = "anonymous"
# Capture T32_SESSION_ID for reuse in T35.
# ---------------------------------------------------------------------------
sep "Test 32 — POST /sessions anonymous → trust_tier=anonymous"
T32_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/sessions" \\
-H "Content-Type: application/json" \\
-d '{"amount_sats":200}')
T32_BODY=\$(body_of "\$T32_RES"); T32_CODE=\$(code_of "\$T32_RES")
T32_SESSION_ID=\$(echo "\$T32_BODY" | jq -r '.sessionId' 2>/dev/null || echo "")
T32_TIER=\$(echo "\$T32_BODY" | jq -r '.trust_tier' 2>/dev/null || echo "")
if [[ "\$T32_CODE" == "201" && "\$T32_TIER" == "anonymous" && -n "\$T32_SESSION_ID" ]]; then
note PASS "HTTP 201, trust_tier=anonymous, sessionId=\${T32_SESSION_ID:0:8}..."
PASS=\$((PASS+1))
else
note FAIL "code=\$T32_CODE trust_tier='\$T32_TIER' body=\$T32_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T33 POST /jobs (anonymous) includes trust_tier = "anonymous"
# Capture T33_JOB_ID for reuse in T34.
# ---------------------------------------------------------------------------
sep "Test 33 — POST /jobs anonymous → trust_tier=anonymous"
T33_RES=\$(curl -s -w "\\n%{http_code}" -X POST "\$BASE/api/jobs" \\
-H "Content-Type: application/json" \\
-d '{"request":"T33 trust tier probe"}')
T33_BODY=\$(body_of "\$T33_RES"); T33_CODE=\$(code_of "\$T33_RES")
T33_JOB_ID=\$(echo "\$T33_BODY" | jq -r '.jobId' 2>/dev/null || echo "")
T33_TIER=\$(echo "\$T33_BODY" | jq -r '.trust_tier' 2>/dev/null || echo "")
if [[ "\$T33_CODE" == "201" && "\$T33_TIER" == "anonymous" && -n "\$T33_JOB_ID" ]]; then
note PASS "HTTP 201, trust_tier=anonymous, jobId=\${T33_JOB_ID:0:8}..."
PASS=\$((PASS+1))
else
note FAIL "code=\$T33_CODE trust_tier='\$T33_TIER' body=\$T33_BODY"
FAIL=\$((FAIL+1))
fi
# ---------------------------------------------------------------------------
# T34 GET /jobs/:id always includes trust_tier (uses T33_JOB_ID)
# ---------------------------------------------------------------------------
sep "Test 34 — GET /jobs/:id always includes trust_tier"
if [[ -n "\$T33_JOB_ID" && "\$T33_JOB_ID" != "null" ]]; then
T34_RES=\$(curl -s -w "\\n%{http_code}" "\$BASE/api/jobs/\$T33_JOB_ID")
T34_BODY=\$(body_of "\$T34_RES"); T34_CODE=\$(code_of "\$T34_RES")
T34_TIER=\$(echo "\$T34_BODY" | jq -r '.trust_tier' 2>/dev/null || echo "")
if [[ "\$T34_CODE" == "200" && -n "\$T34_TIER" && "\$T34_TIER" != "null" && "\$T34_TIER" == "anonymous" ]]; then
note PASS "HTTP 200, trust_tier=\$T34_TIER (anonymous job)"
PASS=\$((PASS+1))
else
note FAIL "code=\$T34_CODE trust_tier='\$T34_TIER' body=\$T34_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No T33_JOB_ID available — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# T35 GET /sessions/:id always includes trust_tier (uses T32_SESSION_ID)
# ---------------------------------------------------------------------------
sep "Test 35 — GET /sessions/:id always includes trust_tier"
if [[ -n "\$T32_SESSION_ID" && "\$T32_SESSION_ID" != "null" ]]; then
T35_RES=\$(curl -s -w "\\n%{http_code}" "\$BASE/api/sessions/\$T32_SESSION_ID")
T35_BODY=\$(body_of "\$T35_RES"); T35_CODE=\$(code_of "\$T35_RES")
T35_TIER=\$(echo "\$T35_BODY" | jq -r '.trust_tier' 2>/dev/null || echo "")
if [[ "\$T35_CODE" == "200" && "\$T35_TIER" == "anonymous" ]]; then
note PASS "HTTP 200, trust_tier=\$T35_TIER"
PASS=\$((PASS+1))
else
note FAIL "code=\$T35_CODE trust_tier='\$T35_TIER' body=\$T35_BODY"
FAIL=\$((FAIL+1))
fi
else
note SKIP "No T32_SESSION_ID available — skipping"
SKIP=\$((SKIP+1))
fi
# ---------------------------------------------------------------------------
# T36 Full challenge sign verify end-to-end flow (inline node script)
# Uses nostr-tools CJS bundle (node_modules absolute path for portability).
# Guards on node availability and script success; SKIP if either fails.
# ---------------------------------------------------------------------------
sep "Test 36 — Full challenge → sign → verify (Nostr identity flow)"
T36_SKIP=false
if ! command -v node >/dev/null 2>&1; then
note SKIP "node not found in PATH — skipping T36"
SKIP=\$((SKIP+1))
T36_SKIP=true
fi
if [[ "\$T36_SKIP" == "false" ]]; then
# Write a temp CommonJS script so we can import nostr-tools via require()
# with an absolute path to the CJS bundle. Falls back to SKIP on any failure.
T36_TMPFILE=\$(mktemp /tmp/nostr_t36_XXXXXX.cjs)
cat > "\$T36_TMPFILE" << 'NODESCRIPT'
'use strict';
const https = require('https');
const http = require('http');
const BASE = process.argv[2];
// Try the absolute CJS bundle path first (dev/Replit), then bare module name.
let nt;
const NOSTR_CJS = '/home/runner/workspace/artifacts/api-server/node_modules/nostr-tools/lib/cjs/index.js';
try { nt = require(NOSTR_CJS); } catch { try { nt = require('nostr-tools'); } catch { process.stderr.write('nostr-tools not importable\n'); process.exit(1); } }
const { generateSecretKey, getPublicKey, finalizeEvent } = nt;
function request(url, opts, body) {
return new Promise((resolve, reject) => {
const u = new URL(url);
const mod = u.protocol === 'https:' ? https : http;
const req = mod.request(u, opts, (res) => {
let data = '';
res.on('data', c => data += c);
res.on('end', () => resolve({ status: res.statusCode, body: data }));
});
req.on('error', reject);
if (body) req.write(body);
req.end();
});
}
async function main() {
// 1. Generate ephemeral keypair
const sk = generateSecretKey();
const pubkey = getPublicKey(sk);
// 2. Get challenge nonce
const chalRes = await request(BASE + '/api/identity/challenge', { method: 'POST', headers: { 'Content-Type': 'application/json' } }, '{}');
if (chalRes.status !== 200) { process.stderr.write('challenge failed: ' + chalRes.status + '\n'); process.exit(1); }
const { nonce } = JSON.parse(chalRes.body);
// 3. Build and sign Nostr event (kind=27235, content=nonce)
const event = finalizeEvent({ kind: 27235, content: nonce, tags: [], created_at: Math.floor(Date.now() / 1000) }, sk);
// 4. POST /identity/verify
const verRes = await request(BASE + '/api/identity/verify',
{ method: 'POST', headers: { 'Content-Type': 'application/json' } },
JSON.stringify({ event }));
if (verRes.status !== 200) { process.stderr.write('verify failed: ' + verRes.status + ' ' + verRes.body + '\n'); process.exit(1); }
const verBody = JSON.parse(verRes.body);
const token = verBody.nostr_token;
const tier = verBody.trust && verBody.trust.tier;
const icount = verBody.trust && verBody.trust.interactionCount;
// 5. GET /identity/me
const meRes = await request(BASE + '/api/identity/me',
{ method: 'GET', headers: { 'X-Nostr-Token': token } });
if (meRes.status !== 200) { process.stderr.write('identity/me failed: ' + meRes.status + '\n'); process.exit(1); }
const meBody = JSON.parse(meRes.body);
const meTier = meBody.trust && meBody.trust.tier;
const mePubkey = meBody.pubkey;
// Print result as single JSON line
process.stdout.write(JSON.stringify({ pubkey, tier, icount, meTier, mePubkey }) + '\n');
}
main().catch(err => { process.stderr.write(String(err) + '\n'); process.exit(1); });
NODESCRIPT
T36_OUT=\$(node "\$T36_TMPFILE" "\$BASE" 2>/dev/null)
T36_EXIT=\$?
rm -f "\$T36_TMPFILE"
if [[ \$T36_EXIT -ne 0 || -z "\$T36_OUT" ]]; then
note SKIP "node script failed (nostr-tools unavailable or network error)"
SKIP=\$((SKIP+1))
else
T36_TIER=\$(echo "\$T36_OUT" | jq -r '.tier' 2>/dev/null || echo "")
T36_ICOUNT=\$(echo "\$T36_OUT" | jq -r '.icount' 2>/dev/null || echo "")
T36_METIER=\$(echo "\$T36_OUT" | jq -r '.meTier' 2>/dev/null || echo "")
T36_PK=\$(echo "\$T36_OUT" | jq -r '.pubkey' 2>/dev/null || echo "")
T36_MEPK=\$(echo "\$T36_OUT" | jq -r '.mePubkey' 2>/dev/null || echo "")
if [[ "\$T36_TIER" == "new" && "\$T36_ICOUNT" == "0" \\
&& "\$T36_METIER" == "new" && "\$T36_PK" == "\$T36_MEPK" && -n "\$T36_PK" ]]; then
note PASS "challenge→sign→verify OK: tier=\$T36_TIER icount=\$T36_ICOUNT identity/me pubkey matches"
PASS=\$((PASS+1))
else
note FAIL "tier=\$T36_TIER icount=\$T36_ICOUNT meTier=\$T36_METIER pkMatch=\$([[ \$T36_PK == \$T36_MEPK ]] && echo yes || echo no)"
FAIL=\$((FAIL+1))
fi
fi
fi
# ===========================================================================
# FUTURE STUBS placeholders for upcoming tasks (do not affect PASS/FAIL)
# ===========================================================================
# These are bash comments only. They document planned tests so future tasks
# can implement them with the correct numbering context.
#
# FUTURE T37 (Task #27 cost routing): GET /api/estimate returns cost preview
# GET \$BASE/api/estimate?request=<text>
# Assert HTTP 200, estimatedSats is a positive integer
# Assert model, inputTokens, outputTokens are present
#
# FUTURE T38 (Task #27 cost routing): Anonymous job always hits Lightning gate
# Create anonymous job, poll to awaiting_work_payment
# Assert response.free_tier is absent or false in all poll responses
#
# FUTURE T39 (Task #27 free tier): Nostr-identified trusted identity free response
# Requires identity with trust_score >= 50 (trusted tier) and daily budget not exhausted
# Submit request with identity token
# Assert HTTP 200, response.free_tier == true, no invoice created
#
# FUTURE T40 (Task #29 Timmy as peer): Timmy initiates a zap
# POST to /api/identity/me/tip (or similar)
# Assert Timmy initiates a Lightning outbound payment to caller's LNURL
# ---------------------------------------------------------------------------
# Summary
# ---------------------------------------------------------------------------
echo
echo "======================================="
echo " RESULTS: PASS=\$PASS FAIL=\$FAIL SKIP=\$SKIP"
echo "======================================="
if [[ "\$FAIL" -gt 0 ]]; then exit 1; fi
`;
res.setHeader("Content-Type", "text/x-shellscript; charset=utf-8");
res.setHeader("Content-Disposition", 'inline; filename="timmy_test.sh"');
res.send(script);
});
task/34: Testkit self-serve plan + report endpoints ## Routes added to artifacts/api-server/src/routes/testkit.ts ### GET /api/testkit/plan - Returns TIMMY_TEST_PLAN.md verbatim as text/markdown; charset=utf-8 - Reads file at request time (not on startup) so edits to the plan are picked up without server restart - Path resolves via import.meta.url + dirname() → 4 levels up to project root (handles both dev/tsx and compiled dist/routes/ directories) ### GET /api/testkit/report - Returns only the content from "## Report template" heading to end-of-file - Content-Type: text/plain; charset=utf-8 — ready to copy and fill in - Slice is found with indexOf("## Report template"); 500 if marker absent - Uses the same PLAN_PATH as /api/testkit/plan (single source of truth) ## Deviation: __dirname → import.meta.url Original plan said "resolve relative to project root regardless of cwd". The codebase runs as ESM (tsx / ts-node with ESM), so __dirname is not defined. Fixed by using dirname(fileURLToPath(import.meta.url)) instead — equivalent semantics, correct in both dev and compiled output. ## AGENTS.md — Testing section added Three-step workflow documented between "Branch and PR conventions" and "Stub mode" sections: 1. curl <BASE>/api/testkit/plan — fetch plan before starting 2. curl -s <BASE>/api/testkit | bash — run suite after implementing 3. curl <BASE>/api/testkit/report — fetch report template to fill in ## Unchanged - GET /api/testkit bash script generation: untouched - No new test cases or script modifications ## TypeScript: 0 errors. Smoke tests all pass: - /api/testkit/plan → 200 text/markdown, full TIMMY_TEST_PLAN.md content - /api/testkit/report → 200 text/plain, starts at "## Report template" - /api/testkit → 200 bash script, unchanged
2026-03-19 21:02:43 +00:00
/**
* GET /api/testkit/plan
*
* Returns TIMMY_TEST_PLAN.md verbatim as text/markdown.
* Path resolves from the project root regardless of cwd.
*/
const _dirname = dirname(fileURLToPath(import.meta.url));
const PLAN_PATH = resolve(_dirname, "../../../../TIMMY_TEST_PLAN.md");
router.get("/testkit/plan", (_req: Request, res: Response) => {
let content: string;
try {
content = readFileSync(PLAN_PATH, "utf-8");
} catch {
res.status(500).json({ error: "TIMMY_TEST_PLAN.md not found on server" });
return;
}
res.setHeader("Content-Type", "text/markdown; charset=utf-8");
res.send(content);
});
/**
* GET /api/testkit/report
*
* Returns only the report template section from TIMMY_TEST_PLAN.md
* everything from the "## Report template" heading to end-of-file.
* Returned as text/plain so agents can copy and fill in directly.
*/
router.get("/testkit/report", (_req: Request, res: Response) => {
let content: string;
try {
content = readFileSync(PLAN_PATH, "utf-8");
} catch {
res.status(500).json({ error: "TIMMY_TEST_PLAN.md not found on server" });
return;
}
const marker = "## Report template";
const idx = content.indexOf(marker);
if (idx === -1) {
res.status(500).json({ error: "Report template section not found in plan" });
return;
}
res.setHeader("Content-Type", "text/plain; charset=utf-8");
res.send(content.slice(idx));
});
export default router;