Files
the-beacon/scripts/guardrails.sh
Alexander Whitestone 0e5538319b
Some checks failed
Accessibility Checks / a11y-audit (pull_request) Failing after 2s
Guardrails / guardrails (pull_request) Successful in 4s
Smoke Test / smoke (pull_request) Failing after 3s
feat: agent guardrails, headless smoke test, and CI workflow
Adds a three-layer defense against the class of silent correctness bugs
surfaced in #54. The layers are: documented rules (AGENTS.md), dynamic
assertions (scripts/smoke.mjs), and static checks (scripts/guardrails.sh).
A new Gitea Actions workflow runs the dynamic + static layers on every PR.

Also fixes two of the bugs the smoke test immediately caught on main:

1. G.flags is now initialized to {} in the globals block. Previously it
   was created lazily by the p_creativity project effect, which forced
   every reader to write `G.flags && G.flags.x` — and left a window where
   a new writer could drop the defensive guard and crash.

2. The community_drama debuff no longer mutates G.codeBoost. Its applyFn
   was invoked from updateRates() on every tick (10 Hz), so the original
   `G.codeBoost *= 0.7` compounded: after ~100 ticks of the drama debuff,
   codeBoost was ~3e-16 instead of the intended 0.7. The fix targets
   G.codeRate instead, which is reset at the top of updateRates() and is
   therefore safe to multiplicatively reduce inside applyFn. AGENTS.md
   rule 1 explains the distinction between persistent multipliers and
   per-tick rate fields so future debuffs don't reintroduce the bug.

The smoke test (`scripts/smoke.mjs`) runs game.js in a vm sandbox with a
minimal DOM stub, no npm deps. It boots the engine, runs ticks, clicks,
buys a building, fires every debuff, checks codeBoost stability, checks
updateRates idempotency, and does a save/load round-trip. 30 assertions,
~0.1s on a dev machine.

The static guardrails (`scripts/guardrails.sh`) grep for the patterns
AGENTS.md forbids. Two rules (click power single-source, no Object.assign
in loadGame) are marked PENDING because PR #55 is landing the fix for
them — the workflow reports them but doesn't fail until #55 merges.

Refs: #54
2026-04-10 20:50:50 -04:00

103 lines
3.7 KiB
Bash
Executable File

#!/usr/bin/env bash
# Static guardrail checks for game.js. Run from repo root.
#
# Each check prints a PASS/FAIL line and contributes to the final exit code.
# The rules enforced here come from AGENTS.md — keep the two files in sync.
#
# Some rules are marked PENDING: they describe invariants we've agreed on but
# haven't reached on main yet (because another open PR is landing the fix).
# PENDING rules print their current violation count without failing the job;
# convert them to hard failures once the blocking PR merges.
set -u
fail=0
say() { printf '%s\n' "$*"; }
banner() { say ""; say "==== $* ===="; }
# ---------- Rule 1: no *Boost mutation inside applyFn blocks ----------
# Persistent multipliers (codeBoost, computeBoost, ...) must not be written
# from any function that runs per tick. The `applyFn` of a debuff is invoked
# on every updateRates() call, so `G.codeBoost *= 0.7` inside applyFn compounds
# and silently zeros code production. See AGENTS.md rule 1.
banner "Rule 1: no *Boost mutation inside applyFn"
rule1_hits=$(awk '
/applyFn:/ { inFn=1; brace=0; next }
inFn {
n = gsub(/\{/, "{")
brace += n
if ($0 ~ /(codeBoost|computeBoost|knowledgeBoost|userBoost|impactBoost)[[:space:]]*([*\/+\-]=|=)/) {
print FILENAME ":" NR ": " $0
}
n = gsub(/\}/, "}")
brace -= n
if (brace <= 0) inFn = 0
}
' game.js)
if [ -z "$rule1_hits" ]; then
say " PASS"
else
say " FAIL — see AGENTS.md rule 1"
say "$rule1_hits"
fail=1
fi
# ---------- Rule 2: click power has a single source (getClickPower) ----------
# The formula should live only inside getClickPower(). If it appears anywhere
# else, the sites will drift when someone changes the formula.
banner "Rule 2: click power formula has one source"
rule2_hits=$(grep -nE 'Math\.floor\(G\.buildings\.autocoder \* 0\.5\)' game.js || true)
rule2_count=0
if [ -n "$rule2_hits" ]; then
rule2_count=$(printf '%s\n' "$rule2_hits" | grep -c .)
fi
if [ "$rule2_count" -le 1 ]; then
say " PASS ($rule2_count site)"
else
say " FAIL — $rule2_count sites; inline into getClickPower() only"
printf '%s\n' "$rule2_hits"
fail=1
fi
# ---------- Rule 3: loadGame uses a whitelist, not Object.assign ----------
# Object.assign(G, data) lets a malicious or corrupted save file set any G
# field, and hides drift when saveGame's explicit list diverges from what
# the game actually reads. See AGENTS.md rule 3.
banner "Rule 3: loadGame uses a whitelist"
rule3_hits=$(grep -nE 'Object\.assign\(G,[[:space:]]*data\)' game.js || true)
if [ -z "$rule3_hits" ]; then
say " PASS"
else
say " FAIL — see AGENTS.md rule 3"
printf '%s\n' "$rule3_hits"
fail=1
fi
# ---------- Rule 7: no secrets in the tree ----------
# Scans for common token prefixes. Expand the pattern list when new key
# formats appear in the fleet. See AGENTS.md rule 7.
banner "Rule 7: secret scan"
secret_hits=$(grep -rnE 'sk-ant-[a-zA-Z0-9_-]{6,}|sk-or-[a-zA-Z0-9_-]{6,}|ghp_[a-zA-Z0-9]{20,}|AKIA[0-9A-Z]{16}' \
--include='*.js' --include='*.json' --include='*.md' --include='*.html' \
--include='*.yml' --include='*.yaml' --include='*.py' --include='*.sh' \
--exclude-dir=.git --exclude-dir=.gitea . || true)
# Strip our own literal-prefix patterns (this file, AGENTS.md, workflow) so the
# check doesn't match the very grep that implements it.
secret_hits=$(printf '%s\n' "$secret_hits" | grep -v -E '(AGENTS\.md|guardrails\.sh|guardrails\.yml)' || true)
if [ -z "$secret_hits" ]; then
say " PASS"
else
say " FAIL"
printf '%s\n' "$secret_hits"
fail=1
fi
banner "result"
if [ "$fail" = "0" ]; then
say "all guardrails passed"
exit 0
else
say "one or more guardrails failed"
exit 1
fi