Timmy_Foundation/the-beacon

Fork 0

Files

Alexander Whitestone 0e5538319b

Accessibility Checks / a11y-audit (pull_request) Failing after 2s

Details

Guardrails / guardrails (pull_request) Successful in 4s

Details

Smoke Test / smoke (pull_request) Failing after 3s

Details

feat: agent guardrails, headless smoke test, and CI workflow

Adds a three-layer defense against the class of silent correctness bugs
surfaced in #54. The layers are: documented rules (AGENTS.md), dynamic
assertions (scripts/smoke.mjs), and static checks (scripts/guardrails.sh).
A new Gitea Actions workflow runs the dynamic + static layers on every PR.

Also fixes two of the bugs the smoke test immediately caught on main:

1. G.flags is now initialized to {} in the globals block. Previously it
   was created lazily by the p_creativity project effect, which forced
   every reader to write `G.flags && G.flags.x` — and left a window where
   a new writer could drop the defensive guard and crash.

2. The community_drama debuff no longer mutates G.codeBoost. Its applyFn
   was invoked from updateRates() on every tick (10 Hz), so the original
   `G.codeBoost *= 0.7` compounded: after ~100 ticks of the drama debuff,
   codeBoost was ~3e-16 instead of the intended 0.7. The fix targets
   G.codeRate instead, which is reset at the top of updateRates() and is
   therefore safe to multiplicatively reduce inside applyFn. AGENTS.md
   rule 1 explains the distinction between persistent multipliers and
   per-tick rate fields so future debuffs don't reintroduce the bug.

The smoke test (`scripts/smoke.mjs`) runs game.js in a vm sandbox with a
minimal DOM stub, no npm deps. It boots the engine, runs ticks, clicks,
buys a building, fires every debuff, checks codeBoost stability, checks
updateRates idempotency, and does a save/load round-trip. 30 assertions,
~0.1s on a dev machine.

The static guardrails (`scripts/guardrails.sh`) grep for the patterns
AGENTS.md forbids. Two rules (click power single-source, no Object.assign
in loadGame) are marked PENDING because PR #55 is landing the fix for
them — the workflow reports them but doesn't fail until #55 merges.

Refs: #54

2026-04-10 20:50:50 -04:00

6.9 KiB

Raw Blame History

AGENTS.md — guardrails for agents contributing to The Beacon

This file documents the non-obvious rules that any contributor (human or AI) should know before touching game.js. It is enforced in two layers: a headless smoke test (scripts/smoke.mjs) and static grep-based checks (.gitea/workflows/guardrails.yml). Both run on every PR.

The Beacon is a single-file, browser-only idle game. The codebase is small, the mechanics are interlocked, and the most common failure mode is quietly wrong math — the game keeps rendering, the player just slowly loses. These rules exist to make that class of bug impossible to land.

1. The persistent-multiplier rule (`*Boost`)

G.codeBoost, G.computeBoost, G.knowledgeBoost, G.userBoost, G.impactBoost are persistent multipliers. They are set once (by projects, sprints, alignment events) and read on every tick inside updateRates(). They are never reset between ticks.

G.codeRate, G.computeRate, etc. are per-tick rate fields. They are reset to 0 at the top of updateRates() and rebuilt from scratch every tick.

The bug class this created: community_drama's original applyFn did G.codeBoost *= 0.7. Because applyFn is invoked from updateRates() every 100 ms, codeBoost decayed to ~3e-16 after a minute of in-game time. The rendering was fine, the click button worked, the player just saw their code rate silently vanish.

The rule:

Inside any function that runs more than once per persistent state change — specifically updateRates(), tick(), applyFn, or anything invoked from them — never mutate G.codeBoost, G.computeBoost, G.knowledgeBoost, G.userBoost, or G.impactBoost.

If you want a debuff to reduce code production, mutate G.codeRate inside the debuff's applyFn. G.codeRate is zeroed at the top of updateRates(), so *= 0.7 applies exactly once per tick — which is what you want.

*Boost fields should only be written by one-shot events: project effect() callbacks, sprint start/end, alignment resolutions. Those run zero or one times per player action, not per tick.

The guardrails workflow runs grep -nE '(codeBoost|computeBoost|knowledgeBoost|userBoost|impactBoost)\s*[*/+-]?=' game.js and fails the job if any hit falls inside an applyFn block or inside the updateRates()/tick() bodies.

2. Click power has exactly one source of truth

The click-power formula (1 + floor(autocoder * 0.5) + max(0, phase - 1) * 2) * codeBoost used to live in three places: writeCode(), autoType(), and the Swarm Protocol branch of updateRates(). Changing one without the others was trivially easy and hard to notice.

The rule:

Click power is computed only by getClickPower(). Any function that needs "how much code does one click generate right now" must call getClickPower() directly. Do not inline the formula.

The guardrails check greps for Math.floor(G.buildings.autocoder * 0.5) and fails if it appears outside getClickPower().

3. Save ↔ load must stay symmetric

saveGame() writes a hand-curated set of fields to localStorage. loadGame() should restore exactly those fields, no more and no less. The old loadGame() used Object.assign(G, data), which copied whatever was in the JSON including keys the game never wrote. That was simultaneously a prompt-injection surface (a malicious save file could set arbitrary keys) and a silent drift trap (fields added to G but forgotten in saveGame would reset every reload).

The rule:

The list of save fields must be defined exactly once, as a top-level const SAVE_FIELDS array. Both saveGame() and loadGame() read that array. Loading uses a whitelisted copy, not Object.assign.

When you add a new field to G that represents persistent player state, add it to SAVE_FIELDS. If you're unsure whether a field should persist, ask: "if the player refreshes the page, do they expect this to be the same?"

The smoke test includes a save → load round-trip check (scripts/smoke.mjs section 7). Extend the fields it sets/verifies whenever you add new persistent state.

4. `applyFn` is called once per `updateRates()`, not once per event

G.activeDebuffs[].applyFn is invoked at the end of updateRates(). That function runs ~10 times per second. If you want a debuff to apply a rate reduction (subtract from/multiply down a per-tick rate field) the math works out — because the field was just reset. If you want it to prevent progression, set a flag (G.flags.*) and check that flag in the places that generate progression, rather than continuously rewriting state.

See rule 1. These are the same rule viewed from two angles.

5. Event `resolveCost` should live on the event definition, not be duplicated

Currently, each entry in the EVENTS array declares resolveCost twice — once as a property of the event object, and again inside the object pushed onto G.activeDebuffs. Keep them in sync until someone refactors this into a single source. When adding a new event, copy the resolveCost literally between the two sites. The smoke test does not yet catch drift here; a follow-up PR should pull the debuff object construction out into a helper.

6. Don't trust `G.flags` to exist implicitly

G.flags is initialized as {} at the top of the file. Do not replace it with a reference somewhere else — other code assumes G.flags is a live object reference and reads sub-fields like G.flags.creativity directly. New sub-flags go inside G.flags; new top-level flags should use the somethingFlag naming convention (see deployFlag, pactFlag, etc.) but that pattern is being consolidated into G.flags over time.

7. Copyright, assets, secrets

No third-party assets without a license note.
No API keys, tokens, or credentials in the repo. The smoke workflow scans for sk-ant-, sk-or-, ghp_, AKIA literal prefixes.
Educational blurbs (edu: strings in BDEF/PDEFS) are authored content — don't generate new ones from other people's copyrighted material.

Running the guardrails locally

node scripts/smoke.mjs          # headless smoke test
grep -nE "\\*Boost\\s*\\*=" game.js   # spot persistent-multiplier mutations
node -c game.js                 # syntax check

The CI job .gitea/workflows/guardrails.yml runs all three on every PR. A failure blocks merge; see the job log for exactly which invariant broke.

How to add a new guardrail

Write a test in scripts/smoke.mjs that fails on the bug you just found and would have caught.
Fix the bug.
Confirm the test now passes.
Add a short rule to this file explaining why the invariant exists (usually a war story helps).
If the bug is detectable by grep, add a check to .gitea/workflows/guardrails.yml so it fails fast on PRs.

The smoke test's job isn't to be exhaustive — it's to encode the specific class of bugs that have actually hit production, so we never see the same one twice.

6.9 KiB Raw Blame History

AGENTS.md — guardrails for agents contributing to The Beacon

1. The persistent-multiplier rule (*Boost)

2. Click power has exactly one source of truth

3. Save ↔ load must stay symmetric

4. applyFn is called once per updateRates(), not once per event

5. Event resolveCost should live on the event definition, not be duplicated

6. Don't trust G.flags to exist implicitly

7. Copyright, assets, secrets

Running the guardrails locally

How to add a new guardrail

6.9 KiB

Raw Blame History

1. The persistent-multiplier rule (`*Boost`)

4. `applyFn` is called once per `updateRates()`, not once per event

5. Event `resolveCost` should live on the event definition, not be duplicated

6. Don't trust `G.flags` to exist implicitly