Adds a three-layer defense against the class of silent correctness bugs surfaced in #54. The layers are: documented rules (AGENTS.md), dynamic assertions (scripts/smoke.mjs), and static checks (scripts/guardrails.sh). A new Gitea Actions workflow runs the dynamic + static layers on every PR. Also fixes two of the bugs the smoke test immediately caught on main: 1. G.flags is now initialized to {} in the globals block. Previously it was created lazily by the p_creativity project effect, which forced every reader to write `G.flags && G.flags.x` — and left a window where a new writer could drop the defensive guard and crash. 2. The community_drama debuff no longer mutates G.codeBoost. Its applyFn was invoked from updateRates() on every tick (10 Hz), so the original `G.codeBoost *= 0.7` compounded: after ~100 ticks of the drama debuff, codeBoost was ~3e-16 instead of the intended 0.7. The fix targets G.codeRate instead, which is reset at the top of updateRates() and is therefore safe to multiplicatively reduce inside applyFn. AGENTS.md rule 1 explains the distinction between persistent multipliers and per-tick rate fields so future debuffs don't reintroduce the bug. The smoke test (`scripts/smoke.mjs`) runs game.js in a vm sandbox with a minimal DOM stub, no npm deps. It boots the engine, runs ticks, clicks, buys a building, fires every debuff, checks codeBoost stability, checks updateRates idempotency, and does a save/load round-trip. 30 assertions, ~0.1s on a dev machine. The static guardrails (`scripts/guardrails.sh`) grep for the patterns AGENTS.md forbids. Two rules (click power single-source, no Object.assign in loadGame) are marked PENDING because PR #55 is landing the fix for them — the workflow reports them but doesn't fail until #55 merges. Refs: #54
6.9 KiB
AGENTS.md — guardrails for agents contributing to The Beacon
This file documents the non-obvious rules that any contributor (human or AI) should know before touching game.js. It is enforced in two layers: a headless smoke test (scripts/smoke.mjs) and static grep-based checks (.gitea/workflows/guardrails.yml). Both run on every PR.
The Beacon is a single-file, browser-only idle game. The codebase is small, the mechanics are interlocked, and the most common failure mode is quietly wrong math — the game keeps rendering, the player just slowly loses. These rules exist to make that class of bug impossible to land.
1. The persistent-multiplier rule (*Boost)
G.codeBoost, G.computeBoost, G.knowledgeBoost, G.userBoost, G.impactBoost are persistent multipliers. They are set once (by projects, sprints, alignment events) and read on every tick inside updateRates(). They are never reset between ticks.
G.codeRate, G.computeRate, etc. are per-tick rate fields. They are reset to 0 at the top of updateRates() and rebuilt from scratch every tick.
The bug class this created: community_drama's original applyFn did G.codeBoost *= 0.7. Because applyFn is invoked from updateRates() every 100 ms, codeBoost decayed to ~3e-16 after a minute of in-game time. The rendering was fine, the click button worked, the player just saw their code rate silently vanish.
The rule:
Inside any function that runs more than once per persistent state change — specifically
updateRates(),tick(),applyFn, or anything invoked from them — never mutateG.codeBoost,G.computeBoost,G.knowledgeBoost,G.userBoost, orG.impactBoost.If you want a debuff to reduce code production, mutate
G.codeRateinside the debuff'sapplyFn.G.codeRateis zeroed at the top ofupdateRates(), so*= 0.7applies exactly once per tick — which is what you want.
*Boostfields should only be written by one-shot events: projecteffect()callbacks, sprint start/end, alignment resolutions. Those run zero or one times per player action, not per tick.
The guardrails workflow runs grep -nE '(codeBoost|computeBoost|knowledgeBoost|userBoost|impactBoost)\s*[*/+-]?=' game.js and fails the job if any hit falls inside an applyFn block or inside the updateRates()/tick() bodies.
2. Click power has exactly one source of truth
The click-power formula (1 + floor(autocoder * 0.5) + max(0, phase - 1) * 2) * codeBoost used to live in three places: writeCode(), autoType(), and the Swarm Protocol branch of updateRates(). Changing one without the others was trivially easy and hard to notice.
The rule:
Click power is computed only by
getClickPower(). Any function that needs "how much code does one click generate right now" must callgetClickPower()directly. Do not inline the formula.
The guardrails check greps for Math.floor(G.buildings.autocoder * 0.5) and fails if it appears outside getClickPower().
3. Save ↔ load must stay symmetric
saveGame() writes a hand-curated set of fields to localStorage. loadGame() should restore exactly those fields, no more and no less. The old loadGame() used Object.assign(G, data), which copied whatever was in the JSON including keys the game never wrote. That was simultaneously a prompt-injection surface (a malicious save file could set arbitrary keys) and a silent drift trap (fields added to G but forgotten in saveGame would reset every reload).
The rule:
The list of save fields must be defined exactly once, as a top-level
const SAVE_FIELDSarray. BothsaveGame()andloadGame()read that array. Loading uses a whitelisted copy, notObject.assign.When you add a new field to
Gthat represents persistent player state, add it toSAVE_FIELDS. If you're unsure whether a field should persist, ask: "if the player refreshes the page, do they expect this to be the same?"
The smoke test includes a save → load round-trip check (scripts/smoke.mjs section 7). Extend the fields it sets/verifies whenever you add new persistent state.
4. applyFn is called once per updateRates(), not once per event
G.activeDebuffs[].applyFn is invoked at the end of updateRates(). That function runs ~10 times per second. If you want a debuff to apply a rate reduction (subtract from/multiply down a per-tick rate field) the math works out — because the field was just reset. If you want it to prevent progression, set a flag (G.flags.*) and check that flag in the places that generate progression, rather than continuously rewriting state.
See rule 1. These are the same rule viewed from two angles.
5. Event resolveCost should live on the event definition, not be duplicated
Currently, each entry in the EVENTS array declares resolveCost twice — once as a property of the event object, and again inside the object pushed onto G.activeDebuffs. Keep them in sync until someone refactors this into a single source. When adding a new event, copy the resolveCost literally between the two sites. The smoke test does not yet catch drift here; a follow-up PR should pull the debuff object construction out into a helper.
6. Don't trust G.flags to exist implicitly
G.flags is initialized as {} at the top of the file. Do not replace it with a reference somewhere else — other code assumes G.flags is a live object reference and reads sub-fields like G.flags.creativity directly. New sub-flags go inside G.flags; new top-level flags should use the somethingFlag naming convention (see deployFlag, pactFlag, etc.) but that pattern is being consolidated into G.flags over time.
7. Copyright, assets, secrets
- No third-party assets without a license note.
- No API keys, tokens, or credentials in the repo. The smoke workflow scans for
sk-ant-,sk-or-,ghp_,AKIAliteral prefixes. - Educational blurbs (
edu:strings inBDEF/PDEFS) are authored content — don't generate new ones from other people's copyrighted material.
Running the guardrails locally
node scripts/smoke.mjs # headless smoke test
grep -nE "\\*Boost\\s*\\*=" game.js # spot persistent-multiplier mutations
node -c game.js # syntax check
The CI job .gitea/workflows/guardrails.yml runs all three on every PR. A failure blocks merge; see the job log for exactly which invariant broke.
How to add a new guardrail
- Write a test in
scripts/smoke.mjsthat fails on the bug you just found and would have caught. - Fix the bug.
- Confirm the test now passes.
- Add a short rule to this file explaining why the invariant exists (usually a war story helps).
- If the bug is detectable by grep, add a check to
.gitea/workflows/guardrails.ymlso it fails fast on PRs.
The smoke test's job isn't to be exhaustive — it's to encode the specific class of bugs that have actually hit production, so we never see the same one twice.