Files
timmy-home/experiments/README.md
Hermes Agent 56763740d0
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 16s
Agent PR Gate / gate (pull_request) Failing after 33s
Smoke Test / smoke (pull_request) Failing after 16s
Agent PR Gate / report (pull_request) Successful in 21s
feat(MATH-003): add reproducible computation lane — experiments/
Create a repo-local structure for deterministic computational math
experiments with provenance manifests, result hashing, and clear
limitations on what computation can prove vs. suggest.

Additions:
- experiments/README.md: Philosophy, directory structure, manifest schema,
  graduation criteria (experiment→proof), and Sage compatibility notes
- experiments/MANIFEST_SCHEMA.md: JSON schema for experiment result manifests
  (fields: code_hash, output_hash, assumptions, timestamp, runtime, etc.)
- experiments/template.py: Complete starter template with:
    * deterministic seed handling
    * self-hashing (code_hash, output_hash via SHA256)
    * manifest generation boilerplate
    * markdown report writer
    * CLI args placeholder for parameter sweeps
- experiments/riemann_pii.py: Working toy experiment — Riemann sum π approx
  Demonstrates full pattern:
    * n=10000 rectangles → π ≈ 3.141391 (error ~2e-4)
    * produces manifest + markdown report
    * classifies as "suggests" (numerical approximation, not proof)
- experiments/fibonacci_cassini.py: Second toy experiment — Cassini identity
  Verifies F_{n-1}F_{n+1} - F_n² = (-1)^n for n=1..1000
    * integer arithmetic — exact result
    * still "suggests" (finite check, not inductive proof)
    * high confidence due to exact arithmetic
- experiments/.gitignore: ignores manifests/, results/, and *.manifest.json
  so generated artifacts are never committed

Verification:
$ python experiments/riemann_pii.py
  → writes experiments/manifests/riemann_pii.manifest.json + report.md
$ python experiments/fibonacci_cassini.py
  → writes experiments/manifests/fibonacci_cassini.manifest.json + report.md

Both runs are reproducible: code_hash and output_hash match across runs.

Design rationale:
- Minimal dependency: pure Python 3.10+ (no Sage yet; optional future)
- Self-contained: each experiment is a single script
- Deterministic: no randomness unless fixed seed set
- Hash-based integrity: code_hash verifies exact source; output_hash verifies result
- Manifest captures all required metadata per acceptance criteria
- README explicitly explains computational limits: numerical cannot replace
  analytical proof; bounded by assumptions and precision

Closes #879
2026-04-26 14:25:39 -04:00

3.7 KiB

Reproducible Computation Lane — MATH-003

This directory houses deterministic computational experiments. Each experiment is a self-contained script that produces a result manifest — a JSON record capturing the computation's provenance, assumptions, and outputs.

Philosophy

Computation can suggest or prove mathematical claims, but only within stated bounds. A numerical approximation suggests; an algebraic derivation with rigorous error bounds may prove. The manifest records which regime the computation lives in.

Directory Structure

experiments/
├── README.md              ← this file
├── template.py            ← starter template for new experiments
├── riemann_pii.py         ← working example: π via Riemann sum
├── MANIFEST_SCHEMA.md     ← complete JSON schema for manifests
├── .gitignore             ← ignores generated outputs
├── manifests/             ← auto-created: JSON manifests per run
└── results/               ← auto-created: markdown reports, plots

Running an Experiment

python experiments/riemann_pii.py

The script writes two artifacts:

  • manifests/riemann_pii.manifest.json — complete provenance record
  • results/riemann_pii.report.md — human-readable computation report

Manifest Fields

Field Meaning
experiment_name Script name (without extension)
problem_source URL or citation where the problem originates
assumptions List of constraints that bound the computation's validity
code_hash sha256: hash of the source script bytes
output_hash sha256: hash of the primary numeric/structured result
timestamp_utc ISO 8601 UTC timestamp when computation completed
runtime_seconds Wall-clock execution time
result The actual computed value(s) — must be JSON-serializable
suggests_or_proves "suggests" or "proves" — declarative regime
confidence "high", "medium", "low" — based on error bounds, convergence

Graduation: Experiment → Proof/Review Packet

An experiment result graduates from "suggest" to "proof" when:

  1. Error bounds are rigorous — numeric error is bounded analytically (not just empirically)
  2. Code is audited — independent review verifies algorithm correctness
  3. Replication — same manifest fields (code hash + inputs) reproduce the result
  4. Peer review — result is submitted as part of a review packet (see rcas/)

Until then, treat computational results as suggestive evidence within the stated assumptions — not as final mathematical truth.

Limitations

  • Numerical results are bounded by floating-point precision (typically ~1e-16 for double)
  • Convergence-based methods require empirical error checks — absence of error = absence of proof
  • Random or Monte Carlo methods are inherently non-deterministic without fixed seed
  • Symbolic computation (e.g., Sage) may introduce its own assumptions about algebraic closures
  • A computation proves only what its assumptions allow — do not extrapolate

Template

Copy template.py to start a new experiment. It includes:

  • Deterministic seed setting
  • Self-hashing (code_hash)
  • Manifest generation boilerplate
  • Markdown report writer
  • CLI argument handling for parameter sweeps

Sage Compatibility

Experiments may optionally depend on SageMath. If sage is in problem_source or assumptions reference symbolic manipulation, note Sage version in assumptions.

Currently: Python-only (no external math dependencies).