Commit Graph

3635 Commits

Author SHA1 Message Date
Alexander Whitestone
5989600d80 feat: time-aware model routing for cron jobs (#317)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 1m1s
Empirical audit: cron error rate peaks at 18:00 (9.4%) vs 4.0% at 09:00.
During configured high-error windows, automatically route cron jobs to
more capable models when the user is not present to correct errors.

- agent/smart_model_routing.py: resolve_cron_model() + _hour_in_window()
- cron/scheduler.py: wired into run_job() after base model resolution
- tests/test_cron_model_routing.py: 16 tests

Config:
  cron_model_routing:
    enabled: true
    fallback_model: "anthropic/claude-sonnet-4"
    fallback_provider: "openrouter"
    windows:
      - {start_hour: 17, end_hour: 22, reason: evening_error_peak}
      - {start_hour: 2, end_hour: 5, reason: overnight_api_instability}

Features: midnight-wrap, per-window overrides, first-match-wins,
graceful degradation on malformed config.

Closes #317
2026-04-13 20:19:37 -04:00
1ec02cf061 Merge pull request 'fix(gateway): reject known-weak placeholder tokens at startup' (#371) from fix/weak-credential-guard into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 3m6s
2026-04-13 20:33:00 +00:00
Alexander Whitestone
1156875cb5 fix(gateway): reject known-weak placeholder tokens at startup
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 3m8s
Fixes #318

Cherry-picked concept from ferris fork (f724079).

Problem: Users who copy .env.example without changing values
get confusing auth failures at gateway startup.

Fix: _guard_weak_credentials() checks TELEGRAM_BOT_TOKEN,
DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN, HASS_TOKEN against
known-weak placeholder patterns (your-token-here, fake, xxx,
etc.) and minimum length requirements. Warns at startup.

Tests: 6 tests (no tokens, placeholder, case-insensitive,
short token, valid pass-through, multiple weak). All pass.
2026-04-13 16:32:56 -04:00
f4c102400e Merge pull request 'feat(memory): enable temporal decay with access-recency boost — #241' (#367) from feat/temporal-decay-holographic-memory into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 31s
Merge PR #367: feat(memory): enable temporal decay with access-recency boost
2026-04-13 19:51:04 +00:00
6555ccabc1 Merge pull request 'fix(tools): validate handler return types at dispatch boundary' (#369) from fix/tool-return-type-validation into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 21s
2026-04-13 19:47:56 +00:00
Alexander Whitestone
8c712866c4 fix(tools): validate handler return types at dispatch boundary
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 22s
Fixes #297

Problem: Tool handlers that return dict/list/None instead of a
JSON string crash the agent loop with cryptic errors. No error
proofing at the boundary.
Fix: In handle_function_call(), after dispatch returns:
1. If result is not str → wrap in JSON with _type_warning
2. If result is str but not valid JSON → wrap in {"output": ...}
3. Log type violations for analysis
4. Valid JSON strings pass through unchanged

Tests: 4 new tests (dict, None, non-JSON string, valid JSON).
All 16 tests in test_model_tools.py pass.
2026-04-13 15:47:52 -04:00
8fb59aae64 Merge pull request 'fix(tools): memory no-match is success, not error' (#368) from fix/memory-no-match-not-error into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 22s
2026-04-13 19:41:08 +00:00
Alexander Whitestone
95bde9d3cb fix(tools): memory no-match is success, not error
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 24s
Fixes #313

Problem: MemoryStore.replace() and .remove() return
{"success": false, "error": "No entry matched..."} when the
search substring is not found. This is a valid outcome, not
an error. The empirical audit showed 58.4% error rate on the
memory tool, but 98.4% of those were just empty search results.

Fix: Return {"success": true, "result": "no_match", "message": ...}
instead. This drops the memory tool error rate from ~58% to ~1%.

Tests updated: test_replace_no_match and test_remove_no_match
now assert success=True with result="no_match".
All 33 memory tool tests pass.
2026-04-13 15:40:48 -04:00
Alexander Whitestone
aa6eabb816 feat(memory): enable temporal decay with access-recency boost
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 23s
The holographic retriever had temporal decay implemented but disabled
(half_life=0). All facts scored equally regardless of age — a 2-year-old
fact about a deprecated tool scored the same as yesterday's deployment
config.

This commit:
1. Changes default temporal_decay_half_life from 0 to 60 days
   - 60 days: facts lose half their relevance every 2 months
   - Configurable via config.yaml: plugins.hermes-memory-store.temporal_decay_half_life
   - Added to config schema so `hermes memory setup` exposes it

2. Adds access-recency boost to search scoring
   - Facts accessed within 1 half-life get up to 1.5x boost on their decay factor
   - Boost tapers linearly from 1.5 (just accessed) to 1.0 (1 half-life ago)
   - Capped at 1.0 effective score (boost can't exceed fresh-fact score)
   - Prevents actively-used facts from decaying prematurely

3. Scoring pipeline: score = relevance * trust * decay * min(1.0, access_boost)
   - Fresh facts: decay=1.0, boost≈1.5 → score unchanged
   - 60-day-old, recently accessed: decay=0.5, boost≈1.25 → score=0.625
   - 60-day-old, not accessed: decay=0.5, boost=1.0 → score=0.5
   - 120-day-old, not accessed: decay=0.25, boost=1.0 → score=0.25

23 tests covering:
- Temporal decay formula (fresh, 1HL, 2HL, 3HL, disabled, None, invalid, future)
- Access recency boost (just accessed, halfway, at HL, beyond HL, disabled, range)
- Integration (recently-accessed old fact > equally-old unaccessed fact)
- Default config verification (half_life=60, not 0)

Fixes #241
2026-04-13 15:38:12 -04:00
3b89bfbab2 fix(tools): ast.parse() preflight in execute_code — eliminates ~1,400 sandbox errors (#366)
Some checks failed
Forge CI / smoke-and-build (push) Failing after 23s
2026-04-13 19:26:06 +00:00
3e6e183ad2 Merge pull request 'fix(cron): deploy sync guard + kwarg filter + script failure marker' (#364) from fix/cron-sync-guard-v2 into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 23s
2026-04-13 19:13:31 +00:00
Alexander Whitestone
9c38e28f4d fix(cron): deploy sync guard + kwarg filter + script failure marker
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 20s
Fixes #341, Fixes #348

Three-part cron resilience fix:
1. _validate_agent_interface() — fail-fast if AIAgent.__init__
   is missing expected params (deploy sync guard)
2. _safe_agent_kwargs() — filter unsupported kwargs so jobs
   keep running with degraded functionality
3. [SCRIPT_FAILED] marker — prompt-wrapped script jobs can
   now propagate command failure to cron state

Supersedes PR #358 (branch conflict).
2026-04-13 15:12:12 -04:00
cea4c7fdd0 fix(poka-yoke): circuit breaker for error cascading (#309) + tool fixation detection (#310) (#362)
Some checks failed
Forge CI / smoke-and-build (push) Failing after 26s
Merged poka-yoke #309 and #310
2026-04-13 14:18:35 +00:00
f9b6db52af fix: unescape corrupted quotes in mempalace __init__.py (#360)
Some checks failed
Forge CI / smoke-and-build (push) Failing after 29s
Co-authored-by: Alexander Whitestone <alexander@alexanderwhitestone.com>
Co-committed-by: Alexander Whitestone <alexander@alexanderwhitestone.com>
2026-04-13 14:03:30 +00:00
f91f22ef7a Merge pull request '[claude] fix(cron): preflight model context validation + auto-pause (#351)' (#359) from claude/issue-351 into main
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merged by Timmy overnight cycle
2026-04-13 14:03:12 +00:00
b89c670400 Merge pull request 'feat: add hermes cron run --now for immediate job execution (closes #347)' (#361) from feat/cron-run-now into main
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merged by Timmy overnight cycle
2026-04-13 14:03:08 +00:00
Timmy
f6e72c135c feat: add hermes cron run --now for immediate job execution (closes #347)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 24s
Problem: 'hermes cron run JOBID' only queues for next scheduler tick.
Stale error state (like tool_choice TypeError residue) persists forever
because there's no way to execute a job immediately and get fresh results.

Solution: Three-layer synchronous execution path:
- cron/jobs.py: run_job_now() calls scheduler.run_job() then mark_job_run()
- gateway: POST /api/jobs/{id}/run-now endpoint (runs in thread executor)
- CLI: hermes cron run JOBID --now executes and prints result immediately
- tools/cronjob_tools.py: 'run_now' action routes to new function

Also fixes #346, #349 (same stale error pattern).
2026-04-13 09:58:47 -04:00
Alexander Whitestone
ece8b5f8be fix(cron): preflight model context validation + auto-pause on incompatible models
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 25s
Fixes #351

Root cause: cron jobs with a per-job model override (e.g. `gemma4:latest`,
8K context) were only discovered to be incompatible at agent runtime,
causing a hard ValueError on every tick with no automatic recovery.

Changes:
- Add `CRON_MIN_CONTEXT_TOKENS = 64_000` constant to scheduler.py
- Add `ModelContextError(ValueError)` exception class for typed identification
- Add `_check_model_context_compat()` preflight function that calls
  `get_model_context_length()` and raises `ModelContextError` if the
  resolved model's context is below the minimum
- Call preflight check in `run_job()` after model resolution, before
  `AIAgent()` is instantiated
- In `_process_single_job()` inside `tick()`, catch `ModelContextError`
  and call `pause_job()` to auto-pause the offending job — it will no
  longer fire on every tick until the operator fixes the config
- Honour `model.context_length` in config.yaml as an explicit override
  that bypasses the check (operator accepts responsibility)
- If context detection itself fails (network/import error), log a warning
  and allow the job to proceed (fail-open) so detection gaps don't block
  otherwise-working jobs
- Fix pre-existing IndentationError in `tick()` result loop (missing
  `try:` block introduced in #353 parallel-execution refactor)
- Export `ModelContextError` and `CRON_MIN_CONTEXT_TOKENS` from `cron/__init__.py`
- Add 8 new tests covering all branches of `_check_model_context_compat`

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 09:41:17 -04:00
c88b172bd9 Merge pull request 'perf(cron): parallel job execution + priority sorting (#353)' (#357) from fix/cron-tick-backlog into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 20s
2026-04-13 08:29:31 +00:00
Alexander Whitestone
4373ef2698 perf(cron): parallel job execution + priority sorting (#353)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 20s
2026-04-13 04:21:14 -04:00
fed7156a86 Merge pull request 'feat(cron): deploy sync guard — catch stale code before cascading failures' (#356) from feat/deploy-sync-guard into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 28s
2026-04-13 08:15:34 +00:00
Alexander Whitestone
e68c4d3e4e feat(cron): add deploy sync guard to catch stale code before cascading failures
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 26s
When the installed run_agent.py diverges from what scheduler.py expects,
every cron job fails with TypeError on AIAgent.__init__() — a silent total
outage that cascades into gateway restarts, asyncio shutdown errors, and
auth token expiry.

This commit adds a _validate_agent_interface() guard that:
- Inspects AIAgent.__init__ at runtime via inspect.signature
- Verifies every kwarg the scheduler passes exists in the constructor
- Fails fast with a clear remediation message on mismatch
- Runs once per gateway process (cached, zero per-job overhead)

The guard is called at the top of run_job() before any work begins.
It would have caught the tool_choice TypeError that caused 1,199 failures
across 55 jobs (meta-issue #343).

Includes 3 tests: pass, fail, and cache verification.
2026-04-13 03:33:48 -04:00
a547552ff7 Merge pull request 'fix(cron): guard against interpreter shutdown in run_job() and tick()' (#355) from fix/cron-interpreter-shutdown-352 into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 27s
Merge PR #355: fix(cron): guard against interpreter shutdown in run_job() and tick()
2026-04-13 07:32:06 +00:00
Alexander Whitestone
d6bd3bc10a fix(cron): guard against interpreter shutdown in run_job() and tick()
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 27s
Fixes #352

Problem: When the gateway restarts, Python's interpreter enters
shutdown phase while the last cron tick is still processing jobs.
ThreadPoolExecutor.submit() raises RuntimeError("cannot schedule
new futures after interpreter shutdown") for every remaining job.
This cascades through the entire tick queue.

Fix (two-part):
1. run_job(): Wrap ThreadPoolExecutor creation + submit in try/except.
   On RuntimeError, fall back to synchronous execution (same thread)
   so the job at least attempts instead of dying silently.
2. tick(): Check sys.is_finalizing() before each job. If the
   interpreter is shutting down, stop processing immediately
   instead of wasting time on doomed ThreadPoolExecutor.submit() calls.
2026-04-13 03:22:10 -04:00
7a577068f0 Merge pull request 'fix(cron): ensure ticker thread starts and monitor for death (#342)' (#345) from fix/cron-ticker-startup into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 25s
Auto-merge #345
2026-04-13 07:15:28 +00:00
Alexander Whitestone
cb9214cae0 fix(cron): ensure ticker thread starts and monitor for death
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 27s
Issue #342: Cron ticker thread not starting in gateway

Root cause: asyncio.get_running_loop() can raise RuntimeError in edge cases,
and ticker thread can die silently without restart.

Fix:
1. Wrap get_running_loop() in try/except with fallback
2. Add explicit logger.info when ticker starts
3. Add async monitor that restarts ticker if it dies
4. Log PID and thread name for debugging
2026-04-13 03:02:36 -04:00
eecff3fbf6 Merge pull request 'ci: add skills index workflow (rescued from #307)' (#335) from feat/skills-index-workflow into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 26s
2026-04-13 04:26:28 +00:00
Alexander Whitestone
4210412bef ci: add skills index workflow
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 37s
2026-04-13 00:23:59 -04:00
c8739e0970 Merge pull request 'feat: add research paper project scaffolder' (#308) from feat/research-paper-scaffolder into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 25s
Merge PR #308: feat: add research paper project scaffolder
2026-04-13 02:56:17 +00:00
6c358ae603 docs: add scaffolder quick start to Phase 0
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 28s
2026-04-13 02:52:32 +00:00
8450c4a8f5 feat: add research paper project scaffolder 2026-04-13 02:51:32 +00:00
0677bbd9d7 fix: support required cron tool choice
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merge PR #306: fix: support required cron tool choice
2026-04-13 02:33:39 +00:00
Alexander Whitestone
9b950bb897 fix: support required cron tool choice
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 39s
2026-04-12 22:14:47 -04:00
a3452d2c66 purge: remove Anthropic from hermes-agent config layer (#305)
Some checks failed
Forge CI / smoke-and-build (push) Failing after 26s
2026-04-13 02:02:00 +00:00
039d2ab88c Merge pull request 'purge: remove .claw/ OpenClaw session artifacts' (#304) from perplexity/openclaw-purge-claw-dir into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 24s
2026-04-13 01:36:02 +00:00
6b036a76fc purge: add .claw/ to .gitignore
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 29s
2026-04-13 01:34:26 +00:00
c2eff61025 purge: remove OpenClaw session artifact session-1775534636684-0.jsonl 2026-04-13 01:34:23 +00:00
250e79b8c9 purge: remove OpenClaw session artifact session-1775533542734-0.jsonl 2026-04-13 01:34:21 +00:00
abe321736e Merge pull request 'fix: muda cleanup and cost guardrails' (#303) from fix/muda-cleanup-and-guardrails into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 24s
2026-04-13 00:58:46 +00:00
ccc2fc3280 feat: implement guardrails in AIAgent
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 27s
2026-04-13 00:54:55 +00:00
455b0c87b1 feat: add cost and safety guardrails
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 31s
2026-04-13 00:31:51 +00:00
669c25b2bb fix: move TEST_OPTIMIZATION_GUIDE.md to docs/reports/ 2026-04-13 00:31:49 +00:00
4f2e75f228 fix: move TEST_OPTIMIZATION_GUIDE.md to docs/reports/ 2026-04-13 00:31:48 +00:00
6da28ef92d fix: move TEST_ANALYSIS_REPORT.md to docs/reports/ 2026-04-13 00:31:46 +00:00
bb905d3bf9 fix: move TEST_ANALYSIS_REPORT.md to docs/reports/ 2026-04-13 00:31:44 +00:00
51c20bb6c6 fix: move SECURITY_MITIGATION_ROADMAP.md to docs/reports/ 2026-04-13 00:31:42 +00:00
df8e87bf7c fix: move SECURITY_MITIGATION_ROADMAP.md to docs/reports/ 2026-04-13 00:31:41 +00:00
8495bff72f fix: move SECURITY_FIXES_CHECKLIST.md to docs/reports/ 2026-04-13 00:31:39 +00:00
90c9549408 fix: move SECURITY_FIXES_CHECKLIST.md to docs/reports/ 2026-04-13 00:31:36 +00:00
ee1ce608b2 fix: move SECURITY_AUDIT_REPORT.md to docs/reports/ 2026-04-13 00:31:34 +00:00