Commit Graph

4383 Commits

Author SHA1 Message Date
18e3533a0a Merge pull request 'feat: The Budgetary Sovereign Router — Efficiency Sauce' (#1008) from feat/budgetary-router-1776864510362 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 13:38:40 +00:00
60ccd825ec Merge pull request 'feat: The Sovereign Teleport — State Migration Sauce' (#1007) from feat/sovereign-teleport-1776864503956 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 13:38:36 +00:00
e7d5a7f2cf Merge pull request 'feat: The Scavenger Fixer — Closing the Autonomous Loop' (#975) from feat/autonomous-scavenger-fix-1776827712502 into main
All checks were successful
Lint / lint (push) Successful in 13s
2026-04-22 13:38:03 +00:00
9aaac192cf Merge pull request 'test(#798): Parallel tool calling — 2+ tools per response' (#988) from fix/798 into main
All checks were successful
Lint / lint (push) Successful in 9s
2026-04-22 13:36:37 +00:00
f3d88ec31d Merge pull request '[claude] Wire Gemma 4 vision into browser_tool for screenshot analysis (#816)' (#947) from claude/issue-816 into main
All checks were successful
Lint / lint (push) Successful in 13s
2026-04-22 13:36:20 +00:00
2f22570622 Merge pull request 'feat(web-console): Self-healing browser CDP + operator cockpit (#394)' (#934) from feat/web-console-394 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 13:36:14 +00:00
2022322606 Merge pull request 'feat: Deep Dive Security Integration - Multilayer Defense' (#929) from feat/security-deep-dive-1776732106631 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 13:36:08 +00:00
d6ec32fe93 Merge pull request 'feat: implement SHIELD Multilingual Defense & Input Sanitization' (#918) from feat/shield-multilingual-1776700482647 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 13:36:05 +00:00
2b284e75f6 Merge pull request 'feat: Multi-Agent Concurrency Guard — "Secret Sauce" for Fleet Scaling' (#969) from feat/fleet-concurrency-guard-1776826501792 into main
All checks were successful
Lint / lint (push) Successful in 16s
2026-04-22 13:29:01 +00:00
efa1fc034e feat: Budgetary Sovereign Router — Complexity-aware steering
All checks were successful
Lint / lint (pull_request) Successful in 25s
2026-04-22 13:28:31 +00:00
99d925d40b feat: Sovereign Teleport — Cross-environment agent migration
All checks were successful
Lint / lint (pull_request) Successful in 28s
2026-04-22 13:28:25 +00:00
Alexander Whitestone
ed250b1ca8 test(#798): Strengthen parallel tool calling tests + fix flaky concurrent tests
All checks were successful
Lint / lint (pull_request) Successful in 10s
- Add TestAIAgentConcurrentExecution with 8 integration tests exercising
  _execute_tool_calls_concurrent through AIAgent for 2/3/4-tool batches,
  pass-rate reporting, and Gemma 4-style read patterns.
- Fix test_malformed_json_args_forces_sequential: use JSON array '[1,2,3]'
  instead of unrepairable garbage now that repair_and_load_json handles
  most malformed input.
- Fix test_concurrent_handles_tool_error: replace racy call_count list
  with deterministic failure based on tool_call_id to eliminate flaky
  failures under ThreadPoolExecutor.

Closes #798
2026-04-22 01:34:24 -04:00
16eab5d503 Merge pull request '[claude] A2A auth — mutual TLS between fleet agents (#806)' (#948) from claude/issue-806 into main
All checks were successful
Lint / lint (push) Successful in 13s
Merge PR #948: A2A auth — mutual TLS between fleet agents (#806)
2026-04-22 03:19:42 +00:00
81f7347bcb feat: Scavenger Fixer — Autonomous tech debt healing
All checks were successful
Lint / lint (pull_request) Successful in 22s
2026-04-22 03:15:17 +00:00
c7a2d439c1 Merge pull request 'feat: The Sovereign Scavenger — Automated Tech Debt Recovery' (#974) from feat/sovereign-scavenger-1776827259631 into main
All checks were successful
Lint / lint (push) Successful in 12s
2026-04-22 03:14:14 +00:00
8ad8520bd2 Merge pull request 'feat: Execution Safety Sentry — GOFAI Risk Analysis' (#973) from feat/static-analyzer-gofai-1776826921747 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 03:14:07 +00:00
9c7c88823f Merge pull request 'feat: Local Inference Story — Freeing the fleet from cloud dependency' (#972) from feat/local-inference-bridge-1776826896029 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 03:14:03 +00:00
aa45e02238 Merge pull request 'feat: GOFAI Semantic Sentry — Deterministic code verification' (#971) from feat/symbolic-verify-gofai-1776826842170 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-22 03:14:01 +00:00
3266c39e8e feat: Sovereign Scavenger — Turning tech debt into actionable backlog
All checks were successful
Lint / lint (pull_request) Successful in 18s
2026-04-22 03:07:40 +00:00
93a855d4e3 feat: Static Risk Analyzer (GOFAI) for execution safety
All checks were successful
Lint / lint (pull_request) Successful in 8s
2026-04-22 03:02:02 +00:00
5a0bdb556e feat: Local Inference Bridge — Bypassing cloud for local tasks
All checks were successful
Lint / lint (pull_request) Successful in 17s
2026-04-22 03:01:37 +00:00
d619d279f8 feat: Symbolic Sentry (GOFAI) for deterministic code audits
All checks were successful
Lint / lint (pull_request) Successful in 15s
2026-04-22 03:00:44 +00:00
77d2430a44 feat: add Fleet-Wide File Concurrency Guard
All checks were successful
Lint / lint (pull_request) Successful in 19s
2026-04-22 02:55:04 +00:00
Alexander Whitestone
9ef7682ee2 chore: merge remote claude/issue-816 — deduplicate gemma-4-27b-it in models.py
All checks were successful
Lint / lint (pull_request) Successful in 30s
Merged prior implementation (PR #947) and resolved conflicts.
Removed duplicate "gemma-4-27b-it" entry introduced during merge.

Refs #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 21:27:51 -04:00
Alexander Whitestone
e157a22639 feat: wire Gemma 4 vision into browser_tool for screenshot analysis
- Add `_BROWSER_VISION_DEFAULT_MODEL = "google/gemma-4-27b-it"` constant
- Rewrite `_get_vision_model()` with 4-tier resolution:
  1. BROWSER_VISION_MODEL env var (browser-specific override)
  2. auxiliary.browser_vision.model config key
  3. AUXILIARY_VISION_MODEL env var (backward compat)
  4. google/gemma-4-27b-it default (Gemma 4 native multimodal)
- Extract `_load_browser_vision_config()` helper for testability
- Always set call_kwargs["model"] (remove redundant `if vision_model` guard)
- Read timeout from auxiliary.browser_vision.timeout before auxiliary.vision.timeout
- Register gemma-4-27b-it in Gemini provider model catalog
- Document auxiliary.browser_vision section in cli-config.yaml.example
- Add 12 unit tests in tests/tools/test_browser_vision_model.py covering all
  resolution tiers, backward compat, error fallthrough, and type guarantees

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 21:26:03 -04:00
Alexander Whitestone
671283389c feat: Wire Gemma 4 vision into browser_tool for screenshot analysis
All checks were successful
Lint / lint (pull_request) Successful in 8s
_get_vision_model() now resolves via a layered priority chain:
  1. BROWSER_VISION_MODEL env var (browser-specific override)
  2. config.yaml browser.vision_model
  3. AUXILIARY_VISION_MODEL env var (backward-compat shared override)
  4. google/gemma-4-27b-it — Gemma 4 native multimodal default

Add browser.vision_model config key to hermes_cli/config.py defaults
with inline documentation.

call_kwargs["model"] is now always set (model is never None), and a
debug log line records which model is in use for each screenshot.

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 20:51:04 -04:00
Alexander Whitestone
17cc4bac90 feat: complete Gemma 4 browser_vision wiring — task routing, timeout, tests
All checks were successful
Lint / lint (pull_request) Successful in 10s
Building on the Gemma 4 default already on this branch:

- Change call_llm() task from "vision" to "browser_vision" in browser_vision()
  so auxiliary.browser_vision.* config is consulted for provider/model/timeout
- Route call_llm(task="browser_vision") through the vision provider resolution
  path in auxiliary_client.py (same as task="vision")
- Fix timeout resolution: check auxiliary.browser_vision.timeout before
  auxiliary.vision.timeout (allows browser-specific timeout override)
- Add timeout option to auxiliary.browser_vision in cli-config.yaml.example
- Add test_browser_vision_gemma4.py covering: task routing assertions,
  call_llm() vision branch routing, and timeout config key ordering

Refs #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:43:42 -04:00
Alexander Whitestone
1843545d66 chore: merge remote branch — resolve conflicts, use canonical implementation
All checks were successful
Lint / lint (pull_request) Successful in 8s
Merge remote claude/issue-816 which contains the full Gemma 4 browser
vision implementation. Resolved conflicts by taking the remote's cleaner
variable names and docstrings while keeping the same 4-tier resolution
logic. All 12 tests pass.

Refs #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:50:22 -04:00
Alexander Whitestone
c643ac90da feat: wire Gemma 4 vision into browser_tool for screenshot analysis
- Add `_BROWSER_VISION_DEFAULT_MODEL = "google/gemma-4-27b-it"` constant
- Rewrite `_get_vision_model()` with 4-tier resolution:
    1. BROWSER_VISION_MODEL env var (browser-specific override)
    2. auxiliary.browser_vision.model config key
    3. AUXILIARY_VISION_MODEL env var (backward compat)
    4. Gemma 4 27B default
- Remove `if vision_model:` guard — function now always returns a string
- Update browser_vision tool description to surface Gemma 4 as default
- Register gemma-4-27b-it in Gemini provider model catalog (models.py)
- Document auxiliary.browser_vision.model in cli-config.yaml.example
- Add 14 unit tests covering all priority levels and backward compat

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:47:03 -04:00
Alexander Whitestone
da9c4cf10c feat: wire Gemma 4 vision into browser_tool for screenshot analysis
All checks were successful
Lint / lint (pull_request) Successful in 7s
Extends `_get_vision_model()` with a 5-level resolution chain:
1. `BROWSER_VISION_MODEL` env var — browser-specific override
2. `auxiliary.browser.vision_model` config key — per-install default
3. `AUXILIARY_VISION_MODEL` env var — backward-compat shared override
4. Auto-select `gemma-4-27b-it` when the main provider is Gemini/Google
5. `None` — fall through to `call_llm` vision router

Adds `_BROWSER_VISION_DEFAULT_MODEL = "gemma-4-27b-it"` constant and
registers `gemma-4-27b-it` in the Gemini provider model catalog.

16 new tests in `tests/tools/test_browser_vision_model.py` cover each
priority level, edge cases (empty env, config exceptions, wrong provider).

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:18:30 -04:00
Alexander Whitestone
4214082fb6 feat: A2A auth — mutual TLS between fleet agents
All checks were successful
Lint / lint (pull_request) Successful in 8s
Implements mTLS for securing agent-to-agent communication in the Hermes
fleet. Fixes #806.

Changes:
- scripts/gen_fleet_ca.sh: generate a self-signed Fleet CA (4096-bit RSA,
  10-year validity) that signs all agent certificates
- scripts/gen_agent_cert.sh: generate per-agent certs (Timmy, Allegro,
  Ezra) signed by the fleet CA with SAN entries and clientAuth/serverAuth
  extended key usage
- agent/mtls.py: new module providing:
  - build_server_ssl_context() — TLS_SERVER context with CERT_REQUIRED,
    enforces client cert against Fleet CA
  - build_client_ssl_context() — TLS_CLIENT context for outbound A2A calls
  - MTLSMiddleware — ASGI middleware that rejects unauthenticated requests
    to A2A routes (/.well-known/agent-card*, /api/agent-card, /a2a/) with
    HTTP 403 when mTLS is enabled
  - is_mtls_configured() — checks HERMES_MTLS_CERT/KEY/CA env vars
- hermes_cli/web_server.py: wire MTLSMiddleware into the FastAPI app;
  pass SSL context to uvicorn when HERMES_MTLS_* env vars are set so
  the server runs TLS with mandatory client cert verification
- ansible/roles/hermes_mtls/: Ansible role to distribute Fleet CA cert,
  agent cert, and agent key to fleet nodes; writes an env file with
  HERMES_MTLS_* vars and restarts the hermes-gateway service
- ansible/fleet_mtls.yml: fleet-wide playbook referencing the role for
  Timmy, Allegro, and Ezra nodes
- tests/test_mtls.py: 15 tests covering is_mtls_configured, SSL context
  creation with real cryptography-generated certs, and MTLSMiddleware
  (unauthorized agent rejected → 403, authorized agent accepted → 200)

mTLS is opt-in: set HERMES_MTLS_CERT, HERMES_MTLS_KEY, and HERMES_MTLS_CA
to enable. When unset, the server behaves exactly as before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:04:00 -04:00
Alexander Whitestone
95bb842a21 feat: Wire Gemma 4 vision into browser_tool for screenshot analysis
All checks were successful
Lint / lint (pull_request) Successful in 8s
Default browser_vision screenshots to google/gemma-4-27b-it (Gemma 4
native multimodal) for reduced latency and unified text+vision model.

Resolution order for _get_vision_model():
1. BROWSER_VISION_MODEL env var (new, browser-specific override)
2. auxiliary.browser_vision.model in config.yaml (new config key)
3. AUXILIARY_VISION_MODEL env var (existing global vision override)
4. Default: google/gemma-4-27b-it

Backward compatibility: existing AUXILIARY_VISION_MODEL users are
unaffected — their override still flows through to browser_vision.

Also documents the new auxiliary.browser_vision config section in
cli-config.yaml.example and adds 14 unit tests covering the full
priority chain.

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 17:14:32 -04:00
Alexander Whitestone
ac28444bf2 feat: add A2AMTLSServer routing API, A2AMTLSClient, and expand tests to 20 (#806)
All checks were successful
Lint / lint (pull_request) Successful in 9s
Builds on the existing A2AServer / build_*_ssl_context foundation:

- agent/a2a_mtls.py:
  - Add A2AMTLSServer: routing-based HTTPS server with add_route() and
    context-manager (__enter__/__exit__) lifecycle support
  - Add A2AMTLSClient: fleet-cert-presenting HTTP client with .get() / .post()
  - Widen imports (json, Callable, Dict, urlopen)

- tests/agent/test_a2a_mtls.py:
  - Fix datetime.utcnow() deprecation — use datetime.now(timezone.utc)
  - Add TestA2AMTLSServerAndClient (9 tests): routing GET/POST, 404,
    context-manager stop, rogue-cert rejection, A2AMTLSClient, concurrency
  - Total: 11 → 20 passing tests

Refs #806
2026-04-21 15:21:10 -04:00
Alexander Whitestone
12b5d9a7fd refactor: remove redundant vision_model guard in browser_vision
All checks were successful
Lint / lint (pull_request) Successful in 10s
_get_vision_model() now always returns a non-empty string (Gemma 4 default
or configured override), so the `if vision_model:` conditional guard is
unnecessary. Replace with unconditional assignment and add a debug log
line showing which model was selected.

Refs #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 14:09:40 -04:00
Alexander Whitestone
91faf6f956 feat: A2A auth — mutual TLS between fleet agents
All checks were successful
Lint / lint (pull_request) Successful in 10s
Implements mutual TLS for secure agent-to-agent communication (#806).

- scripts/gen_fleet_ca.sh: generate fleet CA (4096-bit RSA, 10-year)
- scripts/gen_agent_cert.sh: per-agent cert signed by fleet CA (timmy, allegro, ezra)
- agent/a2a_mtls.py: A2AServer requiring client cert verification (CERT_REQUIRED),
  build_server_ssl_context / build_client_ssl_context helpers, server_from_env()
- ansible/roles/fleet_mtls_certs/: distribute CA + per-agent certs to fleet nodes,
  write /etc/hermes/a2a.env, notify hermes-a2a service on change
- ansible/fleet_mtls.yml + ansible/inventory/fleet.ini.example: playbook + example inventory
- tests/agent/test_a2a_mtls.py: 11 tests — authorized agent accepted (200/202),
  self-signed cert rejected, no-cert rejected, lifecycle, env-var wiring

Fixes #806

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 13:28:28 -04:00
Alexander Whitestone
b6398b8b0d feat: wire Gemma 4 vision into browser_tool for screenshot analysis
All checks were successful
Lint / lint (pull_request) Successful in 19s
Default browser screenshot analysis now uses Gemma 4 27B
(google/gemma-4-27b-it) instead of deferring to the auxiliary router's
auto-detection.  Gemma 4 is natively multimodal — the same model family
already in use for text tasks — which avoids cold-start model-switching
overhead and improves context continuity.

Resolution order for _get_vision_model():
  1. BROWSER_VISION_MODEL env var (browser-specific override)
  2. auxiliary.browser_vision.model in config.yaml
  3. AUXILIARY_VISION_MODEL env var (shared/legacy override)
  4. google/gemma-4-27b-it (new default)

- Add _BROWSER_VISION_DEFAULT_MODEL constant to browser_tool.py
- Document auxiliary.browser_vision config key in cli-config.yaml.example
- Add 10 unit tests covering all resolution steps

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 12:49:46 -04:00
a2a40429bd Merge pull request '[claude] Poka-yoke: auto-revert incomplete skill edits (#923)' (#946) from claude/issue-923 into main
All checks were successful
Lint / lint (push) Successful in 10s
2026-04-21 16:38:24 +00:00
ee61c5fa9d Merge pull request 'feat: Add queue health check script' (#912) from feat/queue-health-check into main
All checks were successful
Lint / lint (push) Successful in 34s
2026-04-21 15:37:59 +00:00
Alexander Whitestone
1fece10569 feat: poka-yoke auto-revert for incomplete skill edits (#923)
All checks were successful
Lint / lint (pull_request) Successful in 32s
Implement a transactional write-validate-commit-or-rollback pattern for
all skill_manage write operations (edit, patch, write_file):

- _backup_skill_file: timestamped .bak.{ts} snapshot before every write
- _validate_written_file: re-reads from disk after write to catch truncation,
  encoding errors, and broken YAML frontmatter
- _revert_from_backup: restores original content (or removes the corrupted
  file) on any validation failure
- _cleanup_old_backups: prunes to MAX_BACKUPS_PER_FILE (3) after success;
  failed edits keep their .bak file as a debugging aid

Also fixes pre-existing issue where _patch_skill error returns lacked a
`suggestion` field expected by test_skill_manager_error_context.py tests.

Adds 21 tests in test_skill_manager_autorevert.py covering every component
and an end-to-end simulation of mid-write failure + auto-revert.

Fixes #923

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 11:37:55 -04:00
46668505bc Merge pull request 'feat: tool fixation detection — break repetitive loops (#886)' (#914) from fix/886 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:35:08 +00:00
cac0c8224e Merge pull request 'fix: circuit breaker for error cascading (2.33x amplification)' (#927) from fix/885-circuit-breaker into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:35:04 +00:00
f38a64455d Merge pull request '[claude] Gateway config debt: add validation tests and API_SERVER_KEY warning (#892)' (#915) from claude/issue-892 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:33:19 +00:00
1b35a5a0d2 Merge pull request 'feat: Poka-yoke — hardcoded path guard (#921)' (#928) from fix/921-hardcoded-path-guard into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:33:14 +00:00
9172131b25 Merge pull request 'docs: tool investigation report from awesome-ai-tools (#926)' (#931) from fix/926 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:33:12 +00:00
407eab3331 Merge pull request 'feat: session deterministic seeding & marathon limits' (#919) from feat/session-management-1776700585635 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:29:44 +00:00
cf090a966d Merge pull request 'fix: Poka-yoke — detect and block tool hallucination before API calls (#922)' (#935) from fix/922 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:29:35 +00:00
b65be9b12c Merge pull request '[claude] Add tool investigation report: top 5 awesome-ai-tools recommendations (#926)' (#936) from claude/issue-926 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:29:32 +00:00
3c1cff255e Merge pull request 'ci: integrate hardcoded path linter into CI workflow' (#938) from fix/865-ci-path-linter into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:29:30 +00:00
690d100afc Merge pull request 'feat: Poka-yoke token budget — progressive context overflow guard (#925)' (#943) from burn/925-1776770102 into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been skipped
Nix / nix (ubuntu-latest) (push) Failing after 5s
Tests / e2e (push) Successful in 5m8s
Tests / test (push) Failing after 30m13s
Nix / nix (macos-latest) (push) Has been cancelled
2026-04-21 15:29:02 +00:00
c6f0831738 Merge pull request 'feat: Python syntax validation before execute_code (#913)' (#917) from fix/913-syntax-validation into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
Tests / e2e (push) Has been cancelled
2026-04-21 15:27:05 +00:00