7caaf49a34
feat: deep dive integration of tests/test_shield_multilingual.py
2026-04-21 00:41:53 +00:00
e52f6d2cde
feat: deep dive integration of tools/shield/detector.py
2026-04-21 00:41:52 +00:00
000d64deed
feat: deep dive integration of agent/input_sanitizer.py
2026-04-21 00:41:50 +00:00
d527cb569b
feat: deep dive integration of agent/shield.py
2026-04-21 00:41:49 +00:00
44ada06fd4
feat: update agent/privacy_filter.py for deep dive security
2026-04-21 00:41:48 +00:00
c6f2855745
fix: restore _format_error helper for test compatibility ( #916 )
...
Docker Build and Publish / build-and-push (push) Has been skipped
Nix / nix (ubuntu-latest) (push) Failing after 2s
Tests / e2e (push) Successful in 2m47s
Tests / test (push) Failing after 27m41s
Build Skills Index / build-index (push) Has been skipped
Build Skills Index / deploy-with-index (push) Has been skipped
Nix / nix (macos-latest) (push) Has been cancelled
fix: restore _format_error helper for test compatibility (#916 )
2026-04-20 23:56:27 +00:00
05f8c2d188
Merge PR #899
...
Merged PR #899 : feat: Allegro worker deliverables
2026-04-17 01:52:11 +00:00
ff2ce95ade
feat(research): Allegro worker deliverables — fleet research reports + skill manager test
...
Tests / e2e (pull_request) Successful in 1m39s
Tests / test (pull_request) Failing after 1h7m45s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Successful in 24s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 28s
Research reports:
- Vector DB research
- Workflow orchestration research
- Fleet knowledge graph SOTA research
- LLM inference optimization
- Local model crisis quality
- Memory systems SOTA
- Multi-agent coordination
- R5 vs E2E gap analysis
- Text-to-music-video
Test:
- test_skill_manager_error_context.py
[Allegro] Forge workers — 2026-04-16
2026-04-16 15:04:28 +00:00
Hermes Merge Bot
aedebfdf58
Merge PR #848
2026-04-16 02:12:13 -04:00
Hermes Merge Bot
adf49b1809
Merge PR #849
2026-04-16 02:11:21 -04:00
Hermes Merge Bot
52ea3a8935
Merge PR #850
2026-04-16 02:09:00 -04:00
Hermes Merge Bot
43246d6cb4
Merge PR #852
2026-04-16 02:08:06 -04:00
Hermes Merge Bot
20c5e237a7
Merge PR #861
2026-04-16 02:06:36 -04:00
Hermes Merge Bot
a0f4d10a7f
Merge PR #855
2026-04-16 02:06:17 -04:00
Hermes Merge Bot
bc5d1cf6ff
Merge PR #863
2026-04-16 02:05:44 -04:00
Hermes Merge Bot
dff451081d
Merge PR #856
2026-04-16 02:05:42 -04:00
Hermes Merge Bot
5509b157c5
Merge PR #864
2026-04-16 02:05:05 -04:00
Hermes Merge Bot
fcc322fb81
Merge PR #867
2026-04-16 02:03:23 -04:00
Hermes Merge Bot
9bba9ecc40
Merge PR #866
2026-04-16 02:02:43 -04:00
Hermes Merge Bot
05086e58ea
Merge PR #871
2026-04-16 02:00:55 -04:00
Hermes Merge Bot
7af6889767
Merge PR #869
2026-04-16 02:00:49 -04:00
5022db9d7b
Merge pull request 'feat: self-modifying agent that improves its own prompts ( #813 )' ( #897 ) from fix/813 into main
2026-04-16 05:29:11 +00:00
0f61474b74
Merge pull request 'feat: MCP server — expose hermes tools to fleet peers ( #803 )' ( #896 ) from fix/803 into main
...
Auto-merged PR #896 : feat: MCP server — expose hermes tools to fleet peers (#803 )
2026-04-16 05:24:27 +00:00
Alexander Whitestone
a528bd5b1b
fix: use .get() for env_vars key in _show_tool_availability_warnings
...
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 24s
Tests / test (pull_request) Failing after 1h2m1s
Tests / e2e (pull_request) Successful in 1m38s
Fixes KeyError: 'missing_vars' crash on CLI startup when toolsets are
unavailable. registry.py returns dicts with 'env_vars' key, but
_show_tool_availability_warnings() was accessing 'missing_vars' directly.
Now uses .get("env_vars") or .get("missing_vars") to handle both key
names, consistent with how doctor.py already handles this.
Fixes #834
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-16 01:23:48 -04:00
Alexander Whitestone
e63cdaf16f
feat: self-modifying agent that improves its own prompts ( #813 )
...
Docker Build and Publish / build-and-push (pull_request) Has been cancelled
Contributor Attribution Check / check-attribution (pull_request) Has been cancelled
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
Tests / e2e (pull_request) Has been cancelled
Resolves #813 . Agent analyzes session transcripts for failure
patterns and generates prompt patches to prevent future failures.
agent/self_modify.py (PromptLearner class):
- analyze_session(): detects 5 failure types from transcripts:
retry_loop, timeout, hallucination, context_loss, tool_failure
- generate_patches(): converts patterns to prompt patches with
confidence scoring (frequency-based)
- apply_patches(): appends learned rules to system prompt with
backup and rollback support
- learn_from_session(): full cycle analyze → patch → apply
Failures → patterns → patches → improved prompts → fewer failures.
Safety: patches only ADD rules (append-only), never remove.
Rollback: restores from timestamped backup.
2026-04-16 01:23:48 -04:00
Alexander Whitestone
2b7b12baf9
feat: MCP server — expose hermes tools to fleet peers ( #803 )
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 44s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Tests / test (pull_request) Has been cancelled
Tests / e2e (pull_request) Has been cancelled
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 19m48s
Resolves #803 . Standalone MCP server that exposes safe hermes
tools to other fleet agents.
scripts/mcp_server.py:
- Exposes: terminal, file_read, file_search, web_search, session_search
- Blocks: approval, delegate, memory, config, cron, send_message
- Terminal uses approval.py dangerous command detection
- Auth via Bearer token (MCP_AUTH_KEY)
- HTTP endpoints: GET /mcp/tools, POST /mcp/tools/call, GET /health
Usage:
python scripts/mcp_server.py --port 8081 --auth-key SECRET
curl http://localhost:8081/mcp/tools
curl -X POST http://localhost:8081/mcp/tools/call -d {"name":"file_read","arguments":{"path":"README.md"}}
2026-04-16 01:10:00 -04:00
Alexander Whitestone
6b40c5db7a
fix: use env_vars key in _show_tool_availability_warnings to prevent KeyError
...
registry.py:check_tool_availability() returns unavailable dicts with key
"env_vars", but _show_tool_availability_warnings() in cli.py was accessing
u["missing_vars"] causing a KeyError crashing CLI startup whenever any
toolset was disabled.
Fix matches how doctor.py already handles the same data.
Fixes #834
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-16 00:42:03 -04:00
5a24894f78
fix: update hermes_cli/web_server.py for agent card discovery
Contributor Attribution Check / check-attribution (pull_request) Successful in 43s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Nix / nix (ubuntu-latest) (pull_request) Failing after 5s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 38s
Tests / test (pull_request) Failing after 10m58s
Tests / e2e (pull_request) Successful in 1m32s
Nix / nix (macos-latest) (pull_request) Has been cancelled
2026-04-16 03:45:04 +00:00
a474eb8459
fix: add agent/agent_card.py for agent card discovery
2026-04-16 03:45:01 +00:00
Alexander Whitestone
3238cf4eb1
feat: Tool investigation report + Mem0 local provider ( #842 )
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 38s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 32s
Tests / test (pull_request) Failing after 43m54s
Tests / e2e (pull_request) Successful in 2m5s
## Investigation Report
- docs/tool-investigation-2026-04-15.md: Full report analyzing 414 tools
from awesome-ai-tools. Top 5 recommendations with integration paths.
- docs/plans/awesome-ai-tools-integration.md: Implementation tracking plan.
## Mem0 Local Provider (P1)
- plugins/memory/mem0_local/: New ChromaDB-backed memory provider.
No API key required - fully sovereign. Compatible tool schemas with
cloud Mem0 (mem0_profile, mem0_search, mem0_conclude).
- Pattern-based fact extraction from conversations.
- Deterministic dedup via content hashing.
- Circuit breaker for resilience.
- tests/plugins/memory/test_mem0_local.py: Full test coverage.
## Issues Filed
- #857 : LightRAG integration (P2)
- #858 : n8n workflow orchestration (P3)
- #859 : RAGFlow document understanding (P4)
- #860 : tensorzero LLMOps evaluation (P3)
Closes #842
2026-04-15 23:04:41 -04:00
eed87e454e
test: Benchmark Gemma 4 vision accuracy vs current approach ( #817 )
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 26s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 26s
Tests / e2e (pull_request) Successful in 2m38s
Tests / test (pull_request) Failing after 47m49s
Vision benchmark suite comparing Gemma 4 (google/gemma-4-27b-it) vs
current Gemini 3 Flash Preview (google/gemini-3-flash-preview).
Metrics:
- OCR accuracy (character + word overlap)
- Description completeness (keyword coverage)
- Structural quality (length, sentences, numbers)
- Latency (ms per image)
- Token usage
- Consistency across runs
Features:
- 24 diverse test images (screenshots, diagrams, photos, charts)
- Category-specific evaluation prompts
- Automated verdict with composite scoring
- JSON + markdown report output
- 28 unit tests passing
Usage:
python benchmarks/vision_benchmark.py --images benchmarks/test_images.json
python benchmarks/vision_benchmark.py --url https://example.com/img.png
python benchmarks/vision_benchmark.py --generate-dataset
Closes #817 .
2026-04-15 23:02:02 -04:00
Alexander Whitestone
f03709aa29
test: crisis hook integration tests with agent loop ( #707 )
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 16s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 15s
Tests / e2e (pull_request) Failing after 12m38s
Tests / test (pull_request) Failing after 25m58s
10 integration tests verifying crisis detection works correctly
when called from the agent conversation flow:
- scan_user_message detects CRITICAL/HIGH/MEDIUM/LOW levels
- Safe messages pass through without triggering
- Tool handler returns valid JSON
- Compassion injection includes 988 lifeline for CRITICAL/HIGH
- Case insensitive detection
- Empty/None text handled gracefully
- False positive resistance on common non-crisis phrases
- Config check returns bool
- Callable from agent context (not just isolation tests)
2026-04-15 23:00:12 -04:00
Alexander Whitestone
4d8e004b5f
fix: extend JSON repair to remaining json.loads sites in run_agent.py
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 42s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Nix / nix (ubuntu-latest) (pull_request) Failing after 4s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 36s
Tests / test (pull_request) Failing after 1h13m6s
Tests / e2e (pull_request) Successful in 1m32s
Nix / nix (macos-latest) (pull_request) Has been cancelled
Adds `repair_and_load_json()` to utils.py using the `json_repair` library
as a fallback when `json.loads()` fails. Replaces 8 non-hot-path json.loads
sites identified in issue #809 :
- L2250: trajectory/sanitization message content parsing
- L2500: tool_call dict reconstruction in trajectory conversion
- L2535: tool_content parsing (JSON-like strings in tool responses)
- L2888: session log file loading (with warning on unrecoverable parse)
- L3119: todo content parsing in message processing
- L5963: vision result_json parsing
- L6761: memory flush tool call argument parsing
- L8300: cache serialization tool call args normalization
Each site uses an appropriate default ({} for tool args, None/continue for
content parsing) and a context label for debug tracing.
Fixes #809
2026-04-15 22:56:39 -04:00
85a654348a
feat: poka-yoke — prevent hardcoded ~/.hermes paths ( closes #835 )
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 27s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 19s
Tests / e2e (pull_request) Successful in 1m55s
Tests / test (pull_request) Failing after 56m41s
scripts/lint_hardcoded_paths.py (new):
- Scans Python files for hardcoded home-directory paths
- Detects: Path.home()/.hermes without env fallback, /Users/<name>/, /home/<name>/
- Excludes: comments, docstrings, test files, skills, plugins, docs
- Excludes correct patterns: profiles_parent, current_default, native_home
- Supports --staged (git pre-commit), --fix (suggestions), --json output
scripts/pre-commit-hardcoded-paths.sh (new):
- Pre-commit hook that runs lint_hardcoded_paths.py --staged
- Blocks commits containing hardcoded path violations
tools/confirmation_daemon.py (fixed):
- Replaced Path.home() / '.hermes' / 'approval_whitelist.json'
with get_hermes_home() / 'approval_whitelist.json'
- Added import of get_hermes_home from hermes_constants
tests/test_hardcoded_paths.py (new):
- 11 tests: detection, exclusion, fallback patterns, clean files
2026-04-15 22:56:32 -04:00
fc0d8fe5e9
fix: extend JSON repair to ALL remaining json.loads sites ( #809 )
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Nix / nix (ubuntu-latest) (pull_request) Failing after 2s
Contributor Attribution Check / check-attribution (pull_request) Successful in 26s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 26s
Tests / e2e (pull_request) Successful in 2m50s
Tests / test (pull_request) Failing after 1h17m49s
Nix / nix (macos-latest) (pull_request) Has been cancelled
2026-04-16 02:53:41 +00:00
Alexander Whitestone
13ef670c05
feat: session compaction with fact extraction ( #748 )
...
Contributor Attribution Check / check-attribution (pull_request) Successful in 29s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 33s
Tests / e2e (pull_request) Successful in 3m26s
Tests / test (pull_request) Failing after 1h28m50s
Before compressing conversation context, extract durable facts
(user preferences, corrections, project details) and save to
fact store so they survive compression.
New agent/session_compactor.py:
- extract_facts_from_messages(): scans user messages for
preferences, corrections, project/infra facts using regex
- 3 pattern categories: user_pref (5 patterns), correction
(3 patterns), project (4 patterns)
- ExtractedFact: category, entity, content, confidence, source_turn
- save_facts_to_store(): saves to fact store (callback or auto-detect)
- extract_and_save_facts(): one-call extraction + persistence
- Deduplication by category+content
- Skips tool results, short messages, system messages
- format_facts_summary(): human-readable summary
Tests: tests/test_session_compactor.py (9 tests)
Closes #748
2026-04-15 22:41:54 -04:00
4752a0085e
fix: extend JSON repair to remaining json.loads sites in run_agent.py ( #809 )
2026-04-16 02:40:51 +00:00
b26a6ec23b
feat: add repair_and_load_json() to utils.py ( #809 )
2026-04-16 02:38:01 +00:00
Alexander Whitestone
9f0c410481
feat: batch tool execution with parallel safety checks ( #749 )
...
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Successful in 35s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 37s
Tests / e2e (pull_request) Successful in 1m48s
Tests / test (pull_request) Failing after 36m13s
Centralized safety classification for tool call batches:
tools/batch_executor.py (new):
- classify_tool_calls() — classifies batch into parallel_safe,
path_scoped, sequential, never_parallel tiers
- BatchExecutionPlan — structured plan with parallel and sequential batches
- Path conflict detection — write_file + patch on same file go sequential
- Destructive command detection — rm, mv, sed -i, redirects
- execute_parallel_batch() — ThreadPoolExecutor for concurrent execution
tools/registry.py (enhanced):
- ToolEntry.parallel_safe field — tools can declare parallel safety
- registry.register() accepts parallel_safe=True parameter
- registry.get_parallel_safe_tools() — query registry-declared safe tools
Safety tiers:
- parallel_safe: read_file, web_search, search_files, etc.
- path_scoped: write_file, patch (concurrent when paths don't overlap)
- sequential: terminal, delegate_task, unknown tools
- never_parallel: clarify (requires user interaction)
19 tests passing.
2026-04-15 22:17:16 -04:00
b34b5b293d
test: add tests for tool hallucination prevention ( #836 )
Contributor Attribution Check / check-attribution (pull_request) Successful in 24s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 22s
Tests / e2e (pull_request) Successful in 3m6s
Tests / test (pull_request) Failing after 41m24s
2026-04-16 02:15:59 +00:00
05f9d2b009
feat: integrate poka-yoke validation into tool dispatch ( #836 )
...
- Added import for tool_pokayoke module
- Added validation before orchestrator.dispatch calls
- Auto-corrects tool names and parameters
- Returns structured errors with suggestions
- Circuit breaker for consecutive failures
Closes #836
2026-04-16 02:15:17 +00:00
Timmy Time
fb7464995c
fix: Ultraplan Mode for daily autonomous planning ( closes #840 )
Contributor Attribution Check / check-attribution (pull_request) Successful in 37s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 39s
Tests / test (pull_request) Failing after 1h15m33s
Tests / e2e (pull_request) Successful in 2m20s
2026-04-15 22:14:16 -04:00
7c71b7e73a
test: parallel tool calling — 2+ tools per response ( #798 )
Contributor Attribution Check / check-attribution (pull_request) Successful in 45s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 1m16s
Tests / e2e (pull_request) Successful in 3m17s
Tests / test (pull_request) Failing after 1h30m54s
2026-04-16 02:13:00 +00:00
4a3068b3b5
test: add regression tests for issue #834 KeyError fix
Contributor Attribution Check / check-attribution (pull_request) Successful in 39s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 44s
Tests / e2e (pull_request) Successful in 2m53s
Tests / test (pull_request) Failing after 1h28m32s
2026-04-16 02:12:36 +00:00
a8300ceb43
fix: KeyError 'missing_vars' in _show_tool_availability_warnings ( #834 )
2026-04-16 02:11:08 +00:00
8ef766beac
feat: add tool hallucination prevention module ( #836 )
...
- Validates tool names against registered tools
- Auto-corrects parameter names within Levenshtein distance 1
- Circuit breaker for consecutive failures (threshold: 3)
- Structured error messages with suggestions
Closes #836
2026-04-16 02:10:39 +00:00
db72e908f7
Merge pull request 'feat(security): implement Vitalik's secure LLM patterns — privacy filter + confirmation daemon [resolves merge conflict]' ( #830 ) from feat/vitalik-secure-llm-1776303263 into main
...
Vitalik's secure LLM patterns — privacy filter + confirmation daemon
Clean rebase of #397 onto current main. Resolves merge conflicts in tools/approval.py.
2026-04-16 01:36:58 +00:00
b82b760d5d
feat: add Vitalik's threat model patterns to DANGEROUS_PATTERNS
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 41s
Contributor Attribution Check / check-attribution (pull_request) Successful in 51s
Tests / e2e (pull_request) Successful in 5m21s
Tests / test (pull_request) Failing after 45m7s
2026-04-16 01:35:49 +00:00
d8d7846897
feat: add tests/tools/test_confirmation_daemon.py from PR #397
2026-04-16 01:35:24 +00:00
6840d05554
feat: add tests/agent/test_privacy_filter.py from PR #397
2026-04-16 01:35:21 +00:00