hermes-agent

Author	SHA1	Message	Date
teknium1	c1775de56f	feat: filesystem checkpoints and /rollback command Automatic filesystem snapshots before destructive file operations, with user-facing rollback. Inspired by PR #559 (by @alireza78a). Architecture: - Shadow git repos at ~/.hermes/checkpoints/{hash}/ via GIT_DIR - CheckpointManager: take/list/restore, turn-scoped dedup, pruning - Transparent — the LLM never sees it, no tool schema, no tokens - Once per turn — only first write_file/patch triggers a snapshot Integration: - Config: checkpoints.enabled + checkpoints.max_snapshots - CLI flag: hermes --checkpoints - Trigger: run_agent.py _execute_tool_calls() before write_file/patch - /rollback slash command in CLI + gateway (list, restore by number) - Pre-rollback snapshot auto-created on restore (undo the undo) Safety: - Never blocks file operations — all errors silently logged - Skips root dir, home dir, dirs >50K files - Disables gracefully when git not installed - Shadow repo completely isolated from project git Tests: 35 new tests, all passing (2798 total suite) Docs: feature page, config reference, CLI commands reference	2026-03-10 00:49:15 -07:00
teknium1	5212644861	fix(security): prevent shell injection in tilde-username path expansion Validate that the username portion of ~username paths contains only valid characters (alphanumeric, dot, hyphen, underscore) before passing to shell echo for expansion. Previously, paths like '~; rm -rf /' would be passed unquoted to self._exec(f'echo {path}'), allowing arbitrary command execution. The approach validates the username rather than using shlex.quote(), which would prevent tilde expansion from working at all since echo '~user' outputs the literal string instead of expanding it. Added tests for injection blocking and valid ~username/path expansion. Credit to @alireza78a for reporting (PR #442, issue #442).	2026-03-09 17:33:19 -07:00
teknium1	2d44ed1c5b	test: add comprehensive tests for vision_tools (42 tests) Covers PR #428 changes and existing vision_tools functionality: - _validate_image_url: 20 tests for urlparse-based validation - _determine_mime_type: 6 tests for MIME type detection - _image_to_base64_data_url: 3 tests for base64 conversion - _handle_vision_analyze: 5 tests for type hints, prompt building, AUXILIARY_VISION_MODEL env var override - Error logging exc_info: 3 async tests verifying stack traces are logged on download failure, analysis error, and cleanup error - check_vision_requirements & get_debug_session_info: 2 basic tests - Registry integration: 3 tests for tool registration	2026-03-09 15:32:02 -07:00
Teknium	654e16187e	feat(mcp): add sampling support — server-initiated LLM requests (#753 ) Add MCP sampling/createMessage capability via SamplingHandler class. Text-only sampling + tool use in sampling with governance (rate limits, model whitelist, token caps, tool loop limits). Per-server audit metrics. Based on concept from PR #366 by eren-karakus0. Restructured as class-based design with bug fixes and tests using real MCP SDK types. 50 new tests, 2600 total passing.	2026-03-09 03:37:38 -07:00
teknium1	7af33accf1	fix: apply secret redaction to file tool outputs Terminal output was already redacted via redact_sensitive_text() but read_file and search_files returned raw content. Now both tools redact secrets before returning results to the LLM. Based on PR #372 by @teyrebaz33 (closes #363) — applied manually due to branch conflicts with the current codebase.	2026-03-09 00:49:46 -07:00
teknium1	a8bf414f4a	feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill New browser capabilities and a built-in skill for agent-driven web QA. ## New tool: browser_console Returns console messages (log/warn/error/info) AND uncaught JavaScript exceptions in a single call. Uses agent-browser's 'console' and 'errors' commands through the existing session plumbing. Supports --clear to reset buffers. Verified working in both local and Browserbase cloud modes. ## Enhanced tool: browser_vision(annotate=True) New boolean parameter on browser_vision. When true, agent-browser overlays numbered [N] labels on interactive elements — each [N] maps to ref @eN. Annotation data (element name, role, bounding box) returned alongside the vision analysis. Useful for QA reports and spatial reasoning. ## Config: browser.record_sessions Auto-record browser sessions as WebM video files when enabled: - Starts recording on first browser_navigate - Stops and saves on browser_close - Saves to ~/.hermes/browser_recordings/ - Works in both local and cloud modes (verified) - Disabled by default ## Built-in skill: dogfood Systematic exploratory QA testing for web applications. Teaches the agent a 5-phase workflow: 1. Plan — accept URL, create output dirs, set scope 2. Explore — systematic crawl with annotated screenshots 3. Collect Evidence — screenshots, console errors, JS exceptions 4. Categorize — severity (Critical/High/Medium/Low) and category (Functional/Visual/Accessibility/Console/UX/Content) 5. Report — structured markdown with per-issue evidence Includes: - skills/dogfood/SKILL.md — full workflow instructions - skills/dogfood/references/issue-taxonomy.md — severity/category defs - skills/dogfood/templates/dogfood-report-template.md — report template ## Tests 21 new tests covering: - browser_console message/error parsing, clear flag, empty/failed states - browser_console schema registration - browser_vision annotate schema and flag passing - record_sessions config defaults and recording lifecycle - Dogfood skill file existence and content validation Addresses #315.	2026-03-08 21:28:12 -07:00
teknium1	491605cfea	feat: add high-value tool result hints for patch and search_files (#722 ) Add contextual [Hint: ...] suffixes to tool results where they save real iterations: - patch (no match): suggests read_file/search_files to verify content before retrying — addresses the common pattern where the agent retries with stale old_string instead of re-reading the file. - search_files (truncated): provides explicit next offset and suggests narrowing the search — clearer than relying on total_count inference. Other hints proposed in #722 (terminal, web_search, web_extract, browser_snapshot, search zero-results, search content-matches) were evaluated and found to be low-value: either already covered by existing mechanisms (read_file pagination, similar-files, schema descriptions) or guidance the agent already follows from its own reasoning. 5 new tests covering hint presence/absence for both tools.	2026-03-08 17:46:28 -07:00
teknium1	c0520223fd	fix: clipboard BMP conversion file loss and broken test Source code (hermes_cli/clipboard.py): - _convert_to_png() lost the file when both Pillow and ImageMagick were unavailable: path.rename(tmp) moved the file to .bmp, then subprocess.run raised FileNotFoundError, but the file was never renamed back. The final fallback 'return path.exists()' returned False. - Fix: restore the original file in both except handlers by renaming tmp back to path when the original is missing. Test (tests/tools/test_clipboard.py): - test_file_still_usable_when_no_converter expected 'from PIL import Image' to raise an Exception, but Pillow is installed so pytest.raises fired 'DID NOT RAISE'. The test also never called _convert_to_png(). - Fix: properly mock PIL unavailability via patch.dict(sys.modules), actually call _convert_to_png(), and assert the correct result.	2026-03-08 17:22:27 -07:00
teknium1	3fb8938cd3	fix: search_files now reports error for non-existent paths instead of silent empty results Previously, search_files would silently return 0 results when the search path didn't exist (e.g., /root/.hermes/... when HOME is /home/user). The path was passed to rg/grep/find which would fail silently, and the empty stdout was parsed as 'no matches found'. Changes: - Add path existence check at the top of search() using test -e. Returns SearchResult with a clear error message when path doesn't exist. - Add exit code 2 checks in _search_with_rg() and _search_with_grep() as secondary safety net for other error types (bad regex, permissions). - Add 4 new tests covering: nonexistent path (content mode), nonexistent path (files mode), existing path proceeds normally, rg error exit code. Tests: 37 → 41 in test_file_operations.py, full suite 2330 passed.	2026-03-08 16:47:20 -07:00
teknium1	cf810c2950	fix: pre-process CLI clipboard images through vision tool instead of raw embedding Images pasted in the CLI were embedded as raw base64 image_url content parts in the conversation history, which only works with vision-capable models. If the main model (e.g. Nous API) doesn't support vision, this breaks the request and poisons all subsequent messages. Now the CLI uses the same approach as the messaging gateway: images are pre-processed through the auxiliary vision model (Gemini Flash via OpenRouter or Nous Portal) and converted to text descriptions. The local file path is included so the agent can re-examine via vision_analyze if needed. Works with any model. Fixes #638.	2026-03-08 06:22:00 -07:00
Teknium	b8120df860	Revert "feat: skill prerequisites — hide skills with unmet runtime dependencies"	2026-03-08 03:58:13 -07:00
kshitij	f210510276	feat: add prerequisites field to skill spec — hide skills with unmet dependencies Skills can now declare runtime prerequisites (env vars, CLI binaries) via YAML frontmatter. Skills with unmet prerequisites are excluded from the system prompt so the agent never claims capabilities it can't deliver, and skill_view() warns the agent about what's missing. Three layers of defense: - build_skills_system_prompt() filters out unavailable skills - _find_all_skills() flags unmet prerequisites in metadata - skill_view() returns prerequisites_warning with actionable details Tagged 12 bundled skills that have hard runtime dependencies: gif-search (TENOR_API_KEY), notion (NOTION_API_KEY), himalaya, imessage, apple-notes, apple-reminders, openhue, duckduckgo-search, codebase-inspection, blogwatcher, songsee, mcporter. Closes #658 Fixes #630	2026-03-08 13:19:32 +05:30
teknium1	24f6a193e7	fix: remove stale 'model' assertion from delegate_task schema test The 'model' property was removed from DELEGATE_TASK_SCHEMA but the test still asserted its presence, causing CI to fail.	2026-03-07 11:29:55 -08:00
teknium1	f668e9fc75	feat: platform-conditional skill loading + Apple/macOS skills Add a 'platforms' field to SKILL.md frontmatter that restricts skills to specific operating systems. Skills with platforms: [macos] only appear in the system prompt, skills_list(), and slash commands on macOS. Skills without the field load everywhere (backward compatible). Implementation: - skill_matches_platform() in tools/skills_tool.py — core filter - Wired into all 3 discovery paths: prompt_builder.py, skills_tool.py, skill_commands.py - 28 new tests across 3 test files New bundled Apple/macOS skills (all platforms: [macos]): - imessage — Send/receive iMessages via imsg CLI - apple-reminders — Manage Reminders via remindctl CLI - apple-notes — Manage Notes via memo CLI - findmy — Track devices/AirTags via AppleScript + screen capture Docs updated: CONTRIBUTING.md, AGENTS.md, creating-skills.md, skills.md (user guide)	2026-03-07 00:47:54 -08:00
0xbyt4	211b55815e	fix: prevent data loss in skills sync on copy/update failure Two bugs in sync_skills(): 1. Failed copytree poisons manifest: when shutil.copytree fails (disk full, permission error), the skill is still recorded in the manifest. On the next sync, the skill appears as "in manifest but not on disk" which is interpreted as "user deliberately deleted it" — the skill is never retried. Fix: only write to manifest on successful copy. 2. Failed update destroys user copy: rmtree deletes the existing skill directory before copytree runs. If copytree then fails, the user's skill is gone with no way to recover. Fix: move to .bak before copying, restore from backup if copytree fails. Both bugs are proven by new regression tests that fail on the old code and pass on the fix.	2026-03-07 03:58:32 +03:00
teknium1	4f56e31dc7	fix: track origin hashes in skills manifest to preserve user modifications Upgrade skills_sync manifest to v2 format (name:origin_hash). The origin hash records the MD5 of the bundled skill at the time it was last synced. On update, the user's copy is compared against the origin hash: - User copy == origin hash → unmodified → safe to update from bundled - User copy != origin hash → user customized → skip (preserve changes) v1 manifests (plain names) are auto-migrated: the user's current hash becomes the baseline, so future syncs can detect modifications. Output now shows user-modified skills: ~ whisper (user-modified, skipping) 27 tests covering all scenarios including v1→v2 migration, user modification detection, update after migration, and origin hash tracking. 2009 tests pass.	2026-03-06 16:13:58 -08:00
teknium1	ab0f4126cf	fix: restore all removed bundled skills + fix skills sync system - Restored 21 skills removed in commits `757d012` and `740dd92`: accelerate, audiocraft, code-review, faiss, flash-attention, gguf, grpo-rl-training, guidance, llava, nemo-curator, obliteratus, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, stable-diffusion, tensorrt-llm, torchtitan, trl-fine-tuning, whisper - Rewrote sync_skills() with proper update semantics: * New skills (not in manifest): copied to user dir * Existing skills (in manifest + on disk): updated via hash comparison * User-deleted skills (in manifest, not on disk): respected, not re-added * Stale manifest entries (removed from bundled): cleaned from manifest - Added sync_skills() to CLI startup (cmd_chat) and gateway startup (start_gateway) — previously only ran during 'hermes update' - Updated cmd_update output to show new/updated/cleaned counts - Rewrote tests: 20 tests covering manifest CRUD, dir hashing, fresh install, user deletion respect, update detection, stale cleanup, and name collision handling 75 bundled skills total. 2002 tests pass.	2026-03-06 15:57:30 -08:00
teknium1	b89eb29174	fix: correct mock tool name 'search' → 'search_files' in test_code_execution The mock handler checked for function_name == 'search' but the RPC sends 'search_files'. Any test exercising search_files through the mock would get 'Unknown tool' instead of the canned response.	2026-03-06 03:53:43 -08:00
teknium1	3982fcf095	fix: sync execute_code sandbox stubs with real tool schemas The _TOOL_STUBS dict in code_execution_tool.py was out of sync with the actual tool schemas, causing TypeErrors when the LLM used parameters it sees in its system prompt but the sandbox stubs didn't accept: search_files: - Added missing params: context, offset, output_mode - Fixed target default: 'grep' → 'content' (old value was obsolete) patch: - Added missing params: mode, patch (V4A multi-file patch support) Also added 4 drift-detection tests (TestStubSchemaDrift) that will catch future divergence between stubs and real schemas: - test_stubs_cover_all_schema_params: every schema param in stub - test_stubs_pass_all_params_to_rpc: every stub param sent over RPC - test_search_files_target_uses_current_values: no obsolete values - test_generated_module_accepts_all_params: generated code compiles All 28 tests pass.	2026-03-06 03:40:06 -08:00
teknium1	39299e2de4	Merge PR #451 : feat: Add Daytona environment backend Authored by rovle. Adds Daytona as the sixth terminal execution backend with cloud sandboxes, persistent workspaces, and full CLI/gateway integration. Includes 24 unit tests and 8 integration tests.	2026-03-06 03:32:40 -08:00
teknium1	efec4fcaab	feat(execute_code): add json_parse, shell_quote, retry helpers to sandbox The execute_code sandbox generates a hermes_tools.py stub module for LLM scripts. Three common failure modes keep tripping up scripts: 1. json.loads(strict=True) rejects control chars in terminal() output (e.g., GitHub issue bodies with literal tabs/newlines) 2. Shell backtick/quote interpretation when interpolating dynamic content into terminal() commands (markdown with backticks gets eaten by bash) 3. No retry logic for transient network failures (API timeouts, rate limits) Adds three convenience helpers to the generated hermes_tools module: - json_parse(text) — json.loads with strict=False for tolerant parsing - shell_quote(s) — shlex.quote() for safe shell interpolation - retry(fn, max_attempts=3, delay=2) — exponential backoff wrapper Also updates the EXECUTE_CODE_SCHEMA description to document these helpers so LLMs know they're available without importing anything extra. Includes 7 new tests (unit + integration) covering all three helpers.	2026-03-06 01:52:46 -08:00
teknium1	2317d115cd	fix: clipboard image paste on WSL2, Wayland, and VSCode terminal The original implementation only supported xclip (X11), which silently fails on WSL2 (can't access Windows clipboard for images), Wayland desktops (xclip is X11-only), and VSCode terminal on WSL2. Clipboard backend changes (hermes_cli/clipboard.py): - WSL2: detect via /proc/version, use powershell.exe with .NET System.Windows.Forms.Clipboard to extract images as base64 PNG - Wayland: use wl-paste with MIME type detection, auto-convert BMP to PNG for WSLg environments (via Pillow or ImageMagick) - Dispatch order: WSL → Wayland → X11 (xclip), with fallthrough - New has_clipboard_image() for lightweight clipboard checks - Cache WSL detection result per-process CLI changes (cli.py): - /paste command: explicit clipboard image check for terminals where BracketedPaste doesn't fire (image-only clipboard in VSCode/WinTerm) - Ctrl+V keybinding: fallback for Linux terminals where Ctrl+V sends raw byte instead of triggering bracketed paste Tests: 80 tests (up from 37) covering WSL, Wayland, X11 dispatch, BMP conversion, has_clipboard_image, and /paste command.	2026-03-05 20:22:44 -08:00
teknium1	8253b54be9	test: strengthen assertions in skill_manager + memory_tool (batch 3) test_skill_manager_tool.py (20 weak → 0): - Validation error messages verified against exact strings - Name validation: checks specific invalid name echoed in error - Frontmatter validation: exact error text for missing fields, unclosed markers, empty content, invalid YAML - File path validation: traversal, disallowed dirs, root-level test_memory_tool.py (13 weak → 0): - Security scan tests verify both 'Blocked' prefix AND specific threat pattern ID (prompt_injection, exfil_curl, etc.) - Invisible unicode tests verify exact codepoint strings - Snapshot test verifies type, header, content, and isolation	2026-03-05 18:51:43 -08:00
teknium1	5c867fd79f	test: strengthen assertions across 3 more test files (batch 2) test_run_agent.py (2 weak → 0, +13 assertions): - Session ID validated against actual YYYYMMDD_HHMMSS_hex format - API failure verifies error message propagation - Invalid JSON args verifies empty dict fallback + message structure - Context compression verifies final_response + completed flag - Invalid tool name retry verifies api_calls count - Invalid response verifies completed/failed/error structure test_model_tools.py (3 weak → 0): - Unknown tool error includes tool name in message - Exception returns dict with 'error' key + non-empty message - get_all_tool_names verifies both web_search AND terminal present test_approval.py (1 weak → 0, assert ratio 1.1 → 2.2): - Dangerous commands verify description content (delete, shell, drop, etc.) - Safe commands explicitly assert key AND desc are None - Pre/post condition checks for state management	2026-03-05 18:46:30 -08:00
teknium1	e9f05b3524	test: comprehensive tests for model metadata + firecrawl config model_metadata tests (61 tests, was 39): - Token estimation: concrete value assertions, unicode, tool_call messages, vision multimodal content, additive verification - Context length resolution: cache-over-API priority, no-base_url skips cache, missing context_length key in API response - API metadata fetch: canonical_slug aliasing, TTL expiry with time mock, stale cache fallback on API failure, malformed JSON resilience - Probe tiers: above-max returns 2M, zero returns None - Error parsing: Anthropic format ('X > Y maximum'), LM Studio, empty string, unreasonably large numbers — also fixed parser to handle Anthropic format - Cache: corruption resilience (garbage YAML, wrong structure), value updates, special chars in model names Firecrawl config tests (8 tests, was 4): - Singleton caching (core purpose — verified constructor called once) - Constructor failure recovery (retry after exception) - Return value actually asserted (not just constructor args) - Empty string env vars treated as absent - Proper setup/teardown for env var isolation	2026-03-05 18:22:39 -08:00
teknium1	e2a834578d	refactor: extract clipboard methods + comprehensive tests (37 tests) Refactored image paste internals for testability: - Extracted _try_attach_clipboard_image() method (clipboard → state) - Extracted _build_multimodal_content() method (images → OpenAI format) - chat() now delegates to these instead of inline logic Tests organized in 4 levels: Level 1 (19 tests): Clipboard module — every platform path with realistic subprocess simulation (tools writing files, timeouts, empty files, cleanup on failure) Level 2 (8 tests): _build_multimodal_content — base64 encoding, MIME types (png/jpg/webp/unknown), missing files, multiple images, default question for empty text Level 3 (5 tests): _try_attach_clipboard_image — state management, counter increment/rollback, naming convention, mixed success/failure Level 4 (5 tests): Queue routing — tuple unpacking, command detection, images-only payloads, text-only payloads	2026-03-05 18:07:53 -08:00
teknium1	ffc752a79e	test: improve clipboard tests with realistic scenarios and multimodal coverage Rewrote clipboard tests from 11 shallow mocks to 21 realistic tests: - Success paths now simulate tools actually writing files (not pre-created) - osascript: success with PNG, success with TIFF, extraction-fail cases - pngpaste: empty file rejection edge case - Linux: extraction failure cleanup verification - New TestMultimodalConversion class: base64 encoding, MIME types, multiple images, missing file handling, default question fallback	2026-03-05 17:58:06 -08:00
teknium1	399562a7d1	feat: clipboard image paste in CLI (Cmd+V / Ctrl+V) Copy an image to clipboard (screenshot, browser, etc.) and paste into the Hermes CLI. The image is saved to ~/.hermes/images/, shown as a badge above the input ([📎 Image #1]), and sent to the model as a base64-encoded OpenAI vision multimodal content block. Implementation: - hermes_cli/clipboard.py: clean module with platform-specific extraction - macOS: pngpaste (if installed) → osascript fallback (always available) - Linux: xclip (apt install xclip) - cli.py: BracketedPaste key handler checks clipboard on every paste, image bar widget shows attached images, chat() converts to multimodal content format, Ctrl+C clears attachments Inspired by @m0at's fork (https://github.com/m0at/hermes-agent) which implemented image paste support for local vision models. Reimplemented cleanly as a separate module with tests.	2026-03-05 17:55:41 -08:00
teknium1	363633e2ba	fix: allow self-hosted Firecrawl without API key + add self-hosting docs On top of PR #460: self-hosted Firecrawl instances don't require an API key (USE_DB_AUTHENTICATION=false), so don't force users to set a dummy FIRECRAWL_API_KEY when FIRECRAWL_API_URL is set. Also adds a proper self-hosting section to the configuration docs explaining what you get, what you lose, and how to set it up (Docker stack, tradeoffs vs cloud). Added 2 more tests (URL-only without key, neither-set raises).	2026-03-05 16:44:21 -08:00
caentzminger	d7d10b14cd	feat(tools): add support for self-hosted firecrawl Adds optional FIRECRAWL_API_URL environment variable to support self-hosted Firecrawl deployments alongside the cloud service. - Add FIRECRAWL_API_URL to optional env vars in hermes_cli/config.py - Update _get_firecrawl_client() in tools/web_tools.py to accept custom API URL - Add tests for client initialization with/without URL - Document new env var in installation and config guides	2026-03-05 16:16:18 -06:00
rovle	a6499b6107	fix(daytona): use shell timeout wrapper instead of broken SDK exec timeout The Daytona SDK's process.exec(timeout=N) parameter is not enforced — the server-side timeout never fires and the SDK has no client-side fallback, causing commands to hang indefinitely. Fix: wrap commands with timeout N sh -c '...' (coreutils) which reliably kills the process and returns exit code 124. Added shlex.quote for proper shell escaping and a secondary deadline (timeout + 10s) that force-stops the sandbox if the shell timeout somehow fails. Signed-off-by: rovle <lovre.pesut@gmail.com>	2026-03-05 13:12:41 -08:00
rovle	efc7a7b957	fix(daytona): don't guess /root on cwd probe failure, keep constructor default; update tests to reflect this Signed-off-by: rovle <lovre.pesut@gmail.com>	2026-03-05 11:49:35 -08:00
rovle	577da79a47	fix(daytona): make disk cap visible and use SDK enum for sandbox state - Replace logger.warning with warnings.warn for the disk cap so users actually see it (logger was suppressed by CLI's log level config) - Use SandboxState enum instead of string literals in _ensure_sandbox_ready Signed-off-by: rovle <lovre.pesut@gmail.com>	2026-03-05 11:03:39 -08:00
rovle	d5efb82c7c	test(daytona): add unit and integration tests for Daytona backend Unit tests cover cwd resolution, sandbox persistence/resume, cleanup, command execution, resource conversion, interrupt handling, retry exhaustion, and sandbox readiness checks. Integration tests verify basic commands, filesystem ops, session persistence, and task isolation against a live Daytona API. Signed-off-by: rovle <lovre.pesut@gmail.com>	2026-03-05 10:26:22 -08:00
teknium1	ad9c26afb8	Merge PR #293 : fix: eliminate shell noise from terminal output and fix test failures Authored by 0xbyt4. Wraps commands with unique fence markers to isolate real output from shell init/exit noise (oh-my-zsh, macOS session restore, etc.). Falls back to expanded pattern-based cleaning. Also fixes BSD find fallback and test module shadowing.	2026-03-05 08:48:26 -08:00
teknium1	b4b426c69d	test: add coverage for tee, process substitution, and full-path rm patterns Tests for the three new dangerous command patterns added in PR #280: - TestProcessSubstitutionPattern: 7 tests (bash/sh/zsh/ksh + safe commands) - TestTeePattern: 7 tests (sensitive paths + safe destinations) - TestFindExecFullPathRm: 4 tests (/bin/rm, /usr/bin/rm, bare rm, safe find)	2026-03-05 01:58:33 -08:00
teknium1	e1baab90f7	Merge PR #201 : fix skills hub dedup to prefer higher trust levels Authored by 0xbyt4. The dedup logic in GitHubSource.search() and unified_search() used 'r.trust_level == "trusted"' which let trusted results overwrite builtin ones. Now uses ranked comparison: builtin (2) > trusted (1) > community (0).	2026-03-04 19:40:41 -08:00
teknium1	b336980229	Merge PR #193 : add unit tests for 5 security/logic-critical modules (batch 4) Authored by 0xbyt4. 144 new tests covering gateway/pairing.py, tools/skill_manager_tool.py, tools/skills_tool.py, honcho_integration/session.py, and agent/auxiliary_client.py.	2026-03-04 19:35:01 -08:00
teknium1	7128f95621	Merge PR #390 : fix hidden directory filter broken on Windows Authored by Farukest. Fixes #389. Replaces hardcoded forward-slash string checks ('/.git/', '/.hub/') with Path.parts membership test in _find_all_skills() and scan_skill_commands(). On Windows, str(Path) uses backslashes so the old filter never matched, causing quarantined skills to appear as installed.	2026-03-04 19:22:43 -08:00
teknium1	ffc6d767ec	Merge PR #388 : fix --force bypassing dangerous verdict in should_allow_install Authored by Farukest. Fixes #387. Removes 'and not force' from the dangerous verdict check so --force can never install skills with critical security findings (reverse shells, data exfiltration, etc). The docstring already documented this behavior but the code didn't enforce it.	2026-03-04 19:19:57 -08:00
teknium1	44a2d0c01f	Merge PR #386 : fix symlink boundary check prefix confusion in skills_guard Authored by Farukest. Fixes #385. Replaces startswith() with Path.is_relative_to() in _check_structure() symlink escape check — same fix pattern as skill_view() (PR #352). Prevents symlinks escaping to sibling directories with shared name prefixes.	2026-03-04 19:13:21 -08:00
teknium1	093acd72dd	fix: catch exceptions from check_fn in is_toolset_available() get_definitions() already wrapped check_fn() calls in try/except, but is_toolset_available() did not. A failing check (network error, missing import, bad config) would propagate uncaught and crash the CLI banner, agent startup, and tools-info display. Now is_toolset_available() catches all exceptions and returns False, matching the existing pattern in get_definitions(). Added 4 tests covering exception handling in is_toolset_available(), check_toolset_requirements(), get_definitions(), and check_tool_availability(). Closes #402	2026-03-04 14:22:30 -08:00
Farukest	f93b48226c	fix: use Path.parts for hidden directory filter in skill listing The hidden directory filter used hardcoded forward-slash strings like '/.git/' and '/.hub/' to exclude internal directories. On Windows, Path returns backslash-separated strings, so the filter never matched. This caused quarantined skills in .hub/quarantine/ to appear as installed skills and available slash commands on Windows. Replaced string-based checks with Path.parts membership test which works on both Windows and Unix.	2026-03-04 18:34:16 +03:00
Farukest	4805be0119	fix: prevent --force from overriding dangerous verdict in should_allow_install The docstring states --force should never override dangerous verdicts, but the condition `if result.verdict == "dangerous" and not force` allowed force=True to skip the early return. Execution then fell through to `if force: return True`, bypassing the policy block. Removed `and not force` so dangerous skills are always blocked regardless of the --force flag.	2026-03-04 18:10:18 +03:00
Farukest	a3ca71fe26	fix: use is_relative_to() for symlink boundary check in skills_guard The symlink escape check in _check_structure() used startswith() without a trailing separator. A symlink resolving to a sibling directory with a shared prefix (e.g. 'axolotl-backdoor') would pass the check for 'axolotl' since the string prefix matched. Replaced with Path.is_relative_to() which correctly handles directory boundaries and is consistent with the skill_view path check.	2026-03-04 17:23:23 +03:00
teknium1	70a0a5ff4a	fix: exclude current session from session_search results session_search was returning the current session if it matched the query, which is redundant — the agent already has the current conversation context. This wasted an LLM summarization call and a result slot. Added current_session_id parameter to session_search(). The agent passes self.session_id and the search filters out any results where either the raw or parent-resolved session ID matches. Both the raw match and the parent-resolved match are checked to handle child sessions from delegation. Two tests added verifying the exclusion works and that other sessions are still returned.	2026-03-04 06:06:40 -08:00
teknium1	79871c2083	refactor: use Path.is_relative_to() for skill_view boundary check Replace the string-based startswith + os.sep approach with Path.is_relative_to() (Python 3.9+, we require 3.10+). This is the idiomatic pathlib way to check path containment — it handles separators, case sensitivity, and the equal-path case natively without string manipulation. Simplified tests to match: removed the now-unnecessary test_separator_is_os_native test since is_relative_to doesn't depend on separator choice.	2026-03-04 05:30:43 -08:00
Farukest	e86f391cac	fix: use os.sep in skill_view path boundary check for Windows compatibility	2026-03-04 06:50:06 +03:00
0xbyt4	aefc330b8f	merge: resolve conflict with main (add mcp + homeassistant extras)	2026-03-03 14:52:22 +03:00
0xbyt4	f967471758	merge: resolve conflict with main (keep fence markers + _find_shell)	2026-03-03 14:50:45 +03:00

1 2

97 Commits