hermes-agent

Author	SHA1	Message	Date
Teknium	f84230527c	docs(skill): add split, merge, search examples to ocr-and-documents skill (#2461 ) * fix: respect DashScope v1 runtime mode for alibaba Remove the hardcoded Alibaba branch from resolve_runtime_provider() that forced api_mode='anthropic_messages' regardless of the base URL. Alibaba now goes through the generic API-key provider path, which auto-detects the protocol from the URL: - /apps/anthropic → anthropic_messages (via endswith check) - /v1 → chat_completions (default) This fixes Alibaba setup with OpenAI-compatible DashScope endpoints (e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because runtime always forced Anthropic mode even when setup saved a /v1 URL. Based on PR #2024 by @kshitijk4poor. * docs(skill): add split, merge, search examples to ocr-and-documents skill Adds pymupdf examples for PDF splitting, merging, and text search to the existing ocr-and-documents skill. No new dependencies — pymupdf already covers all three operations natively. --------- Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-22 04:31:22 -07:00
Test	672e9752a0	docs: align venv path to match installer (venv/ not .venv/) The install script creates venv/ but several docs referenced .venv/, causing agents to fail with 'No such file or directory' when following AGENTS.md instructions. Fixes #2066	2026-03-19 18:16:26 -07:00
Test	7e30e97a59	chore: trim redundant trigger sentence from huggingface-hub description	2026-03-18 04:18:13 -07:00
Test	adf188c439	chore: add search to huggingface-hub skill description	2026-03-18 04:15:03 -07:00
Test	947827bba0	chore: tighten huggingface-hub skill description	2026-03-18 04:11:33 -07:00
Test	56ca84f243	feat: add huggingface-hub bundled skill Adds the Hugging Face CLI (hf) reference as a built-in skill under mlops/. Covers downloading/uploading models and datasets, repo management, SQL queries on datasets, inference endpoints, Spaces, buckets, and more. Based on the official HF skill from huggingface/skills.	2026-03-18 04:07:41 -07:00
Test	764825bbff	feat: expand hermes-agent-setup skill + tell agent about it in STT notes Skill now covers full CLI usage (hermes setup, hermes skills, hermes tools, hermes config, session management, etc.), config file reference, and expanded gateway commands. Agent context notes for STT failure now mention the hermes-agent-setup skill is available to help users configure Hermes features.	2026-03-18 03:05:17 -07:00
Test	9c0f346258	fix: direct user message on STT failure + hermes-agent-setup skill When a user sends a voice message and STT isn't configured, the gateway now sends a clear message directly to the user explaining how to set up voice transcription, rather than relying on the agent to relay an injected context note (which often gets misinterpreted). Also adds a hermes-agent-setup bundled skill covering STT/TTS setup, tool configuration, dependency installation, and troubleshooting.	2026-03-18 03:01:41 -07:00
Teknium	d132a3dfbb	feat(skills): add inference.sh skill (terminal-based, no custom tools) (#1686 ) Add inference.sh as a built-in skill that uses the terminal tool to run infsh CLI commands. No custom tools or tool registration — the skill teaches the agent how to use the infsh binary via terminal. Covers 150+ AI apps: image gen (FLUX, Reve, Seedream), video (Veo, Wan, Seedance), LLMs, search (Tavily, Exa), 3D, avatars, and more. Includes reference docs for authentication, app discovery, running apps, and CLI command reference. Based on PR #1021 by @okaris, reworked as a skill-only integration. Co-authored-by: okaris <okaris@users.noreply.github.com>	2026-03-17 03:06:53 -07:00
Teknium	c3d626eb07	Revert "feat: add inference.sh integration (infsh tool + skill) (#1682 )" (#1684 ) This reverts commit `6020db0243`.	2026-03-17 03:01:30 -07:00
Teknium	6020db0243	feat: add inference.sh integration (infsh tool + skill) (#1682 ) Add inference.sh CLI (infsh) as a tool integration, giving agents access to 150+ AI apps through a single CLI — image gen (FLUX, Reve, Seedream), video (Veo, Wan, Seedance), LLMs, search (Tavily, Exa), 3D, avatar/lipsync, and more. One API key manages all services. Tools: - infsh: run any infsh CLI command (app list, app run, etc.) - infsh_install: install the CLI if not present Registered as an 'inference' toolset (opt-in, not in core tools). Includes comprehensive skill docs with examples for all app categories. Changes from original PR: - NOT added to _HERMES_CORE_TOOLS (available via --toolsets inference) - Added 12 tests covering tool registration, command execution, error handling, timeout, JSON parsing, and install flow Inspired by PR #1021 by @okaris. Co-authored-by: okaris <okaris@users.noreply.github.com>	2026-03-17 02:59:21 -07:00
SHL0MS	63635744bf	Refactor ascii-video skill: creative-first SKILL.md, consolidate reference files	2026-03-16 20:11:12 -04:00
Teknium	dd7921d514	fix(honcho): isolate session routing for multi-user gateway (#1500 ) Salvaged from PR #1470 by adavyas. Core fix: Honcho tool calls in a multi-session gateway could route to the wrong session because honcho_tools.py relied on process-global state. Now threads session context through the call chain: AIAgent._invoke_tool() → handle_function_call() → registry.dispatch() → handler **kw → _resolve_session_context() Changes: - Add _resolve_session_context() to prefer per-call context over globals - Plumb honcho_manager + honcho_session_key through handle_function_call - Add sync_honcho=False to run_conversation() for synthetic flush turns - Pass honcho_session_key through gateway memory flush lifecycle - Harden gateway PID detection when /proc cmdline is unreadable - Make interrupt test scripts import-safe for pytest-xdist - Wrap BibTeX examples in Jekyll raw blocks for docs build - Fix thread-order-dependent assertion in client lifecycle test - Expand Honcho docs: session isolation, lifecycle, routing internals Dropped from original PR: - Indentation change in _create_request_openai_client that would move client creation inside the lock (causes unnecessary contention) Co-authored-by: adavyas <adavyas@users.noreply.github.com>	2026-03-16 00:23:47 -07:00
teknium1	4524cddc72	fix: persist google oauth pkce for headless auth Store the pending OAuth state and code verifier between --auth-url and --auth-code so the manual headless flow can reuse Flow.fetch_token() without disabling PKCE.	2026-03-14 22:11:34 -07:00
Teknium	b14a07315b	fix: save /plan output in workspace (#1381 )	2026-03-14 21:28:51 -07:00
Teknium	ff3473a37c	feat: add /plan command (#1372 ) * feat: add /plan command * refactor: back /plan with bundled skill * docs: document /plan skill	2026-03-14 21:18:17 -07:00
teknium1	3229e434b8	Merge origin/main into hermes/hermes-5d160594	2026-03-14 19:34:05 -07:00
teknium1	a6dc73fa07	docs: finish cron terminology cleanup	2026-03-14 19:20:58 -07:00
Teknium	6d2cfc24e9	Merge pull request #953 from JackTheGit/fix/docs-typos-batch4 Fix several documentation typos across training references	2026-03-14 10:26:15 -07:00
SHL0MS	66f8c2d5e8	ascii-video README: add missing sections (value fields, SDFs, coordinate transforms, temporal coherence, feedback buffer, masking, OKLAB, design patterns)	2026-03-14 11:08:10 -04:00
teknium1	d2869de477	docs: tighten Parallel CLI skill guidance Clarify that Parallel is an optional paid vendor workflow, add headless auth and context-chaining guidance, and align command examples more closely with upstream docs before salvaging PR #985.	2026-03-14 06:18:04 -07:00
kshitij	8d61ebe183	feat: add Parallel CLI research skill	2026-03-14 06:15:16 -07:00
Teknium	9525db913f	feat(skills): add X/Twitter xitter skill via upstream x-cli (#1285 ) * feat(skills): salvage xitter skill from PR #1065 Adapt the X/Twitter skill onto current main without vendoring an external CLI. Use upstream x-cli installation instructions, add a social-media category, and align credential/setup guidance with Hermes conventions. * docs(skills): explain X credential requirements in xitter skill Clarify why the official X flow needs five credentials and call out the setup/cost friction explicitly.	2026-03-14 04:00:27 -07:00
Teknium	25481d4286	feat: restore ACP server implementation from PR #949 (#1254 ) Restore the ACP editor-integration implementation that was present on the original PR branch but did not actually land in main. Includes: - acp_adapter/ server, session manager, event bridge, auth, permissions, and tool helpers - hermes acp subcommand and hermes-acp entry point - hermes-acp curated toolset - ACP registry manifest, setup guide, and ACP test suite - jupyter-live-kernel data science skill from the original branch Also updates the revived ACP code for current main by: - resolving runtime providers through the modern shared provider router - binding ACP sessions to per-session cwd task overrides - tracking duplicate same-name tool calls with FIFO IDs - restoring terminal approval callbacks after prompts - normalizing supporting docs/skill metadata Validated with tests/acp and the full pytest suite (-n0).	2026-03-14 00:09:05 -07:00
Teknium	2bf6b7ad1a	feat(skills): add Linear project management skill (#1230 ) Comprehensive Linear GraphQL API skill with API key auth (no OAuth needed). Includes all common queries (issues, projects, teams, search, filters) and mutations (create, update, assign, comment, status changes). Addresses user pain point: Linear MCP server OAuth flow is unreliable in headless agent sessions. This skill uses personal API keys which work reliably without browser-based auth flows. Requires: LINEAR_API_KEY env var (personal API key from Linear settings)	2026-03-13 21:20:32 -07:00
SHL0MS	6733a9a538	Update README	2026-03-13 19:31:29 -04:00
SHL0MS	cda5910ab0	update ascii-video skill: design patterns, local time, examples - New references/design-patterns.md: layer hierarchy (bg/content/accent), directional parameter arcs, scene concepts and visual metaphors, counter-rotating systems, wave collision, progressive fragmentation, entropy/consumption, staggered crescendo buildup, scene ordering - New references/examples.md: copy-paste-ready scenes at every complexity - Update scenes.md: local time convention (t=0 at scene start) - Update SKILL.md: add design-patterns.md to reference table - Add README.md to hermes-agent copy - Sync all reference docs with canonical source (SHL0MS/ascii-video)	2026-03-13 19:13:12 -04:00
Teknium	9f676d1394	feat(skills): add bundled opencode autonomous-agent skill Cherry-picked from PR #880 by @arceus77-7, rebased onto current main with corrections. Adds opencode skill under skills/autonomous-ai-agents/ with: - One-shot opencode run workflow - Interactive/background TUI session workflow - PR review workflow (including opencode pr command) - Parallel work patterns - TUI keybindings reference - Session/cost management - Smoke verification Tested with OpenCode v1.2.25. Fixed /exit bug (not a valid command), added missing flags (--file, --thinking, --variant), expanded docs. Co-authored-by: arceus77-7 <261276524+arceus77-7@users.noreply.github.com>	2026-03-13 08:39:21 -07:00
kshitijk4poor	ccfbf42844	feat: secure skill env setup on load (core #688 ) When a skill declares required_environment_variables in its YAML frontmatter, missing env vars trigger a secure TUI prompt (identical to the sudo password widget) when the skill is loaded. Secrets flow directly to ~/.hermes/.env, never entering LLM context. Key changes: - New required_environment_variables frontmatter field for skills - Secure TUI widget (masked input, 120s timeout) - Gateway safety: messaging platforms show local setup guidance - Legacy prerequisites.env_vars normalized into new format - Remote backend handling: conservative setup_needed=True - Env var name validation, file permissions hardened to 0o600 - Redact patterns extended for secret-related JSON fields - 12 existing skills updated with prerequisites declarations - ~48 new tests covering skip, timeout, gateway, remote backends - Dynamic panel widget sizing (fixes hardcoded width from original PR) Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main with conflict resolution. Fixes #688 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-13 03:14:04 -07:00
JackTheGit	2eb778119d	Fix checkpoint_id typos and add StorageMeta example in checkpoint storage docs	2026-03-12 09:59:17 +00:00
JackTheGit	a182d12778	Fix several documentation typos across training references	2026-03-11 15:49:00 +00:00
teknium1	82113f1f1e	docs: conditional skill activation — tag duckduckgo-search as web fallback and add documentation - Tag duckduckgo-search skill with fallback_for_toolsets: [web] so it auto-hides when Firecrawl is available and auto-shows when it isn't - Add 'Conditional Activation' section to CONTRIBUTING.md with full spec, semantics, and examples for all 4 frontmatter fields - Add 'Conditional Activation (Fallback Skills)' section to the user- facing skills docs with field reference table and practical example - Update SKILL.md format examples in both docs to show the new fields Follow-up to PR #785 (conditional skill activation feature).	2026-03-11 08:47:01 -07:00
Teknium	c69adfbb17	Merge pull request #825 from JackTheGit/fix/docs-typos-batch2 Fix several documentation typos	2026-03-11 07:13:24 -07:00
SHL0MS	c358af7861	Add ASCII video skill to creative category	2026-03-10 15:54:38 -04:00
teknium1	36ac91c902	Merge PR #598 : feat(skill): expand duckduckgo-search with DDGS Python API coverage Authored by areu01or00. Adds Python DDGS library examples for text, news, images, and video search with structured return field docs.	2026-03-10 04:08:53 -07:00
JackTheGit	1db8609ac9	Fix several documentation typos	2026-03-10 08:10:16 +00:00
teknium1	6ab3ebf195	Add hermes-atropos-environments skill (bundled) Add comprehensive skill for building, testing, and debugging Hermes Agent RL environments for Atropos training. Includes: - SKILL.md: Full guide covering HermesAgentBaseEnv interface, required methods, config class, CLI modes (serve/process/evaluate), reward function patterns, common pitfalls, and minimum implementation checklist - New 'Inference Setup' section: instructs the agent to always ask the user for their inference provider (OpenRouter + model choice, self-hosted VLLM endpoint, or other OpenAI-compatible API) before running tests - references/agentresult-fields.md: AgentResult dataclass field reference - references/atropos-base-env.md: Atropos BaseEnv API reference - references/usage-patterns.md: Step-by-step patterns for process, evaluate, serve, and smoke test modes Will be auto-synced to ~/.hermes/skills/ via skills_sync.	2026-03-09 23:04:17 -07:00
teknium1	0ff7fe3ee2	Merge PR #439 : docs: fix spelling of 'publicly' Authored by JackTheGit. Simple typo fix: publically → publicly in axolotl reference docs.	2026-03-09 20:55:37 -07:00
teknium1	b9d55d5719	feat: add pokemon-player skill with battle-tested gameplay tips Comprehensive skill for playing Pokemon Red/Blue via the pokemon-agent package (NousResearch/pokemon-agent). Includes: - Full startup procedure (uv venv, server, localhost.run dashboard tunnel) - Save/load lifecycle and naming conventions - Gameplay loop with emphasis on frequent vision checks - Hard-learned navigation tips: - Use vision every 2-4 steps (RAM state is blind to obstacles) - Wait 2-3 seconds after door/stair warps for map transitions - Sidestep after exiting buildings to avoid re-entering - Hold B to speed Gen 1's slow text scrolling - Ledges are one-way — use vision to find gaps - Battle strategy, type chart, Gen 1 quirks - Memory conventions with PKM: prefix - Progression milestones through all 8 gyms + Elite Four	2026-03-09 20:29:38 -07:00
teknium1	c6b75baad0	feat: find-nearby skill and Telegram location support Adds a 'find-nearby' skill for discovering nearby places using OpenStreetMap (Overpass + Nominatim). No API keys needed. Works with: - Coordinates (from Telegram location pins) - Addresses, cities, zip codes, landmarks (auto-geocoded) - Multiple place types (restaurant, cafe, bar, pharmacy, etc.) Returns names, distances, cuisine, hours, addresses, and Google Maps links (pin + directions). 184-line stdlib-only script. Also adds Telegram location message handling: - New MessageType.LOCATION in gateway base - Telegram adapter handles LOCATION and VENUE messages - Injects lat/lon coordinates into conversation context - Prompts agent to ask what the user wants nearby Inspired by PR #422 (reimplemented with simpler script and broader skill scope — addresses/cities/zips, not just Telegram coordinates).	2026-03-09 05:31:10 -07:00
teknium1	0dafdcab86	Merge: skill reorganization + sub-category support - Sub-category support in prompt_builder.py (backwards-compatible) - Split mlops (40 skills) into 7 logical sub-categories - Merged 8 singleton categories into logical parents - Fixed 2 misplaced skills (code-review, ml-paper-writing)	2026-03-09 03:40:11 -07:00
Teknium	654e16187e	feat(mcp): add sampling support — server-initiated LLM requests (#753 ) Add MCP sampling/createMessage capability via SamplingHandler class. Text-only sampling + tool use in sampling with governance (rate limits, model whitelist, token caps, tool loop limits). Per-server audit metrics. Based on concept from PR #366 by eren-karakus0. Restructured as class-based design with bug fixes and tests using real MCP SDK types. 50 new tests, 2600 total passing.	2026-03-09 03:37:38 -07:00
teknium1	732c66b0f3	refactor: reorganize skills into sub-categories The skills directory was getting disorganized — mlops alone had 40 skills in a flat list, and 12 categories were singletons with just one skill each. Code change: - prompt_builder.py: Support sub-categories in skill scanner. skills/mlops/training/axolotl/SKILL.md now shows as category 'mlops/training' instead of just 'mlops'. Backwards-compatible with existing flat structure. Split mlops (40 skills) into 7 sub-categories: - mlops/training (12): accelerate, axolotl, flash-attention, grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, torchtitan, trl-fine-tuning, unsloth - mlops/inference (8): gguf, guidance, instructor, llama-cpp, obliteratus, outlines, tensorrt-llm, vllm - mlops/models (6): audiocraft, clip, llava, segment-anything, stable-diffusion, whisper - mlops/vector-databases (4): chroma, faiss, pinecone, qdrant - mlops/evaluation (5): huggingface-tokenizers, lm-evaluation-harness, nemo-curator, saelens, weights-and-biases - mlops/cloud (2): lambda-labs, modal - mlops/research (1): dspy Merged singleton categories: - gifs → media (gif-search joins youtube-content) - music-creation → media (heartmula, songsee) - diagramming → creative (excalidraw joins ascii-art) - ocr-and-documents → productivity - domain → research (domain-intel) - feeds → research (blogwatcher) - market-data → research (polymarket) Fixed misplaced skills: - mlops/code-review → software-development (not ML-specific) - mlops/ml-paper-writing → research (academic writing) Added DESCRIPTION.md files for all new/updated categories.	2026-03-09 03:35:53 -07:00
teknium1	c21d77ca08	Merge: OBLITERATUS skill v2.0 + unified gateway compression OBLITERATUS skill (PR #408 updated): - 9 CLI methods, 28 analysis modules, 116 model presets - Default method: advanced (multi-direction SVD, norm-preserving) - Live-tested: Qwen2.5-3B 75%→0% refusal, Qwen2.5-0.5B 60%→20% - References, templates, and real-world pitfalls included Gateway compression fix (PR #739): - Unified session hygiene with agent compression config - Uses model context length × compression.threshold from config.yaml - Removed hardcoded 100k/200-msg thresholds	2026-03-09 02:59:41 -07:00
teknium1	d6c710706f	docs: add real-world testing findings to OBLITERATUS skill Added pitfalls discovered during live abliteration testing: - Models < 1B have fragmented refusal, respond poorly (0.5B: 60%→20%) - Models 3B+ work much better (3B: 75%→0% with advanced defaults) - aggressive method can backfire on small models (made it worse) - Spectral certification RED is common even when refusal rate is 0% - Fixed torch property: total_mem → total_memory	2026-03-09 02:52:54 -07:00
teknium1	a6d3becd6a	feat: update OBLITERATUS skill to v2.0 — match current repo state Major updates to reflect the current OBLITERATUS codebase: - Change default recommendation from 'informed' (experimental) to 'advanced' (reliable, well-tested multi-direction SVD) - Add new CLI commands: tourney, recommend, strategies, report, aggregate, abliterate (alias) - Add --direction-method flag (diff_means, svd, leace) - Add strategies module (embedding/FFN ablation, head pruning, layer removal) - Add evaluation module with LM Eval Harness integration - Expand analysis modules from 15 to 28 - Add Apple Silicon (MLX) support - Add study presets (quick, jailbreak, knowledge, etc.) - Add --contribute, --verify-sample-size, --preset flags - Add complete CLI command reference table - Fix torch property name: total_mem -> total_memory (caught during live testing) Tested: Successfully abliterated Qwen2.5-0.5B-Instruct using 'advanced' method — refusal rate 0.4%, coherence 1.0, model responds without refusal to test prompts.	2026-03-09 02:39:03 -07:00
teknium1	eb0b01de7b	chore: move agentmail skill to optional-skills, add API key docs AgentMail requires a third-party API key (free tier available, paid plans from $20/mo) — not appropriate for bundled skills that show up in every user's system prompt. Added a Requirements section at the top with clear instructions to add AGENTMAIL_API_KEY to ~/.hermes/.env. Streamlined setup steps to avoid duplicating the key in both .env and config.yaml.	2026-03-08 23:33:05 -07:00
teknium1	5b1528519c	Merge PR #330 : feat: add AgentMail skill for agent-owned email inboxes Authored by teyrebaz33. Closes #329.	2026-03-08 23:32:26 -07:00
teknium1	a8bf414f4a	feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill New browser capabilities and a built-in skill for agent-driven web QA. ## New tool: browser_console Returns console messages (log/warn/error/info) AND uncaught JavaScript exceptions in a single call. Uses agent-browser's 'console' and 'errors' commands through the existing session plumbing. Supports --clear to reset buffers. Verified working in both local and Browserbase cloud modes. ## Enhanced tool: browser_vision(annotate=True) New boolean parameter on browser_vision. When true, agent-browser overlays numbered [N] labels on interactive elements — each [N] maps to ref @eN. Annotation data (element name, role, bounding box) returned alongside the vision analysis. Useful for QA reports and spatial reasoning. ## Config: browser.record_sessions Auto-record browser sessions as WebM video files when enabled: - Starts recording on first browser_navigate - Stops and saves on browser_close - Saves to ~/.hermes/browser_recordings/ - Works in both local and cloud modes (verified) - Disabled by default ## Built-in skill: dogfood Systematic exploratory QA testing for web applications. Teaches the agent a 5-phase workflow: 1. Plan — accept URL, create output dirs, set scope 2. Explore — systematic crawl with annotated screenshots 3. Collect Evidence — screenshots, console errors, JS exceptions 4. Categorize — severity (Critical/High/Medium/Low) and category (Functional/Visual/Accessibility/Console/UX/Content) 5. Report — structured markdown with per-issue evidence Includes: - skills/dogfood/SKILL.md — full workflow instructions - skills/dogfood/references/issue-taxonomy.md — severity/category defs - skills/dogfood/templates/dogfood-report-template.md — report template ## Tests 21 new tests covering: - browser_console message/error parsing, clear flag, empty/failed states - browser_console schema registration - browser_vision annotate schema and flag passing - record_sessions config defaults and recording lifecycle - Dogfood skill file existence and content validation Addresses #315.	2026-03-08 21:28:12 -07:00
teknium1	99f7582175	chore: move Solana skill to optional-skills/ Solana blockchain queries are a niche use case — not needed by every user. Moved from skills/ (bundled) to optional-skills/ (installable via Skills Hub).	2026-03-08 18:52:02 -07:00

1 2

100 Commits