hermes-agent

Author	SHA1	Message	Date
teknium1	08ff1c1aa8	More major refactor/tech debt removal!	2026-02-21 20:22:33 -08:00
teknium1	6134939882	refactor: deduplicate toolsets, unify async bridging, fix approval race condition, harden security - Replace 4 copy-pasted messaging platform toolsets with shared _HERMES_CORE_TOOLS list - Consolidate 5 ad-hoc async-bridging patterns into single _run_async() in model_tools.py - Removes deprecated get_event_loop()/set_event_loop() calls - Makes all tool handlers self-protecting regardless of caller's event loop state - RL handler refactored from if/elif chain to dispatch dict - Fix exec approval race condition: replace module-level globals with thread-safe per-session tools/approval.py (submit_pending, pop_pending, approve_session, is_approved) - Session A approving "rm" no longer approves it for all other sessions - Fix config deep merge: user overriding tts.elevenlabs.voice_id no longer clobbers tts.elevenlabs.model_id; migration detection now recurses to arbitrary depth - Gateway default-deny: unauthenticated users denied unless GATEWAY_ALLOW_ALL_USERS=true - Add 10 dangerous command patterns: rm --recursive, bash -c, python -e, curl\|bash, xargs rm, find -delete - Sanitize gateway error messages: users see generic message, full traceback goes to logs	2026-02-21 18:28:49 -08:00
teknium1	7cb6427dea	refactor: streamline cron job handling and update CLI commands - Removed legacy cron daemon functionality, integrating cron job execution directly into the gateway process for improved efficiency. - Updated CLI commands to reflect changes, replacing `hermes cron daemon` with `hermes cron status` and enhancing documentation for cron job management. - Clarified messaging in the README and other documentation regarding the gateway's role in managing cron jobs. - Removed obsolete terminal_hecate tool and related configurations to simplify the codebase.	2026-02-21 16:21:19 -08:00
teknium1	79b62497d1	enable cronjobs in messaging platforms	2026-02-21 12:46:18 -08:00
teknium1	0729ef7353	fix: refine environment creation condition in terminal_tool - Updated the environment creation condition to specifically check for "singularity" instead of allowing "local", ensuring more precise handling of environment types during task execution.	2026-02-21 12:43:56 -08:00
teknium1	8f6788474b	feat: enhance logging in AIAgent for quiet mode - Added functionality to suppress logging noise from specific modules when in quiet mode, improving user experience in CLI. - Updated terminal_tool.py to change the log level for fallback directory usage from warning to debug, providing clearer context without cluttering logs.	2026-02-21 12:41:05 -08:00
teknium1	c98ee98525	feat: implement interactive prompts for sudo password and command approval in CLI - Added methods for handling sudo password and dangerous command approval prompts using a callback mechanism in cli.py. - Integrated these prompts with the prompt_toolkit UI for improved user experience. - Updated terminal_tool.py to support callback registration for interactive prompts, enhancing the CLI's interactivity. - Introduced a background thread for API calls in run_agent.py to allow for interrupt handling during long-running operations. - Enhanced error handling for interrupted API calls, ensuring graceful degradation of user experience.	2026-02-21 12:15:40 -08:00
teknium1	7ee7221af1	refactor: consolidate debug logging across tools with shared DebugSession class - Introduced a new DebugSession class in tools/debug_helpers.py to centralize debug logging functionality, replacing duplicated code across various tool modules. - Updated image_generation_tool.py, mixture_of_agents_tool.py, vision_tools.py, web_tools.py, and others to utilize the new DebugSession for logging tool calls and saving debug logs. - Enhanced maintainability and consistency in debug logging practices across the codebase.	2026-02-21 03:53:24 -08:00
teknium1	748fd3db88	refactor: enhance error handling with structured logging across multiple modules - Updated various modules including cli.py, run_agent.py, gateway, and tools to replace silent exception handling with structured logging. - Improved error messages to provide more context, aiding in debugging and monitoring. - Ensured consistent logging practices throughout the codebase, enhancing traceability and maintainability.	2026-02-21 03:32:11 -08:00
teknium1	a885d2f240	refactor: implement structured logging across multiple modules - Introduced logging functionality in cli.py, run_agent.py, scheduler.py, and various tool modules to replace print statements with structured logging. - Enhanced error handling and informational messages to improve debugging and monitoring capabilities. - Ensured consistent logging practices across the codebase, facilitating better traceability and maintenance.	2026-02-21 03:11:11 -08:00
teknium1	b6247b71b5	refactor: update tool descriptions for clarity and conciseness - Revised descriptions for various tools in model_tools.py, browser_tool.py, code_execution_tool.py, delegate_tool.py, and terminal_tool.py to enhance clarity and reduce verbosity. - Improved consistency in terminology and formatting across tool descriptions, ensuring users have a clearer understanding of tool functionalities and usage.	2026-02-21 02:41:30 -08:00
teknium1	a54a27595b	fix: update browser command connection instructions to prevent session conflicts - Clarified the usage of the --cdp flag when connecting to an existing Browserbase session. - Emphasized the importance of not using --session with --cdp to avoid creating a local browser instance in agent-browser >=0.13. - Updated comments to reflect changes in per-task isolation management with AGENT_BROWSER_SOCKET_DIR.	2026-02-21 00:54:01 -08:00
teknium1	7283b9f6cf	feat: extend browser session management with improved thread safety and timeout configuration - Increased the default session inactivity timeout from 2 to 5 minutes to accommodate LLM reasoning during multi-step tasks. - Enhanced thread safety by implementing locks around session activity tracking and cleanup processes, allowing concurrent access by multiple subagents. - Removed the stale daemon cleanup function, as it is no longer necessary with the updated session management approach. - Updated logging and session cleanup logic to ensure proper handling of active sessions and associated resources.	2026-02-21 00:44:25 -08:00
teknium1	5b3f708fcb	feat: enhance stale daemon cleanup and improve error logging in browser tool - Updated the stale daemon cleanup function to support multiple patterns for identifying orphaned agent-browser processes, improving reliability across different versions. - Added logging for stderr output during browser command execution to aid in diagnostics, particularly for capturing warnings from the agent-browser. - Implemented a warning for empty snapshots returned from the agent-browser, indicating potential issues with stale daemons or CDP connections.	2026-02-21 00:27:35 -08:00
teknium1	c48817f69b	chore: update agent-browser dependency and clean up stale daemon processes - Upgraded the agent-browser dependency from version 0.7.6 to 0.13.0 in package.json. - Added functionality to kill stale agent-browser daemon processes in browser_tool.py to prevent orphaned instances from previous runs.	2026-02-20 23:40:42 -08:00
teknium1	70dd3a16dc	Cleanup time!	2026-02-20 23:23:32 -08:00
teknium1	630bd3d789	feat: improve password prompt handling in terminal tool - Replaced getpass with direct reading from /dev/tty to enhance password input handling without echoing. - Updated threading logic for password input to ensure proper cleanup and error handling. - Improved visual feedback during password prompt, including clearer separation and timeout messaging. - Enhanced user experience by providing immediate feedback on password input status.	2026-02-20 21:26:31 -08:00
teknium1	ba07d9d5e3	feat: enhance task delegation with spinner updates and progress display - Added a spinner to visually indicate task delegation progress in quiet mode, improving user experience during batch processing. - Implemented a method to update spinner text dynamically based on remaining tasks, providing real-time feedback. - Enhanced the `delegate_task` function to include per-task completion messages, ensuring clarity on task status during execution. - Updated the KawaiiSpinner class to allow message updates while running, facilitating better interaction during long-running tasks.	2026-02-20 03:23:23 -08:00
teknium1	90e5211128	feat: implement subagent delegation for task management - Introduced the `delegate_task` tool, allowing the main agent to spawn child AIAgent instances with isolated context for complex tasks. - Supported both single-task and batch processing (up to 3 concurrent tasks) to enhance task management capabilities. - Updated configuration options for delegation, including maximum iterations and default toolsets for subagents. - Enhanced documentation to provide clear guidance on using the delegation feature and its configuration. - Added comprehensive tests to ensure the functionality and reliability of the delegation logic.	2026-02-20 03:15:53 -08:00
teknium1	c0d412a736	refactor: update search tool parameters and documentation for clarity - Changed the target parameter from "content" and "files" to "grep" and "find" to better represent their functionality. - Revised descriptions in the tool definitions and execution code schema to enhance understanding of search modes and output formats. - Ensured consistency in the handling of search operations across the codebase.	2026-02-20 02:46:30 -08:00
teknium1	f9eb5edb96	refactor: rename search tool for clarity and consistency - Updated the tool name from "search" to "search_files" across multiple files to better reflect its functionality. - Adjusted related documentation and descriptions to ensure clarity in usage and expected behavior. - Enhanced the toolset definitions and mappings to incorporate the new naming convention, improving overall consistency in the codebase.	2026-02-20 02:43:57 -08:00
teknium1	ba8b80a163	refactor: improve memory entry handling and file operations - Replaced file locking with atomic file operations using temporary files to prevent race conditions during read/write. - Added deduplication of memory and user entries to avoid exact duplicates in the memory store. - Enhanced error handling for duplicate entries and improved logic for managing multiple matches in memory operations. - Updated docstrings to clarify the behavior of file reading and writing methods, ensuring better understanding of the implementation.	2026-02-20 02:32:15 -08:00
teknium1	3b90fa5c9b	fix: increase default timeout for code execution sandbox - Updated the default timeout for sandbox script execution from 120 seconds to 300 seconds (5 minutes) to allow longer-running scripts. - Enhanced comments in the code execution tool to clarify the timeout duration. - Suppressed stdout and stderr output from internal tool handlers during execution to prevent clutter in the CLI interface.	2026-02-20 01:29:53 -08:00
teknium1	273b367f05	fix: update documentation and return types for web tools - Revised docstrings for `web_search` and `web_extract` functions to clarify return types and structure. - Updated the execution code schema documentation to reflect changes in the output format for both tools, ensuring consistency and improved understanding for users.	2026-02-19 23:30:01 -08:00
teknium1	783acd712d	feat: implement code execution sandbox for programmatic tool calling - Introduced a new `execute_code` tool that allows the agent to run Python scripts that call Hermes tools via RPC, reducing the number of round trips required for tool interactions. - Added configuration options for timeout and maximum tool calls in the sandbox environment. - Updated the toolset definitions to include the new code execution capabilities, ensuring integration across platforms. - Implemented comprehensive tests for the code execution sandbox, covering various scenarios including tool call limits and error handling. - Enhanced the CLI and documentation to reflect the new functionality, providing users with clear guidance on using the code execution tool.	2026-02-19 23:23:43 -08:00
teknium1	9350e26e68	feat: introduce clarifying questions tool for interactive user engagement - Added a new `clarify_tool` to enable the agent to ask structured multiple-choice or open-ended questions to users. - Implemented callback functionality for user interaction, allowing the platform to handle UI presentation. - Updated the CLI and agent to support clarify questions, including timeout handling and response management. - Enhanced toolset definitions and requirements to include the clarify tool, ensuring availability across platforms.	2026-02-19 20:06:14 -08:00
teknium1	4d5f29c74c	feat: introduce skill management tool for agent-created skills and skills migration to ~/.hermes - Added a new `skill_manager_tool` to enable agents to create, update, and delete their own skills, enhancing procedural memory capabilities. - Updated the skills directory structure to support user-created skills in `~/.hermes/skills/`, allowing for better organization and management. - Enhanced the CLI and documentation to reflect the new skill management functionalities, including detailed instructions on creating and modifying skills. - Implemented a manifest-based syncing mechanism for bundled skills to ensure user modifications are preserved during updates.	2026-02-19 18:25:53 -08:00
teknium1	d070b8698d	fix: escape file glob patterns in ShellFileOperations - Updated the file glob and include filters in the ShellFileOperations class to escape shell arguments, preventing unintended shell expansion. - Added comments to clarify the necessity of quoting for file glob patterns.	2026-02-19 15:12:02 -08:00
teknium1	057d3e1810	feat: enhance search functionality in ShellFileOperations - Updated the `_search_with_rg` and `_search_with_grep` methods to include filename in the output and improve result handling. - Adjusted result fetching to account for context lines, ensuring accurate total counts and pagination. - Enhanced parsing logic for matches and context lines, improving the accuracy of search results. - Refactored result slicing to maintain consistency across output modes, ensuring users receive the correct number of results.	2026-02-19 15:10:17 -08:00
teknium1	d49af633f0	feat: enhance command execution with stdin support - Modified the `_exec` method in `ShellFileOperations` to accept `stdin_data`, allowing large content to be piped directly to commands, bypassing ARG_MAX limitations. - Updated the `execute` method in various environment classes (`_LocalEnvironment`, `_SingularityEnvironment`, `_SSHEnvironment`, `_DockerEnvironment`) to support `stdin_data`, improving command execution flexibility. - Removed the unique marker generation for heredoc in favor of direct stdin piping, simplifying file writing operations and enhancing performance for large files.	2026-02-19 14:50:51 -08:00
teknium1	4f57d7116d	Improved stdout handling in the terminal tool to prevent deadlocks by implementing a background thread to continuously drain output, ensuring smooth command execution without blocking.	2026-02-19 09:26:31 -08:00
teknium1	56ee8a5cc6	refactor: remove 'read' action from memory tool and agent logging - Eliminated the 'read' action from the memory tool and related logging in the agent, streamlining the available actions to 'add', 'replace', and 'remove'. - Updated error messages and documentation to reflect the removal of the 'read' action, ensuring clarity in the API's usage.	2026-02-19 01:03:08 -08:00
teknium1	440c244cac	feat: add persistent memory system + SQLite session store Two-part implementation: Part A - Curated Bounded Memory: - New memory tool (tools/memory_tool.py) with MEMORY.md + USER.md stores - Character-limited (2200/1375 chars), § delimited entries - Frozen snapshot injected into system prompt at session start - Model manages pruning via replace/remove with substring matching - Usage indicator shown in system prompt header Part B - SQLite Session Store: - New hermes_state.py with SessionDB class, FTS5 full-text search - Gateway session.py rewritten to dual-write SQLite + legacy JSONL - Compression-triggered session splitting with parent_session_id chains - New session_search tool with Gemini Flash summarization of matched sessions - CLI session lifecycle (create on launch, close on exit) Also: - System prompt now cached per session, only rebuilt on compression (fixes prefix cache invalidation from date/time changes every turn) - Config version bumped to 3, hermes doctor checks for new artifacts - Disabled in batch_runner and RL environments	2026-02-19 00:57:31 -08:00
teknium1	14e59706b7	Add Skills Hub — universal skill search, install, and management from online registries Implements the Hermes Skills Hub with agentskills.io spec compliance, multi-registry skill discovery, security scanning, and user-driven management via CLI and /skills slash command. Core features: - Security scanner (tools/skills_guard.py): 120 threat patterns across 12 categories, trust-aware install policy (builtin/trusted/community), structural checks, unicode injection detection, LLM audit pass - Hub client (tools/skills_hub.py): GitHub, ClawHub, Claude Code marketplace, and LobeHub source adapters with shared GitHubAuth (PAT + gh CLI + GitHub App), lock file provenance tracking, quarantine flow, and unified search across all sources - CLI interface (hermes_cli/skills_hub.py): search, install, inspect, list, audit, uninstall, publish (GitHub PR), snapshot export/import, and tap management — powers both `hermes skills` and `/skills` Spec conformance (Phase 0): - Upgraded frontmatter parser to yaml.safe_load with fallback - Migrated 39 SKILL.md files: tags/related_skills to metadata.hermes.* - Added assets/ directory support and compatibility/metadata fields - Excluded .hub/ from skill discovery in skills_tool.py Updated 13 config/doc files including README, AGENTS.md, .env.example, setup wizard, doctor, status, pyproject.toml, and docs.	2026-02-18 16:09:05 -08:00
teknium1	e184f5ab3a	Add todo tool for agent task planning and management Single `todo` tool that reads (no params) or writes (provide todos array with merge flag). In-memory TodoStore on AIAgent, no system prompt mutation, behavioral guidance in tool description only. State re-injected after context compression events. Gateway sessions hydrate from conversation history. Added to all platform toolsets. Also wired into RL agent_loop.py with per-run TodoStore and fixed browser_snapshot user_task passthrough from first user message.	2026-02-17 17:02:33 -08:00
teknium1	ec59d71e60	Update PTY write handling in ProcessRegistry to ensure data is encoded as bytes before writing. This change improves compatibility with string inputs and clarifies the expected data type in comments.	2026-02-17 03:14:47 -08:00
teknium1	bdac541d1e	Rename OPENAI_API_KEY to HERMES_OPENAI_API_KEY in configuration and codebase for clarity and to avoid conflicts. Update related documentation and error messages to reflect the new key name, ensuring backward compatibility with existing setups.	2026-02-17 03:11:17 -08:00
teknium1	061fa70907	Add background process management with process tool, wait, PTY, and stdin support New process registry and tool for managing long-running background processes across all terminal backends (local, Docker, Singularity, Modal, SSH). Process Registry (tools/process_registry.py): - ProcessSession tracking with rolling 200KB output buffer - spawn_local() with optional PTY via ptyprocess for interactive CLIs - spawn_via_env() for non-local backends (runs inside sandbox, never on host) - Background reader threads per process (Popen stdout or PTY) - wait() with timeout clamping, interrupt support, and transparent limit reporting - JSON checkpoint to ~/.hermes/processes.json for gateway crash recovery - Module-level singleton shared across agent loop, gateway, and RL Process Tool (model_tools.py): - 7 actions: list, poll, log, wait, kill, write, submit - Paired with terminal in all toolsets (CLI, messaging, RL) - Timeout clamping with transparent notes in response Terminal Tool Updates (tools/terminal_tool.py): - Replaced nohup background mode with registry spawn (returns session_id) - Added workdir parameter for per-command working directory - Added check_interval parameter for gateway auto-check watchers - Added pty parameter for interactive CLI tools (Codex, Claude Code) - Updated TERMINAL_TOOL_DESCRIPTION with full background workflow docs - Cleanup thread now respects active background processes (won't reap sandbox) Gateway Integration (gateway/run.py, session.py, config.py): - Session reset protection: sessions with active processes exempt from reset - Default idle timeout increased from 2 hours to 24 hours - from_dict fallback aligned to match (was 120, now 1440) - session_key env var propagated to process registry for session mapping - Crash recovery on gateway startup via checkpoint probe - check_interval watcher: asyncio task polls process, delivers updates to platform RL Safety (environments/): - tool_context.py cleanup() kills background processes on episode end - hermes_base_env.py warns when enabled_toolsets is None (loads all tools) - Process tool safe in RL via wait() blocking the agent loop Also: - Added ptyprocess as optional dependency (in pyproject.toml [pty] extra + [all]) - Fixed pre-existing bug: rl_test_inference missing from TOOL_TO_TOOLSET_MAP - Updated AGENTS.md with process management docs and project structure - Updated README.md terminal section with process management overview	2026-02-17 02:51:31 -08:00
teknium1	c33feb6dc9	Fix host CWD leaking into non-local terminal backends When using Modal, Docker, SSH, or Singularity as the terminal backend from the CLI, the agent resolved cwd: "." to the host machine's local path (e.g. /Users/rewbs/code/hermes-agent) and passed it to the remote sandbox, where it doesn't exist. All commands failed with "No such file or directory". Root cause: cli.py unconditionally resolved "." to os.getcwd() and wrote it to TERMINAL_CWD regardless of backend type. Every tool then used that host-local path as the working directory inside the remote environment. Fixes: - cli.py: only resolve "." to os.getcwd() for the local backend. For all remote backends (ssh, docker, modal, singularity), leave TERMINAL_CWD unset so the tool layer uses per-backend defaults (/root, /, ~, etc.) - terminal_tool.py: added sanity check -- if TERMINAL_CWD contains a host-local prefix (/Users/, /home/, C:\) for a non-local backend, log a warning and fall back to the backend's default - terminal_tool.py: SSH default CWD is now ~ instead of os.getcwd() - file_operations.py: last-resort CWD fallback changed from os.getcwd() to "/" so host paths never leak into remote file operations	2026-02-16 22:30:04 -08:00
teknium1	8117d0adab	Refactor file operations and environment management in file_tools and terminal_tool - Improved the caching mechanism for ShellFileOperations to ensure stale entries are invalidated when environments are cleaned up. - Enhanced thread safety by refining the use of locks during environment creation and cleanup processes. - Streamlined the cleanup of inactive environments to prevent blocking other tool calls, ensuring efficient resource management. - Added error handling and messaging improvements for better user feedback during environment cleanup.	2026-02-16 19:37:40 -08:00
teknium1	01a3a6ab0d	Implement cleanup guard to prevent multiple executions on exit - Introduced a new cleanup function that ensures terminal and browser sessions are cleaned up only once during application exit. - Updated atexit registration to use the new cleanup function, enhancing resource management and preventing potential issues from multiple cleanup calls. - Modified terminal cleanup messaging to only display when environments are cleaned, improving user feedback.	2026-02-16 02:43:45 -08:00
teknium1	69aa35a51c	Add messaging platform enhancements: STT, stickers, Discord UX, Slack, pairing, hooks Major feature additions inspired by OpenClaw/ClawdBot integration analysis: Voice Message Transcription (STT): - Auto-transcribe voice/audio messages via OpenAI Whisper API - Download voice to ~/.hermes/audio_cache/ on Telegram/Discord/WhatsApp - Inject transcript as text so all models can understand voice input - Configurable model (whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe) Telegram Sticker Understanding: - Describe static stickers via vision tool with JSON-backed cache - Cache keyed by file_unique_id avoids redundant API calls - Animated/video stickers get emoji-based fallback description Discord Rich UX: - Native slash commands (/ask, /reset, /status, /stop) via app_commands - Button-based exec approvals (Allow Once / Always Allow / Deny) - ExecApprovalView with user authorization and timeout handling Slack Integration: - Full SlackAdapter using slack-bolt with Socket Mode - DMs, channel messages (mention-gated), /hermes slash command - File attachment handling with bot-token-authenticated downloads DM Pairing System: - Code-based user authorization as alternative to static allowlists - 8-char codes from unambiguous alphabet, 1-hour expiry - Rate limiting, lockout after failed attempts, chmod 0600 on data - CLI: hermes pairing list/approve/revoke/clear-pending Event Hook System: - File-based hook discovery from ~/.hermes/hooks/ - HOOK.yaml + handler.py per hook, sync/async handler support - Events: gateway:startup, session:start/reset, agent:start/step/end - Wildcard matching (command:* catches all command events) Cross-Channel Messaging: - send_message agent tool for delivering to any connected platform - Enables cron job delivery and cross-platform notifications Human-Like Response Pacing: - Configurable delays between message chunks (off/natural/custom) - HERMES_HUMAN_DELAY_MODE env var with min/max ms settings Warm Injection Message Style: - Retrofitted image vision messages with friendly kawaii-consistent tone - All new injection messages (STT, stickers, errors) use warm style Also: updated config migration to prompt for optional keys interactively, bumped config version, updated README, AGENTS.md, .env.example, cli-config.yaml.example, install scripts, pyproject.toml, and toolsets.	2026-02-15 21:38:59 -08:00
teknium1	5404a8fcd8	Enhance image handling and analysis capabilities across platforms - Updated the vision tool to accept both HTTP/HTTPS URLs and local file paths for image analysis. - Implemented caching of user-uploaded images in local directories to ensure reliable access for the vision tool, addressing issues with ephemeral URLs. - Enhanced platform adapters (Discord, Telegram, WhatsApp) to download and cache images, allowing for immediate analysis and enriched message context. - Added a new method to auto-analyze images attached by users, enriching the conversation with detailed descriptions. - Improved documentation for image handling processes and updated related functions for clarity and efficiency.	2026-02-15 16:10:50 -08:00
teknium1	ff9ea6c4b1	Enhance TTS tool to support platform-specific audio formats - Added detection of the platform from the environment variable to determine the appropriate audio output format. - Implemented logic to output Opus (.ogg) files for Telegram when using compatible TTS providers, while defaulting to MP3 for others.	2026-02-14 16:13:26 -08:00
teknium1	f5be6177b2	Add Text-to-Speech (TTS) functionality with multiple providers Add tool previews Add AGENTS and SOUL.md support Add Exec Approval	2026-02-12 10:05:08 -08:00
teknium	f23856df8e	Add kill_modal script to manage Modal applications and better handling of file and terminal tools - Introduced a new script, `kill_modal.sh`, to facilitate stopping running Modal apps, including the ability to stop all apps or specific swe-rex sandboxes. - Enhanced user experience with clear usage instructions and feedback during the stopping process. - Improved error handling to ensure smooth execution even if some apps fail to stop.	2026-02-12 05:37:14 +00:00
teknium1	153cd5bb44	Refactor skills tool integration and enhance system prompt - Removed the skills_categories tool from the skills toolset, streamlining the skills functionality to focus on skills_list and skill_view. - Updated the system prompt to dynamically build a compact skills index, allowing the model to quickly reference available skills without additional tool calls. - Cleaned up related code and documentation to reflect the removal of skills_categories, ensuring clarity and consistency across the codebase.	2026-02-10 19:48:38 -08:00
teknium1	cfe2f3fe15	Implement interrupt handling for long-running tool executions in AIAgent - Added functionality to signal and terminate long-running terminal commands when a new user message is received, allowing for immediate agent response. - Introduced a global interrupt event in the terminal tool to facilitate early termination of subprocesses. - Updated the AIAgent class to handle interrupts gracefully, ensuring that remaining tool calls are skipped and appropriate messages are returned to maintain valid message sequences.	2026-02-10 16:34:27 -08:00
teknium	999a28062d	Implement graceful exit cleanup for terminal tool - Added a new `_atexit_cleanup` function to handle cleanup of active environments and stop the cleanup thread upon program exit. - Enhanced logging to inform users about the number of remaining sandboxes being shut down during cleanup.	2026-02-10 22:53:44 +00:00
teknium	35ad3146a8	Add new environments and enhance tool context functionality - Introduced new environments: Terminal Test Environment and SWE Environment, each with default configurations for testing and software engineering tasks. - Added TerminalBench 2.0 evaluation environment with comprehensive setup for agentic LLMs, including task execution and verification. - Enhanced ToolContext with methods for uploading and downloading files, ensuring binary-safe operations. - Updated documentation across environments to reflect new features and usage instructions. - Refactored existing environment configurations for consistency and clarity.	2026-02-10 19:39:05 +00:00

1 2

99 Commits