Commit Graph

600 Commits

Author SHA1 Message Date
teknium1
39bfd226b8 Merge PR #225: fix: preserve empty content in ReadResult.to_dict()
Authored by Farukest. Fixes #224.
2026-03-02 03:13:31 -08:00
teknium1
234b67f5fd fix: mock time in retry exhaustion tests to prevent backoff sleep
The TestRetryExhaustion tests from PR #223 didn't mock time.sleep/time.time,
causing the retry backoff loops (275s+ total) to run in real time. Tests would
time out instead of running quickly.

Added _make_fast_time_mock() helper that creates a mock time module where
time.time() advances 500s per call (so sleep_end is always in the past) and
time.sleep() is a no-op. Both tests now complete in <1s.
2026-03-02 02:59:41 -08:00
teknium1
e27e3a4f8a Merge PR #223: fix: correct off-by-one in retry exhaustion checks
Authored by Farukest. Fixes #222.
2026-03-02 02:54:10 -08:00
teknium1
7a11ff95a9 Merge PR #277: fix: handle None message content across codebase
Fixes #276. Replace msg.get('content', '') with msg.get('content') or ''
in 4 vulnerable message-processing paths.
2026-03-02 02:42:35 -08:00
teknium1
33ab5cec82 fix: handle None message content across codebase (fixes #276)
The OpenAI API returns content: null on assistant messages with tool
calls. msg.get('content', '') returns None when the key exists with
value None, causing TypeError on len(), string concatenation, and
.strip() in downstream code paths.

Fixed 4 locations that process conversation messages:
- agent/auxiliary_client.py:84 — None passed to API calls
- cli.py:1288 — crash on content[:200] and len(content)
- run_agent.py:3444 — crash on None.strip()
- honcho_integration/session.py:445 — 'None' rendered in transcript

13 other instances were verified safe (already protected, only process
user/tool messages, or use the safe pattern).

Pattern: msg.get('content', '') → msg.get('content') or ''

Fixes #276
2026-03-02 02:23:53 -08:00
teknium1
1cb2311bad fix(security): block path traversal in skill_view file_path (fixes #220)
skill_view accepted arbitrary file_path values like '../../.env' and
would read files outside the skill directory, exposing API keys and
other sensitive data.

Added two layers of defense:
1. Reject paths with '..' components (fast, catches obvious traversal)
2. resolve() containment check with trailing '/' to prevent prefix
   collisions (catches symlinks and edge cases)

Fix approach from PR #242 (@Bartok9). Vulnerability reported by
@Farukest (#220, PR #221). Tests rewritten to properly mock SKILLS_DIR.

Closes #220
2026-03-02 02:00:09 -08:00
teknium1
25c65bc99e fix(agent): handle None content in context compressor (fixes #211)
The OpenAI API returns content: null on assistant messages that only
contain tool calls. msg.get('content', '') returns None (not '') when
the key exists with value None, causing TypeError on len() and string
concatenation in _generate_summary and compress.

Fix: msg.get('content') or '' — handles both missing keys and None.

Tests from PR #216 (@Farukest). Fix also in PR #215 (@cutepawss).
Both PRs had stale branches and couldn't be merged directly.

Closes #211
2026-03-02 01:35:52 -08:00
teknium1
afb680b50d fix(cli): fix max_turns comment and test for correct priority order
Priority is: CLI arg > config file > env var > default
(not env var > config file as the old comment stated)

The test failed because config.yaml had max_turns at both root level
and inside agent section. The test cleared agent.max_turns but the
root-level value still took precedence over the env var. Fixed the
test to clear both, and corrected the comment to match the intended
priority order.
2026-03-02 01:18:52 -08:00
teknium1
866fd9476b fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)

The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.

Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)

Fixes #189.
2026-03-02 01:09:34 -08:00
teknium1
e265006fd6 test: add coverage for chat_topic in SessionSource and session context prompt
Tests added:
- Roundtrip serialization of chat_topic via to_dict/from_dict
- chat_topic defaults to None when missing from dict
- Channel Topic line appears in session context prompt when set
- Channel Topic line is omitted when chat_topic is None

Follow-up to PR #248 (feat: Discord channel topic in session context).
2026-03-02 00:53:21 -08:00
teknium1
6bf3aad62e fix(delegate_tool): update max_iterations in documentation and example config to reflect default value of 50 2026-03-02 00:52:01 -08:00
teknium1
3a840a130c Merge PR #248: feat(gateway): include Discord channel topic in session context
Authored by Bartok9. Fixes #163.

Surfaces Discord channel topics in the agent's session context prompt,
allowing the agent to adapt its behavior based on the channel's purpose.
2026-03-02 00:51:20 -08:00
teknium1
14396e3fe7 fix(delegate_tool): update max_iterations default from 25 to 50 for improved task handling 2026-03-02 00:51:10 -08:00
teknium1
1ad930cbd0 fix(delegate_tool): increase DEFAULT_MAX_ITERATIONS from 25 to 50 to enhance processing capabilities 2026-03-02 00:51:01 -08:00
Sertug17
7a0b37712f fix(agent): strip finish_reason from assistant messages to fix Mistral 422 errors (#253)
* fix(agent): skip reasoning param for Mistral API to prevent 422 errors

* fix(agent): strip finish_reason from assistant messages to fix Mistral 422 errors
2026-03-02 00:35:03 -08:00
teknium1
e2b8740fcf fix: load_cli_config() now carries over non-default config keys
load_cli_config() only merged keys present in its hardcoded defaults
dict, silently dropping user-added keys like platform_toolsets (saved
by 'hermes tools'), provider_routing, memory, honcho, etc.

Added a second pass to carry over all file_config keys that aren't in
defaults, so 'hermes tools' changes actually take effect in CLI mode.

The gateway was unaffected (reads YAML directly via yaml.safe_load).
2026-03-02 00:32:28 -08:00
teknium1
45d132d098 fix(agent): remove preview truncation in assistant message output
Updated the AIAgent class to print the full content of assistant messages without truncation, enhancing visibility of the messages during runtime. This change improves the clarity of communication from the agent.
2026-03-02 00:32:06 -08:00
teknium1
719f2eef32 Merge branch 'pr-217'
# Conflicts:
#	gateway/session.py
2026-03-02 00:18:41 -08:00
teknium1
698b35933e fix: /retry, /undo, /compress, and /reset gateway commands (#210)
- /retry, /undo, /compress were setting a non-existent conversation_history
  attribute on SessionEntry (a @dataclass with no such field). The dangling
  attribute was silently created but never read — transcript was reloaded
  from DB on next interaction, making all three commands no-ops.

- /reset accessed self.session_store._sessions (non-existent) instead of
  self.session_store._entries, causing AttributeError caught by a bare
  except, silently skipping the pre-reset memory flush.

Fix:
- Add SessionDB.clear_messages() to delete messages and reset counters
- Add SessionStore.rewrite_transcript() to atomically replace transcript
  in both SQLite and legacy JSONL storage
- Replace all dangling attr assignments with rewrite_transcript() calls
- Fix _sessions → _entries in /reset handler

Closes #210
2026-03-02 00:14:49 -08:00
teknium1
0512ada793 feat(agent): include tools in agent status output
Added the tools attribute to the AIAgent class's status output, ensuring that the current tools used by the agent are included in the status information. This enhancement improves the visibility of the agent's capabilities during runtime.
2026-03-02 00:13:41 -08:00
teknium1
47289ba6f1 feat(agent): include system prompt in agent status output
Added the system prompt to the AIAgent class's status output, ensuring that the current system prompt is included in the agent's status information. This enhancement improves visibility into the agent's configuration during runtime.
2026-03-01 23:50:54 -08:00
teknium1
7b38afc179 fix(auth): handle session expiration and re-authentication in Nous Portal
Enhanced error handling in the _model_flow_nous function to detect session expiration and prompt for re-authentication with the Nous Portal. Added logic to manage re-login attempts and provide user feedback on success or failure, improving the overall user experience during authentication issues.
2026-03-01 20:20:30 -08:00
teknium1
e5893075f9 feat(agent): add summary handling for reasoning items
Enhanced the AIAgent class to capture and normalize summary information for reasoning items. Implemented logic to handle summaries as lists, ensuring proper formatting for API interactions. Updated tests to validate the inclusion of summaries in reasoning items, both for existing and default cases.
2026-03-01 20:03:03 -08:00
teknium1
5e598a588f refactor(auth): transition Codex OAuth tokens to Hermes auth store
Updated the authentication mechanism to store Codex OAuth tokens in the Hermes auth store located at ~/.hermes/auth.json instead of the previous ~/.codex/auth.json. This change includes refactoring related functions for reading and saving tokens, ensuring better management of authentication states and preventing conflicts between different applications. Adjusted tests to reflect the new storage structure and improved error handling for missing or malformed tokens.
2026-03-01 19:59:24 -08:00
teknium1
8bc2de4ab6 feat(provider-routing): add OpenRouter provider routing configuration
Introduced a new `provider_routing` section in the CLI configuration to control how requests are routed across providers when using OpenRouter. This includes options for sorting providers by throughput, latency, or price, as well as allowing or ignoring specific providers, setting the order of provider attempts, and managing data collection policies. Updated relevant classes and documentation to support these features, enhancing flexibility in provider selection.
2026-03-01 18:24:27 -08:00
teknium1
75a92a3f82 refactor(cli): improve header formatting and description truncation
Updated the CLI header formatting for tool and configuration displays to center titles within their respective widths. Enhanced the display of command descriptions to include an ellipsis for longer texts, ensuring better readability. This refactor improves the overall user interface of the CLI.
2026-03-01 16:37:16 -08:00
teknium1
72963e9ccb fix(install): prevent interactive prompts during non-interactive installs
Updated the install.sh script to set DEBIAN_FRONTEND and NEEDRESTART_MODE environment variables for non-interactive package installations on Ubuntu and Debian. This change ensures that prompts from needrestart and whiptail do not block the installation process, improving automation for system package installations.
2026-03-01 16:18:35 -08:00
teknium1
92da8e7e62 feat(agent): enhance reasoning handling and configuration
Added support for processing encrypted reasoning content within the AIAgent class. Introduced logic to determine reasoning effort and enable/disable reasoning based on configuration settings. Updated the kwargs to reflect these changes, ensuring proper handling of reasoning parameters during agent execution.
2026-03-01 16:15:20 -08:00
teknium1
c84d5ce738 refactor(terminal_tool): clarify foreground and background process usage
Updated documentation within terminal_tool.py to emphasize the appropriate use of foreground and background processes. Enhanced descriptions for the timeout setting and background execution to guide users towards optimal configurations for scripts, builds, and long-running tasks. Adjusted the default timeout value from 60 to 180 seconds for improved handling of longer operations.
2026-03-01 16:15:05 -08:00
teknium1
dda9f3e734 fix(process_registry): ensure unbuffered output for subprocesses
Updated the environment variables for subprocess execution in the ProcessRegistry class to set PYTHONUNBUFFERED to "1". This change ensures that output from Python scripts is unbuffered, allowing for real-time visibility of progress during background execution. Adjusted both the pty and background process spawning methods to use the new environment configuration.
2026-03-01 16:14:57 -08:00
teknium1
834e25a662 feat(batch_runner): enhance prompt processing with optional container image support
Updated the _process_single_prompt function to accept an optional 'image' field in prompt_data, allowing for per-prompt container image overrides. Implemented checks for Docker image accessibility and added logic to register task environment overrides for Docker, Modal, and Singularity. This improves flexibility in managing containerized environments for prompt execution.
2026-03-01 16:14:36 -08:00
teknium1
11f5c1ecf0 fix(tests): use bare @pytest.mark.asyncio for hook emit tests
Remove loop_scope="function" parameter from async test decorators in
test_hooks.py. This matches the existing convention in the repo
(test_telegram_documents.py) and avoids requiring pytest-asyncio 0.23+.

All 144 new tests from PR #191 now pass.
2026-03-01 05:28:55 -08:00
0xbyt4
3b745633e4 test: add unit tests for 8 untested modules (batch 3) (#191)
* test: add unit tests for 8 untested modules (batch 3)

New test files (143 tests total):
- tools/debug_helpers.py: DebugSession enable/disable, log, save, session info
- tools/skills_guard.py: scan_file, scan_skill, trust levels, install policy, structural checks
- tools/skills_sync.py: manifest read/write, skill discovery, sync logic
- gateway/sticker_cache.py: cache CRUD, sticker injection text builders
- gateway/channel_directory.py: channel resolution, display formatting, session building
- gateway/hooks.py: hook discovery, sync/async emit, wildcard matching
- gateway/mirror.py: session lookup, JSONL append, mirror_to_session
- honcho_integration/client.py: config from env/file, session name resolution, linked workspaces

Also documents a gap in skills_guard: multi-word prompt injection
variants like "ignore all prior instructions" bypass the regex scanner.

* test: strengthen sticker injection tests with exact format assertions

Replace loose "contains" checks with exact output matching for
build_sticker_injection and build_animated_sticker_injection.
Add edge cases: set_name without emoji, empty description, empty emoji.

* test: remove skills_guard gap-documenting test to avoid conflict with fix PR
2026-03-01 05:28:12 -08:00
Bartok Moltbot
54147474d3 feat(gateway): include Discord channel topic in session context
Fixes #163

- Add chat_topic field to SessionSource dataclass
- Update to_dict/from_dict for serialization support
- Add chat_topic parameter to build_source helper
- Extract channel.topic in Discord adapter for messages and slash commands
- Display Channel Topic in system prompt when available
- Normalize empty topics to None
2026-03-01 03:48:24 -05:00
teknium1
4d6f380bd1 docs: update README and CLI documentation for new commands
Enhanced the README and CLI documentation to include the newly added `/compress` and `/usage` commands for managing conversation context and monitoring token usage. Updated log descriptions to clarify the contents of log files and ensured that sensitive information is automatically redacted. This improves user understanding of available features and log management.
2026-03-01 00:28:07 -08:00
teknium1
93f5fd80b8 feat(gateway): add /compress and /usage commands for conversation management
Implemented the /compress command to allow users to manually compress conversation context, ensuring sufficient history is available before execution. The /usage command was also added to display token usage statistics for the current session, including prompt and completion tokens. Updated command documentation to reflect these new features.
2026-03-01 00:25:44 -08:00
teknium1
177be32b7f feat(cli): add /usage command to display session token usage
Introduced a new command "/usage" in the CLI to show cumulative token usage for the current session. This includes details on prompt tokens, completion tokens, total tokens, API calls, and context state. Updated command documentation to reflect this addition. Enhanced the AIAgent class to track token usage throughout the session.
2026-03-01 00:23:19 -08:00
teknium1
30efc263ff feat(cli): add /compress command for manual conversation context compression
Introduced a new command "/compress" to the CLI, allowing users to manually trigger context compression on the current conversation. The method checks for sufficient conversation history and active agent status before performing compression, providing feedback on the number of messages and tokens before and after the operation. Updated command documentation accordingly.
2026-03-01 00:16:38 -08:00
teknium1
41d8a80226 fix(display): fix subagent progress tree-view visual nits
Two fixes to the subagent progress display from PR #186:

1. Task index prefix: show 1-indexed prefix ([1], [2], ...) for ALL
   tasks in batch mode (task_count > 1). Single tasks get no prefix.
   Previously task 0 had no prefix while others did, making batch
   output confusing.

2. Completion indicator: use spinner.print_above() instead of raw
   print() for per-task completion lines (✓ [1/2] ...). Raw print
   collided with the active spinner, mushing the completion text
   onto the spinner line. Now prints cleanly above.

Added task_count parameter to _build_child_progress_callback and
_run_single_child. Updated tests accordingly.
2026-02-28 23:29:49 -08:00
teknium1
4ec386cc72 fix(display): use spaces instead of ANSI \033[K in print_above() for prompt_toolkit compat
print_above() used \033[K (erase-to-end-of-line) to clear the spinner
line before printing text above it. This causes garbled escape codes when
prompt_toolkit's patch_stdout is active in CLI mode.

Switched to the same spaces-based clearing approach used by stop() —
overwrite with blanks, then carriage return back to start of line.

Updated test assertion to match the new clearing method.
2026-02-28 23:19:23 -08:00
lila
dd69f16c3e feat(gateway): expose subagent tool calls and thinking to user (fixes #169) (#186)
When subagents run via delegate_task, the user now sees real-time
progress instead of silence:

CLI: tree-view activity lines print above the delegation spinner
  🔀 Delegating: research quantum computing
     ├─ 💭 "I'll search for papers first..."
     ├─ 🔍 web_search  "quantum computing"
     ├─ 📖 read_file  "paper.pdf"
     └─ ⠹ working... (18.2s)

Gateway (Telegram/Discord): batched progress summaries sent every
5 tool calls to avoid message spam. Remaining tools flushed on
subagent completion.

Changes:
- agent/display.py: add KawaiiSpinner.print_above() to print
  status lines above an active spinner without disrupting animation.
  Uses captured stdout (self._out) so it works inside the child's
  redirect_stdout(devnull).

- tools/delegate_tool.py: add _build_child_progress_callback()
  that creates a per-child callback relaying tool calls and
  thinking events to the parent's spinner (CLI) or progress
  queue (gateway). Each child gets its own callback instance,
  so parallel subagents don't share state. Includes _flush()
  for gateway batch completion.

- run_agent.py: fire tool_progress_callback with '_thinking'
  event when the model produces text content. Guarded by
  _delegate_depth > 0 so only subagents fire this (prevents
  gateway spam from main agent). REASONING_SCRATCHPAD/think/
  reasoning XML tags are stripped before display.

Tests: 21 new tests covering print_above, callback builder,
thinking relay, SCRATCHPAD filtering, batching, flush, thread
isolation, delegate_depth guard, and prefix handling.
2026-02-28 23:18:00 -08:00
teknium1
1db5598294 feat(tests): add live integration tests for file operations and shell noise filtering
- Introduce a new test suite in `test_file_tools_live.py` to validate file operations and ensure accurate command execution in a real environment.
- Implement assertions to check for shell noise contamination in outputs, enhancing the reliability of command results.
- Create fixtures for setting up a local environment and populating directories with known file contents for comprehensive testing.
- Refactor shell noise handling in `process_registry.py` and `local.py` to support multiple noise patterns, improving output cleanliness.
2026-02-28 22:57:58 -08:00
teknium1
23d0b7af6a feat(logging): implement persistent error logging for tool failures
- Introduce a separate error log for capturing warnings and errors related to tool execution, ensuring detailed inspection of issues post-failure.
- Enhance error handling in the AIAgent class to log exceptions with stack traces for better debugging.
- Add a similar error logging mechanism in the gateway to streamline debugging processes.
2026-02-28 22:49:58 -08:00
teknium1
a7c2b9e280 fix(display): enhance memory error detection for tool failures
- Implement logic to distinguish between "full" memory errors and actual failures in the `_detect_tool_failure` function.
- Add JSON parsing to identify specific error messages related to memory limits, improving error handling for memory-related tools.
2026-02-28 22:49:52 -08:00
teknium1
70dfec9638 test(redact): add sensitive text redaction
- Introduce a new test suite for the `redact_sensitive_text` function, covering various sensitive data formats including API keys, tokens, and environment variables.
- Ensure that sensitive information is properly masked in logs and outputs while non-sensitive data remains unchanged.
- Add tests for different scenarios including JSON fields, authorization headers, and environment variable assignments.
- Implement a redacting formatter for logging to enhance security during log output.
2026-02-28 21:56:27 -08:00
teknium1
95b0610f36 refactor(cli, auth): Add Codex/OpenAI OAuth Support - finalized
- Replace `hermes login` with `hermes model` for selecting providers and managing authentication.
- Update documentation and CLI commands to reflect the new provider selection process.
- Introduce a new redaction system for logging sensitive information.
- Enhance Codex model discovery by integrating API fetching and local cache.
- Adjust max turns configuration logic for better clarity and precedence.
- Improve error handling and user feedback during authentication processes.
2026-02-28 21:56:27 -08:00
teknium1
500f0eab4a refactor(cli): Finalize OpenAI Codex Integration with OAuth
- Enhanced Codex model discovery by fetching available models from the API, with fallback to local cache and defaults.
- Updated the context compressor's summary target tokens to 2500 for improved performance.
- Added external credential detection for Codex CLI to streamline authentication.
- Refactored various components to ensure consistent handling of authentication and model selection across the application.
2026-02-28 21:47:51 -08:00
Teknium
86b1db0598 Merge pull request #43 from grp06/codex/align-codex-provider-conventions-mainrepo
Enable ChatGPT subscription Codex support end-to-end
2026-02-28 18:13:58 -08:00
Teknium
5a79e423fe Merge branch 'main' into codex/align-codex-provider-conventions-mainrepo 2026-02-28 18:13:38 -08:00
teknium1
7f7643cf63 feat(hooks): introduce event hooks system for lifecycle management
Add a new hooks system allowing users to run custom code at key lifecycle points in the agent's operation. This includes support for events such as `gateway:startup`, `session:start`, `agent:step`, and more. Documentation for creating hooks and available events has been added to `README.md` and a new `hooks.md` file. Additionally, integrate step callbacks in the agent to facilitate hook execution during tool-calling iterations.
2026-02-28 17:09:26 -08:00