Commit Graph

76 Commits

Author SHA1 Message Date
teknium
b78076cac7 Enhance trajectory_compressor.py with new input options and sampling functionality
- Updated the main function to accept both single JSONL files and directories for compression.
- Added support for sampling a percentage of trajectories before compression.
- Improved usage documentation with detailed examples for various compression scenarios.
- Enhanced error handling for input validation and dry run mode.
- Streamlined output handling to manage temporary files during processing.
2026-01-29 06:04:13 +00:00
teknium
ba19d530ad Update environment configuration and enhance terminal tool integration
- Updated `.env.example` to include new API keys and configuration options for the mini-swe-agent backend, including support for local, Docker, and Modal environments.
- Added `.gitmodules` to include mini-swe-agent as a submodule for easier integration.
- Refactored `mini_swe_runner.py` to use the updated model format and default to OpenRouter for API calls.
- Enhanced `model_tools.py` to support the new terminal tool definitions and ensure compatibility with the mini-swe-agent backend.
- Updated `README.md` to reflect changes in setup instructions and environment variable configurations.
- Improved `terminal_tool.py` to manage execution environments and lifecycle, ensuring proper cleanup and error handling.
- Introduced `terminal_hecate.py` for executing commands on MorphCloud VMs, providing an alternative backend for terminal operations.
2026-01-23 12:26:53 +00:00
teknium
47555602d7 Add mini-swe-agent runner and trajectory compressor
- Introduced mini_swe_runner.py for executing tasks using mini-swe-agent environments (local, Docker, Modal) and outputting trajectories in Hermes format.
- Implemented trajectory_compressor.py to post-process agent trajectories, compressing them within a target token budget while preserving essential content.
- Added trajectory_compression.yaml configuration file for customizable compression settings.
- Created sample_and_compress.py script to download, sample, and compress trajectories from HuggingFace datasets.
- Enhanced logging and error handling across new modules for improved usability and debugging.
2026-01-23 00:52:46 +00:00
teknium
6eb76c7c1a Enhance batch processing and image generation tools
- Updated batch processing to include robust resume functionality by scanning completed prompts based on content rather than indices, improving recovery from failures.
- Implemented retry logic for image downloads with exponential backoff to handle transient failures effectively.
- Refined image generation tool to utilize the FLUX 2 Pro model, updating descriptions and parameters for clarity and consistency.
- Added new configuration scripts for GLM 4.7 and Imagen tasks, enhancing usability and logging capabilities.
- Removed outdated scripts and test files to streamline the codebase.
2026-01-18 10:11:59 +00:00
teknium
b32cc4b09d Refactor batch processing with rich progress tracking and update logging in AIAgent
- Replaced tqdm with rich for enhanced visual progress tracking in batch processing.
- Adjusted logging levels in AIAgent to suppress asyncio debug messages.
- Modified datagen script to reduce number of workers for improved performance.
2026-01-14 14:02:59 +00:00
teknium
6e3dbb8d8b Enhance batch processing with progress tracking and update AIAgent for OpenRouter detection
- Integrated tqdm for progress tracking in batch processing, replacing map with imap_unordered for improved performance.
- Added base_url attribute in AIAgent to facilitate OpenRouter detection.
2026-01-14 13:46:16 +00:00
teknium
b66c093316 add default datagen example script 2026-01-14 13:41:09 +00:00
teknium
13d360030f Enhance tool normalization and API integration across modules
- Introduced normalization functions for tool statistics and error counts to ensure consistent schema across all trajectory entries, facilitating compatibility with HuggingFace datasets.
- Updated batch processing to utilize normalized tool stats and error counts, improving data integrity.
- Refactored vision tools and mixture of agents tool to integrate with OpenRouter API, replacing Nous Research API references and updating model configurations.
- Enabled reasoning capabilities in API calls for enhanced response quality across various tools.
- Improved error handling and API key validation for OpenRouter integration.
2026-01-14 13:40:10 +00:00
teknium
66daebe88f Implement enhanced response handling and tool call validation in run_agent
- Added methods to check for meaningful content after <think> blocks and to retrieve messages up to the last complete assistant turn.
- Introduced retry logic for handling truncated responses and invalid JSON arguments in tool calls, with a maximum retry limit.
- Improved logging for invalid JSON and empty responses, ensuring better error tracking and handling.
- Updated the batch data generation script to adjust dataset file, batch size, and ephemeral system prompt for improved context management.
2026-01-10 13:04:43 +00:00
teknium
4071ba29da Enhance batch processing and tool validation
- Added support for tracking partial results and tool error counts in batch processing.
- Implemented filtering of corrupted entries during batch file combination based on valid tool names.
- Updated terminal tool to improve command execution and error handling, including retry logic for transient failures.
- Refactored model tools to use a simple terminal tool with no session persistence.
- Improved logging and error messages for invalid API responses and tool calls.
- Introduced chunked processing for large content in web tools to manage size limitations effectively.
2026-01-10 05:56:26 +00:00
Teknium
21f9e2df40 Merge pull request #14 from NousResearch/speed-upgrades
updates for stability and speed
2026-01-08 01:04:15 -08:00
Teknium
80d326310e Merge branch 'main' into speed-upgrades 2026-01-08 01:03:34 -08:00
Teknium
53fc705b13 Merge pull request #8 from NousResearch/update-snapshot-id
Update snapshot id for ipython
2026-01-08 01:00:24 -08:00
Teknium
d5af53888a Merge pull request #3 from NousResearch/architecture-planning
Architecture planning
2026-01-08 01:00:00 -08:00
Teknium
a7a37249f7 Merge branch 'main' into architecture-planning 2026-01-08 00:59:51 -08:00
teknium
6af6ff2a0a updates for stability and speed 2026-01-08 08:57:51 +00:00
Teknium
30ca282594 Merge pull request #11 from NousResearch/simplify-terminal
Add simple terminal
2025-11-22 02:26:01 -08:00
hjc-puro
ab7293bed6 don't log exit code !=0 as terminal failure 2025-11-17 18:39:16 -05:00
hjc-puro
1614c15bb1 rate limits 2025-11-17 18:35:36 -05:00
hjc-puro
f813959750 add simple terminal 2025-11-17 01:14:31 -05:00
teknium
f957ec2267 update distribution and gitignore 2025-11-16 01:03:23 +00:00
Teknium
92e3074c10 Merge pull request #9 from NousResearch/tc-logging
Add logging for first 100 chars of the tool call args json / tool response
2025-11-15 14:03:24 -08:00
hjc-puro
0c618482c4 add logging of prefix of tool call and tool response 2025-11-07 14:43:44 -05:00
hjc-puro
2d8f6c46f1 log first 20 chars 2025-11-07 14:08:06 -05:00
hjc-puro
0fbc0475f3 update snapshot id for ipython 2025-11-05 02:11:25 -05:00
teknium
c27787f09f fix gitignore again 2025-11-05 06:43:03 +00:00
teknium
d90fcd4e2b update gitignore 2025-11-05 06:43:03 +00:00
Teknium
69fd0ca9aa Merge pull request #7 from NousResearch/test
some cleanups
2025-11-04 19:54:49 -08:00
Teknium
4135cf4682 Merge branch 'main' into test 2025-11-04 19:54:40 -08:00
teknium
c82741c3d8 some cleanups 2025-11-05 03:47:17 +00:00
Teknium
9573b2ac2d Merge pull request #6 from NousResearch/fix-leakage
Fix VM instance sharing across tasks
2025-11-04 02:15:32 -08:00
hjc-puro
fbd3a2fdb8 prevent leakage of morph instances between tasks 2025-11-04 03:32:43 -05:00
hjc-puro
a4db3fdee5 fix leakage 2025-11-03 17:42:23 -05:00
Teknium
ab5c9fc37b Merge pull request #5 from NousResearch/update-snapshot
Update snapshot
2025-11-02 21:30:08 -08:00
hjc-puro
0ca3e0aaa9 update snapshot 2025-11-02 23:13:49 -05:00
teknium
f6f75cbe2b update webtools 2025-11-02 06:03:21 +00:00
Teknium
d4544f08c5 Merge pull request #4 from NousResearch/fix-terminal
Fix terminal interactivity
2025-11-01 22:39:21 -07:00
hjc-puro
a6ec79730c terminal tool 2025-11-02 08:57:04 +08:00
hjc-puro
faecbddd9b fix terminal interactivity 2025-11-02 08:52:05 +08:00
teknium
de9c0edc51 some bugfixes 2025-10-15 18:07:06 +00:00
teknium
8d256779d8 Update vision_tools.py to include image downloading and base64 conversion features.
add excluding tmp image dl's in .gitignore
2025-10-08 02:38:04 +00:00
teknium
d36790de91 Add ephemeral system prompt support in batch and agent runners. Update README with usage examples and documentation for the new feature. Ensure prompt is not saved to trajectories. 2025-10-08 02:33:58 +00:00
teknium
a398d320b7 update gitignore 2025-10-07 14:09:37 +00:00
teknium
22b6d5866c Fix some issues around async and tool constraints 2025-10-07 14:08:46 +00:00
teknium
0e2e69a71d Add batch processing capabilities with checkpointing and statistics tracking, along with toolset distribution management. Update README and add test scripts for validation. 2025-10-06 03:17:58 +00:00
teknium
bc5f0e62d9 Add support for enabling all toolsets with 'all' or '*' alias in README and toolset resolution logic 2025-10-03 13:45:29 +00:00
teknium
6fac6fecde Enhance import handling for Hecate in terminal_tool.py to manage local folder shadowing and improve error reporting for import failures. 2025-10-03 09:46:44 +00:00
teknium
c42d9055ed Move test run back to repo root. weirdness occurred 2025-10-02 20:05:09 +00:00
teknium
a7ff4d49e9 A bit of restructuring for simplicity and organization 2025-10-01 23:29:25 +00:00
teknium
0411ca1880 Add environment configuration file, restructure tool imports, and enhance README setup instructions 2025-10-01 09:54:17 +00:00