- Introduced `run_browser_tasks.sh` for executing browser-focused data generation tasks with specific guidelines for automation. - Added `run_eval_glm4.7_newterm.sh` for evaluating terminal tasks using the GLM 4.7 model, including logging and configuration for terminal environments. - Created `run_eval_terminal.sh` for terminal-only evaluations with Modal sandboxes, ensuring proper logging and environment setup. - Developed `run_mixed_tasks.sh` for running mixed browser and terminal tasks, integrating capabilities for both environments. - Implemented `run_terminal_tasks.sh` for terminal-focused data generation, with detailed instructions for task execution and logging. - All scripts include timestamped logging for better tracking of task execution and outputs.
2.5 KiB
Executable File
2.5 KiB
Executable File