Add browser automation tools and enhance environment configuration

- Introduced new browser automation tools in `browser_tool.py` for navigating, interacting with, and extracting content from web pages using the agent-browser CLI and Browserbase cloud execution. - Updated `.env.example` to include new configuration options for Browserbase API keys and session settings. - Enhanced `model_tools.py` and `toolsets.py` to integrate browser tools into the existing tool framework, ensuring consistent access across toolsets. - Updated `README.md` with setup instructions for browser tools and their usage examples. - Added new test script `test_modal_terminal.py` to validate Modal terminal backend functionality. - Improved `run_agent.py` to support browser tool integration and logging enhancements for better tracking of API responses.
2026-01-29 06:10:24 +00:00
parent 54ca0997ee
commit 248acf715e
12 changed files with 2626 additions and 134 deletions
--- a/tools/vision_tools.py
+++ b/tools/vision_tools.py
@@ -155,10 +155,14 @@ async def _download_image(image_url: str, destination: Path, max_retries: int =
    for attempt in range(max_retries):
        try:
            # Download the image with appropriate headers using async httpx
-            async with httpx.AsyncClient(timeout=30.0) as client:
+            # Enable follow_redirects to handle image CDNs that redirect (e.g., Imgur, Picsum)
+            async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
                response = await client.get(
                    image_url,
-                    headers={"User-Agent": "hermes-agent-vision/1.0"},
+                    headers={
+                        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+                        "Accept": "image/*,*/*;q=0.8",
+                    },
                )
                response.raise_for_status()