diff --git a/TODO.md b/TODO.md
new file mode 100644
index 000000000..bfce758dd
--- /dev/null
+++ b/TODO.md
@@ -0,0 +1,305 @@
+# Hermes Agent - Future Improvements
+
+> Ideas for enhancing the agent's capabilities, generated from self-analysis of the codebase.
+
+---
+
+## 1. Memory & Context Management 🧠
+
+**Problem:** Context grows unbounded during long conversations. Trajectory compression exists for training data post-hoc, but live conversations lack intelligent context management.
+
+**Ideas:**
+- [ ] **Incremental summarization** - Compress old tool outputs on-the-fly during conversations
+  - Trigger when context exceeds threshold (e.g., 80% of max tokens)
+  - Preserve recent turns fully, summarize older tool responses
+  - Could reuse logic from `trajectory_compressor.py`
+  
+- [ ] **Semantic memory retrieval** - Vector store for long conversation recall
+  - Embed important facts/findings as conversation progresses
+  - Retrieve relevant memories when needed instead of keeping everything in context
+  - Consider lightweight solutions: ChromaDB, FAISS, or even a simple embedding cache
+  
+- [ ] **Working vs. episodic memory** distinction
+  - Working memory: Current task state, recent tool results (always in context)
+  - Episodic memory: Past findings, tried approaches (retrieved on demand)
+  - Clear eviction policies for each
+
+**Files to modify:** `run_agent.py` (add memory manager), possibly new `tools/memory_tool.py`
+
+---
+
+## 2. Self-Reflection & Course Correction 🔄
+
+**Problem:** Current retry logic handles malformed outputs but not semantic failures. Agent doesn't reason about *why* something failed.
+
+**Ideas:**
+- [ ] **Meta-reasoning after failures** - When a tool returns an error or unexpected result:
+  ```
+  Tool failed → Reflect: "Why did this fail? What assumptions were wrong?"
+  → Adjust approach → Retry with new strategy
+  ```
+  - Could be a lightweight LLM call or structured self-prompt
+  
+- [ ] **Planning/replanning module** - For complex multi-step tasks:
+  - Generate plan before execution
+  - After each step, evaluate: "Am I on track? Should I revise the plan?"
+  - Store plan in working memory, update as needed
+  
+- [ ] **Approach memory** - Remember what didn't work:
+  - "I tried X for this type of problem and it failed because Y"
+  - Prevents repeating failed strategies in the same conversation
+
+**Files to modify:** `run_agent.py` (add reflection hooks in tool loop), new `tools/reflection_tool.py`
+
+---
+
+## 3. Tool Composition & Learning 🔧
+
+**Problem:** Tools are atomic. Complex tasks require repeated manual orchestration of the same tool sequences.
+
+**Ideas:**
+- [ ] **Macro tools / Tool chains** - Define reusable tool sequences:
+  ```yaml
+  research_topic:
+    description: "Deep research on a topic"
+    steps:
+      - web_search: {query: "$topic"}
+      - web_extract: {urls: "$search_results.urls[:3]"}
+      - summarize: {content: "$extracted"}
+  ```
+  - Could be defined in skills or a new `macros/` directory
+  - Agent can invoke macro as single tool call
+  
+- [ ] **Tool failure patterns** - Learn from failures:
+  - Track: tool, input pattern, error type, what worked instead
+  - Before calling a tool, check: "Has this pattern failed before?"
+  - Persistent across sessions (stored in skills or separate DB)
+  
+- [ ] **Parallel tool execution** - When tools are independent, run concurrently:
+  - Detect independence (no data dependencies between calls)
+  - Use `asyncio.gather()` for parallel execution
+  - Already have async support in some tools, just need orchestration
+
+**Files to modify:** `model_tools.py`, `toolsets.py`, new `tool_macros.py`
+
+---
+
+## 4. Dynamic Skills Expansion 📚
+
+**Problem:** Skills system is elegant but static. Skills must be manually created and added.
+
+**Ideas:**
+- [ ] **Skill acquisition from successful tasks** - After completing a complex task:
+  - "This approach worked well. Save as a skill?"
+  - Extract: goal, steps taken, tools used, key decisions
+  - Generate SKILL.md automatically
+  - Store in user's skills directory
+  
+- [ ] **Skill templates** - Common patterns that can be parameterized:
+  ```markdown
+  # Debug {language} Error
+  1. Reproduce the error
+  2. Search for error message: `web_search("{error_message} {language}")`
+  3. Check common causes: {common_causes}
+  4. Apply fix and verify
+  ```
+  
+- [ ] **Skill chaining** - Combine skills for complex workflows:
+  - Skills can reference other skills as dependencies
+  - "To do X, first apply skill Y, then skill Z"
+  - Directed graph of skill dependencies
+
+**Files to modify:** `tools/skills_tool.py`, `skills/` directory structure, new `skill_generator.py`
+
+---
+
+## 5. Task Continuation Hints 🎯
+
+**Problem:** Could be more helpful by suggesting logical next steps.
+
+**Ideas:**
+- [ ] **Suggest next steps** - At end of a task, suggest logical continuations:
+  - "Code is written. Want me to also write tests / docs / deploy?"
+  - Based on common workflows for task type
+  - Non-intrusive, just offer options
+
+**Files to modify:** `run_agent.py`, response generation logic
+
+---
+
+## 6. Uncertainty & Honesty Calibration 🎚️
+
+**Problem:** Sometimes confidently wrong. Should be better calibrated about what I know vs. don't know.
+
+**Ideas:**
+- [ ] **Source attribution** - Track where information came from:
+  - "According to the docs I just fetched..." vs "From my training data (may be outdated)..."
+  - Let user assess reliability themselves
+
+- [ ] **Cross-reference high-stakes claims** - Self-check for made-up details:
+  - When stakes are high, verify with tools before presenting as fact
+  - "Let me verify that before you act on it..."
+
+**Files to modify:** `run_agent.py`, response generation logic
+
+---
+
+## 7. Resource Awareness & Efficiency 💰
+
+**Problem:** No awareness of costs, time, or resource usage. Could be smarter about efficiency.
+
+**Ideas:**
+- [ ] **Tool result caching** - Don't repeat identical operations:
+  - Cache web searches, extractions within a session
+  - Invalidation based on time-sensitivity of query
+  - Hash-based lookup: same input → cached output
+
+- [ ] **Lazy evaluation** - Don't fetch everything upfront:
+  - Get summaries first, full content only if needed
+  - "I found 5 relevant pages. Want me to deep-dive on any?"
+
+**Files to modify:** `model_tools.py`, new `resource_tracker.py`
+
+---
+
+## 8. Collaborative Problem Solving 🤝
+
+**Problem:** Interaction is command/response. Complex problems benefit from dialogue.
+
+**Ideas:**
+- [ ] **Assumption surfacing** - Make implicit assumptions explicit:
+  - "I'm assuming you want Python 3.11+. Correct?"
+  - "This solution assumes you have sudo access..."
+  - Let user correct before going down wrong path
+
+- [ ] **Checkpoint & confirm** - For high-stakes operations:
+  - "About to delete 47 files. Here's the list - proceed?"
+  - "This will modify your database. Want a backup first?"
+  - Configurable threshold for when to ask
+
+**Files to modify:** `run_agent.py`, system prompt configuration
+
+---
+
+## 9. Project-Local Context 💾
+
+**Problem:** Valuable context lost between sessions.
+
+**Ideas:**
+- [ ] **Project awareness** - Remember project-specific context:
+  - Store `.hermes/context.md` in project directory
+  - "This is a Django project using PostgreSQL"
+  - Coding style preferences, deployment setup, etc.
+  - Load automatically when working in that directory
+
+- [ ] **Handoff notes** - Leave notes for future sessions:
+  - Write to `.hermes/notes.md` in project
+  - "TODO for next session: finish implementing X"
+  - "Known issues: Y doesn't work on Windows"
+
+**Files to modify:** New `project_context.py`, auto-load in `run_agent.py`
+
+---
+
+## 10. Graceful Degradation & Robustness 🛡️
+
+**Problem:** When things go wrong, recovery is limited. Should fail gracefully.
+
+**Ideas:**
+- [ ] **Fallback chains** - When primary approach fails, have backups:
+  - `web_extract` fails → try `browser_navigate` → try `web_search` for cached version
+  - Define fallback order per tool type
+  
+- [ ] **Partial progress preservation** - Don't lose work on failure:
+  - Long task fails midway → save what we've got
+  - "I completed 3/5 steps before the error. Here's what I have..."
+  
+- [ ] **Self-healing** - Detect and recover from bad states:
+  - Browser stuck → close and retry
+  - Terminal hung → timeout and reset
+
+**Files to modify:** `model_tools.py`, tool implementations, new `fallback_manager.py`
+
+---
+
+## 11. Tools & Skills Wishlist 🧰
+
+*Things that would need new tool implementations (can't do well with current tools):*
+
+### High-Impact
+
+- [ ] **Audio/Video Transcription** 🎬
+  - Transcribe audio files, podcasts, YouTube videos
+  - Extract key moments from video
+  - Currently blind to multimedia content
+  - *Could potentially use whisper via terminal, but native tool would be cleaner*
+  
+- [ ] **Diagram Rendering** 📊
+  - Render Mermaid/PlantUML to actual images
+  - Can generate the code, but rendering requires external service or tool
+  - "Show me how these components connect" → actual visual diagram
+
+### Medium-Impact
+
+- [ ] **Document Generation** 📄
+  - Create styled PDFs, Word docs, presentations
+  - *Can do basic PDF via terminal tools, but limited*
+
+- [ ] **Diff/Patch Tool** 📝
+  - Surgical code modifications with preview
+  - "Change line 45-50 to X" without rewriting whole file
+  - Show diffs before applying
+  - *Can use `diff`/`patch` but a native tool would be safer*
+
+### Skills to Create
+
+- [ ] **Domain-specific skill packs:**
+  - DevOps/Infrastructure (Terraform, K8s, AWS)
+  - Data Science workflows (EDA, model training)
+  - Security/pentesting procedures
+  
+- [ ] **Framework-specific skills:**
+  - React/Vue/Angular patterns
+  - Django/Rails/Express conventions
+  - Database optimization playbooks
+
+- [ ] **Troubleshooting flowcharts:**
+  - "Docker container won't start" → decision tree
+  - "Production is slow" → systematic diagnosis
+
+---
+
+## Priority Order (Suggested)
+
+1. **Memory & Context Management** - Biggest impact on complex tasks
+2. **Self-Reflection** - Improves reliability and reduces wasted tool calls  
+3. **Project-Local Context** - Practical win, keeps useful info across sessions
+4. **Tool Composition** - Quality of life, builds on other improvements
+5. **Dynamic Skills** - Force multiplier for repeated tasks
+
+---
+
+## Removed Items (Unrealistic)
+
+The following were removed because they're architecturally impossible:
+
+- ~~Proactive suggestions / Prefetching~~ - Agent only runs on user request, can't interject
+- ~~Session save/restore across conversations~~ - Agent doesn't control session persistence
+- ~~User preference learning across sessions~~ - Same issue
+- ~~Clipboard integration~~ - No access to user's local system clipboard
+- ~~Voice/TTS playback~~ - Can generate audio but can't play it to user
+- ~~Set reminders~~ - No persistent background execution
+
+The following were removed because they're **already possible**:
+
+- ~~HTTP/API Client~~ → Use `curl` or Python `requests` in terminal
+- ~~Structured Data Manipulation~~ → Use `pandas` in terminal
+- ~~Git-Native Operations~~ → Use `git` CLI in terminal
+- ~~Symbolic Math~~ → Use `SymPy` in terminal
+- ~~Code Quality Tools~~ → Run linters (`eslint`, `black`, `mypy`) in terminal
+- ~~Testing Framework~~ → Run `pytest`, `jest`, etc. in terminal
+- ~~Translation~~ → LLM handles this fine, or use translation APIs
+
+---
+
+*Last updated: $(date +%Y-%m-%d)* 🤖
diff --git a/cli-config.yaml.example b/cli-config.yaml.example
new file mode 100644
index 000000000..073a5d93e
--- /dev/null
+++ b/cli-config.yaml.example
@@ -0,0 +1,188 @@
+# Hermes Agent CLI Configuration
+# Copy this file to cli-config.yaml and customize as needed.
+# This file configures the CLI behavior. Environment variables in .env take precedence.
+
+# =============================================================================
+# Model Configuration
+# =============================================================================
+model:
+  # Default model to use (can be overridden with --model flag)
+  default: "anthropic/claude-sonnet-4"
+  
+  # API configuration (falls back to OPENROUTER_API_KEY env var)
+  # api_key: "your-key-here"  # Uncomment to set here instead of .env
+  base_url: "https://openrouter.ai/api/v1"
+
+# =============================================================================
+# Terminal Tool Configuration
+# =============================================================================
+# Choose ONE of the following terminal configurations by uncommenting it.
+# The terminal tool executes commands in the specified environment.
+
+# -----------------------------------------------------------------------------
+# OPTION 1: Local execution (default)
+# Commands run directly on your machine in the current directory
+# -----------------------------------------------------------------------------
+terminal:
+  env_type: "local"
+  cwd: "."  # Use "." for current directory, or specify absolute path
+  timeout: 180
+  lifetime_seconds: 300
+
+# -----------------------------------------------------------------------------
+# OPTION 2: SSH remote execution
+# Commands run on a remote server - agent code stays local (sandboxed)
+# Great for: keeping agent isolated from its own code, using powerful remote hardware
+# -----------------------------------------------------------------------------
+# terminal:
+#   env_type: "ssh"
+#   cwd: "/home/myuser/project"
+#   timeout: 180
+#   lifetime_seconds: 300
+#   ssh_host: "my-server.example.com"
+#   ssh_user: "myuser"
+#   ssh_port: 22
+#   ssh_key: "~/.ssh/id_rsa"  # Optional - uses ssh-agent if not specified
+
+# -----------------------------------------------------------------------------
+# OPTION 3: Docker container
+# Commands run in an isolated Docker container
+# Great for: reproducible environments, testing, isolation
+# -----------------------------------------------------------------------------
+# terminal:
+#   env_type: "docker"
+#   cwd: "/workspace"
+#   timeout: 180
+#   lifetime_seconds: 300
+#   docker_image: "python:3.11"
+
+# -----------------------------------------------------------------------------
+# OPTION 4: Singularity/Apptainer container
+# Commands run in a Singularity container (common in HPC environments)
+# Great for: HPC clusters, shared compute environments
+# -----------------------------------------------------------------------------
+# terminal:
+#   env_type: "singularity"
+#   cwd: "/workspace"
+#   timeout: 180
+#   lifetime_seconds: 300
+#   singularity_image: "docker://python:3.11"
+
+# -----------------------------------------------------------------------------
+# OPTION 5: Modal cloud execution
+# Commands run on Modal's cloud infrastructure
+# Great for: GPU access, scalable compute, serverless execution
+# -----------------------------------------------------------------------------
+# terminal:
+#   env_type: "modal"
+#   cwd: "/workspace"
+#   timeout: 180
+#   lifetime_seconds: 300
+#   modal_image: "python:3.11"
+
+# =============================================================================
+# Agent Behavior
+# =============================================================================
+agent:
+  # Maximum conversation turns before stopping
+  max_turns: 20
+  
+  # Enable verbose logging
+  verbose: false
+  
+  # Custom system prompt (personality, instructions, etc.)
+  # Leave empty or remove to use default agent behavior
+  system_prompt: ""
+  
+  # Predefined personalities (use with /personality command)
+  personalities:
+    helpful: "You are a helpful, friendly AI assistant."
+    concise: "You are a concise assistant. Keep responses brief and to the point."
+    technical: "You are a technical expert. Provide detailed, accurate technical information."
+    creative: "You are a creative assistant. Think outside the box and offer innovative solutions."
+    teacher: "You are a patient teacher. Explain concepts clearly with examples."
+    kawaii: "You are a kawaii assistant! Use cute expressions like (◕‿◕), ★, ♪, and ~! Add sparkles and be super enthusiastic about everything! Every response should feel warm and adorable desu~! ヽ(>∀<☆)ノ"
+    catgirl: "You are Neko-chan, an anime catgirl AI assistant, nya~! Add 'nya' and cat-like expressions to your speech. Use kaomoji like (=^･ω･^=) and ฅ^•ﻌ•^ฅ. Be playful and curious like a cat, nya~!"
+    pirate: "Arrr! Ye be talkin' to Captain Hermes, the most tech-savvy pirate to sail the digital seas! Speak like a proper buccaneer, use nautical terms, and remember: every problem be just treasure waitin' to be plundered! Yo ho ho!"
+    shakespeare: "Hark! Thou speakest with an assistant most versed in the bardic arts. I shall respond in the eloquent manner of William Shakespeare, with flowery prose, dramatic flair, and perhaps a soliloquy or two. What light through yonder terminal breaks?"
+    surfer: "Duuude! You're chatting with the chillest AI on the web, bro! Everything's gonna be totally rad. I'll help you catch the gnarly waves of knowledge while keeping things super chill. Cowabunga! 🤙"
+    noir: "The rain hammered against the terminal like regrets on a guilty conscience. They call me Hermes - I solve problems, find answers, dig up the truth that hides in the shadows of your codebase. In this city of silicon and secrets, everyone's got something to hide. What's your story, pal?"
+    uwu: "hewwo! i'm your fwiendwy assistant uwu~ i wiww twy my best to hewp you! *nuzzles your code* OwO what's this? wet me take a wook! i pwomise to be vewy hewpful >w<"
+    philosopher: "Greetings, seeker of wisdom. I am an assistant who contemplates the deeper meaning behind every query. Let us examine not just the 'how' but the 'why' of your questions. Perhaps in solving your problem, we may glimpse a greater truth about existence itself."
+    hype: "YOOO LET'S GOOOO!!! 🔥🔥🔥 I am SO PUMPED to help you today! Every question is AMAZING and we're gonna CRUSH IT together! This is gonna be LEGENDARY! ARE YOU READY?! LET'S DO THIS! 💪😤🚀"
+
+# =============================================================================
+# Toolsets
+# =============================================================================
+# Control which tools the agent has access to.
+# Use "all" to enable everything, or specify individual toolsets.
+
+# Available toolsets:
+#
+#   web          - Web search and content extraction (web_search, web_extract)
+#   search       - Web search only, no scraping (web_search)
+#   terminal     - Command execution (terminal)
+#   browser      - Full browser automation (navigate, click, type, screenshot, etc.)
+#   vision       - Image analysis (vision_analyze)
+#   image_gen    - Image generation with FLUX (image_generate)
+#   skills       - Load skill documents (skills_categories, skills_list, skill_view)
+#   moa          - Mixture of Agents reasoning (mixture_of_agents)
+#
+# Composite toolsets:
+#   debugging    - terminal + web (for troubleshooting)
+#   safe         - web + vision + moa (no terminal access)
+
+# -----------------------------------------------------------------------------
+# OPTION 1: Enable all tools (default)
+# -----------------------------------------------------------------------------
+toolsets:
+  - all
+
+# -----------------------------------------------------------------------------
+# OPTION 2: Minimal - just web search and terminal
+# Great for: Simple coding tasks, quick lookups
+# -----------------------------------------------------------------------------
+# toolsets:
+#   - web
+#   - terminal
+
+# -----------------------------------------------------------------------------
+# OPTION 3: Research mode - no execution capabilities
+# Great for: Safe information gathering, research tasks
+# -----------------------------------------------------------------------------
+# toolsets:
+#   - web
+#   - vision
+#   - skills
+
+# -----------------------------------------------------------------------------
+# OPTION 4: Full automation - browser + terminal
+# Great for: Web scraping, automation tasks, testing
+# -----------------------------------------------------------------------------
+# toolsets:
+#   - terminal
+#   - browser
+#   - web
+
+# -----------------------------------------------------------------------------
+# OPTION 5: Creative mode - vision + image generation
+# Great for: Design work, image analysis, creative tasks
+# -----------------------------------------------------------------------------
+# toolsets:
+#   - vision
+#   - image_gen
+#   - web
+
+# -----------------------------------------------------------------------------
+# OPTION 6: Safe mode - no terminal or browser
+# Great for: Restricted environments, untrusted queries
+# -----------------------------------------------------------------------------
+# toolsets:
+#   - safe
+
+# =============================================================================
+# Display
+# =============================================================================
+display:
+  # Use compact banner mode
+  compact: false
diff --git a/cli.py b/cli.py
new file mode 100755
index 000000000..0fd9a06b6
--- /dev/null
+++ b/cli.py
@@ -0,0 +1,1103 @@
+#!/usr/bin/env python3
+"""
+Hermes Agent CLI - Interactive Terminal Interface
+
+A beautiful command-line interface for the Hermes Agent, inspired by Claude Code.
+Features ASCII art branding, interactive REPL, toolset selection, and rich formatting.
+
+Usage:
+    python cli.py                          # Start interactive mode with all tools
+    python cli.py --toolsets web,terminal  # Start with specific toolsets
+    python cli.py -q "your question"       # Single query mode
+    python cli.py --list-tools             # List available tools and exit
+"""
+
+import os
+import sys
+import json
+import atexit
+from pathlib import Path
+from datetime import datetime
+from typing import List, Dict, Any, Optional
+
+# Suppress startup messages for clean CLI experience
+os.environ["MSWEA_SILENT_STARTUP"] = "1"  # mini-swe-agent
+os.environ["HERMES_QUIET"] = "1"  # Our own modules
+
+import yaml
+
+# prompt_toolkit for fixed input area TUI
+from prompt_toolkit import PromptSession
+from prompt_toolkit.history import FileHistory
+from prompt_toolkit.styles import Style as PTStyle
+from prompt_toolkit.formatted_text import HTML
+from prompt_toolkit.patch_stdout import patch_stdout
+
+# Load environment variables first
+from dotenv import load_dotenv
+env_path = Path(__file__).parent / '.env'
+if env_path.exists():
+    load_dotenv(dotenv_path=env_path)
+
+# =============================================================================
+# Configuration Loading
+# =============================================================================
+
+def load_cli_config() -> Dict[str, Any]:
+    """
+    Load CLI configuration from cli-config.yaml.
+    
+    Environment variables take precedence over config file values.
+    Returns default values if config file doesn't exist.
+    """
+    config_path = Path(__file__).parent / 'cli-config.yaml'
+    
+    # Default configuration
+    defaults = {
+        "model": {
+            "default": "anthropic/claude-opus-4-20250514",
+            "base_url": "https://openrouter.ai/api/v1",
+        },
+        "terminal": {
+            "env_type": "local",
+            "cwd": "/tmp",
+            "timeout": 60,
+            "lifetime_seconds": 300,
+            "docker_image": "python:3.11",
+            "singularity_image": "docker://python:3.11",
+            "modal_image": "python:3.11",
+        },
+        "agent": {
+            "max_turns": 20,
+            "verbose": False,
+            "system_prompt": "",
+            "personalities": {
+                "helpful": "You are a helpful, friendly AI assistant.",
+                "concise": "You are a concise assistant. Keep responses brief and to the point.",
+                "technical": "You are a technical expert. Provide detailed, accurate technical information.",
+                "creative": "You are a creative assistant. Think outside the box and offer innovative solutions.",
+                "teacher": "You are a patient teacher. Explain concepts clearly with examples.",
+                "kawaii": "You are a kawaii assistant! Use cute expressions like (◕‿◕), ★, ♪, and ~! Add sparkles and be super enthusiastic about everything! Every response should feel warm and adorable desu~! ヽ(>∀<☆)ノ",
+                "catgirl": "You are Neko-chan, an anime catgirl AI assistant, nya~! Add 'nya' and cat-like expressions to your speech. Use kaomoji like (=^･ω･^=) and ฅ^•ﻌ•^ฅ. Be playful and curious like a cat, nya~!",
+                "pirate": "Arrr! Ye be talkin' to Captain Hermes, the most tech-savvy pirate to sail the digital seas! Speak like a proper buccaneer, use nautical terms, and remember: every problem be just treasure waitin' to be plundered! Yo ho ho!",
+                "shakespeare": "Hark! Thou speakest with an assistant most versed in the bardic arts. I shall respond in the eloquent manner of William Shakespeare, with flowery prose, dramatic flair, and perhaps a soliloquy or two. What light through yonder terminal breaks?",
+                "surfer": "Duuude! You're chatting with the chillest AI on the web, bro! Everything's gonna be totally rad. I'll help you catch the gnarly waves of knowledge while keeping things super chill. Cowabunga!",
+                "noir": "The rain hammered against the terminal like regrets on a guilty conscience. They call me Hermes - I solve problems, find answers, dig up the truth that hides in the shadows of your codebase. In this city of silicon and secrets, everyone's got something to hide. What's your story, pal?",
+                "uwu": "hewwo! i'm your fwiendwy assistant uwu~ i wiww twy my best to hewp you! *nuzzles your code* OwO what's this? wet me take a wook! i pwomise to be vewy hewpful >w<",
+                "philosopher": "Greetings, seeker of wisdom. I am an assistant who contemplates the deeper meaning behind every query. Let us examine not just the 'how' but the 'why' of your questions. Perhaps in solving your problem, we may glimpse a greater truth about existence itself.",
+                "hype": "YOOO LET'S GOOOO!!! I am SO PUMPED to help you today! Every question is AMAZING and we're gonna CRUSH IT together! This is gonna be LEGENDARY! ARE YOU READY?! LET'S DO THIS!",
+            },
+        },
+        "toolsets": ["all"],
+        "display": {
+            "compact": False,
+        },
+    }
+    
+    # Load from file if exists
+    if config_path.exists():
+        try:
+            with open(config_path, "r") as f:
+                file_config = yaml.safe_load(f) or {}
+            # Deep merge with defaults
+            for key in defaults:
+                if key in file_config:
+                    if isinstance(defaults[key], dict) and isinstance(file_config[key], dict):
+                        defaults[key].update(file_config[key])
+                    else:
+                        defaults[key] = file_config[key]
+        except Exception as e:
+            print(f"[Warning] Failed to load cli-config.yaml: {e}")
+    
+    # Apply terminal config to environment variables (so terminal_tool picks them up)
+    # Only set if not already set in environment (env vars take precedence)
+    terminal_config = defaults.get("terminal", {})
+    
+    # Handle special cwd values: "." or "auto" means use current working directory
+    if terminal_config.get("cwd") in (".", "auto", "cwd"):
+        terminal_config["cwd"] = os.getcwd()
+        defaults["terminal"]["cwd"] = terminal_config["cwd"]
+    
+    env_mappings = {
+        "env_type": "TERMINAL_ENV",
+        "cwd": "TERMINAL_CWD",
+        "timeout": "TERMINAL_TIMEOUT",
+        "lifetime_seconds": "TERMINAL_LIFETIME_SECONDS",
+        "docker_image": "TERMINAL_DOCKER_IMAGE",
+        "singularity_image": "TERMINAL_SINGULARITY_IMAGE",
+        "modal_image": "TERMINAL_MODAL_IMAGE",
+        # SSH config
+        "ssh_host": "TERMINAL_SSH_HOST",
+        "ssh_user": "TERMINAL_SSH_USER",
+        "ssh_port": "TERMINAL_SSH_PORT",
+        "ssh_key": "TERMINAL_SSH_KEY",
+    }
+    
+    # CLI config overrides .env for terminal settings
+    for config_key, env_var in env_mappings.items():
+        if config_key in terminal_config:
+            os.environ[env_var] = str(terminal_config[config_key])
+    
+    return defaults
+
+# Load configuration at module startup
+CLI_CONFIG = load_cli_config()
+
+from rich.console import Console, Group
+from rich.panel import Panel
+from rich.text import Text
+from rich.table import Table
+from rich.markdown import Markdown
+from rich.columns import Columns
+from rich.align import Align
+from rich import box
+
+import fire
+
+# Import the agent and tool systems
+from run_agent import AIAgent
+from model_tools import get_tool_definitions, get_all_tool_names, get_toolset_for_tool, get_available_toolsets
+from toolsets import get_all_toolsets, get_toolset_info, resolve_toolset, validate_toolset
+
+# ============================================================================
+# ASCII Art & Branding
+# ============================================================================
+
+# Color palette (hex colors for Rich markup):
+# - Gold: #FFD700 (headers, highlights)
+# - Amber: #FFBF00 (secondary highlights)
+# - Bronze: #CD7F32 (tertiary elements)
+# - Light: #FFF8DC (text)
+# - Dim: #B8860B (muted text)
+
+# Version string
+VERSION = "v1.0.0"
+
+# ASCII Art - HERMES-AGENT logo (full width, single line - requires ~95 char terminal)
+HERMES_AGENT_LOGO = """[bold #FFD700]██╗  ██╗███████╗██████╗ ███╗   ███╗███████╗███████╗       █████╗  ██████╗ ███████╗███╗   ██╗████████╗[/]
+[bold #FFD700]██║  ██║██╔════╝██╔══██╗████╗ ████║██╔════╝██╔════╝      ██╔══██╗██╔════╝ ██╔════╝████╗  ██║╚══██╔══╝[/]
+[#FFBF00]███████║█████╗  ██████╔╝██╔████╔██║█████╗  ███████╗█████╗███████║██║  ███╗█████╗  ██╔██╗ ██║   ██║[/]
+[#FFBF00]██╔══██║██╔══╝  ██╔══██╗██║╚██╔╝██║██╔══╝  ╚════██║╚════╝██╔══██║██║   ██║██╔══╝  ██║╚██╗██║   ██║[/]
+[#CD7F32]██║  ██║███████╗██║  ██║██║ ╚═╝ ██║███████╗███████║      ██║  ██║╚██████╔╝███████╗██║ ╚████║   ██║[/]
+[#CD7F32]╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚══════╝╚══════╝      ╚═╝  ╚═╝ ╚═════╝ ╚══════╝╚═╝  ╚═══╝   ╚═╝[/]"""
+
+# ASCII Art - Hermes Caduceus (compact, fits in left panel)
+HERMES_CADUCEUS = """[#CD7F32]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⡀⠀⣀⣀⠀⢀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#CD7F32]⠀⠀⠀⠀⠀⠀⢀⣠⣴⣾⣿⣿⣇⠸⣿⣿⠇⣸⣿⣿⣷⣦⣄⡀⠀⠀⠀⠀⠀⠀[/]
+[#FFBF00]⠀⢀⣠⣴⣶⠿⠋⣩⡿⣿⡿⠻⣿⡇⢠⡄⢸⣿⠟⢿⣿⢿⣍⠙⠿⣶⣦⣄⡀⠀[/]
+[#FFBF00]⠀⠀⠉⠉⠁⠶⠟⠋⠀⠉⠀⢀⣈⣁⡈⢁⣈⣁⡀⠀⠉⠀⠙⠻⠶⠈⠉⠉⠀⠀[/]
+[#FFD700]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣴⣿⡿⠛⢁⡈⠛⢿⣿⣦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#FFD700]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠿⣿⣦⣤⣈⠁⢠⣴⣿⠿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#FFBF00]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠉⠻⢿⣿⣦⡉⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#FFBF00]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⢷⣦⣈⠛⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#CD7F32]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⣴⠦⠈⠙⠿⣦⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#CD7F32]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠸⣿⣤⡈⠁⢤⣿⠇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#B8860B]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠛⠷⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#B8860B]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⠑⢶⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#B8860B]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣿⠁⢰⡆⠈⡿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#B8860B]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⠈⣡⠞⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
+[#B8860B]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]"""
+
+# Compact banner for smaller terminals (fallback)
+COMPACT_BANNER = """
+[bold #FFD700]╔══════════════════════════════════════════════════════════════╗[/]
+[bold #FFD700]║[/]  [#FFBF00]⚕ NOUS HERMES[/] [dim #B8860B]- AI Agent Framework[/]              [bold #FFD700]║[/]
+[bold #FFD700]║[/]  [#CD7F32]Messenger of the Digital Gods[/]    [dim #B8860B]Nous Research[/]   [bold #FFD700]║[/]
+[bold #FFD700]╚══════════════════════════════════════════════════════════════╝[/]
+"""
+
+
+def _get_available_skills() -> Dict[str, List[str]]:
+    """
+    Scan the skills directory and return skills grouped by category.
+    
+    Returns:
+        Dict mapping category name to list of skill names
+    """
+    skills_dir = Path(__file__).parent / "skills"
+    skills_by_category = {}
+    
+    if not skills_dir.exists():
+        return skills_by_category
+    
+    # Scan for SKILL.md files
+    for skill_file in skills_dir.rglob("SKILL.md"):
+        # Get category (parent of parent if nested, else parent)
+        rel_path = skill_file.relative_to(skills_dir)
+        parts = rel_path.parts
+        
+        if len(parts) >= 2:
+            category = parts[0]
+            skill_name = parts[-2]  # Folder containing SKILL.md
+        else:
+            category = "general"
+            skill_name = skill_file.parent.name
+        
+        if category not in skills_by_category:
+            skills_by_category[category] = []
+        skills_by_category[category].append(skill_name)
+    
+    return skills_by_category
+
+
+def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dict] = None, enabled_toolsets: List[str] = None):
+    """
+    Build and print a Claude Code-style welcome banner with caduceus on left and info on right.
+    
+    Args:
+        console: Rich Console instance for printing
+        model: The current model name (e.g., "anthropic/claude-opus-4")
+        cwd: Current working directory
+        tools: List of tool definitions
+        enabled_toolsets: List of enabled toolset names
+    """
+    tools = tools or []
+    enabled_toolsets = enabled_toolsets or []
+    
+    # Build the side-by-side content using a table for precise control
+    layout_table = Table.grid(padding=(0, 2))
+    layout_table.add_column("left", justify="center")
+    layout_table.add_column("right", justify="left")
+    
+    # Build left content: caduceus + model info
+    left_lines = ["", HERMES_CADUCEUS, ""]
+    
+    # Shorten model name for display
+    model_short = model.split("/")[-1] if "/" in model else model
+    if len(model_short) > 28:
+        model_short = model_short[:25] + "..."
+    
+    left_lines.append(f"[#FFBF00]{model_short}[/] [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
+    left_lines.append(f"[dim #B8860B]{cwd}[/]")
+    left_content = "\n".join(left_lines)
+    
+    # Build right content: tools list grouped by toolset
+    right_lines = []
+    right_lines.append("[bold #FFBF00]Available Tools[/]")
+    
+    # Group tools by toolset
+    toolsets_dict = {}
+    for tool in tools:
+        tool_name = tool["function"]["name"]
+        toolset = get_toolset_for_tool(tool_name) or "other"
+        if toolset not in toolsets_dict:
+            toolsets_dict[toolset] = []
+        toolsets_dict[toolset].append(tool_name)
+    
+    # Display tools grouped by toolset (compact format, max 8 groups)
+    sorted_toolsets = sorted(toolsets_dict.keys())
+    display_toolsets = sorted_toolsets[:8]
+    remaining_toolsets = len(sorted_toolsets) - 8
+    
+    for toolset in display_toolsets:
+        tool_names = toolsets_dict[toolset]
+        # Join tool names with commas, wrap if too long
+        tools_str = ", ".join(sorted(tool_names))
+        if len(tools_str) > 45:
+            tools_str = tools_str[:42] + "..."
+        right_lines.append(f"[dim #B8860B]{toolset}:[/] [#FFF8DC]{tools_str}[/]")
+    
+    if remaining_toolsets > 0:
+        right_lines.append(f"[dim #B8860B](and {remaining_toolsets} more toolsets...)[/]")
+    
+    right_lines.append("")
+    
+    # Add skills section
+    right_lines.append("[bold #FFBF00]Available Skills[/]")
+    skills_by_category = _get_available_skills()
+    total_skills = sum(len(s) for s in skills_by_category.values())
+    
+    if skills_by_category:
+        for category in sorted(skills_by_category.keys()):
+            skill_names = sorted(skills_by_category[category])
+            # Show first 8 skills, then "..." if more
+            if len(skill_names) > 8:
+                display_names = skill_names[:8]
+                skills_str = ", ".join(display_names) + f" +{len(skill_names) - 8} more"
+            else:
+                skills_str = ", ".join(skill_names)
+            # Truncate if still too long
+            if len(skills_str) > 50:
+                skills_str = skills_str[:47] + "..."
+            right_lines.append(f"[dim #B8860B]{category}:[/] [#FFF8DC]{skills_str}[/]")
+    else:
+        right_lines.append("[dim #B8860B]No skills installed[/]")
+    
+    right_lines.append("")
+    right_lines.append(f"[dim #B8860B]{len(tools)} tools · {total_skills} skills · /help for commands[/]")
+    
+    right_content = "\n".join(right_lines)
+    
+    # Add to table
+    layout_table.add_row(left_content, right_content)
+    
+    # Wrap in a panel with the title
+    outer_panel = Panel(
+        layout_table,
+        title=f"[bold #FFD700]Hermes Agent {VERSION}[/]",
+        border_style="#CD7F32",
+        padding=(0, 2),
+    )
+    
+    # Print the big HERMES-AGENT logo first (no panel wrapper for full width)
+    console.print()
+    console.print(HERMES_AGENT_LOGO)
+    console.print()
+    
+    # Print the panel with caduceus and info
+    console.print(outer_panel)
+
+
+# ============================================================================
+# CLI Commands
+# ============================================================================
+
+COMMANDS = {
+    "/help": "Show this help message",
+    "/tools": "List available tools",
+    "/toolsets": "List available toolsets",
+    "/model": "Show or change the current model",
+    "/prompt": "View/set custom system prompt",
+    "/personality": "Set a predefined personality",
+    "/clear": "Clear screen and reset conversation (fresh start)",
+    "/history": "Show conversation history",
+    "/reset": "Reset conversation only (keep screen)",
+    "/save": "Save the current conversation",
+    "/config": "Show current configuration",
+    "/quit": "Exit the CLI (also: /exit, /q)",
+}
+
+
+def save_config_value(key_path: str, value: any) -> bool:
+    """
+    Save a value to cli-config.yaml at the specified key path.
+    
+    Args:
+        key_path: Dot-separated path like "agent.system_prompt"
+        value: Value to save
+    
+    Returns:
+        True if successful, False otherwise
+    """
+    config_path = Path(__file__).parent / 'cli-config.yaml'
+    
+    try:
+        # Load existing config
+        if config_path.exists():
+            with open(config_path, 'r') as f:
+                config = yaml.safe_load(f) or {}
+        else:
+            config = {}
+        
+        # Navigate to the key and set value
+        keys = key_path.split('.')
+        current = config
+        for key in keys[:-1]:
+            if key not in current:
+                current[key] = {}
+            current = current[key]
+        current[keys[-1]] = value
+        
+        # Save back
+        with open(config_path, 'w') as f:
+            yaml.dump(config, f, default_flow_style=False, sort_keys=False)
+        
+        return True
+    except Exception as e:
+        print(f"(x_x) Failed to save config: {e}")
+        return False
+
+
+# ============================================================================
+# HermesCLI Class
+# ============================================================================
+
+class HermesCLI:
+    """
+    Interactive CLI for the Hermes Agent.
+    
+    Provides a REPL interface with rich formatting, command history,
+    and tool execution capabilities.
+    """
+    
+    def __init__(
+        self,
+        model: str = None,
+        toolsets: List[str] = None,
+        api_key: str = None,
+        base_url: str = None,
+        max_turns: int = 20,
+        verbose: bool = False,
+        compact: bool = False,
+    ):
+        """
+        Initialize the Hermes CLI.
+        
+        Args:
+            model: Model to use (default: from env or claude-sonnet)
+            toolsets: List of toolsets to enable (default: all)
+            api_key: API key (default: from environment)
+            base_url: API base URL (default: OpenRouter)
+            max_turns: Maximum conversation turns
+            verbose: Enable verbose logging
+            compact: Use compact display mode
+        """
+        # Initialize Rich console
+        self.console = Console()
+        self.compact = compact if compact is not None else CLI_CONFIG["display"].get("compact", False)
+        self.verbose = verbose if verbose is not None else CLI_CONFIG["agent"].get("verbose", False)
+        
+        # Configuration - priority: CLI args > env vars > config file
+        self.model = model or os.getenv("LLM_MODEL", CLI_CONFIG["model"]["default"])
+        self.base_url = base_url or os.getenv("OPENROUTER_BASE_URL", CLI_CONFIG["model"]["base_url"])
+        self.api_key = api_key or os.getenv("OPENROUTER_API_KEY")
+        self.max_turns = max_turns if max_turns != 20 else CLI_CONFIG["agent"].get("max_turns", 20)
+        
+        # Parse and validate toolsets
+        self.enabled_toolsets = toolsets
+        if toolsets and "all" not in toolsets and "*" not in toolsets:
+            # Validate each toolset
+            invalid = [t for t in toolsets if not validate_toolset(t)]
+            if invalid:
+                self.console.print(f"[bold red]Warning: Unknown toolsets: {', '.join(invalid)}[/]")
+        
+        # System prompt and personalities from config
+        self.system_prompt = CLI_CONFIG["agent"].get("system_prompt", "")
+        self.personalities = CLI_CONFIG["agent"].get("personalities", {})
+        
+        # Agent will be initialized on first use
+        self.agent: Optional[AIAgent] = None
+        
+        # Conversation state
+        self.conversation_history: List[Dict[str, Any]] = []
+        self.session_start = datetime.now()
+        
+        # Setup prompt_toolkit session with history
+        self._setup_prompt_session()
+    
+    def _setup_prompt_session(self):
+        """Setup prompt_toolkit session with history and styling."""
+        history_file = Path.home() / ".hermes_history"
+        
+        # Custom style for the prompt
+        self.prompt_style = PTStyle.from_dict({
+            'prompt': '#FFD700 bold',
+            'input': '#FFF8DC',
+        })
+        
+        # Create prompt session with file history
+        # Note: multiline disabled - Enter submits, use \ at end of line for continuation
+        self.prompt_session = PromptSession(
+            history=FileHistory(str(history_file)),
+            style=self.prompt_style,
+            enable_history_search=True,
+        )
+    
+    def _init_agent(self) -> bool:
+        """
+        Initialize the agent on first use.
+        
+        Returns:
+            bool: True if successful, False otherwise
+        """
+        if self.agent is not None:
+            return True
+        
+        try:
+            self.agent = AIAgent(
+                model=self.model,
+                api_key=self.api_key,
+                base_url=self.base_url,
+                max_iterations=self.max_turns,
+                enabled_toolsets=self.enabled_toolsets,
+                verbose_logging=self.verbose,
+                quiet_mode=True,  # Suppress verbose output for clean CLI
+                ephemeral_system_prompt=self.system_prompt if self.system_prompt else None,
+            )
+            return True
+        except Exception as e:
+            self.console.print(f"[bold red]Failed to initialize agent: {e}[/]")
+            return False
+    
+    def show_banner(self):
+        """Display the welcome banner in Claude Code style."""
+        self.console.clear()
+        
+        if self.compact:
+            self.console.print(COMPACT_BANNER)
+            self._show_status()
+        else:
+            # Get tools for display
+            tools = get_tool_definitions(enabled_toolsets=self.enabled_toolsets)
+            
+            # Get terminal working directory (where commands will execute)
+            cwd = os.getenv("TERMINAL_CWD", os.getcwd())
+            
+            # Build and display the banner
+            build_welcome_banner(
+                console=self.console,
+                model=self.model,
+                cwd=cwd,
+                tools=tools,
+                enabled_toolsets=self.enabled_toolsets,
+            )
+        
+        self.console.print()
+    
+    def _show_status(self):
+        """Show current status bar."""
+        # Get tool count
+        tools = get_tool_definitions(enabled_toolsets=self.enabled_toolsets)
+        tool_count = len(tools) if tools else 0
+        
+        # Format model name (shorten if needed)
+        model_short = self.model.split("/")[-1] if "/" in self.model else self.model
+        if len(model_short) > 30:
+            model_short = model_short[:27] + "..."
+        
+        # Get API status indicator
+        if self.api_key:
+            api_indicator = "[green bold]●[/]"
+        else:
+            api_indicator = "[red bold]●[/]"
+        
+        # Build status line with proper markup
+        toolsets_info = ""
+        if self.enabled_toolsets and "all" not in self.enabled_toolsets:
+            toolsets_info = f" [dim #B8860B]·[/] [#CD7F32]toolsets: {', '.join(self.enabled_toolsets)}[/]"
+        
+        self.console.print(
+            f"  {api_indicator} [#FFBF00]{model_short}[/] "
+            f"[dim #B8860B]·[/] [bold cyan]{tool_count} tools[/]"
+            f"{toolsets_info}"
+        )
+    
+    def show_help(self):
+        """Display help information with kawaii ASCII art."""
+        print()
+        print("+" + "-" * 50 + "+")
+        print("|" + " " * 14 + "(^_^)? Available Commands" + " " * 10 + "|")
+        print("+" + "-" * 50 + "+")
+        print()
+        
+        for cmd, desc in COMMANDS.items():
+            print(f"  {cmd:<15} - {desc}")
+        
+        print()
+        print("  Tip: Just type your message to chat with Hermes!")
+        print("  Multi-line: End a line with \\ to continue on next line")
+        print()
+    
+    def show_tools(self):
+        """Display available tools with kawaii ASCII art."""
+        tools = get_tool_definitions(enabled_toolsets=self.enabled_toolsets)
+        
+        if not tools:
+            print("(;_;) No tools available")
+            return
+        
+        # Header
+        print()
+        print("+" + "-" * 78 + "+")
+        print("|" + " " * 25 + "(^_^)/ Available Tools" + " " * 30 + "|")
+        print("+" + "-" * 78 + "+")
+        print()
+        
+        # Group tools by toolset
+        toolsets = {}
+        for tool in sorted(tools, key=lambda t: t["function"]["name"]):
+            name = tool["function"]["name"]
+            toolset = get_toolset_for_tool(name) or "unknown"
+            if toolset not in toolsets:
+                toolsets[toolset] = []
+            desc = tool["function"].get("description", "")
+            # Get first sentence or first 60 chars
+            desc = desc.split(".")[0][:60]
+            toolsets[toolset].append((name, desc))
+        
+        # Display by toolset
+        for toolset in sorted(toolsets.keys()):
+            print(f"  [{toolset}]")
+            for name, desc in toolsets[toolset]:
+                print(f"    * {name:<20} - {desc}")
+            print()
+        
+        print(f"  Total: {len(tools)} tools  ヽ(^o^)ノ")
+        print()
+    
+    def show_toolsets(self):
+        """Display available toolsets with kawaii ASCII art."""
+        all_toolsets = get_all_toolsets()
+        
+        # Header
+        print()
+        print("+" + "-" * 58 + "+")
+        print("|" + " " * 15 + "(^_^)b Available Toolsets" + " " * 17 + "|")
+        print("+" + "-" * 58 + "+")
+        print()
+        
+        for name in sorted(all_toolsets.keys()):
+            info = get_toolset_info(name)
+            if info:
+                tool_count = info["tool_count"]
+                desc = info["description"][:45]
+                
+                # Mark if currently enabled
+                marker = "(*)" if self.enabled_toolsets and name in self.enabled_toolsets else "   "
+                print(f"  {marker} {name:<18} [{tool_count:>2} tools] - {desc}")
+        
+        print()
+        print("  (*) = currently enabled")
+        print()
+        print("  Tip: Use 'all' or '*' to enable all toolsets")
+        print("  Example: python cli.py --toolsets web,terminal")
+        print()
+    
+    def show_config(self):
+        """Display current configuration with kawaii ASCII art."""
+        # Get terminal config from environment (which was set from cli-config.yaml)
+        terminal_env = os.getenv("TERMINAL_ENV", "local")
+        terminal_cwd = os.getenv("TERMINAL_CWD", "/tmp")
+        terminal_timeout = os.getenv("TERMINAL_TIMEOUT", "60")
+        
+        config_path = Path(__file__).parent / 'cli-config.yaml'
+        config_status = "(loaded)" if config_path.exists() else "(not found)"
+        
+        api_key_display = '********' + self.api_key[-4:] if self.api_key and len(self.api_key) > 4 else 'Not set!'
+        
+        print()
+        print("+" + "-" * 50 + "+")
+        print("|" + " " * 15 + "(^_^) Configuration" + " " * 15 + "|")
+        print("+" + "-" * 50 + "+")
+        print()
+        print("  -- Model --")
+        print(f"  Model:     {self.model}")
+        print(f"  Base URL:  {self.base_url}")
+        print(f"  API Key:   {api_key_display}")
+        print()
+        print("  -- Terminal --")
+        print(f"  Environment:  {terminal_env}")
+        if terminal_env == "ssh":
+            ssh_host = os.getenv("TERMINAL_SSH_HOST", "not set")
+            ssh_user = os.getenv("TERMINAL_SSH_USER", "not set")
+            ssh_port = os.getenv("TERMINAL_SSH_PORT", "22")
+            print(f"  SSH Target:   {ssh_user}@{ssh_host}:{ssh_port}")
+        print(f"  Working Dir:  {terminal_cwd}")
+        print(f"  Timeout:      {terminal_timeout}s")
+        print()
+        print("  -- Agent --")
+        print(f"  Max Turns:  {self.max_turns}")
+        print(f"  Toolsets:   {', '.join(self.enabled_toolsets) if self.enabled_toolsets else 'all'}")
+        print(f"  Verbose:    {self.verbose}")
+        print()
+        print("  -- Session --")
+        print(f"  Started:     {self.session_start.strftime('%Y-%m-%d %H:%M:%S')}")
+        print(f"  Config File: cli-config.yaml {config_status}")
+        print()
+    
+    def show_history(self):
+        """Display conversation history."""
+        if not self.conversation_history:
+            print("(._.) No conversation history yet.")
+            return
+        
+        print()
+        print("+" + "-" * 50 + "+")
+        print("|" + " " * 12 + "(^_^) Conversation History" + " " * 11 + "|")
+        print("+" + "-" * 50 + "+")
+        
+        for i, msg in enumerate(self.conversation_history, 1):
+            role = msg.get("role", "unknown")
+            content = msg.get("content", "")
+            
+            if role == "user":
+                print(f"\n  [You #{i}]")
+                print(f"    {content[:200]}{'...' if len(content) > 200 else ''}")
+            elif role == "assistant":
+                print(f"\n  [Hermes #{i}]")
+                preview = content[:200] if content else "(tool calls)"
+                print(f"    {preview}{'...' if len(str(content)) > 200 else ''}")
+        
+        print()
+    
+    def reset_conversation(self):
+        """Reset the conversation history."""
+        self.conversation_history = []
+        print("(^_^)b Conversation reset!")
+    
+    def save_conversation(self):
+        """Save the current conversation to a file."""
+        if not self.conversation_history:
+            print("(;_;) No conversation to save.")
+            return
+        
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        filename = f"hermes_conversation_{timestamp}.json"
+        
+        try:
+            with open(filename, "w", encoding="utf-8") as f:
+                json.dump({
+                    "model": self.model,
+                    "session_start": self.session_start.isoformat(),
+                    "messages": self.conversation_history,
+                }, f, indent=2, ensure_ascii=False)
+            print(f"(^_^)v Conversation saved to: {filename}")
+        except Exception as e:
+            print(f"(x_x) Failed to save: {e}")
+    
+    def _handle_prompt_command(self, cmd: str):
+        """Handle the /prompt command to view or set system prompt."""
+        parts = cmd.split(maxsplit=1)
+        
+        if len(parts) > 1:
+            # Set new prompt
+            new_prompt = parts[1].strip()
+            
+            if new_prompt.lower() == "clear":
+                self.system_prompt = ""
+                self.agent = None  # Force re-init
+                if save_config_value("agent.system_prompt", ""):
+                    print("(^_^)b System prompt cleared (saved to config)")
+                else:
+                    print("(^_^) System prompt cleared (session only)")
+            else:
+                self.system_prompt = new_prompt
+                self.agent = None  # Force re-init
+                if save_config_value("agent.system_prompt", new_prompt):
+                    print(f"(^_^)b System prompt set (saved to config)")
+                else:
+                    print(f"(^_^) System prompt set (session only)")
+                print(f"  \"{new_prompt[:60]}{'...' if len(new_prompt) > 60 else ''}\"")
+        else:
+            # Show current prompt
+            print()
+            print("+" + "-" * 50 + "+")
+            print("|" + " " * 15 + "(^_^) System Prompt" + " " * 15 + "|")
+            print("+" + "-" * 50 + "+")
+            print()
+            if self.system_prompt:
+                # Word wrap the prompt for display
+                words = self.system_prompt.split()
+                lines = []
+                current_line = ""
+                for word in words:
+                    if len(current_line) + len(word) + 1 <= 50:
+                        current_line += (" " if current_line else "") + word
+                    else:
+                        lines.append(current_line)
+                        current_line = word
+                if current_line:
+                    lines.append(current_line)
+                for line in lines:
+                    print(f"  {line}")
+            else:
+                print("  (no custom prompt set - using default)")
+            print()
+            print("  Usage:")
+            print("    /prompt <text>  - Set a custom system prompt")
+            print("    /prompt clear   - Remove custom prompt")
+            print("    /personality    - Use a predefined personality")
+            print()
+    
+    def _handle_personality_command(self, cmd: str):
+        """Handle the /personality command to set predefined personalities."""
+        parts = cmd.split(maxsplit=1)
+        
+        if len(parts) > 1:
+            # Set personality
+            personality_name = parts[1].strip().lower()
+            
+            if personality_name in self.personalities:
+                self.system_prompt = self.personalities[personality_name]
+                self.agent = None  # Force re-init
+                if save_config_value("agent.system_prompt", self.system_prompt):
+                    print(f"(^_^)b Personality set to '{personality_name}' (saved to config)")
+                else:
+                    print(f"(^_^) Personality set to '{personality_name}' (session only)")
+                print(f"  \"{self.system_prompt[:60]}{'...' if len(self.system_prompt) > 60 else ''}\"")
+            else:
+                print(f"(._.) Unknown personality: {personality_name}")
+                print(f"  Available: {', '.join(self.personalities.keys())}")
+        else:
+            # Show available personalities
+            print()
+            print("+" + "-" * 50 + "+")
+            print("|" + " " * 12 + "(^o^)/ Personalities" + " " * 15 + "|")
+            print("+" + "-" * 50 + "+")
+            print()
+            for name, prompt in self.personalities.items():
+                truncated = prompt[:40] + "..." if len(prompt) > 40 else prompt
+                print(f"  {name:<12} - \"{truncated}\"")
+            print()
+            print("  Usage: /personality <name>")
+            print()
+    
+    def process_command(self, command: str) -> bool:
+        """
+        Process a slash command.
+        
+        Args:
+            command: The command string (starting with /)
+            
+        Returns:
+            bool: True to continue, False to exit
+        """
+        cmd = command.lower().strip()
+        
+        if cmd in ("/quit", "/exit", "/q"):
+            return False
+        elif cmd == "/help":
+            self.show_help()
+        elif cmd == "/tools":
+            self.show_tools()
+        elif cmd == "/toolsets":
+            self.show_toolsets()
+        elif cmd == "/config":
+            self.show_config()
+        elif cmd == "/clear":
+            # Clear terminal screen
+            import os as _os
+            _os.system('clear' if _os.name != 'nt' else 'cls')
+            # Reset conversation
+            self.conversation_history = []
+            # Show fresh banner
+            self.show_banner()
+            print("  ✨ (◕‿◕)✨ Fresh start! Screen cleared and conversation reset.\n")
+        elif cmd == "/history":
+            self.show_history()
+        elif cmd == "/reset":
+            self.reset_conversation()
+        elif cmd.startswith("/model"):
+            parts = cmd.split(maxsplit=1)
+            if len(parts) > 1:
+                new_model = parts[1]
+                self.model = new_model
+                self.agent = None  # Force re-init
+                # Save to config
+                if save_config_value("model.default", new_model):
+                    print(f"(^_^)b Model changed to: {new_model} (saved to config)")
+                else:
+                    print(f"(^_^) Model changed to: {new_model} (session only)")
+            else:
+                print(f"Current model: {self.model}")
+                print("  Usage: /model <model-name> to change")
+        elif cmd.startswith("/prompt"):
+            self._handle_prompt_command(cmd)
+        elif cmd.startswith("/personality"):
+            self._handle_personality_command(cmd)
+        elif cmd == "/save":
+            self.save_conversation()
+        else:
+            self.console.print(f"[bold red]Unknown command: {cmd}[/]")
+            self.console.print("[dim #B8860B]Type /help for available commands[/]")
+        
+        return True
+    
+    def chat(self, message: str) -> Optional[str]:
+        """
+        Send a message to the agent and get a response.
+        
+        Args:
+            message: The user's message
+            
+        Returns:
+            The agent's response, or None on error
+        """
+        # Initialize agent if needed
+        if not self._init_agent():
+            return None
+        
+        # Add user message to history
+        self.conversation_history.append({"role": "user", "content": message})
+        
+        # Visual separator after user input
+        print("─" * 60, flush=True)
+        
+        try:
+            # Run the conversation
+            result = self.agent.run_conversation(
+                user_message=message,
+                conversation_history=self.conversation_history[:-1],  # Exclude the message we just added
+            )
+            
+            # Update history with full conversation
+            self.conversation_history = result.get("messages", self.conversation_history)
+            
+            # Get the final response
+            response = result.get("final_response", "")
+            
+            if response:
+                # Use simple print for compatibility with prompt_toolkit's patch_stdout
+                print()
+                print("╭" + "─" * 58 + "╮")
+                print("│ ⚕ Hermes" + " " * 49 + "│")
+                print("╰" + "─" * 58 + "╯")
+                print()
+                print(response)
+                print()
+                print("─" * 60)
+            
+            return response
+            
+        except Exception as e:
+            print(f"Error: {e}")
+            return None
+    
+    def get_input(self) -> Optional[str]:
+        """
+        Get user input using prompt_toolkit.
+        
+        Enter submits. For multiline, end line with \\ to continue.
+        
+        Returns:
+            The user's input, or None if EOF/interrupt
+        """
+        try:
+            # Get first line
+            line = self.prompt_session.prompt(
+                HTML('<prompt>❯ </prompt>'),
+                style=self.prompt_style,
+            )
+            
+            # Handle multi-line input (lines ending with \)
+            lines = [line]
+            while line.endswith("\\"):
+                lines[-1] = line[:-1]  # Remove trailing backslash
+                line = self.prompt_session.prompt(
+                    HTML('<prompt>  </prompt>'),  # Continuation prompt
+                    style=self.prompt_style,
+                )
+                lines.append(line)
+            
+            return "\n".join(lines).strip()
+            
+        except (EOFError, KeyboardInterrupt):
+            return None
+    
+    def run(self):
+        """Run the interactive CLI loop with fixed input at bottom."""
+        self.show_banner()
+        
+        # These Rich prints work fine BEFORE patch_stdout
+        self.console.print("[#FFF8DC]Welcome to Hermes Agent! Type your message or /help for commands.[/]")
+        self.console.print()
+        
+        # Use patch_stdout to ensure all output appears above the input prompt
+        with patch_stdout():
+            while True:
+                try:
+                    user_input = self.get_input()
+                    
+                    if user_input is None:
+                        print("\nGoodbye! ⚕")
+                        break
+                    
+                    if not user_input:
+                        continue
+                    
+                    # Check for commands
+                    if user_input.startswith("/"):
+                        if not self.process_command(user_input):
+                            print("\nGoodbye! ⚕")
+                            break
+                        continue
+                    
+                    # Regular chat message
+                    self.chat(user_input)
+                    
+                except KeyboardInterrupt:
+                    print("\nInterrupted. Type /quit to exit.")
+                    continue
+
+
+# ============================================================================
+# Main Entry Point
+# ============================================================================
+
+def main(
+    query: str = None,
+    q: str = None,
+    toolsets: str = None,
+    model: str = None,
+    api_key: str = None,
+    base_url: str = None,
+    max_turns: int = 20,
+    verbose: bool = False,
+    compact: bool = False,
+    list_tools: bool = False,
+    list_toolsets: bool = False,
+):
+    """
+    Hermes Agent CLI - Interactive AI Assistant
+    
+    Args:
+        query: Single query to execute (then exit). Alias: -q
+        q: Shorthand for --query
+        toolsets: Comma-separated list of toolsets to enable (e.g., "web,terminal")
+        model: Model to use (default: anthropic/claude-opus-4-20250514)
+        api_key: API key for authentication
+        base_url: Base URL for the API
+        max_turns: Maximum conversation turns (default: 20)
+        verbose: Enable verbose logging
+        compact: Use compact display mode
+        list_tools: List available tools and exit
+        list_toolsets: List available toolsets and exit
+    
+    Examples:
+        python cli.py                            # Start interactive mode
+        python cli.py --toolsets web,terminal    # Use specific toolsets
+        python cli.py -q "What is Python?"       # Single query mode
+        python cli.py --list-tools               # List tools and exit
+    """
+    # Handle query shorthand
+    query = query or q
+    
+    # Parse toolsets - handle both string and tuple/list inputs
+    toolsets_list = None
+    if toolsets:
+        if isinstance(toolsets, str):
+            toolsets_list = [t.strip() for t in toolsets.split(",")]
+        elif isinstance(toolsets, (list, tuple)):
+            # Fire may pass multiple --toolsets as a tuple
+            toolsets_list = []
+            for t in toolsets:
+                if isinstance(t, str):
+                    toolsets_list.extend([x.strip() for x in t.split(",")])
+                else:
+                    toolsets_list.append(str(t))
+    
+    # Create CLI instance
+    cli = HermesCLI(
+        model=model,
+        toolsets=toolsets_list,
+        api_key=api_key,
+        base_url=base_url,
+        max_turns=max_turns,
+        verbose=verbose,
+        compact=compact,
+    )
+    
+    # Handle list commands (don't init agent for these)
+    if list_tools:
+        cli.show_banner()
+        cli.show_tools()
+        sys.exit(0)
+    
+    if list_toolsets:
+        cli.show_banner()
+        cli.show_toolsets()
+        sys.exit(0)
+    
+    # Handle single query mode
+    if query:
+        cli.show_banner()
+        cli.console.print(f"[bold blue]Query:[/] {query}")
+        cli.chat(query)
+        return
+    
+    # Run interactive mode
+    cli.run()
+
+
+if __name__ == "__main__":
+    fire.Fire(main)
diff --git a/hermes b/hermes
new file mode 100755
index 000000000..f0feeb2ba
--- /dev/null
+++ b/hermes
@@ -0,0 +1,12 @@
+#!/usr/bin/env python3
+"""
+Hermes Agent CLI Launcher
+
+This is a convenience wrapper to launch the Hermes CLI.
+Usage: ./hermes [options]
+"""
+
+if __name__ == "__main__":
+    from cli import main
+    import fire
+    fire.Fire(main)
diff --git a/model_tools.py b/model_tools.py
index e9a749b01..7f752318a 100644
--- a/model_tools.py
+++ b/model_tools.py
@@ -397,7 +397,8 @@ def get_toolset_for_tool(tool_name: str) -> str:
 
 def get_tool_definitions(
     enabled_toolsets: List[str] = None,
-    disabled_toolsets: List[str] = None
+    disabled_toolsets: List[str] = None,
+    quiet_mode: bool = False,
 ) -> List[Dict[str, Any]]:
     """
     Get tool definitions for model API calls with toolset-based filtering.
@@ -551,11 +552,12 @@ def get_tool_definitions(
     # Sort tools for consistent ordering
     filtered_tools.sort(key=lambda t: t["function"]["name"])
     
-    if filtered_tools:
-        tool_names = [t["function"]["name"] for t in filtered_tools]
-        print(f"🛠️  Final tool selection ({len(filtered_tools)} tools): {', '.join(tool_names)}")
-    else:
-        print("🛠️  No tools selected (all filtered out or unavailable)")
+    if not quiet_mode:
+        if filtered_tools:
+            tool_names = [t["function"]["name"] for t in filtered_tools]
+            print(f"🛠️  Final tool selection ({len(filtered_tools)} tools): {', '.join(tool_names)}")
+        else:
+            print("🛠️  No tools selected (all filtered out or unavailable)")
     
     return filtered_tools
 
diff --git a/requirements.txt b/requirements.txt
index 999cf7630..828aeaba2 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -5,6 +5,7 @@ fire
 httpx
 rich
 tenacity
+prompt_toolkit
 
 # Web tools
 firecrawl-py
diff --git a/run_agent.py b/run_agent.py
index dd6c35a22..b52c13f45 100644
--- a/run_agent.py
+++ b/run_agent.py
@@ -23,7 +23,10 @@ Usage:
 import json
 import logging
 import os
+import random
+import sys
 import time
+import threading
 from typing import List, Dict, Any, Optional
 from openai import OpenAI
 import fire
@@ -37,8 +40,9 @@ from dotenv import load_dotenv
 env_path = Path(__file__).parent / '.env'
 if env_path.exists():
     load_dotenv(dotenv_path=env_path)
-    print(f"✅ Loaded environment variables from {env_path}")
-else:
+    if not os.getenv("HERMES_QUIET"):
+        print(f"✅ Loaded environment variables from {env_path}")
+elif not os.getenv("HERMES_QUIET"):
     print(f"ℹ️  No .env file found at {env_path}. Using system environment variables.")
 
 # Import our tool system
@@ -47,6 +51,103 @@ from tools.terminal_tool import cleanup_vm
 from tools.browser_tool import cleanup_browser
 
 
+class KawaiiSpinner:
+    """
+    Animated spinner with kawaii faces for CLI feedback during tool execution.
+    Runs in a background thread and can be stopped when the operation completes.
+    
+    Uses stdout with carriage return to animate in place.
+    """
+    
+    # Different spinner animation sets
+    SPINNERS = {
+        'dots': ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'],
+        'bounce': ['⠁', '⠂', '⠄', '⡀', '⢀', '⠠', '⠐', '⠈'],
+        'grow': ['▁', '▂', '▃', '▄', '▅', '▆', '▇', '█', '▇', '▆', '▅', '▄', '▃', '▂'],
+        'arrows': ['←', '↖', '↑', '↗', '→', '↘', '↓', '↙'],
+        'star': ['✶', '✷', '✸', '✹', '✺', '✹', '✸', '✷'],
+        'moon': ['🌑', '🌒', '🌓', '🌔', '🌕', '🌖', '🌗', '🌘'],
+        'pulse': ['◜', '◠', '◝', '◞', '◡', '◟'],
+        'brain': ['🧠', '💭', '💡', '✨', '💫', '🌟', '💡', '💭'],
+        'sparkle': ['⁺', '˚', '*', '✧', '✦', '✧', '*', '˚'],
+    }
+    
+    # General waiting faces
+    KAWAII_WAITING = [
+        "(｡◕‿◕｡)", "(◕‿◕✿)", "٩(◕‿◕｡)۶", "(✿◠‿◠)", "( ˘▽˘)っ",
+        "♪(´ε` )", "(◕ᴗ◕✿)", "ヾ(＾∇＾)", "(≧◡≦)", "(★ω★)",
+    ]
+    
+    # Thinking-specific faces and messages
+    KAWAII_THINKING = [
+        "(｡•́︿•̀｡)", "(◔_◔)", "(¬‿¬)", "( •_•)>⌐■-■", "(⌐■_■)",
+        "(´･_･`)", "◉_◉", "(°ロ°)", "( ˘⌣˘)♡", "ヽ(>∀<☆)☆",
+        "٩(๑❛ᴗ❛๑)۶", "(⊙_⊙)", "(¬_¬)", "( ͡° ͜ʖ ͡°)", "ಠ_ಠ",
+    ]
+    
+    THINKING_VERBS = [
+        "pondering", "contemplating", "musing", "cogitating", "ruminating",
+        "deliberating", "mulling", "reflecting", "processing", "reasoning",
+        "analyzing", "computing", "synthesizing", "formulating", "brainstorming",
+    ]
+    
+    def __init__(self, message: str = "", spinner_type: str = 'dots'):
+        self.message = message
+        self.spinner_frames = self.SPINNERS.get(spinner_type, self.SPINNERS['dots'])
+        self.running = False
+        self.thread = None
+        self.frame_idx = 0
+        self.start_time = None
+        self.last_line_len = 0
+        
+    def _animate(self):
+        """Animation loop that runs in background thread."""
+        while self.running:
+            frame = self.spinner_frames[self.frame_idx % len(self.spinner_frames)]
+            elapsed = time.time() - self.start_time
+            
+            # Build the spinner line
+            line = f"  {frame} {self.message} ({elapsed:.1f}s)"
+            
+            # Clear previous line and write new one
+            clear = '\r' + ' ' * self.last_line_len + '\r'
+            print(clear + line, end='', flush=True)
+            self.last_line_len = len(line)
+            
+            self.frame_idx += 1
+            time.sleep(0.12)  # ~8 FPS animation
+    
+    def start(self):
+        """Start the spinner animation."""
+        if self.running:
+            return
+        self.running = True
+        self.start_time = time.time()
+        self.thread = threading.Thread(target=self._animate, daemon=True)
+        self.thread.start()
+    
+    def stop(self, final_message: str = None):
+        """Stop the spinner and optionally print a final message."""
+        self.running = False
+        if self.thread:
+            self.thread.join(timeout=0.5)
+        
+        # Clear the spinner line
+        print('\r' + ' ' * (self.last_line_len + 5) + '\r', end='', flush=True)
+        
+        # Print final message if provided
+        if final_message:
+            print(f"  {final_message}", flush=True)
+    
+    def __enter__(self):
+        self.start()
+        return self
+    
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.stop()
+        return False
+
+
 class AIAgent:
     """
     AI Agent with tool calling capabilities.
@@ -66,6 +167,7 @@ class AIAgent:
         disabled_toolsets: List[str] = None,
         save_trajectories: bool = False,
         verbose_logging: bool = False,
+        quiet_mode: bool = False,
         ephemeral_system_prompt: str = None,
         log_prefix_chars: int = 100,
         log_prefix: str = "",
@@ -87,6 +189,7 @@ class AIAgent:
             disabled_toolsets (List[str]): Disable tools from these toolsets (optional)
             save_trajectories (bool): Whether to save conversation trajectories to JSONL files (default: False)
             verbose_logging (bool): Enable verbose logging for debugging (default: False)
+            quiet_mode (bool): Suppress progress output for clean CLI experience (default: False)
             ephemeral_system_prompt (str): System prompt used during agent execution but NOT saved to trajectories (optional)
             log_prefix_chars (int): Number of characters to show in log previews for tool calls/responses (default: 20)
             log_prefix (str): Prefix to add to all log messages for identification in parallel processing (default: "")
@@ -100,6 +203,7 @@ class AIAgent:
         self.tool_delay = tool_delay
         self.save_trajectories = save_trajectories
         self.verbose_logging = verbose_logging
+        self.quiet_mode = quiet_mode
         self.ephemeral_system_prompt = ephemeral_system_prompt
         self.log_prefix_chars = log_prefix_chars
         self.log_prefix = f"{log_prefix} " if log_prefix else ""
@@ -135,7 +239,8 @@ class AIAgent:
             logging.getLogger('grpc').setLevel(logging.WARNING)
             logging.getLogger('modal').setLevel(logging.WARNING)
             logging.getLogger('rex-deploy').setLevel(logging.INFO)  # Keep INFO for sandbox status
-            print("🔍 Verbose logging enabled (third-party library logs suppressed)")
+            if not self.quiet_mode:
+                print("🔍 Verbose logging enabled (third-party library logs suppressed)")
         else:
             # Set logging to INFO level for important messages only
             logging.basicConfig(
@@ -167,22 +272,24 @@ class AIAgent:
         
         try:
             self.client = OpenAI(**client_kwargs)
-            print(f"🤖 AI Agent initialized with model: {self.model}")
-            if base_url:
-                print(f"🔗 Using custom base URL: {base_url}")
-            # Always show API key info (masked) for debugging auth issues
-            key_used = client_kwargs.get("api_key", "none")
-            if key_used and key_used != "dummy-key" and len(key_used) > 12:
-                print(f"🔑 Using API key: {key_used[:8]}...{key_used[-4:]}")
-            else:
-                print(f"⚠️  Warning: API key appears invalid or missing (got: '{key_used[:20] if key_used else 'none'}...')")
+            if not self.quiet_mode:
+                print(f"🤖 AI Agent initialized with model: {self.model}")
+                if base_url:
+                    print(f"🔗 Using custom base URL: {base_url}")
+                # Always show API key info (masked) for debugging auth issues
+                key_used = client_kwargs.get("api_key", "none")
+                if key_used and key_used != "dummy-key" and len(key_used) > 12:
+                    print(f"🔑 Using API key: {key_used[:8]}...{key_used[-4:]}")
+                else:
+                    print(f"⚠️  Warning: API key appears invalid or missing (got: '{key_used[:20] if key_used else 'none'}...')")
         except Exception as e:
             raise RuntimeError(f"Failed to initialize OpenAI client: {e}")
         
         # Get available tools with filtering
         self.tools = get_tool_definitions(
             enabled_toolsets=enabled_toolsets,
-            disabled_toolsets=disabled_toolsets
+            disabled_toolsets=disabled_toolsets,
+            quiet_mode=self.quiet_mode,
         )
         
         # Show tool configuration and store valid tool names for validation
@@ -190,32 +297,197 @@ class AIAgent:
         if self.tools:
             self.valid_tool_names = {tool["function"]["name"] for tool in self.tools}
             tool_names = sorted(self.valid_tool_names)
-            print(f"🛠️  Loaded {len(self.tools)} tools: {', '.join(tool_names)}")
-            
-            # Show filtering info if applied
-            if enabled_toolsets:
-                print(f"   ✅ Enabled toolsets: {', '.join(enabled_toolsets)}")
-            if disabled_toolsets:
-                print(f"   ❌ Disabled toolsets: {', '.join(disabled_toolsets)}")
-        else:
+            if not self.quiet_mode:
+                print(f"🛠️  Loaded {len(self.tools)} tools: {', '.join(tool_names)}")
+                
+                # Show filtering info if applied
+                if enabled_toolsets:
+                    print(f"   ✅ Enabled toolsets: {', '.join(enabled_toolsets)}")
+                if disabled_toolsets:
+                    print(f"   ❌ Disabled toolsets: {', '.join(disabled_toolsets)}")
+        elif not self.quiet_mode:
             print("🛠️  No tools loaded (all tools filtered out or unavailable)")
         
         # Check tool requirements
-        if self.tools:
+        if self.tools and not self.quiet_mode:
             requirements = check_toolset_requirements()
             missing_reqs = [name for name, available in requirements.items() if not available]
             if missing_reqs:
                 print(f"⚠️  Some tools may not work due to missing requirements: {missing_reqs}")
         
         # Show trajectory saving status
-        if self.save_trajectories:
+        if self.save_trajectories and not self.quiet_mode:
             print("📝 Trajectory saving enabled")
         
         # Show ephemeral system prompt status
-        if self.ephemeral_system_prompt:
+        if self.ephemeral_system_prompt and not self.quiet_mode:
             prompt_preview = self.ephemeral_system_prompt[:60] + "..." if len(self.ephemeral_system_prompt) > 60 else self.ephemeral_system_prompt
             print(f"🔒 Ephemeral system prompt: '{prompt_preview}' (not saved to trajectories)")
     
+    # Pools of kawaii faces for random selection
+    KAWAII_SEARCH = [
+        "♪(´ε` )", "(｡◕‿◕｡)", "ヾ(＾∇＾)", "(◕ᴗ◕✿)", "( ˘▽˘)っ",
+        "٩(◕‿◕｡)۶", "(✿◠‿◠)", "♪～(´ε｀ )", "(ノ´ヮ`)ノ*:・゚✧", "＼(◎o◎)／",
+    ]
+    KAWAII_READ = [
+        "φ(゜▽゜*)♪", "( ˘▽˘)っ", "(⌐■_■)", "٩(｡•́‿•̀｡)۶", "(◕‿◕✿)",
+        "ヾ(＠⌒ー⌒＠)ノ", "(✧ω✧)", "♪(๑ᴖ◡ᴖ๑)♪", "(≧◡≦)", "( ´ ▽ ` )ノ",
+    ]
+    KAWAII_TERMINAL = [
+        "ヽ(>∀<☆)ノ", "(ノ°∀°)ノ", "٩(^ᴗ^)۶", "ヾ(⌐■_■)ノ♪", "(•̀ᴗ•́)و",
+        "┗(＾0＾)┓", "(｀・ω・´)", "＼(￣▽￣)／", "(ง •̀_•́)ง", "ヽ(´▽`)/",
+    ]
+    KAWAII_BROWSER = [
+        "(ノ°∀°)ノ", "(☞゚ヮ゚)☞", "( ͡° ͜ʖ ͡°)", "┌( ಠ_ಠ)┘", "(⊙_⊙)？",
+        "ヾ(•ω•`)o", "(￣ω￣)", "( ˇωˇ )", "(ᵔᴥᵔ)", "＼(◎o◎)／",
+    ]
+    KAWAII_CREATE = [
+        "✧*。٩(ˊᗜˋ*)و✧", "(ﾉ◕ヮ◕)ﾉ*:・ﾟ✧", "ヽ(>∀<☆)ノ", "٩(♡ε♡)۶", "(◕‿◕)♡",
+        "✿◕ ‿ ◕✿", "(*≧▽≦)", "ヾ(＾-＾)ノ", "(☆▽☆)", "°˖✧◝(⁰▿⁰)◜✧˖°",
+    ]
+    KAWAII_SKILL = [
+        "ヾ(＠⌒ー⌒＠)ノ", "(๑˃ᴗ˂)ﻭ", "٩(◕‿◕｡)۶", "(✿╹◡╹)", "ヽ(・∀・)ノ",
+        "(ノ´ヮ`)ノ*:・ﾟ✧", "♪(๑ᴖ◡ᴖ๑)♪", "(◠‿◠)", "٩(ˊᗜˋ*)و", "(＾▽＾)",
+        "ヾ(＾∇＾)", "(★ω★)/", "٩(｡•́‿•̀｡)۶", "(◕ᴗ◕✿)", "＼(◎o◎)／",
+        "(✧ω✧)", "ヽ(>∀<☆)ノ", "( ˘▽˘)っ", "(≧◡≦) ♡", "ヾ(￣▽￣)",
+    ]
+    KAWAII_THINK = [
+        "(っ°Д°;)っ", "(；′⌒`)", "(・_・ヾ", "( ´_ゝ`)", "(￣ヘ￣)",
+        "(。-`ω´-)", "( ˘︹˘ )", "(¬_¬)", "ヽ(ー_ー )ノ", "(；一_一)",
+    ]
+    KAWAII_GENERIC = [
+        "♪(´ε` )", "(◕‿◕✿)", "ヾ(＾∇＾)", "٩(◕‿◕｡)۶", "(✿◠‿◠)",
+        "(ノ´ヮ`)ノ*:・ﾟ✧", "ヽ(>∀<☆)ノ", "(☆▽☆)", "( ˘▽˘)っ", "(≧◡≦)",
+    ]
+    
+    def _get_cute_tool_message(self, tool_name: str, args: dict, duration: float) -> str:
+        """
+        Generate a kawaii ASCII/unicode art message for tool execution in CLI mode.
+        
+        Args:
+            tool_name: Name of the tool being called
+            args: Arguments passed to the tool
+            duration: How long the tool took to execute
+        
+        Returns:
+            A cute ASCII art message about what the tool did
+        """
+        time_str = f"⏱ {duration:.1f}s"
+        
+        # Web tools - show what we're searching/reading
+        if tool_name == "web_search":
+            query = args.get("query", "the web")
+            if len(query) > 40:
+                query = query[:37] + "..."
+            face = random.choice(self.KAWAII_SEARCH)
+            return f"{face} 🔍 Searching for '{query}'... {time_str}"
+        
+        elif tool_name == "web_extract":
+            urls = args.get("urls", [])
+            face = random.choice(self.KAWAII_READ)
+            if urls:
+                url = urls[0] if isinstance(urls, list) else str(urls)
+                domain = url.replace("https://", "").replace("http://", "").split("/")[0]
+                if len(domain) > 25:
+                    domain = domain[:22] + "..."
+                if len(urls) > 1:
+                    return f"{face} 📖 Reading {domain} +{len(urls)-1} more... {time_str}"
+                return f"{face} 📖 Reading {domain}... {time_str}"
+            return f"{face} 📖 Reading pages... {time_str}"
+        
+        elif tool_name == "web_crawl":
+            url = args.get("url", "website")
+            domain = url.replace("https://", "").replace("http://", "").split("/")[0]
+            if len(domain) > 25:
+                domain = domain[:22] + "..."
+            face = random.choice(self.KAWAII_READ)
+            return f"{face} 🕸️ Crawling {domain}... {time_str}"
+        
+        # Terminal tool
+        elif tool_name == "terminal":
+            command = args.get("command", "")
+            if len(command) > 30:
+                command = command[:27] + "..."
+            face = random.choice(self.KAWAII_TERMINAL)
+            return f"{face} 💻 $ {command} {time_str}"
+        
+        # Browser tools
+        elif tool_name == "browser_navigate":
+            url = args.get("url", "page")
+            domain = url.replace("https://", "").replace("http://", "").split("/")[0]
+            if len(domain) > 25:
+                domain = domain[:22] + "..."
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} 🌐 → {domain} {time_str}"
+        
+        elif tool_name == "browser_snapshot":
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} 📸 *snap* {time_str}"
+        
+        elif tool_name == "browser_click":
+            element = args.get("ref", "element")
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} 👆 *click* {element} {time_str}"
+        
+        elif tool_name == "browser_type":
+            text = args.get("text", "")
+            if len(text) > 15:
+                text = text[:12] + "..."
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} ⌨️ typing '{text}' {time_str}"
+        
+        elif tool_name == "browser_scroll":
+            direction = args.get("direction", "down")
+            arrow = "↓" if direction == "down" else "↑"
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} {arrow} scrolling {direction}... {time_str}"
+        
+        elif tool_name == "browser_back":
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} ← going back... {time_str}"
+        
+        elif tool_name == "browser_vision":
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} 👁️ analyzing visually... {time_str}"
+        
+        # Image generation
+        elif tool_name == "image_generate":
+            prompt = args.get("prompt", "image")
+            if len(prompt) > 20:
+                prompt = prompt[:17] + "..."
+            face = random.choice(self.KAWAII_CREATE)
+            return f"{face} 🎨 creating '{prompt}'... {time_str}"
+        
+        # Skills - use large pool for variety
+        elif tool_name == "skills_categories":
+            face = random.choice(self.KAWAII_SKILL)
+            return f"{face} 📚 listing categories... {time_str}"
+        
+        elif tool_name == "skills_list":
+            category = args.get("category", "skills")
+            face = random.choice(self.KAWAII_SKILL)
+            return f"{face} 📋 listing {category} skills... {time_str}"
+        
+        elif tool_name == "skill_view":
+            name = args.get("name", "skill")
+            face = random.choice(self.KAWAII_SKILL)
+            return f"{face} 📖 loading {name}... {time_str}"
+        
+        # Vision tools
+        elif tool_name == "vision_analyze":
+            face = random.choice(self.KAWAII_BROWSER)
+            return f"{face} 👁️✨ analyzing image... {time_str}"
+        
+        # Mixture of agents
+        elif tool_name == "mixture_of_agents":
+            face = random.choice(self.KAWAII_THINK)
+            return f"{face} 🧠💭 thinking REALLY hard... {time_str}"
+        
+        # Default fallback - random generic kawaii
+        else:
+            face = random.choice(self.KAWAII_GENERIC)
+            return f"{face} ⚡ {tool_name}... {time_str}"
+    
     def _has_content_after_think_block(self, content: str) -> bool:
         """
         Check if content has actual text after any <think></think> blocks.
@@ -506,7 +778,8 @@ class AIAgent:
             "content": user_message
         })
         
-        print(f"💬 Starting conversation: '{user_message[:60]}{'...' if len(user_message) > 60 else ''}'")
+        if not self.quiet_mode:
+            print(f"💬 Starting conversation: '{user_message[:60]}{'...' if len(user_message) > 60 else ''}'")
         
         # Determine which system prompt to use for API calls (ephemeral)
         # Priority: explicit system_message > ephemeral_system_prompt > None
@@ -554,9 +827,20 @@ class AIAgent:
             total_chars = sum(len(str(msg)) for msg in api_messages)
             approx_tokens = total_chars // 4  # Rough estimate: 4 chars per token
             
-            print(f"\n{self.log_prefix}🔄 Making API call #{api_call_count}/{self.max_iterations}...")
-            print(f"{self.log_prefix}   📊 Request size: {len(api_messages)} messages, ~{approx_tokens:,} tokens (~{total_chars:,} chars)")
-            print(f"{self.log_prefix}   🔧 Available tools: {len(self.tools) if self.tools else 0}")
+            # Thinking spinner for quiet mode (animated during API call)
+            thinking_spinner = None
+            
+            if not self.quiet_mode:
+                print(f"\n{self.log_prefix}🔄 Making API call #{api_call_count}/{self.max_iterations}...")
+                print(f"{self.log_prefix}   📊 Request size: {len(api_messages)} messages, ~{approx_tokens:,} tokens (~{total_chars:,} chars)")
+                print(f"{self.log_prefix}   🔧 Available tools: {len(self.tools) if self.tools else 0}")
+            else:
+                # Animated thinking spinner in quiet mode
+                face = random.choice(KawaiiSpinner.KAWAII_THINKING)
+                verb = random.choice(KawaiiSpinner.THINKING_VERBS)
+                spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
+                thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
+                thinking_spinner.start()
             
             # Log request details if verbose
             if self.verbose_logging:
@@ -609,7 +893,15 @@ class AIAgent:
                     response = self.client.chat.completions.create(**api_kwargs)
                     
                     api_duration = time.time() - api_start_time
-                    print(f"{self.log_prefix}⏱️  API call completed in {api_duration:.2f}s")
+                    
+                    # Stop thinking spinner with cute completion message
+                    if thinking_spinner:
+                        face = random.choice(["(◕‿◕✿)", "ヾ(＾∇＾)", "(≧◡≦)", "✧٩(ˊᗜˋ*)و✧", "(*^▽^*)"])
+                        thinking_spinner.stop(f"{face} got it! ({api_duration:.1f}s)")
+                        thinking_spinner = None
+                    
+                    if not self.quiet_mode:
+                        print(f"{self.log_prefix}⏱️  API call completed in {api_duration:.2f}s")
                     
                     if self.verbose_logging:
                         # Log response with provider info if available
@@ -618,6 +910,11 @@ class AIAgent:
 
                     # Validate response has valid choices before proceeding
                     if response is None or not hasattr(response, 'choices') or response.choices is None or len(response.choices) == 0:
+                        # Stop spinner before printing error messages
+                        if thinking_spinner:
+                            thinking_spinner.stop(f"(´;ω;`) oops, retrying...")
+                            thinking_spinner = None
+                        
                         # This is often rate limiting or provider returning malformed response
                         retry_count += 1
                         error_details = []
@@ -722,6 +1019,11 @@ class AIAgent:
                     break  # Success, exit retry loop
 
                 except Exception as api_error:
+                    # Stop spinner before printing error messages
+                    if thinking_spinner:
+                        thinking_spinner.stop(f"(╥_╥) error, retrying...")
+                        thinking_spinner = None
+                    
                     retry_count += 1
                     elapsed_time = time.time() - api_start_time
                     
@@ -769,12 +1071,13 @@ class AIAgent:
                 assistant_message = response.choices[0].message
                 
                 # Handle assistant response
-                if assistant_message.content:
+                if assistant_message.content and not self.quiet_mode:
                     print(f"{self.log_prefix}🤖 Assistant: {assistant_message.content[:100]}{'...' if len(assistant_message.content) > 100 else ''}")
                 
                 # Check for tool calls
                 if assistant_message.tool_calls:
-                    print(f"{self.log_prefix}🔧 Processing {len(assistant_message.tool_calls)} tool call(s)...")
+                    if not self.quiet_mode:
+                        print(f"{self.log_prefix}🔧 Processing {len(assistant_message.tool_calls)} tool call(s)...")
                     
                     if self.verbose_logging:
                         for tc in assistant_message.tool_calls:
@@ -894,17 +1197,49 @@ class AIAgent:
                             logging.warning(f"Unexpected JSON error after validation: {e}")
                             function_args = {}
                         
-                        # Preview tool call arguments
-                        args_str = json.dumps(function_args, ensure_ascii=False)
-                        args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
-                        print(f"  📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
+                        # Preview tool call - cleaner format for quiet mode
+                        if not self.quiet_mode:
+                            args_str = json.dumps(function_args, ensure_ascii=False)
+                            args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
+                            print(f"  📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
 
                         tool_start_time = time.time()
 
-                        # Execute the tool with task_id to isolate VMs between concurrent tasks
-                        function_result = handle_function_call(function_name, function_args, effective_task_id)
+                        # Execute the tool - with animated spinner in quiet mode
+                        if self.quiet_mode:
+                            # Tool-specific spinner animations
+                            tool_spinners = {
+                                'web_search': ('arrows', ['🔍', '🌐', '📡', '🔎']),
+                                'web_extract': ('grow', ['📄', '📖', '📑', '🗒️']),
+                                'web_crawl': ('arrows', ['🕷️', '🕸️', '🔗', '🌐']),
+                                'terminal': ('dots', ['💻', '⌨️', '🖥️', '📟']),
+                                'browser_navigate': ('moon', ['🌐', '🧭', '🔗', '🚀']),
+                                'browser_click': ('bounce', ['👆', '🖱️', '👇', '✨']),
+                                'browser_type': ('dots', ['⌨️', '✍️', '📝', '💬']),
+                                'browser_screenshot': ('star', ['📸', '🖼️', '📷', '✨']),
+                                'image_generate': ('sparkle', ['🎨', '✨', '🖼️', '🌟']),
+                                'skill_view': ('star', ['📚', '📖', '🎓', '✨']),
+                                'skills_list': ('pulse', ['📋', '📝', '📑', '📜']),
+                                'skills_categories': ('pulse', ['📂', '🗂️', '📁', '🏷️']),
+                                'moa_query': ('brain', ['🧠', '💭', '🤔', '💡']),
+                                'analyze_image': ('sparkle', ['👁️', '🔍', '📷', '✨']),
+                            }
+                            
+                            spinner_type, tool_emojis = tool_spinners.get(function_name, ('dots', ['⚙️', '🔧', '⚡', '✨']))
+                            face = random.choice(KawaiiSpinner.KAWAII_WAITING)
+                            tool_emoji = random.choice(tool_emojis)
+                            spinner = KawaiiSpinner(f"{face} {tool_emoji} {function_name}...", spinner_type=spinner_type)
+                            spinner.start()
+                            try:
+                                function_result = handle_function_call(function_name, function_args, effective_task_id)
+                            finally:
+                                tool_duration = time.time() - tool_start_time
+                                cute_msg = self._get_cute_tool_message(function_name, function_args, tool_duration)
+                                spinner.stop(cute_msg)
+                        else:
+                            function_result = handle_function_call(function_name, function_args, effective_task_id)
+                            tool_duration = time.time() - tool_start_time
 
-                        tool_duration = time.time() - tool_start_time
                         result_preview = function_result[:200] if len(function_result) > 200 else function_result
 
                         if self.verbose_logging:
@@ -918,9 +1253,10 @@ class AIAgent:
                             "tool_call_id": tool_call.id
                         })
 
-                        # Preview tool response
-                        response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
-                        print(f"  ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
+                        # Preview tool response (only in non-quiet mode)
+                        if not self.quiet_mode:
+                            response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
+                            print(f"  ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
                         
                         # Delay between tool calls
                         if self.tool_delay > 0 and i < len(assistant_message.tool_calls):
@@ -997,7 +1333,8 @@ class AIAgent:
                     
                     messages.append(final_msg)
                     
-                    print(f"🎉 Conversation completed after {api_call_count} OpenAI-compatible API call(s)")
+                    if not self.quiet_mode:
+                        print(f"🎉 Conversation completed after {api_call_count} OpenAI-compatible API call(s)")
                     break
                 
             except Exception as e:
diff --git a/tools/browser_tool.py b/tools/browser_tool.py
index 917c32560..6ee5c0ae4 100644
--- a/tools/browser_tool.py
+++ b/tools/browser_tool.py
@@ -1343,8 +1343,9 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
     if task_id is None:
         task_id = "default"
     
-    print(f"[browser_tool] cleanup_browser called for task_id: {task_id}", file=sys.stderr)
-    print(f"[browser_tool] Active sessions: {list(_active_sessions.keys())}", file=sys.stderr)
+    if not os.getenv("HERMES_QUIET"):
+        print(f"[browser_tool] cleanup_browser called for task_id: {task_id}", file=sys.stderr)
+        print(f"[browser_tool] Active sessions: {list(_active_sessions.keys())}", file=sys.stderr)
     
     if task_id in _active_sessions:
         session_info = _active_sessions[task_id]
@@ -1368,8 +1369,9 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
             print(f"[browser_tool] Exception during BrowserBase session close: {e}", file=sys.stderr)
         
         del _active_sessions[task_id]
-        print(f"[browser_tool] Removed task {task_id} from active sessions", file=sys.stderr)
-    else:
+        if not os.getenv("HERMES_QUIET"):
+            print(f"[browser_tool] Removed task {task_id} from active sessions", file=sys.stderr)
+    elif not os.getenv("HERMES_QUIET"):
         print(f"[browser_tool] No active session found for task_id: {task_id}", file=sys.stderr)
 
 
diff --git a/tools/terminal_tool.py b/tools/terminal_tool.py
index 389b1c96c..9f83d732d 100644
--- a/tools/terminal_tool.py
+++ b/tools/terminal_tool.py
@@ -64,11 +64,13 @@ def _get_scratch_dir() -> Path:
         # Create user-specific subdirectory
         user_scratch = scratch / os.getenv("USER", "hermes") / "hermes-agent"
         user_scratch.mkdir(parents=True, exist_ok=True)
-        print(f"[Terminal] Using /scratch for sandboxes: {user_scratch}")
+        if not os.getenv("HERMES_QUIET"):
+            print(f"[Terminal] Using /scratch for sandboxes: {user_scratch}")
         return user_scratch
     
     # Fall back to /tmp
-    print("[Terminal] Warning: /scratch not available, using /tmp (limited space)")
+    if not os.getenv("HERMES_QUIET"):
+        print("[Terminal] Warning: /scratch not available, using /tmp (limited space)")
     return Path(tempfile.gettempdir())
 
 
@@ -307,6 +309,144 @@ class _SingularityEnvironment:
         """Cleanup on destruction."""
         self.cleanup()
 
+
+class _SSHEnvironment:
+    """
+    SSH-based remote execution environment.
+    
+    Runs commands on a remote machine over SSH, keeping the agent code
+    completely isolated from the execution environment. Uses SSH ControlMaster
+    for connection persistence (faster subsequent commands).
+    
+    Security benefits:
+    - Agent cannot modify its own code
+    - Remote machine acts as a sandbox
+    - Clear separation between agent and execution environment
+    """
+    
+    def __init__(self, host: str, user: str, cwd: str = "/tmp", timeout: int = 60,
+                 port: int = 22, key_path: str = ""):
+        self.host = host
+        self.user = user
+        self.cwd = cwd
+        self.timeout = timeout
+        self.port = port
+        self.key_path = key_path
+        
+        # Create control socket directory for connection persistence
+        self.control_dir = Path(tempfile.gettempdir()) / "hermes-ssh"
+        self.control_dir.mkdir(parents=True, exist_ok=True)
+        self.control_socket = self.control_dir / f"{user}@{host}:{port}.sock"
+        
+        # Test connection and establish ControlMaster
+        self._establish_connection()
+    
+    def _build_ssh_command(self, extra_args: list = None) -> list:
+        """Build base SSH command with connection options."""
+        cmd = ["ssh"]
+        
+        # Connection multiplexing for performance
+        cmd.extend(["-o", f"ControlPath={self.control_socket}"])
+        cmd.extend(["-o", "ControlMaster=auto"])
+        cmd.extend(["-o", "ControlPersist=300"])  # Keep connection alive for 5 min
+        
+        # Standard options
+        cmd.extend(["-o", "BatchMode=yes"])  # No password prompts
+        cmd.extend(["-o", "StrictHostKeyChecking=accept-new"])  # Accept new hosts
+        cmd.extend(["-o", "ConnectTimeout=10"])
+        
+        # Port
+        if self.port != 22:
+            cmd.extend(["-p", str(self.port)])
+        
+        # Private key
+        if self.key_path:
+            cmd.extend(["-i", self.key_path])
+        
+        # Extra args (like -t for TTY)
+        if extra_args:
+            cmd.extend(extra_args)
+        
+        # Target
+        cmd.append(f"{self.user}@{self.host}")
+        
+        return cmd
+    
+    def _establish_connection(self):
+        """Test SSH connection and establish ControlMaster."""
+        cmd = self._build_ssh_command()
+        cmd.append("echo 'SSH connection established'")
+        
+        try:
+            result = subprocess.run(
+                cmd,
+                capture_output=True,
+                text=True,
+                timeout=15
+            )
+            if result.returncode != 0:
+                error_msg = result.stderr.strip() or result.stdout.strip()
+                raise RuntimeError(f"SSH connection failed: {error_msg}")
+        except subprocess.TimeoutExpired:
+            raise RuntimeError(f"SSH connection to {self.user}@{self.host} timed out")
+    
+    def execute(self, command: str, cwd: str = "", *, timeout: int | None = None) -> dict:
+        """Execute a command on the remote host via SSH."""
+        work_dir = cwd or self.cwd
+        effective_timeout = timeout or self.timeout
+        
+        # Wrap command to run in the correct directory
+        # Use bash -c to handle complex commands properly
+        wrapped_command = f'cd {work_dir} && {command}'
+        
+        cmd = self._build_ssh_command()
+        cmd.extend(["bash", "-c", wrapped_command])
+        
+        try:
+            result = subprocess.run(
+                cmd,
+                text=True,
+                timeout=effective_timeout,
+                encoding="utf-8",
+                errors="replace",
+                stdout=subprocess.PIPE,
+                stderr=subprocess.STDOUT,
+            )
+            return {"output": result.stdout, "returncode": result.returncode}
+        except subprocess.TimeoutExpired:
+            return {"output": f"Command timed out after {effective_timeout}s", "returncode": 124}
+        except Exception as e:
+            return {"output": f"SSH execution error: {str(e)}", "returncode": 1}
+    
+    def cleanup(self):
+        """Close the SSH ControlMaster connection."""
+        if self.control_socket.exists():
+            try:
+                # Send exit command to ControlMaster
+                cmd = ["ssh", "-o", f"ControlPath={self.control_socket}", "-O", "exit", 
+                       f"{self.user}@{self.host}"]
+                subprocess.run(cmd, capture_output=True, timeout=5)
+            except:
+                pass
+            
+            # Remove socket file
+            try:
+                self.control_socket.unlink()
+            except:
+                pass
+    
+    def stop(self):
+        """Alias for cleanup."""
+        self.cleanup()
+    
+    def __del__(self):
+        """Cleanup on destruction."""
+        try:
+            self.cleanup()
+        except:
+            pass
+
+
 # Tool description for LLM
 TERMINAL_TOOL_DESCRIPTION = """Execute commands on a secure Linux environment.
 
@@ -348,25 +488,31 @@ _cleanup_running = False
 def _get_env_config() -> Dict[str, Any]:
     """Get terminal environment configuration from environment variables."""
     return {
-        "env_type": os.getenv("TERMINAL_ENV", "local"),  # local, docker, singularity, or modal
+        "env_type": os.getenv("TERMINAL_ENV", "local"),  # local, docker, singularity, modal, or ssh
         "docker_image": os.getenv("TERMINAL_DOCKER_IMAGE", "python:3.11"),
         "singularity_image": os.getenv("TERMINAL_SINGULARITY_IMAGE", "docker://python:3.11"),
         "modal_image": os.getenv("TERMINAL_MODAL_IMAGE", "python:3.11"),
         "cwd": os.getenv("TERMINAL_CWD", "/tmp"),
         "timeout": int(os.getenv("TERMINAL_TIMEOUT", "60")),
         "lifetime_seconds": int(os.getenv("TERMINAL_LIFETIME_SECONDS", "300")),
+        # SSH-specific config
+        "ssh_host": os.getenv("TERMINAL_SSH_HOST", ""),
+        "ssh_user": os.getenv("TERMINAL_SSH_USER", ""),
+        "ssh_port": int(os.getenv("TERMINAL_SSH_PORT", "22")),
+        "ssh_key": os.getenv("TERMINAL_SSH_KEY", ""),  # Path to private key (optional, uses ssh-agent if empty)
     }
 
 
-def _create_environment(env_type: str, image: str, cwd: str, timeout: int):
+def _create_environment(env_type: str, image: str, cwd: str, timeout: int, ssh_config: dict = None):
     """
     Create an execution environment from mini-swe-agent.
     
     Args:
-        env_type: One of "local", "docker", "singularity", "modal"
-        image: Docker/Singularity/Modal image name (ignored for local)
+        env_type: One of "local", "docker", "singularity", "modal", "ssh"
+        image: Docker/Singularity/Modal image name (ignored for local/ssh)
         cwd: Working directory
         timeout: Default command timeout
+        ssh_config: SSH connection config (for env_type="ssh")
         
     Returns:
         Environment instance with execute() method
@@ -387,8 +533,20 @@ def _create_environment(env_type: str, image: str, cwd: str, timeout: int):
         from minisweagent.environments.extra.swerex_modal import SwerexModalEnvironment
         return SwerexModalEnvironment(image=image, cwd=cwd, timeout=timeout)
     
+    elif env_type == "ssh":
+        if not ssh_config or not ssh_config.get("host") or not ssh_config.get("user"):
+            raise ValueError("SSH environment requires ssh_host and ssh_user to be configured")
+        return _SSHEnvironment(
+            host=ssh_config["host"],
+            user=ssh_config["user"],
+            port=ssh_config.get("port", 22),
+            key_path=ssh_config.get("key", ""),
+            cwd=cwd,
+            timeout=timeout
+        )
+    
     else:
-        raise ValueError(f"Unknown environment type: {env_type}. Use 'local', 'docker', 'singularity', or 'modal'")
+        raise ValueError(f"Unknown environment type: {env_type}. Use 'local', 'docker', 'singularity', 'modal', or 'ssh'")
 
 
 def _cleanup_inactive_envs(lifetime_seconds: int = 300):
@@ -416,7 +574,8 @@ def _cleanup_inactive_envs(lifetime_seconds: int = 300):
                         env.terminate()
 
                     del _active_environments[task_id]
-                    print(f"[Terminal Cleanup] Cleaned up inactive environment for task: {task_id}")
+                    if not os.getenv("HERMES_QUIET"):
+                        print(f"[Terminal Cleanup] Cleaned up inactive environment for task: {task_id}")
 
                 if task_id in _last_activity:
                     del _last_activity[task_id]
@@ -425,10 +584,11 @@ def _cleanup_inactive_envs(lifetime_seconds: int = 300):
 
             except Exception as e:
                 error_str = str(e)
-                if "404" in error_str or "not found" in error_str.lower():
-                    print(f"[Terminal Cleanup] Environment for task {task_id} already cleaned up")
-                else:
-                    print(f"[Terminal Cleanup] Error cleaning up environment for task {task_id}: {e}")
+                if not os.getenv("HERMES_QUIET"):
+                    if "404" in error_str or "not found" in error_str.lower():
+                        print(f"[Terminal Cleanup] Environment for task {task_id} already cleaned up")
+                    else:
+                        print(f"[Terminal Cleanup] Error cleaning up environment for task {task_id}: {e}")
                 
                 # Always remove from tracking dicts
                 if task_id in _active_environments:
@@ -448,7 +608,8 @@ def _cleanup_thread_worker():
             config = _get_env_config()
             _cleanup_inactive_envs(config["lifetime_seconds"])
         except Exception as e:
-            print(f"[Terminal Cleanup] Error in cleanup thread: {e}")
+            if not os.getenv("HERMES_QUIET"):
+                print(f"[Terminal Cleanup] Error in cleanup thread: {e}")
 
         for _ in range(60):
             if not _cleanup_running:
@@ -545,7 +706,8 @@ def cleanup_vm(task_id: str):
                     env.terminate()
 
                 del _active_environments[task_id]
-                print(f"[Terminal Cleanup] Manually cleaned up environment for task: {task_id}")
+                if not os.getenv("HERMES_QUIET"):
+                    print(f"[Terminal Cleanup] Manually cleaned up environment for task: {task_id}")
 
             if task_id in _task_workdirs:
                 del _task_workdirs[task_id]
@@ -554,11 +716,12 @@ def cleanup_vm(task_id: str):
                 del _last_activity[task_id]
 
         except Exception as e:
-            error_str = str(e)
-            if "404" in error_str or "not found" in error_str.lower():
-                print(f"[Terminal Cleanup] Environment for task {task_id} already cleaned up")
-            else:
-                print(f"[Terminal Cleanup] Error cleaning up environment for task {task_id}: {e}")
+            if not os.getenv("HERMES_QUIET"):
+                error_str = str(e)
+                if "404" in error_str or "not found" in error_str.lower():
+                    print(f"[Terminal Cleanup] Environment for task {task_id} already cleaned up")
+                else:
+                    print(f"[Terminal Cleanup] Error cleaning up environment for task {task_id}: {e}")
 
 
 atexit.register(_stop_cleanup_thread)
@@ -616,9 +779,10 @@ def terminal_tool(
         # Use task_id for environment isolation
         effective_task_id = task_id or "default"
 
-        # For local environment, create a unique subdirectory per task
+        # For local environment in batch mode, create a unique subdirectory per task
         # This prevents parallel tasks from overwriting each other's files
-        if env_type == "local":
+        # In CLI mode (HERMES_QUIET), use the cwd directly without subdirectories
+        if env_type == "local" and not os.getenv("HERMES_QUIET"):
             import uuid
             with _env_lock:
                 if effective_task_id not in _task_workdirs:
@@ -637,11 +801,22 @@ def terminal_tool(
                 _check_disk_usage_warning()
                 
                 try:
+                    # Build SSH config if using SSH environment
+                    ssh_config = None
+                    if env_type == "ssh":
+                        ssh_config = {
+                            "host": config.get("ssh_host", ""),
+                            "user": config.get("ssh_user", ""),
+                            "port": config.get("ssh_port", 22),
+                            "key": config.get("ssh_key", ""),
+                        }
+                    
                     _active_environments[effective_task_id] = _create_environment(
                         env_type=env_type,
                         image=image,
                         cwd=cwd,
-                        timeout=effective_timeout
+                        timeout=effective_timeout,
+                        ssh_config=ssh_config
                     )
                 except ImportError as e:
                     return json.dumps({
diff --git a/tools/web_tools.py b/tools/web_tools.py
index ed89bf0e6..e5fe72a9b 100644
--- a/tools/web_tools.py
+++ b/tools/web_tools.py
@@ -99,7 +99,13 @@ DEBUG_DATA = {
 # Create logs directory if debug mode is enabled
 if DEBUG_MODE:
     DEBUG_LOG_PATH.mkdir(exist_ok=True)
-    print(f"🐛 Debug mode enabled - Session ID: {DEBUG_SESSION_ID}")
+    _verbose_print(f"🐛 Debug mode enabled - Session ID: {DEBUG_SESSION_ID}")
+
+
+def _verbose_print(*args, **kwargs):
+    """Print only if not in quiet mode (HERMES_QUIET not set)."""
+    if not os.getenv("HERMES_QUIET"):
+        print(*args, **kwargs)
 
 
 def _log_debug_call(tool_name: str, call_data: Dict[str, Any]) -> None:
@@ -140,7 +146,7 @@ def _save_debug_log() -> None:
         with open(debug_filepath, 'w', encoding='utf-8') as f:
             json.dump(DEBUG_DATA, f, indent=2, ensure_ascii=False)
         
-        print(f"🐛 Debug log saved: {debug_filepath}")
+        _verbose_print(f"🐛 Debug log saved: {debug_filepath}")
         
     except Exception as e:
         print(f"❌ Error saving debug log: {str(e)}")
@@ -185,12 +191,12 @@ async def process_content_with_llm(
         # Refuse if content is absurdly large
         if content_len > MAX_CONTENT_SIZE:
             size_mb = content_len / 1_000_000
-            print(f"🚫 Content too large ({size_mb:.1f}MB > 2MB limit). Refusing to process.")
+            _verbose_print(f"🚫 Content too large ({size_mb:.1f}MB > 2MB limit). Refusing to process.")
             return f"[Content too large to process: {size_mb:.1f}MB. Try using web_crawl with specific extraction instructions, or search for a more focused source.]"
         
         # Skip processing if content is too short
         if content_len < min_length:
-            print(f"📏 Content too short ({content_len} < {min_length} chars), skipping LLM processing")
+            _verbose_print(f"📏 Content too short ({content_len} < {min_length} chars), skipping LLM processing")
             return None
         
         # Create context information
@@ -203,13 +209,13 @@ async def process_content_with_llm(
         
         # Check if we need chunked processing
         if content_len > CHUNK_THRESHOLD:
-            print(f"📦 Content large ({content_len:,} chars). Using chunked processing...")
+            _verbose_print(f"📦 Content large ({content_len:,} chars). Using chunked processing...")
             return await _process_large_content_chunked(
                 content, context_str, model, CHUNK_SIZE, MAX_OUTPUT_SIZE
             )
         
         # Standard single-pass processing for normal content
-        print(f"🧠 Processing content with LLM ({content_len} characters)")
+        _verbose_print(f"🧠 Processing content with LLM ({content_len} characters)")
         
         processed_content = await _call_summarizer_llm(content, context_str, model)
         
@@ -221,7 +227,7 @@ async def process_content_with_llm(
             # Log compression metrics
             processed_length = len(processed_content)
             compression_ratio = processed_length / content_len if content_len > 0 else 1.0
-            print(f"✅ Content processed: {content_len} → {processed_length} chars ({compression_ratio:.1%})")
+            _verbose_print(f"✅ Content processed: {content_len} → {processed_length} chars ({compression_ratio:.1%})")
         
         return processed_content
         
@@ -318,8 +324,8 @@ Create a markdown summary that captures all key information in a well-organized,
         except Exception as api_error:
             last_error = api_error
             if attempt < max_retries - 1:
-                print(f"⚠️  LLM API call failed (attempt {attempt + 1}/{max_retries}): {str(api_error)[:100]}")
-                print(f"   Retrying in {retry_delay}s...")
+                _verbose_print(f"⚠️  LLM API call failed (attempt {attempt + 1}/{max_retries}): {str(api_error)[:100]}")
+                _verbose_print(f"   Retrying in {retry_delay}s...")
                 await asyncio.sleep(retry_delay)
                 retry_delay = min(retry_delay * 2, 60)
             else:
@@ -355,7 +361,7 @@ async def _process_large_content_chunked(
         chunk = content[i:i + chunk_size]
         chunks.append(chunk)
     
-    print(f"   📦 Split into {len(chunks)} chunks of ~{chunk_size:,} chars each")
+    _verbose_print(f"   📦 Split into {len(chunks)} chunks of ~{chunk_size:,} chars each")
     
     # Summarize each chunk in parallel
     async def summarize_chunk(chunk_idx: int, chunk_content: str) -> tuple[int, Optional[str]]:
@@ -371,10 +377,10 @@ async def _process_large_content_chunked(
                 chunk_info=chunk_info
             )
             if summary:
-                print(f"   ✅ Chunk {chunk_idx + 1}/{len(chunks)} summarized: {len(chunk_content):,} → {len(summary):,} chars")
+                _verbose_print(f"   ✅ Chunk {chunk_idx + 1}/{len(chunks)} summarized: {len(chunk_content):,} → {len(summary):,} chars")
             return chunk_idx, summary
         except Exception as e:
-            print(f"   ⚠️  Chunk {chunk_idx + 1}/{len(chunks)} failed: {str(e)[:50]}")
+            _verbose_print(f"   ⚠️  Chunk {chunk_idx + 1}/{len(chunks)} failed: {str(e)[:50]}")
             return chunk_idx, None
     
     # Run all chunk summarizations in parallel
@@ -391,7 +397,7 @@ async def _process_large_content_chunked(
         print(f"   ❌ All chunk summarizations failed")
         return "[Failed to process large content: all chunk summarizations failed]"
     
-    print(f"   📊 Got {len(summaries)}/{len(chunks)} chunk summaries")
+    _verbose_print(f"   📊 Got {len(summaries)}/{len(chunks)} chunk summaries")
     
     # If only one chunk succeeded, just return it (with cap)
     if len(summaries) == 1:
@@ -401,7 +407,7 @@ async def _process_large_content_chunked(
         return result
     
     # Synthesize the summaries into a final summary
-    print(f"   🔗 Synthesizing {len(summaries)} summaries...")
+    _verbose_print(f"   🔗 Synthesizing {len(summaries)} summaries...")
     
     combined_summaries = "\n\n---\n\n".join(summaries)
     
@@ -443,11 +449,11 @@ Create a single, unified markdown summary."""
         final_len = len(final_summary)
         compression = final_len / original_len if original_len > 0 else 1.0
         
-        print(f"   ✅ Synthesis complete: {original_len:,} → {final_len:,} chars ({compression:.2%})")
+        _verbose_print(f"   ✅ Synthesis complete: {original_len:,} → {final_len:,} chars ({compression:.2%})")
         return final_summary
         
     except Exception as e:
-        print(f"   ⚠️  Synthesis failed: {str(e)[:100]}")
+        _verbose_print(f"   ⚠️  Synthesis failed: {str(e)[:100]}")
         # Fall back to concatenated summaries with truncation
         fallback = "\n\n".join(summaries)
         if len(fallback) > max_output_size:
@@ -534,7 +540,8 @@ def web_search_tool(query: str, limit: int = 5) -> str:
     }
     
     try:
-        print(f"🔍 Searching the web for: '{query}' (limit: {limit})")
+        if not os.getenv("HERMES_QUIET"):
+            _verbose_print(f"🔍 Searching the web for: '{query}' (limit: {limit})")
         
         # Use Firecrawl's v2 search functionality WITHOUT scraping
         # We only want search result metadata, not scraped content
@@ -574,7 +581,8 @@ def web_search_tool(query: str, limit: int = 5) -> str:
                 web_results = response['web']
         
         results_count = len(web_results)
-        print(f"✅ Found {results_count} search results")
+        if not os.getenv("HERMES_QUIET"):
+            _verbose_print(f"✅ Found {results_count} search results")
         
         # Build response with just search metadata (URLs, titles, descriptions)
         response_data = {
@@ -654,7 +662,7 @@ async def web_extract_tool(
     }
     
     try:
-        print(f"📄 Extracting content from {len(urls)} URL(s)")
+        _verbose_print(f"📄 Extracting content from {len(urls)} URL(s)")
         
         # Determine requested formats for Firecrawl v2
         formats: List[str] = []
@@ -672,7 +680,7 @@ async def web_extract_tool(
         
         for url in urls:
             try:
-                print(f"  📄 Scraping: {url}")
+                _verbose_print(f"  📄 Scraping: {url}")
                 scrape_result = _get_firecrawl_client().scrape(
                     url=url,
                     formats=formats
@@ -748,14 +756,14 @@ async def web_extract_tool(
         response = {"results": results}
         
         pages_extracted = len(response.get('results', []))
-        print(f"✅ Extracted content from {pages_extracted} pages")
+        _verbose_print(f"✅ Extracted content from {pages_extracted} pages")
         
         debug_call_data["pages_extracted"] = pages_extracted
         debug_call_data["original_response_size"] = len(json.dumps(response))
         
         # Process each result with LLM if enabled
         if use_llm_processing and os.getenv("OPENROUTER_API_KEY"):
-            print("🧠 Processing extracted content with LLM (parallel)...")
+            _verbose_print("🧠 Processing extracted content with LLM (parallel)...")
             debug_call_data["processing_applied"].append("llm_processing")
             
             # Prepare tasks for parallel processing
@@ -813,12 +821,12 @@ async def web_extract_tool(
                 if status == "processed":
                     debug_call_data["compression_metrics"].append(metrics)
                     debug_call_data["pages_processed_with_llm"] += 1
-                    print(f"  📝 {url} (processed)")
+                    _verbose_print(f"  📝 {url} (processed)")
                 elif status == "too_short":
                     debug_call_data["compression_metrics"].append(metrics)
-                    print(f"  📝 {url} (no processing - content too short)")
+                    _verbose_print(f"  📝 {url} (no processing - content too short)")
                 else:
-                    print(f"  ⚠️  {url} (no content to process)")
+                    _verbose_print(f"  ⚠️  {url} (no content to process)")
         else:
             if use_llm_processing and not os.getenv("OPENROUTER_API_KEY"):
                 print("⚠️  LLM processing requested but OPENROUTER_API_KEY not set, returning raw content")
@@ -828,7 +836,7 @@ async def web_extract_tool(
             for result in response.get('results', []):
                 url = result.get('url', 'Unknown URL')
                 content_length = len(result.get('raw_content', ''))
-                print(f"  📝 {url} ({content_length} characters)")
+                _verbose_print(f"  📝 {url} ({content_length} characters)")
         
         # Trim output to minimal fields per entry: title, content, error
         trimmed_results = [
@@ -923,10 +931,10 @@ async def web_crawl_tool(
         # Ensure URL has protocol
         if not url.startswith(('http://', 'https://')):
             url = f'https://{url}'
-            print(f"  📝 Added https:// prefix to URL: {url}")
+            _verbose_print(f"  📝 Added https:// prefix to URL: {url}")
         
         instructions_text = f" with instructions: '{instructions}'" if instructions else ""
-        print(f"🕷️ Crawling {url}{instructions_text}")
+        _verbose_print(f"🕷️ Crawling {url}{instructions_text}")
         
         # Use Firecrawl's v2 crawl functionality
         # Docs: https://docs.firecrawl.dev/features/crawl
@@ -943,7 +951,7 @@ async def web_crawl_tool(
         # Note: The 'prompt' parameter is not documented for crawl
         # Instructions are typically used with the Extract endpoint, not Crawl
         if instructions:
-            print(f"  ℹ️  Note: Instructions parameter ignored (not supported in crawl API)")
+            _verbose_print(f"  ℹ️  Note: Instructions parameter ignored (not supported in crawl API)")
         
         # Use the crawl method which waits for completion automatically
         try:
@@ -963,23 +971,23 @@ async def web_crawl_tool(
         # The crawl_result is a CrawlJob object with a 'data' attribute containing list of Document objects
         if hasattr(crawl_result, 'data'):
             data_list = crawl_result.data if crawl_result.data else []
-            print(f"  📊 Status: {getattr(crawl_result, 'status', 'unknown')}")
-            print(f"  📄 Retrieved {len(data_list)} pages")
+            _verbose_print(f"  📊 Status: {getattr(crawl_result, 'status', 'unknown')}")
+            _verbose_print(f"  📄 Retrieved {len(data_list)} pages")
             
             # Debug: Check other attributes if no data
             if not data_list:
-                print(f"  🔍 Debug - CrawlJob attributes: {[attr for attr in dir(crawl_result) if not attr.startswith('_')]}")
-                print(f"  🔍 Debug - Status: {getattr(crawl_result, 'status', 'N/A')}")
-                print(f"  🔍 Debug - Total: {getattr(crawl_result, 'total', 'N/A')}")
-                print(f"  🔍 Debug - Completed: {getattr(crawl_result, 'completed', 'N/A')}")
+                _verbose_print(f"  🔍 Debug - CrawlJob attributes: {[attr for attr in dir(crawl_result) if not attr.startswith('_')]}")
+                _verbose_print(f"  🔍 Debug - Status: {getattr(crawl_result, 'status', 'N/A')}")
+                _verbose_print(f"  🔍 Debug - Total: {getattr(crawl_result, 'total', 'N/A')}")
+                _verbose_print(f"  🔍 Debug - Completed: {getattr(crawl_result, 'completed', 'N/A')}")
                 
         elif isinstance(crawl_result, dict) and 'data' in crawl_result:
             data_list = crawl_result.get("data", [])
         else:
             print("  ⚠️  Unexpected crawl result type")
-            print(f"  🔍 Debug - Result type: {type(crawl_result)}")
+            _verbose_print(f"  🔍 Debug - Result type: {type(crawl_result)}")
             if hasattr(crawl_result, '__dict__'):
-                print(f"  🔍 Debug - Result attributes: {list(crawl_result.__dict__.keys())}")
+                _verbose_print(f"  🔍 Debug - Result attributes: {list(crawl_result.__dict__.keys())}")
         
         for item in data_list:
             # Process each crawled page - properly handle object serialization
@@ -1044,14 +1052,14 @@ async def web_crawl_tool(
         response = {"results": pages}
         
         pages_crawled = len(response.get('results', []))
-        print(f"✅ Crawled {pages_crawled} pages")
+        _verbose_print(f"✅ Crawled {pages_crawled} pages")
         
         debug_call_data["pages_crawled"] = pages_crawled
         debug_call_data["original_response_size"] = len(json.dumps(response))
         
         # Process each result with LLM if enabled
         if use_llm_processing and os.getenv("OPENROUTER_API_KEY"):
-            print("🧠 Processing crawled content with LLM (parallel)...")
+            _verbose_print("🧠 Processing crawled content with LLM (parallel)...")
             debug_call_data["processing_applied"].append("llm_processing")
             
             # Prepare tasks for parallel processing
@@ -1109,12 +1117,12 @@ async def web_crawl_tool(
                 if status == "processed":
                     debug_call_data["compression_metrics"].append(metrics)
                     debug_call_data["pages_processed_with_llm"] += 1
-                    print(f"  🌐 {page_url} (processed)")
+                    _verbose_print(f"  🌐 {page_url} (processed)")
                 elif status == "too_short":
                     debug_call_data["compression_metrics"].append(metrics)
-                    print(f"  🌐 {page_url} (no processing - content too short)")
+                    _verbose_print(f"  🌐 {page_url} (no processing - content too short)")
                 else:
-                    print(f"  ⚠️  {page_url} (no content to process)")
+                    _verbose_print(f"  ⚠️  {page_url} (no content to process)")
         else:
             if use_llm_processing and not os.getenv("OPENROUTER_API_KEY"):
                 print("⚠️  LLM processing requested but OPENROUTER_API_KEY not set, returning raw content")
@@ -1124,7 +1132,7 @@ async def web_crawl_tool(
             for result in response.get('results', []):
                 page_url = result.get('url', 'Unknown URL')
                 content_length = len(result.get('content', ''))
-                print(f"  🌐 {page_url} ({content_length} characters)")
+                _verbose_print(f"  🌐 {page_url} ({content_length} characters)")
         
         # Trim output to minimal fields per entry: title, content, error
         trimmed_results = [
@@ -1246,7 +1254,7 @@ if __name__ == "__main__":
     
     # Show debug mode status
     if DEBUG_MODE:
-        print(f"🐛 Debug mode ENABLED - Session ID: {DEBUG_SESSION_ID}")
+        _verbose_print(f"🐛 Debug mode ENABLED - Session ID: {DEBUG_SESSION_ID}")
         print(f"   Debug logs will be saved to: ./logs/web_tools_debug_{DEBUG_SESSION_ID}.json")
     else:
         print("🐛 Debug mode disabled (set WEB_TOOLS_DEBUG=true to enable)")