242 lines
5.6 KiB
Markdown
242 lines
5.6 KiB
Markdown
|
|
# Performance Hotspots Quick Reference
|
||
|
|
|
||
|
|
## Critical Files to Optimize
|
||
|
|
|
||
|
|
### 1. run_agent.py (8,317 lines, 419KB)
|
||
|
|
```
|
||
|
|
Lines 460-1000: Massive __init__ - 50+ params, slow startup
|
||
|
|
Lines 2158-2222: _save_session_log - blocking I/O every turn
|
||
|
|
Lines 2269-2297: _hydrate_todo_store - O(n) history scan
|
||
|
|
Lines 3759-3826: _anthropic_messages_create - blocking API calls
|
||
|
|
Lines 3827-3920: _interruptible_api_call - sync/async bridge overhead
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: CRITICAL**
|
||
|
|
- Split into modules
|
||
|
|
- Add async session logging
|
||
|
|
- Cache history hydration
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 2. gateway/run.py (6,016 lines, 274KB)
|
||
|
|
```
|
||
|
|
Lines 406-413: _agent_cache - unbounded growth, memory leak
|
||
|
|
Lines 464-493: _get_or_create_gateway_honcho - blocking init
|
||
|
|
Lines 2800+: run_agent_sync - blocks event loop
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: HIGH**
|
||
|
|
- Implement LRU cache
|
||
|
|
- Use asyncio.to_thread()
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 3. gateway/stream_consumer.py
|
||
|
|
```
|
||
|
|
Lines 88-147: Busy-wait loop with 50ms sleep
|
||
|
|
Max 20 updates/sec throughput
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: MEDIUM**
|
||
|
|
- Use asyncio.Event for signaling
|
||
|
|
- Adaptive back-off
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 4. tools/web_tools.py (1,843 lines)
|
||
|
|
```
|
||
|
|
Lines 171-188: _tavily_request - sync httpx call, 60s timeout
|
||
|
|
Lines 256-301: process_content_with_llm - sync LLM call
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: CRITICAL**
|
||
|
|
- Convert to async
|
||
|
|
- Add connection pooling
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 5. tools/browser_tool.py (1,955 lines)
|
||
|
|
```
|
||
|
|
Lines 194-208: _resolve_cdp_override - sync requests call
|
||
|
|
Lines 234-257: _get_cloud_provider - blocking config read
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: HIGH**
|
||
|
|
- Async HTTP client
|
||
|
|
- Cache config reads
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 6. tools/terminal_tool.py (1,358 lines)
|
||
|
|
```
|
||
|
|
Lines 66-92: _check_disk_usage_warning - blocking glob walk
|
||
|
|
Lines 167-289: _prompt_for_sudo_password - thread creation per call
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: MEDIUM**
|
||
|
|
- Async disk check
|
||
|
|
- Thread pool reuse
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 7. tools/file_tools.py (563 lines)
|
||
|
|
```
|
||
|
|
Lines 53-62: _read_tracker - unbounded dict growth
|
||
|
|
Lines 195-262: read_file_tool - sync file I/O
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: MEDIUM**
|
||
|
|
- TTL-based cleanup
|
||
|
|
- aiofiles for async I/O
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 8. agent/context_compressor.py (676 lines)
|
||
|
|
```
|
||
|
|
Lines 250-369: _generate_summary - expensive LLM call
|
||
|
|
Lines 490-500: _find_tail_cut_by_tokens - O(n) token counting
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: HIGH**
|
||
|
|
- Background compression task
|
||
|
|
- Cache summaries
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 9. hermes_state.py (1,274 lines)
|
||
|
|
```
|
||
|
|
Lines 116-215: _execute_write - global lock, 15 retries
|
||
|
|
Lines 143-156: SQLite with WAL but single connection
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: HIGH**
|
||
|
|
- Connection pooling
|
||
|
|
- Batch writes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 10. model_tools.py (472 lines)
|
||
|
|
```
|
||
|
|
Lines 81-126: _run_async - creates ThreadPool per call!
|
||
|
|
Lines 132-170: _discover_tools - imports ALL tools at startup
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix Priority: CRITICAL**
|
||
|
|
- Persistent thread pool
|
||
|
|
- Lazy tool loading
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Fixes (Copy-Paste Ready)
|
||
|
|
|
||
|
|
### Fix 1: LRU Cache for Agent Cache
|
||
|
|
```python
|
||
|
|
from functools import lru_cache
|
||
|
|
from cachetools import TTLCache
|
||
|
|
|
||
|
|
# In gateway/run.py
|
||
|
|
self._agent_cache: Dict[str, tuple] = TTLCache(maxsize=100, ttl=3600)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Fix 2: Async HTTP Client
|
||
|
|
```python
|
||
|
|
# In tools/web_tools.py
|
||
|
|
import httpx
|
||
|
|
|
||
|
|
_http_client: Optional[httpx.AsyncClient] = None
|
||
|
|
|
||
|
|
async def get_http_client() -> httpx.AsyncClient:
|
||
|
|
global _http_client
|
||
|
|
if _http_client is None:
|
||
|
|
_http_client = httpx.AsyncClient(timeout=60)
|
||
|
|
return _http_client
|
||
|
|
```
|
||
|
|
|
||
|
|
### Fix 3: Connection Pool for DB
|
||
|
|
```python
|
||
|
|
# In hermes_state.py
|
||
|
|
from sqlalchemy import create_engine
|
||
|
|
from sqlalchemy.pool import QueuePool
|
||
|
|
|
||
|
|
engine = create_engine(
|
||
|
|
'sqlite:///state.db',
|
||
|
|
poolclass=QueuePool,
|
||
|
|
pool_size=5,
|
||
|
|
max_overflow=10
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Fix 4: Lazy Tool Loading
|
||
|
|
```python
|
||
|
|
# In model_tools.py
|
||
|
|
@lru_cache(maxsize=1)
|
||
|
|
def _get_discovered_tools():
|
||
|
|
"""Cache tool discovery after first call"""
|
||
|
|
_discover_tools()
|
||
|
|
return registry
|
||
|
|
```
|
||
|
|
|
||
|
|
### Fix 5: Batch Session Writes
|
||
|
|
```python
|
||
|
|
# In run_agent.py
|
||
|
|
async def _save_session_log_async(self, messages):
|
||
|
|
"""Non-blocking session save"""
|
||
|
|
loop = asyncio.get_event_loop()
|
||
|
|
await loop.run_in_executor(None, self._save_session_log, messages)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Performance Metrics to Track
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Add these metrics
|
||
|
|
IMPORT_TIME = Gauge('import_time_seconds', 'Module import time')
|
||
|
|
AGENT_INIT_TIME = Gauge('agent_init_seconds', 'AIAgent init time')
|
||
|
|
TOOL_EXECUTION_TIME = Histogram('tool_duration_seconds', 'Tool execution', ['tool_name'])
|
||
|
|
DB_WRITE_TIME = Histogram('db_write_seconds', 'Database write time')
|
||
|
|
API_LATENCY = Histogram('api_latency_seconds', 'API call latency', ['provider'])
|
||
|
|
MEMORY_USAGE = Gauge('memory_usage_bytes', 'Process memory')
|
||
|
|
CACHE_HIT_RATE = Gauge('cache_hit_rate', 'Cache hit rate', ['cache_name'])
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## One-Liner Profiling Commands
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Find slow imports
|
||
|
|
python -X importtime -c "from run_agent import AIAgent" 2>&1 | head -50
|
||
|
|
|
||
|
|
# Find blocking I/O
|
||
|
|
sudo strace -e trace=openat,read,write -c python run_agent.py 2>&1
|
||
|
|
|
||
|
|
# Memory profiling
|
||
|
|
pip install memory_profiler && python -m memory_profiler run_agent.py
|
||
|
|
|
||
|
|
# CPU profiling
|
||
|
|
pip install py-spy && py-spy record -o profile.svg -- python run_agent.py
|
||
|
|
|
||
|
|
# Find all sleep calls
|
||
|
|
grep -rn "time.sleep\|asyncio.sleep" --include="*.py" | wc -l
|
||
|
|
|
||
|
|
# Find all JSON calls
|
||
|
|
grep -rn "json.loads\|json.dumps" --include="*.py" | wc -l
|
||
|
|
|
||
|
|
# Find all locks
|
||
|
|
grep -rn "threading.Lock\|threading.RLock\|asyncio.Lock" --include="*.py"
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Expected Performance After Fixes
|
||
|
|
|
||
|
|
| Metric | Before | After | Improvement |
|
||
|
|
|--------|--------|-------|-------------|
|
||
|
|
| Startup time | 3-5s | 1-2s | 3x faster |
|
||
|
|
| API latency | 500ms | 200ms | 2.5x faster |
|
||
|
|
| Concurrent requests | 10/s | 100/s | 10x throughput |
|
||
|
|
| Memory per agent | 50MB | 30MB | 40% reduction |
|
||
|
|
| DB writes/sec | 50 | 500 | 10x throughput |
|
||
|
|
| Import time | 2s | 0.5s | 4x faster |
|