fix: move PERFORMANCE_HOTSPOTS_QUICKREF.md to docs/reports/
This commit is contained in:
241
docs/reports/PERFORMANCE_HOTSPOTS_QUICKREF.md
Normal file
241
docs/reports/PERFORMANCE_HOTSPOTS_QUICKREF.md
Normal file
@@ -0,0 +1,241 @@
|
||||
# Performance Hotspots Quick Reference
|
||||
|
||||
## Critical Files to Optimize
|
||||
|
||||
### 1. run_agent.py (8,317 lines, 419KB)
|
||||
```
|
||||
Lines 460-1000: Massive __init__ - 50+ params, slow startup
|
||||
Lines 2158-2222: _save_session_log - blocking I/O every turn
|
||||
Lines 2269-2297: _hydrate_todo_store - O(n) history scan
|
||||
Lines 3759-3826: _anthropic_messages_create - blocking API calls
|
||||
Lines 3827-3920: _interruptible_api_call - sync/async bridge overhead
|
||||
```
|
||||
|
||||
**Fix Priority: CRITICAL**
|
||||
- Split into modules
|
||||
- Add async session logging
|
||||
- Cache history hydration
|
||||
|
||||
---
|
||||
|
||||
### 2. gateway/run.py (6,016 lines, 274KB)
|
||||
```
|
||||
Lines 406-413: _agent_cache - unbounded growth, memory leak
|
||||
Lines 464-493: _get_or_create_gateway_honcho - blocking init
|
||||
Lines 2800+: run_agent_sync - blocks event loop
|
||||
```
|
||||
|
||||
**Fix Priority: HIGH**
|
||||
- Implement LRU cache
|
||||
- Use asyncio.to_thread()
|
||||
|
||||
---
|
||||
|
||||
### 3. gateway/stream_consumer.py
|
||||
```
|
||||
Lines 88-147: Busy-wait loop with 50ms sleep
|
||||
Max 20 updates/sec throughput
|
||||
```
|
||||
|
||||
**Fix Priority: MEDIUM**
|
||||
- Use asyncio.Event for signaling
|
||||
- Adaptive back-off
|
||||
|
||||
---
|
||||
|
||||
### 4. tools/web_tools.py (1,843 lines)
|
||||
```
|
||||
Lines 171-188: _tavily_request - sync httpx call, 60s timeout
|
||||
Lines 256-301: process_content_with_llm - sync LLM call
|
||||
```
|
||||
|
||||
**Fix Priority: CRITICAL**
|
||||
- Convert to async
|
||||
- Add connection pooling
|
||||
|
||||
---
|
||||
|
||||
### 5. tools/browser_tool.py (1,955 lines)
|
||||
```
|
||||
Lines 194-208: _resolve_cdp_override - sync requests call
|
||||
Lines 234-257: _get_cloud_provider - blocking config read
|
||||
```
|
||||
|
||||
**Fix Priority: HIGH**
|
||||
- Async HTTP client
|
||||
- Cache config reads
|
||||
|
||||
---
|
||||
|
||||
### 6. tools/terminal_tool.py (1,358 lines)
|
||||
```
|
||||
Lines 66-92: _check_disk_usage_warning - blocking glob walk
|
||||
Lines 167-289: _prompt_for_sudo_password - thread creation per call
|
||||
```
|
||||
|
||||
**Fix Priority: MEDIUM**
|
||||
- Async disk check
|
||||
- Thread pool reuse
|
||||
|
||||
---
|
||||
|
||||
### 7. tools/file_tools.py (563 lines)
|
||||
```
|
||||
Lines 53-62: _read_tracker - unbounded dict growth
|
||||
Lines 195-262: read_file_tool - sync file I/O
|
||||
```
|
||||
|
||||
**Fix Priority: MEDIUM**
|
||||
- TTL-based cleanup
|
||||
- aiofiles for async I/O
|
||||
|
||||
---
|
||||
|
||||
### 8. agent/context_compressor.py (676 lines)
|
||||
```
|
||||
Lines 250-369: _generate_summary - expensive LLM call
|
||||
Lines 490-500: _find_tail_cut_by_tokens - O(n) token counting
|
||||
```
|
||||
|
||||
**Fix Priority: HIGH**
|
||||
- Background compression task
|
||||
- Cache summaries
|
||||
|
||||
---
|
||||
|
||||
### 9. hermes_state.py (1,274 lines)
|
||||
```
|
||||
Lines 116-215: _execute_write - global lock, 15 retries
|
||||
Lines 143-156: SQLite with WAL but single connection
|
||||
```
|
||||
|
||||
**Fix Priority: HIGH**
|
||||
- Connection pooling
|
||||
- Batch writes
|
||||
|
||||
---
|
||||
|
||||
### 10. model_tools.py (472 lines)
|
||||
```
|
||||
Lines 81-126: _run_async - creates ThreadPool per call!
|
||||
Lines 132-170: _discover_tools - imports ALL tools at startup
|
||||
```
|
||||
|
||||
**Fix Priority: CRITICAL**
|
||||
- Persistent thread pool
|
||||
- Lazy tool loading
|
||||
|
||||
---
|
||||
|
||||
## Quick Fixes (Copy-Paste Ready)
|
||||
|
||||
### Fix 1: LRU Cache for Agent Cache
|
||||
```python
|
||||
from functools import lru_cache
|
||||
from cachetools import TTLCache
|
||||
|
||||
# In gateway/run.py
|
||||
self._agent_cache: Dict[str, tuple] = TTLCache(maxsize=100, ttl=3600)
|
||||
```
|
||||
|
||||
### Fix 2: Async HTTP Client
|
||||
```python
|
||||
# In tools/web_tools.py
|
||||
import httpx
|
||||
|
||||
_http_client: Optional[httpx.AsyncClient] = None
|
||||
|
||||
async def get_http_client() -> httpx.AsyncClient:
|
||||
global _http_client
|
||||
if _http_client is None:
|
||||
_http_client = httpx.AsyncClient(timeout=60)
|
||||
return _http_client
|
||||
```
|
||||
|
||||
### Fix 3: Connection Pool for DB
|
||||
```python
|
||||
# In hermes_state.py
|
||||
from sqlalchemy import create_engine
|
||||
from sqlalchemy.pool import QueuePool
|
||||
|
||||
engine = create_engine(
|
||||
'sqlite:///state.db',
|
||||
poolclass=QueuePool,
|
||||
pool_size=5,
|
||||
max_overflow=10
|
||||
)
|
||||
```
|
||||
|
||||
### Fix 4: Lazy Tool Loading
|
||||
```python
|
||||
# In model_tools.py
|
||||
@lru_cache(maxsize=1)
|
||||
def _get_discovered_tools():
|
||||
"""Cache tool discovery after first call"""
|
||||
_discover_tools()
|
||||
return registry
|
||||
```
|
||||
|
||||
### Fix 5: Batch Session Writes
|
||||
```python
|
||||
# In run_agent.py
|
||||
async def _save_session_log_async(self, messages):
|
||||
"""Non-blocking session save"""
|
||||
loop = asyncio.get_event_loop()
|
||||
await loop.run_in_executor(None, self._save_session_log, messages)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Metrics to Track
|
||||
|
||||
```python
|
||||
# Add these metrics
|
||||
IMPORT_TIME = Gauge('import_time_seconds', 'Module import time')
|
||||
AGENT_INIT_TIME = Gauge('agent_init_seconds', 'AIAgent init time')
|
||||
TOOL_EXECUTION_TIME = Histogram('tool_duration_seconds', 'Tool execution', ['tool_name'])
|
||||
DB_WRITE_TIME = Histogram('db_write_seconds', 'Database write time')
|
||||
API_LATENCY = Histogram('api_latency_seconds', 'API call latency', ['provider'])
|
||||
MEMORY_USAGE = Gauge('memory_usage_bytes', 'Process memory')
|
||||
CACHE_HIT_RATE = Gauge('cache_hit_rate', 'Cache hit rate', ['cache_name'])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## One-Liner Profiling Commands
|
||||
|
||||
```bash
|
||||
# Find slow imports
|
||||
python -X importtime -c "from run_agent import AIAgent" 2>&1 | head -50
|
||||
|
||||
# Find blocking I/O
|
||||
sudo strace -e trace=openat,read,write -c python run_agent.py 2>&1
|
||||
|
||||
# Memory profiling
|
||||
pip install memory_profiler && python -m memory_profiler run_agent.py
|
||||
|
||||
# CPU profiling
|
||||
pip install py-spy && py-spy record -o profile.svg -- python run_agent.py
|
||||
|
||||
# Find all sleep calls
|
||||
grep -rn "time.sleep\|asyncio.sleep" --include="*.py" | wc -l
|
||||
|
||||
# Find all JSON calls
|
||||
grep -rn "json.loads\|json.dumps" --include="*.py" | wc -l
|
||||
|
||||
# Find all locks
|
||||
grep -rn "threading.Lock\|threading.RLock\|asyncio.Lock" --include="*.py"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Expected Performance After Fixes
|
||||
|
||||
| Metric | Before | After | Improvement |
|
||||
|--------|--------|-------|-------------|
|
||||
| Startup time | 3-5s | 1-2s | 3x faster |
|
||||
| API latency | 500ms | 200ms | 2.5x faster |
|
||||
| Concurrent requests | 10/s | 100/s | 10x throughput |
|
||||
| Memory per agent | 50MB | 30MB | 40% reduction |
|
||||
| DB writes/sec | 50 | 500 | 10x throughput |
|
||||
| Import time | 2s | 0.5s | 4x faster |
|
||||
Reference in New Issue
Block a user