Combines the approaches from PR #6309 (duan78) and PR #5963 (KUSH42):
Tiered warnings (from #5963):
- Replaces boolean _context_pressure_warned with float _context_pressure_warned_at
- Fires at 85% (orange) and re-fires at 95% (red/critical)
- Adds 'compacting context...' status message before compression
Gateway dedup (from #6309):
- Class-level dict _context_pressure_last_warned survives across AIAgent
instances (gateway creates a new instance per message)
- 5-minute cooldown per session prevents warning spam
- Higher-tier warnings bypass the cooldown (85% → 95% always fires)
- Compression reset clears the dedup entry for the session
- Stale entries evicted (older than 2x cooldown) to prevent memory leak
Does NOT inject into messages — purely user-facing via _safe_print (CLI)
and status_callback (gateway). Zero prompt cache impact.
Fixes#6309. Fixes#5963.