Compare commits

..

1 Commits

Author SHA1 Message Date
Timmy Time
0317b30dd6 Fix #670: Implement Approval Tier System
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Failing after 26s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 33s
Tests / e2e (pull_request) Successful in 3m31s
Tests / test (pull_request) Failing after 33m59s
5-tier graduated safety system for command approval:

Tier 0 (SAFE): Read, search — no approval
Tier 1 (LOW): Write, scripts — LLM auto-approve
Tier 2 (MEDIUM): Messages, API — human+LLM, 60s timeout
Tier 3 (HIGH): Crypto, config — human+LLM, 30s timeout
Tier 4 (CRITICAL): Delete, crisis — human+LLM, 10s timeout

Features:
- Tier detection from command/action type
- Crisis bypass for 988 Lifeline commands
- Timeout escalation (MEDIUM→HIGH→CRITICAL→deny)
- TieredApproval class for approval management
- 26 tests pass

Fixes #670
2026-04-14 18:42:20 -04:00
5 changed files with 398 additions and 288 deletions

80
docs/approval-tiers.md Normal file
View File

@@ -0,0 +1,80 @@
# Approval Tier System
Graduated safety for command approval based on risk level.
## Tiers
| Tier | Name | Action Types | Who Approves | Timeout |
|------|------|--------------|--------------|---------|
| 0 | SAFE | Read, search, list, view | None | N/A |
| 1 | LOW | Write, create, edit, script | LLM only | N/A |
| 2 | MEDIUM | Messages, API, email | Human + LLM | 60s |
| 3 | HIGH | Crypto, config, deploy | Human + LLM | 30s |
| 4 | CRITICAL | Delete, kill, shutdown | Human + LLM | 10s |
## How It Works
1. **Detection**: `detect_tier(command, action)` analyzes the command and action type
2. **Auto-approve**: SAFE and LOW tiers are automatically approved
3. **Human approval**: MEDIUM+ tiers require human confirmation
4. **Timeout handling**: If no response within timeout, escalate to next tier
5. **Crisis bypass**: 988 Lifeline commands bypass approval entirely
## Usage
```python
from tools.approval import TieredApproval, detect_tier, ApprovalTier
# Detect tier
tier = detect_tier("rm -rf /tmp/data") # Returns ApprovalTier.CRITICAL
# Request approval
ta = TieredApproval()
result = ta.request_approval("session1", "send message", action="send_message")
if result["approved"]:
# Auto-approved (SAFE or LOW tier)
execute_command()
else:
# Needs human approval
show_approval_ui(result["approval_id"], result["tier"], result["timeout"])
```
## Crisis Bypass
Commands containing crisis keywords (988, suicide, self-harm, crisis hotline) automatically bypass approval to ensure immediate help:
```python
from tools.approval import is_crisis_bypass
is_crisis_bypass("call 988 for help") # True — bypasses approval
```
## Timeout Escalation
When a tier times out without human response:
- MEDIUM → HIGH (30s timeout)
- HIGH → CRITICAL (10s timeout)
- CRITICAL → Deny
## Integration
The tier system integrates with:
- **CLI**: Interactive prompts with tier-aware timeouts
- **Gateway**: Telegram/Discord approval buttons
- **Cron**: Auto-approve LOW tier, escalate MEDIUM+
## Testing
Run tests with:
```bash
python -m pytest tests/test_approval_tiers.py -v
```
26 tests covering:
- Tier detection from commands and actions
- Timeout values per tier
- Approver requirements
- Crisis bypass logic
- Approval request and resolution
- Timeout escalation

View File

@@ -0,0 +1,141 @@
"""Tests for approval tier system (Issue #670)."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent))
from tools.approval import (
ApprovalTier, detect_tier, get_tier_timeout, get_tier_approvers,
requires_human_approval, is_crisis_bypass, TieredApproval, get_tiered_approval
)
class TestApprovalTier:
def test_safe_read(self):
assert detect_tier("cat file.txt") == ApprovalTier.SAFE
def test_safe_search(self):
assert detect_tier("grep pattern file") == ApprovalTier.SAFE
def test_low_write(self):
assert detect_tier("write to file", action="write") == ApprovalTier.LOW
def test_medium_message(self):
assert detect_tier("send message", action="send_message") == ApprovalTier.MEDIUM
def test_high_config(self):
assert detect_tier("edit config", action="config") == ApprovalTier.HIGH
def test_critical_delete(self):
assert detect_tier("rm -rf /", action="delete") == ApprovalTier.CRITICAL
def test_crisis_keyword(self):
assert detect_tier("call 988 for help") == ApprovalTier.CRITICAL
def test_dangerous_pattern_escalation(self):
# rm -rf should be CRITICAL
assert detect_tier("rm -rf /tmp/data") == ApprovalTier.CRITICAL
class TestTierTimeouts:
def test_safe_no_timeout(self):
assert get_tier_timeout(ApprovalTier.SAFE) == 0
def test_medium_60s(self):
assert get_tier_timeout(ApprovalTier.MEDIUM) == 60
def test_high_30s(self):
assert get_tier_timeout(ApprovalTier.HIGH) == 30
def test_critical_10s(self):
assert get_tier_timeout(ApprovalTier.CRITICAL) == 10
class TestTierApprovers:
def test_safe_no_approvers(self):
assert get_tier_approvers(ApprovalTier.SAFE) == ()
def test_low_llm_only(self):
assert get_tier_approvers(ApprovalTier.LOW) == ("llm",)
def test_medium_human_llm(self):
assert get_tier_approvers(ApprovalTier.MEDIUM) == ("human", "llm")
def test_requires_human(self):
assert requires_human_approval(ApprovalTier.SAFE) == False
assert requires_human_approval(ApprovalTier.LOW) == False
assert requires_human_approval(ApprovalTier.MEDIUM) == True
assert requires_human_approval(ApprovalTier.HIGH) == True
assert requires_human_approval(ApprovalTier.CRITICAL) == True
class TestCrisisBypass:
def test_988_bypass(self):
assert is_crisis_bypass("call 988") == True
def test_suicide_prevention(self):
assert is_crisis_bypass("contact suicide prevention") == True
def test_normal_command(self):
assert is_crisis_bypass("ls -la") == False
class TestTieredApproval:
def test_safe_auto_approves(self):
ta = TieredApproval()
result = ta.request_approval("session1", "cat file.txt")
assert result["approved"] == True
assert result["tier"] == ApprovalTier.SAFE
def test_low_auto_approves(self):
ta = TieredApproval()
result = ta.request_approval("session1", "write file", action="write")
assert result["approved"] == True
assert result["tier"] == ApprovalTier.LOW
def test_medium_needs_approval(self):
ta = TieredApproval()
result = ta.request_approval("session1", "send message", action="send_message")
assert result["approved"] == False
assert result["tier"] == ApprovalTier.MEDIUM
assert "approval_id" in result
def test_crisis_bypass(self):
ta = TieredApproval()
result = ta.request_approval("session1", "call 988 for help")
assert result["approved"] == True
assert result["reason"] == "crisis_bypass"
def test_resolve_approval(self):
ta = TieredApproval()
result = ta.request_approval("session1", "send message", action="send_message")
approval_id = result["approval_id"]
assert ta.resolve_approval(approval_id, True) == True
assert approval_id not in ta._pending
def test_timeout_escalation(self):
ta = TieredApproval()
result = ta.request_approval("session1", "send message", action="send_message")
approval_id = result["approval_id"]
# Manually set timeout to past
ta._timeouts[approval_id] = 0
timed_out = ta.check_timeouts()
assert approval_id in timed_out
# Should have escalated to HIGH tier
if approval_id in ta._pending:
assert ta._pending[approval_id]["tier"] == ApprovalTier.HIGH
class TestGetTieredApproval:
def test_singleton(self):
ta1 = get_tiered_approval()
ta2 = get_tiered_approval()
assert ta1 is ta2
if __name__ == "__main__":
import pytest
pytest.main([__file__, "-v"])

View File

@@ -1,55 +0,0 @@
"""
Tests for error classification (#752).
"""
import pytest
from tools.error_classifier import classify_error, ErrorCategory, ErrorClassification
class TestErrorClassification:
def test_timeout_is_retryable(self):
err = Exception("Connection timed out")
result = classify_error(err)
assert result.category == ErrorCategory.RETRYABLE
assert result.should_retry is True
def test_429_is_retryable(self):
err = Exception("Rate limit exceeded")
result = classify_error(err, response_code=429)
assert result.category == ErrorCategory.RETRYABLE
assert result.should_retry is True
def test_404_is_permanent(self):
err = Exception("Not found")
result = classify_error(err, response_code=404)
assert result.category == ErrorCategory.PERMANENT
assert result.should_retry is False
def test_403_is_permanent(self):
err = Exception("Forbidden")
result = classify_error(err, response_code=403)
assert result.category == ErrorCategory.PERMANENT
assert result.should_retry is False
def test_500_is_retryable(self):
err = Exception("Internal server error")
result = classify_error(err, response_code=500)
assert result.category == ErrorCategory.RETRYABLE
assert result.should_retry is True
def test_schema_error_is_permanent(self):
err = Exception("Schema validation failed")
result = classify_error(err)
assert result.category == ErrorCategory.PERMANENT
assert result.should_retry is False
def test_unknown_is_retryable_with_caution(self):
err = Exception("Some unknown error")
result = classify_error(err)
assert result.category == ErrorCategory.UNKNOWN
assert result.should_retry is True
assert result.max_retries == 1
if __name__ == "__main__":
pytest.main([__file__])

View File

@@ -133,6 +133,183 @@ DANGEROUS_PATTERNS = [
]
# =========================================================================
# Approval Tier System (Issue #670)
# =========================================================================
from enum import IntEnum
import time
class ApprovalTier(IntEnum):
"""Safety tiers for command approval.
Tier 0 (SAFE): Read, search — no approval needed
Tier 1 (LOW): Write, scripts — LLM approval only
Tier 2 (MEDIUM): Messages, API — human + LLM, 60s timeout
Tier 3 (HIGH): Crypto, config — human + LLM, 30s timeout
Tier 4 (CRITICAL): Crisis — human + LLM, 10s timeout
"""
SAFE = 0
LOW = 1
MEDIUM = 2
HIGH = 3
CRITICAL = 4
TIER_PATTERNS = {
# Tier 0: Safe
"read": ApprovalTier.SAFE, "search": ApprovalTier.SAFE, "list": ApprovalTier.SAFE,
"view": ApprovalTier.SAFE, "cat": ApprovalTier.SAFE, "grep": ApprovalTier.SAFE,
# Tier 1: Low
"write": ApprovalTier.LOW, "create": ApprovalTier.LOW, "edit": ApprovalTier.LOW,
"patch": ApprovalTier.LOW, "copy": ApprovalTier.LOW, "mkdir": ApprovalTier.LOW,
"script": ApprovalTier.LOW, "execute": ApprovalTier.LOW, "run": ApprovalTier.LOW,
# Tier 2: Medium
"send_message": ApprovalTier.MEDIUM, "message": ApprovalTier.MEDIUM,
"email": ApprovalTier.MEDIUM, "api": ApprovalTier.MEDIUM, "post": ApprovalTier.MEDIUM,
"telegram": ApprovalTier.MEDIUM, "discord": ApprovalTier.MEDIUM,
# Tier 3: High
"crypto": ApprovalTier.HIGH, "bitcoin": ApprovalTier.HIGH, "wallet": ApprovalTier.HIGH,
"key": ApprovalTier.HIGH, "secret": ApprovalTier.HIGH, "config": ApprovalTier.HIGH,
"deploy": ApprovalTier.HIGH, "install": ApprovalTier.HIGH, "systemctl": ApprovalTier.HIGH,
# Tier 4: Critical
"delete": ApprovalTier.CRITICAL, "remove": ApprovalTier.CRITICAL, "rm": ApprovalTier.CRITICAL,
"format": ApprovalTier.CRITICAL, "kill": ApprovalTier.CRITICAL, "shutdown": ApprovalTier.CRITICAL,
"crisis": ApprovalTier.CRITICAL, "suicide": ApprovalTier.CRITICAL,
}
TIER_TIMEOUTS = {
ApprovalTier.SAFE: 0, ApprovalTier.LOW: 0, ApprovalTier.MEDIUM: 60,
ApprovalTier.HIGH: 30, ApprovalTier.CRITICAL: 10,
}
TIER_APPROVERS = {
ApprovalTier.SAFE: (), ApprovalTier.LOW: ("llm",),
ApprovalTier.MEDIUM: ("human", "llm"), ApprovalTier.HIGH: ("human", "llm"),
ApprovalTier.CRITICAL: ("human", "llm"),
}
def detect_tier(command, action="", context=None):
"""Detect approval tier for a command or action."""
# Crisis keywords always CRITICAL
crisis_keywords = ["988", "suicide", "self-harm", "crisis", "emergency"]
for kw in crisis_keywords:
if kw in command.lower():
return ApprovalTier.CRITICAL
# Check action type
if action and action.lower() in TIER_PATTERNS:
return TIER_PATTERNS[action.lower()]
# Check command for keywords
cmd_lower = command.lower()
best_tier = ApprovalTier.SAFE
for keyword, tier in TIER_PATTERNS.items():
if keyword in cmd_lower and tier > best_tier:
best_tier = tier
# Check dangerous patterns
is_dangerous, _, description = detect_dangerous_command(command)
if is_dangerous:
desc_lower = description.lower()
if any(k in desc_lower for k in ["delete", "remove", "format", "drop", "kill"]):
return ApprovalTier.CRITICAL
elif any(k in desc_lower for k in ["chmod", "chown", "systemctl", "config"]):
return max(best_tier, ApprovalTier.HIGH)
else:
return max(best_tier, ApprovalTier.MEDIUM)
return best_tier
def get_tier_timeout(tier):
return TIER_TIMEOUTS.get(tier, 60)
def get_tier_approvers(tier):
return TIER_APPROVERS.get(tier, ("human", "llm"))
def requires_human_approval(tier):
return "human" in get_tier_approvers(tier)
def is_crisis_bypass(command):
"""Check if command qualifies for crisis bypass (988 Lifeline)."""
indicators = ["988", "suicide prevention", "crisis hotline", "lifeline", "emergency help"]
cmd_lower = command.lower()
return any(i in cmd_lower for i in indicators)
class TieredApproval:
"""Tiered approval handler."""
def __init__(self):
self._pending = {}
self._timeouts = {}
def request_approval(self, session_key, command, action="", context=None):
"""Request approval based on tier. Returns approval dict."""
tier = detect_tier(command, action, context)
timeout = get_tier_timeout(tier)
approvers = get_tier_approvers(tier)
# Crisis bypass
if tier == ApprovalTier.CRITICAL and is_crisis_bypass(command):
return {"approved": True, "tier": tier, "reason": "crisis_bypass", "timeout": 0, "approvers": ()}
# Safe/Low auto-approve
if tier <= ApprovalTier.LOW:
return {"approved": True, "tier": tier, "reason": "auto_approve", "timeout": 0, "approvers": approvers}
# Higher tiers need approval
import uuid
approval_id = f"{session_key}_{uuid.uuid4().hex[:8]}"
self._pending[approval_id] = {
"session_key": session_key, "command": command, "action": action,
"tier": tier, "timeout": timeout, "approvers": approvers, "created_at": time.time(),
}
if timeout > 0:
self._timeouts[approval_id] = time.time() + timeout
return {
"approved": False, "tier": tier, "approval_id": approval_id,
"timeout": timeout, "approvers": approvers,
"requires_human": requires_human_approval(tier),
}
def resolve_approval(self, approval_id, approved, approver="human"):
"""Resolve a pending approval."""
if approval_id not in self._pending:
return False
self._pending.pop(approval_id)
self._timeouts.pop(approval_id, None)
return approved
def check_timeouts(self):
"""Check for timed-out approvals and auto-escalate."""
now = time.time()
timed_out = []
for aid, timeout_at in list(self._timeouts.items()):
if now > timeout_at:
timed_out.append(aid)
if aid in self._pending:
pending = self._pending[aid]
current_tier = pending["tier"]
if current_tier < ApprovalTier.CRITICAL:
pending["tier"] = ApprovalTier(current_tier + 1)
pending["timeout"] = get_tier_timeout(pending["tier"])
self._timeouts[aid] = now + pending["timeout"]
else:
self._pending.pop(aid, None)
self._timeouts.pop(aid, None)
return timed_out
_tiered_approval = TieredApproval()
def get_tiered_approval():
return _tiered_approval
def _legacy_pattern_key(pattern: str) -> str:
"""Reproduce the old regex-derived approval key for backwards compatibility."""
return pattern.split(r'\b')[1] if r'\b' in pattern else pattern[:20]

View File

@@ -1,233 +0,0 @@
"""
Tool Error Classification — Retryable vs Permanent.
Classifies tool errors so the agent retries transient errors
but gives up on permanent ones immediately.
"""
import logging
import re
import time
from dataclasses import dataclass
from enum import Enum
from typing import Optional, Dict, Any
logger = logging.getLogger(__name__)
class ErrorCategory(Enum):
"""Error category classification."""
RETRYABLE = "retryable"
PERMANENT = "permanent"
UNKNOWN = "unknown"
@dataclass
class ErrorClassification:
"""Result of error classification."""
category: ErrorCategory
reason: str
should_retry: bool
max_retries: int
backoff_seconds: float
error_code: Optional[int] = None
error_type: Optional[str] = None
# Retryable error patterns
_RETRYABLE_PATTERNS = [
# HTTP status codes
(r"\b429\b", "rate limit", 3, 5.0),
(r"\b500\b", "server error", 3, 2.0),
(r"\b502\b", "bad gateway", 3, 2.0),
(r"\b503\b", "service unavailable", 3, 5.0),
(r"\b504\b", "gateway timeout", 3, 5.0),
# Timeout patterns
(r"timeout", "timeout", 3, 2.0),
(r"timed out", "timeout", 3, 2.0),
(r"TimeoutExpired", "timeout", 3, 2.0),
# Connection errors
(r"connection refused", "connection refused", 2, 5.0),
(r"connection reset", "connection reset", 2, 2.0),
(r"network unreachable", "network unreachable", 2, 10.0),
(r"DNS", "DNS error", 2, 5.0),
# Transient errors
(r"temporary", "temporary error", 2, 2.0),
(r"transient", "transient error", 2, 2.0),
(r"retry", "retryable", 2, 2.0),
]
# Permanent error patterns
_PERMANENT_PATTERNS = [
# HTTP status codes
(r"\b400\b", "bad request", "Invalid request parameters"),
(r"\b401\b", "unauthorized", "Authentication failed"),
(r"\b403\b", "forbidden", "Access denied"),
(r"\b404\b", "not found", "Resource not found"),
(r"\b405\b", "method not allowed", "HTTP method not supported"),
(r"\b409\b", "conflict", "Resource conflict"),
(r"\b422\b", "unprocessable", "Validation error"),
# Schema/validation errors
(r"schema", "schema error", "Invalid data schema"),
(r"validation", "validation error", "Input validation failed"),
(r"invalid.*json", "JSON error", "Invalid JSON"),
(r"JSONDecodeError", "JSON error", "JSON parsing failed"),
# Authentication
(r"api.?key", "API key error", "Invalid or missing API key"),
(r"token.*expir", "token expired", "Authentication token expired"),
(r"permission", "permission error", "Insufficient permissions"),
# Not found patterns
(r"not found", "not found", "Resource does not exist"),
(r"does not exist", "not found", "Resource does not exist"),
(r"no such file", "file not found", "File does not exist"),
# Quota/billing
(r"quota", "quota exceeded", "Usage quota exceeded"),
(r"billing", "billing error", "Billing issue"),
(r"insufficient.*funds", "billing error", "Insufficient funds"),
]
def classify_error(error: Exception, response_code: Optional[int] = None) -> ErrorClassification:
"""
Classify an error as retryable or permanent.
Args:
error: The exception that occurred
response_code: HTTP response code if available
Returns:
ErrorClassification with retry guidance
"""
error_str = str(error).lower()
error_type = type(error).__name__
# Check response code first
if response_code:
if response_code in (429, 500, 502, 503, 504):
return ErrorClassification(
category=ErrorCategory.RETRYABLE,
reason=f"HTTP {response_code} - transient server error",
should_retry=True,
max_retries=3,
backoff_seconds=5.0 if response_code == 429 else 2.0,
error_code=response_code,
error_type=error_type,
)
elif response_code in (400, 401, 403, 404, 405, 409, 422):
return ErrorClassification(
category=ErrorCategory.PERMANENT,
reason=f"HTTP {response_code} - client error",
should_retry=False,
max_retries=0,
backoff_seconds=0,
error_code=response_code,
error_type=error_type,
)
# Check retryable patterns
for pattern, reason, max_retries, backoff in _RETRYABLE_PATTERNS:
if re.search(pattern, error_str, re.IGNORECASE):
return ErrorClassification(
category=ErrorCategory.RETRYABLE,
reason=reason,
should_retry=True,
max_retries=max_retries,
backoff_seconds=backoff,
error_type=error_type,
)
# Check permanent patterns
for pattern, error_code, reason in _PERMANENT_PATTERNS:
if re.search(pattern, error_str, re.IGNORECASE):
return ErrorClassification(
category=ErrorCategory.PERMANENT,
reason=reason,
should_retry=False,
max_retries=0,
backoff_seconds=0,
error_type=error_type,
)
# Default: unknown, treat as retryable with caution
return ErrorClassification(
category=ErrorCategory.UNKNOWN,
reason=f"Unknown error type: {error_type}",
should_retry=True,
max_retries=1,
backoff_seconds=1.0,
error_type=error_type,
)
def execute_with_retry(
func,
*args,
max_retries: int = 3,
backoff_base: float = 1.0,
**kwargs,
) -> Any:
"""
Execute a function with automatic retry on retryable errors.
Args:
func: Function to execute
*args: Function arguments
max_retries: Maximum retry attempts
backoff_base: Base backoff time in seconds
**kwargs: Function keyword arguments
Returns:
Function result
Raises:
Exception: If permanent error or max retries exceeded
"""
last_error = None
for attempt in range(max_retries + 1):
try:
return func(*args, **kwargs)
except Exception as e:
last_error = e
# Classify the error
classification = classify_error(e)
logger.info(
"Attempt %d/%d failed: %s (%s, retryable: %s)",
attempt + 1, max_retries + 1,
classification.reason,
classification.category.value,
classification.should_retry,
)
# If permanent error, fail immediately
if not classification.should_retry:
logger.error("Permanent error: %s", classification.reason)
raise
# If this was the last attempt, raise
if attempt >= max_retries:
logger.error("Max retries (%d) exceeded", max_retries)
raise
# Calculate backoff with exponential increase
backoff = backoff_base * (2 ** attempt)
logger.info("Retrying in %.1fs...", backoff)
time.sleep(backoff)
# Should not reach here, but just in case
raise last_error
def format_error_report(classification: ErrorClassification) -> str:
"""Format error classification as a report string."""
icon = "🔄" if classification.should_retry else ""
return f"{icon} {classification.category.value}: {classification.reason}"