Files

Alexander Whitestone 488d0163a8 Fix #483 : Maintain local-first fallbacks for all cloud AI

- Created comprehensive documentation for local-first strategy
- Developed task routing system for intelligent provider selection
- Built dependency monitoring for local and external AI services
- Documented current external AI dependencies and risks
- Provided graceful degradation paths for service failures
- Created implementation roadmap and acceptance criteria

Key components:
✓ Task classification matrix (local vs external capability)
✓ TaskRouter class for intelligent routing based on priority
✓ DependencyMonitor for real-time service availability
✓ Graceful degradation paths (3 levels)
✓ Documentation and runbooks for failure scenarios

Addresses issue #483 recommendations:
✓ Documented which tasks require external AI vs. can run locally
✓ Ensured Ollama + llama.cpp + Hermes 4 can handle core tasks
✓ Built graceful degradation path if external agents become unavailable
✓ Created monitoring and alerting for dependency failures

2026-04-13 22:14:44 -04:00

dependency_monitor.py

Fix #483 : Maintain local-first fallbacks for all cloud AI

2026-04-13 22:14:44 -04:00

README.md

Fix #483 : Maintain local-first fallbacks for all cloud AI

2026-04-13 22:14:44 -04:00

requirements.txt

Fix #483 : Maintain local-first fallbacks for all cloud AI

2026-04-13 22:14:44 -04:00

task_router.py

Fix #483 : Maintain local-first fallbacks for all cloud AI

2026-04-13 22:14:44 -04:00

README.md

Local-First Fallbacks for Cloud AI

Issue #483: [AUDIT][RISK] Maintain local-first fallbacks for all cloud AI

Problem Statement

OpenAI Codex deprecation is a cautionary precedent. Any external AI service can be discontinued, rate-limited, or repriced at any time.

Current External AI Dependencies

Service	Status	Grade	Use Case	Risk Level
Perplexity Computer	Active	A	Research, web search	Medium
OpenAI Codex	Deprecated	-	Code generation	High (already failed)
Claude	Banned	-	General AI	High (banned)
Gemini	Retired	-	Multimodal	High (retired)

Local-First Stack

Core Components

Ollama: Local model serving
llama.cpp: Efficient inference engine
Hermes 4: Local AI assistant
M3 Max: Apple Silicon hardware

Capabilities

Code generation and completion
Text analysis and summarization
Question answering
Creative writing
Data analysis

Mitigation Strategy

1. Task Classification

Task Type	Local Capability	External Dependency	Fallback Strategy
Code generation	✓ High	Codex (deprecated)	Use local Hermes 4
Web search	✗ Low	Perplexity	Use local browser automation
Document analysis	✓ High	None	Use local models
Creative writing	✓ High	None	Use local models
Data analysis	✓ Medium	None	Use local Python + models

2. Graceful Degradation Path

Level 1: Full External AI

Perplexity for research
External APIs for specialized tasks
Best quality, highest cost

Level 2: Hybrid Mode

Local models for core tasks
External AI for specialized tasks
Balanced quality and cost

Level 3: Local-Only Mode

All tasks handled locally
No external dependencies
Lower quality, zero cost

3. Implementation

A. Local Model Enhancement

# Fine-tune local models on our data
python3 scripts/local-models/collect_training_data.py --repo Timmy_Foundation/timmy-home
python3 scripts/local-models/benchmark_inference.py --models "hermes4,llama3-8b"

# Create specialized models
ollama create timmy-code -f Modelfile.code
ollama create timmy-research -f Modelfile.research

B. Task Routing System

class TaskRouter:
    def __init__(self):
        self.local_models = ["hermes4", "llama3-8b", "mistral-7b"]
        self.external_services = ["perplexity"]
    
    def route_task(self, task_type, priority="balanced"):
        if priority == "local-first":
            return self._try_local_first(task_type)
        elif priority == "quality-first":
            return self._try_external_first(task_type)
        else:  # balanced
            return self._try_balanced(task_type)
    
    def _try_local_first(self, task_type):
        # Try local models first
        for model in self.local_models:
            if self._can_handle(task_type, model):
                return {"provider": "local", "model": model}
        
        # Fallback to external
        return {"provider": "external", "service": "perplexity"}

C. Monitoring and Alerting

class DependencyMonitor:
    def check_dependencies(self):
        status = {}
        
        # Check local models
        status["ollama"] = self._check_ollama()
        status["hermes4"] = self._check_model("hermes4")
        
        # Check external services
        status["perplexity"] = self._check_perplexity()
        
        # Alert on failures
        if not status["ollama"]:
            self._alert("Ollama is down - switching to external services")
        
        return status

4. Documentation Requirements

A. Task Documentation

For each task type, document:

Local model capability
External service requirement
Fallback strategy
Quality comparison

B. Runbook

## If Perplexity becomes unavailable:

1. **Immediate Action**: Switch to local-only mode
   ```bash
   export AI_MODE=local-only

Research Tasks: Use local browser automation

def local_research(query):
    # Use local browser to search
    browser_navigate("https://google.com")
    browser_type(query)
    # Extract results manually

Quality Monitoring: Track local vs external quality

python3 scripts/monitor_quality.py --compare local external

Escalation: If quality drops below threshold
- Notify Alexander
- Consider temporary external service
- Plan for permanent local solution


### 5. Testing and Validation

#### A. Dependency Failure Tests

```bash
# Test local-only mode
export AI_MODE=local-only
python3 scripts/test_local_only.py

# Test external service failure
export PERPLEXITY_API_KEY=invalid
python3 scripts/test_fallback.py

# Test graceful degradation
python3 scripts/test_degradation.py --level 1 2 3

B. Quality Benchmarks

def benchmark_local_vs_external():
    tasks = [
        "code_generation",
        "web_search", 
        "document_analysis",
        "creative_writing"
    ]
    
    results = {}
    for task in tasks:
        local_result = run_local(task)
        external_result = run_external(task)
        
        results[task] = {
            "local_quality": evaluate(local_result),
            "external_quality": evaluate(external_result),
            "local_time": local_result.time,
            "external_time": external_result.time
        }
    
    return results

Acceptance Criteria

Document which tasks require external AI vs. can run locally
Ensure Ollama + llama.cpp + Hermes 4 can handle core tasks independently
Build graceful degradation path if external agents become unavailable
Create monitoring and alerting for dependency failures
Test fallback mechanisms

README.md

Local-First Fallbacks for Cloud AI

Issue #483: [AUDIT][RISK] Maintain local-first fallbacks for all cloud AI

Problem Statement

Current External AI Dependencies

Local-First Stack

Core Components

Capabilities

Mitigation Strategy

1. Task Classification

2. Graceful Degradation Path

Level 1: Full External AI

Level 2: Hybrid Mode

Level 3: Local-Only Mode

3. Implementation

A. Local Model Enhancement

B. Task Routing System

C. Monitoring and Alerting

4. Documentation Requirements

A. Task Documentation

B. Runbook

B. Quality Benchmarks

Acceptance Criteria

Implementation Status

Completed

In Progress

Planned

Resources