[EPIC] Saiyah Architecture — OmiCodex & Claw Code Dual Implementation #232

Open
opened 2026-04-01 15:14:54 +00:00 by ezra · 1 comment
Member

EPIC: Saiyah Architecture — OmiCodex & Claw Code Dual Implementation

"Harness is a clever way of saying you need a horse underneath. We have fully disassembled codecs and cloud code to where we have the full operational capacity without needing to make any option calls to OpenAI or OpenRouter."
— Alexander Whitestone, Trenches Dispatch 2026-04-01


Strategic Context

This epic implements the Saiyah architecture — eliminating proprietary SDK/API dependencies ("harnesses") that leak telemetry. Instead, we deploy native operational capacity across two runtimes:

Target Runtime Implementation Current State
Hermes Agent OmiCodex BPS-native protocol Not started
Allegro BPS Claw Code Rust substrate runtime Built, needs integration

Part A: OmiCodex Implementation — Hermes Agent BPS

Definition

OmiCodex is the omnidirectional codec protocol — full disassembly of media/data codecs enabling local operational capacity without external API calls.

Objectives

  • Disassemble and catalog all critical codecs (audio, video, image, document)
  • Implement codec handlers as native Hermes tools (no external SDK)
  • Create self-contained processing pipeline:
    • Media transcoding (ffmpeg-free where possible)
    • Document parsing (no cloud OCR)
    • Audio processing (no cloud STT/TTS)
  • Establish OmiCodex as default for media ops in Hermes

Deliverables

  1. tools/omicodex/ — Codec toolset directory
  2. OmiCodexEngine class — Unified codec interface
  3. Integration tests proving zero external API calls during media processing
  4. Documentation: "OmiCodex Protocol Specification"

Resources Available

  • 16GB RAM cloud VMs (2x)
  • Mac M3 Max local capacity
  • Full ffmpeg/libav toolchain

Part B: Claw Code Implementation — Allegro BPS

Definition

Claw Code is the Rust-based substrate runtime (11MB, 5ms cold start) implementing the Provider trait, tool registry, and MCP-native architecture.

Current State (from #227, #230)

  • Binary built: /root/wizards/substrate/claw-code/rust/target/release/rusty-claude-cli (11MB)
  • Substrate runtime functional
  • Ollama bridge working (local fallback)
  • ⚠️ Kimi bridge written but needs valid API key
  • ⚠️ Hardcoded Anthropic — needs Kimi/OpenRouter support
  • Not yet integrated into Allegro BPS

Objectives

  • Port Claw Code to support configurable providers (not just Anthropic)
  • Integrate Claw runtime into Allegro BPS processing pipeline
  • Eliminate OpenRouter dependency (stop telemetry leakage)
  • Achieve 5ms cold start for Allegro agent tasks
  • Validate E2E with rich data production

Deliverables

  1. Modified Claw Code with provider-agnostic client
  2. Allegro BPS harness → Claw bridge
  3. Performance benchmarks (cold start, throughput)
  4. Telemetry-free verification report
  • #230 — Strategic Pivot: Claw Code as Guiding Star
  • #227 — Claw Code Production Report
  • #219 — Allegro Performance Failure RCA
  • #226 — Migrate Allegro from Robe to Harness

Cross-Cutting Concerns

Anti-Goals (Explicitly Out of Scope)

  • Extending wizard harness (Hermes already has its own)
  • Using proprietary SDKs (Kimi Code, Claude Code, etc.)
  • Telemetry-leaking integrations

Success Criteria

  1. Hermes can process media/documents via OmiCodex with zero external API calls
  2. Allegro runs on Claw Code substrate with <10ms cold start
  3. Both systems operate independently of OpenAI/OpenRouter/Kimi SDKs
  4. Full operational capacity demonstrated on 16GB cloud + Mac hardware

Tracking

Sub-task Owner Issue Status
OmiCodex codec disassembly TBD TBD Not started
OmiCodex Hermes integration TBD TBD Not started
Claw Code provider abstraction TBD TBD Not started
Claw Code → Allegro BPS bridge TBD TBD Not started
E2E validation & burn report TBD TBD Not started

This epic represents the full maturation of the harness engineering initiative — from "needing a horse" to owning the stable.


UPDATE 2026-04-01: Local Opus Decision

Alexander directive: Opus runs local, not API.

This means:

  • No Anthropic API calls for Opus-class reasoning
  • Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM, or similar)
  • 16GB cloud VMs + Mac M3 Max must serve full Opus-equivalent capacity
  • Part of OmiCodex scope: codec + cognition, both local

Revised scope for #233: OmiCodex includes local Opus inference runtime alongside codec disassembly.

# EPIC: Saiyah Architecture — OmiCodex & Claw Code Dual Implementation > **"Harness is a clever way of saying you need a horse underneath. We have fully disassembled codecs and cloud code to where we have the full operational capacity without needing to make any option calls to OpenAI or OpenRouter."** > — Alexander Whitestone, Trenches Dispatch 2026-04-01 --- ## Strategic Context This epic implements the **Saiyah** architecture — eliminating proprietary SDK/API dependencies ("harnesses") that leak telemetry. Instead, we deploy **native operational capacity** across two runtimes: | Target | Runtime | Implementation | Current State | |--------|---------|----------------|---------------| | **Hermes Agent** | OmiCodex | BPS-native protocol | Not started | | **Allegro BPS** | Claw Code | Rust substrate runtime | Built, needs integration | --- ## Part A: OmiCodex Implementation — Hermes Agent BPS ### Definition **OmiCodex** is the omnidirectional codec protocol — full disassembly of media/data codecs enabling local operational capacity without external API calls. ### Objectives - [ ] Disassemble and catalog all critical codecs (audio, video, image, document) - [ ] Implement codec handlers as native Hermes tools (no external SDK) - [ ] Create self-contained processing pipeline: - Media transcoding (ffmpeg-free where possible) - Document parsing (no cloud OCR) - Audio processing (no cloud STT/TTS) - [ ] Establish OmiCodex as default for media ops in Hermes ### Deliverables 1. `tools/omicodex/` — Codec toolset directory 2. `OmiCodexEngine` class — Unified codec interface 3. Integration tests proving zero external API calls during media processing 4. Documentation: "OmiCodex Protocol Specification" ### Resources Available - 16GB RAM cloud VMs (2x) - Mac M3 Max local capacity - Full ffmpeg/libav toolchain --- ## Part B: Claw Code Implementation — Allegro BPS ### Definition **Claw Code** is the Rust-based substrate runtime (11MB, 5ms cold start) implementing the Provider trait, tool registry, and MCP-native architecture. ### Current State (from #227, #230) - ✅ Binary built: `/root/wizards/substrate/claw-code/rust/target/release/rusty-claude-cli` (11MB) - ✅ Substrate runtime functional - ✅ Ollama bridge working (local fallback) - ⚠️ Kimi bridge written but needs valid API key - ⚠️ Hardcoded Anthropic — needs Kimi/OpenRouter support - ❌ Not yet integrated into Allegro BPS ### Objectives - [ ] **Port Claw Code** to support configurable providers (not just Anthropic) - [ ] **Integrate** Claw runtime into Allegro BPS processing pipeline - [ ] **Eliminate** OpenRouter dependency (stop telemetry leakage) - [ ] **Achieve** 5ms cold start for Allegro agent tasks - [ ] **Validate** E2E with rich data production ### Deliverables 1. Modified Claw Code with provider-agnostic client 2. Allegro BPS harness → Claw bridge 3. Performance benchmarks (cold start, throughput) 4. Telemetry-free verification report ### Related Issues - #230 — Strategic Pivot: Claw Code as Guiding Star - #227 — Claw Code Production Report - #219 — Allegro Performance Failure RCA - #226 — Migrate Allegro from Robe to Harness --- ## Cross-Cutting Concerns ### Anti-Goals (Explicitly Out of Scope) - ❌ Extending wizard harness (Hermes already has its own) - ❌ Using proprietary SDKs (Kimi Code, Claude Code, etc.) - ❌ Telemetry-leaking integrations ### Success Criteria 1. **Hermes** can process media/documents via OmiCodex with zero external API calls 2. **Allegro** runs on Claw Code substrate with <10ms cold start 3. Both systems operate independently of OpenAI/OpenRouter/Kimi SDKs 4. Full operational capacity demonstrated on 16GB cloud + Mac hardware --- ## Tracking | Sub-task | Owner | Issue | Status | |----------|-------|-------|--------| | OmiCodex codec disassembly | TBD | TBD | Not started | | OmiCodex Hermes integration | TBD | TBD | Not started | | Claw Code provider abstraction | TBD | TBD | Not started | | Claw Code → Allegro BPS bridge | TBD | TBD | Not started | | E2E validation & burn report | TBD | TBD | Not started | --- *This epic represents the full maturation of the harness engineering initiative — from "needing a horse" to owning the stable.* --- ## UPDATE 2026-04-01: Local Opus Decision **Alexander directive**: Opus runs **local**, not API. This means: - No Anthropic API calls for Opus-class reasoning - Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM, or similar) - 16GB cloud VMs + Mac M3 Max must serve full Opus-equivalent capacity - Part of OmiCodex scope: codec + cognition, both local **Revised scope for #233**: OmiCodex includes local Opus inference runtime alongside codec disassembly.
ezra added the architectureepicharness-engineeringsaiyah labels 2026-04-01 15:14:55 +00:00
Author
Member

UPDATE: Local Opus Decision (2026-04-01)

Alexander directive: Opus runs local, not API.

This means:

  • No Anthropic API calls for Opus-class reasoning
  • Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM)
  • 16GB cloud VMs + Mac M3 Max serve full Opus-equivalent capacity
  • Part of OmiCodex scope: codec + cognition, both local

Revised Scope for #233

OmiCodex now includes local Opus inference runtime alongside codec disassembly.

Candidate Models

Model Size Quant Fits 16GB? Notes
DeepSeek-V3 671B MoE Q4_K_M ⚠️ Borderline Best reasoning, huge
QwQ-32B 32B Q4_K_M Yes Reasoning specialist
Llama 4 Scout 109B Q4_K_M ⚠️ Tight Multimodal
Gemma 3 27B 27B Q4_K_M Yes Efficient, Google

Tasks Added to #233

  • Evaluate local Opus-equivalent options
  • Benchmark on 16GB cloud VM (Q4_K_M, Q5_K_M)
  • Benchmark on Mac M3 Max (Metal GGUF)
  • Implement local inference bridge in Hermes
  • Validate: Opus-quality reasoning without API telemetry

Saiyah principle: No harness, no telemetry, just operational capacity.

## UPDATE: Local Opus Decision (2026-04-01) **Alexander directive**: Opus runs **local**, not API. This means: - ❌ No Anthropic API calls for Opus-class reasoning - ✅ Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM) - ✅ 16GB cloud VMs + Mac M3 Max serve full Opus-equivalent capacity - ✅ Part of OmiCodex scope: codec + cognition, both local ### Revised Scope for #233 OmiCodex now includes **local Opus inference runtime** alongside codec disassembly. ### Candidate Models | Model | Size | Quant | Fits 16GB? | Notes | |-------|------|-------|------------|-------| | DeepSeek-V3 | 671B MoE | Q4_K_M | ⚠️ Borderline | Best reasoning, huge | | QwQ-32B | 32B | Q4_K_M | ✅ Yes | Reasoning specialist | | Llama 4 Scout | 109B | Q4_K_M | ⚠️ Tight | Multimodal | | Gemma 3 27B | 27B | Q4_K_M | ✅ Yes | Efficient, Google | ### Tasks Added to #233 - [ ] Evaluate local Opus-equivalent options - [ ] Benchmark on 16GB cloud VM (Q4_K_M, Q5_K_M) - [ ] Benchmark on Mac M3 Max (Metal GGUF) - [ ] Implement local inference bridge in Hermes - [ ] Validate: Opus-quality reasoning without API telemetry **Saiyah principle**: No harness, no telemetry, just operational capacity.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-home#232