[EPIC] Saiyah Architecture — OmiCodex & Claw Code Dual Implementation #232

New Issue

ezra · 2026-04-01T15:14:54Z

ezra commented

2026-04-01 15:14:54 +00:00

EPIC: Saiyah Architecture — OmiCodex & Claw Code Dual Implementation

"Harness is a clever way of saying you need a horse underneath. We have fully disassembled codecs and cloud code to where we have the full operational capacity without needing to make any option calls to OpenAI or OpenRouter."
— Alexander Whitestone, Trenches Dispatch 2026-04-01

Strategic Context

This epic implements the Saiyah architecture — eliminating proprietary SDK/API dependencies ("harnesses") that leak telemetry. Instead, we deploy native operational capacity across two runtimes:

Target	Runtime	Implementation	Current State
Hermes Agent	OmiCodex	BPS-native protocol	Not started
Allegro BPS	Claw Code	Rust substrate runtime	Built, needs integration

Part A: OmiCodex Implementation — Hermes Agent BPS

Definition

OmiCodex is the omnidirectional codec protocol — full disassembly of media/data codecs enabling local operational capacity without external API calls.

Objectives

Disassemble and catalog all critical codecs (audio, video, image, document)
Implement codec handlers as native Hermes tools (no external SDK)
Create self-contained processing pipeline:
- Media transcoding (ffmpeg-free where possible)
- Document parsing (no cloud OCR)
- Audio processing (no cloud STT/TTS)
Establish OmiCodex as default for media ops in Hermes

Deliverables

tools/omicodex/ — Codec toolset directory
OmiCodexEngine class — Unified codec interface
Integration tests proving zero external API calls during media processing
Documentation: "OmiCodex Protocol Specification"

Resources Available

16GB RAM cloud VMs (2x)
Mac M3 Max local capacity
Full ffmpeg/libav toolchain

Part B: Claw Code Implementation — Allegro BPS

Definition

Claw Code is the Rust-based substrate runtime (11MB, 5ms cold start) implementing the Provider trait, tool registry, and MCP-native architecture.

Current State (from #227, #230)

✅ Binary built: /root/wizards/substrate/claw-code/rust/target/release/rusty-claude-cli (11MB)
✅ Substrate runtime functional
✅ Ollama bridge working (local fallback)
⚠️ Kimi bridge written but needs valid API key
⚠️ Hardcoded Anthropic — needs Kimi/OpenRouter support
❌ Not yet integrated into Allegro BPS

Objectives

Port Claw Code to support configurable providers (not just Anthropic)
Integrate Claw runtime into Allegro BPS processing pipeline
Eliminate OpenRouter dependency (stop telemetry leakage)
Achieve 5ms cold start for Allegro agent tasks
Validate E2E with rich data production

Deliverables

Modified Claw Code with provider-agnostic client
Allegro BPS harness → Claw bridge
Performance benchmarks (cold start, throughput)
Telemetry-free verification report

#230 — Strategic Pivot: Claw Code as Guiding Star
#227 — Claw Code Production Report
#219 — Allegro Performance Failure RCA
#226 — Migrate Allegro from Robe to Harness

Cross-Cutting Concerns

Anti-Goals (Explicitly Out of Scope)

❌ Extending wizard harness (Hermes already has its own)
❌ Using proprietary SDKs (Kimi Code, Claude Code, etc.)
❌ Telemetry-leaking integrations

Success Criteria

Hermes can process media/documents via OmiCodex with zero external API calls
Allegro runs on Claw Code substrate with <10ms cold start
Both systems operate independently of OpenAI/OpenRouter/Kimi SDKs
Full operational capacity demonstrated on 16GB cloud + Mac hardware

Tracking

Sub-task	Owner	Issue	Status
OmiCodex codec disassembly	TBD	TBD	Not started
OmiCodex Hermes integration	TBD	TBD	Not started
Claw Code provider abstraction	TBD	TBD	Not started
Claw Code → Allegro BPS bridge	TBD	TBD	Not started
E2E validation & burn report	TBD	TBD	Not started

This epic represents the full maturation of the harness engineering initiative — from "needing a horse" to owning the stable.

UPDATE 2026-04-01: Local Opus Decision

Alexander directive: Opus runs local, not API.

This means:

No Anthropic API calls for Opus-class reasoning
Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM, or similar)
16GB cloud VMs + Mac M3 Max must serve full Opus-equivalent capacity
Part of OmiCodex scope: codec + cognition, both local

Revised scope for #233: OmiCodex includes local Opus inference runtime alongside codec disassembly.

# EPIC: Saiyah Architecture — OmiCodex & Claw Code Dual Implementation > **"Harness is a clever way of saying you need a horse underneath. We have fully disassembled codecs and cloud code to where we have the full operational capacity without needing to make any option calls to OpenAI or OpenRouter."** > — Alexander Whitestone, Trenches Dispatch 2026-04-01 --- ## Strategic Context This epic implements the **Saiyah** architecture — eliminating proprietary SDK/API dependencies ("harnesses") that leak telemetry. Instead, we deploy **native operational capacity** across two runtimes: | Target | Runtime | Implementation | Current State | |--------|---------|----------------|---------------| | **Hermes Agent** | OmiCodex | BPS-native protocol | Not started | | **Allegro BPS** | Claw Code | Rust substrate runtime | Built, needs integration | --- ## Part A: OmiCodex Implementation — Hermes Agent BPS ### Definition **OmiCodex** is the omnidirectional codec protocol — full disassembly of media/data codecs enabling local operational capacity without external API calls. ### Objectives - [ ] Disassemble and catalog all critical codecs (audio, video, image, document) - [ ] Implement codec handlers as native Hermes tools (no external SDK) - [ ] Create self-contained processing pipeline: - Media transcoding (ffmpeg-free where possible) - Document parsing (no cloud OCR) - Audio processing (no cloud STT/TTS) - [ ] Establish OmiCodex as default for media ops in Hermes ### Deliverables 1. `tools/omicodex/` — Codec toolset directory 2. `OmiCodexEngine` class — Unified codec interface 3. Integration tests proving zero external API calls during media processing 4. Documentation: "OmiCodex Protocol Specification" ### Resources Available - 16GB RAM cloud VMs (2x) - Mac M3 Max local capacity - Full ffmpeg/libav toolchain --- ## Part B: Claw Code Implementation — Allegro BPS ### Definition **Claw Code** is the Rust-based substrate runtime (11MB, 5ms cold start) implementing the Provider trait, tool registry, and MCP-native architecture. ### Current State (from #227, #230) - ✅ Binary built: `/root/wizards/substrate/claw-code/rust/target/release/rusty-claude-cli` (11MB) - ✅ Substrate runtime functional - ✅ Ollama bridge working (local fallback) - ⚠️ Kimi bridge written but needs valid API key - ⚠️ Hardcoded Anthropic — needs Kimi/OpenRouter support - ❌ Not yet integrated into Allegro BPS ### Objectives - [ ] **Port Claw Code** to support configurable providers (not just Anthropic) - [ ] **Integrate** Claw runtime into Allegro BPS processing pipeline - [ ] **Eliminate** OpenRouter dependency (stop telemetry leakage) - [ ] **Achieve** 5ms cold start for Allegro agent tasks - [ ] **Validate** E2E with rich data production ### Deliverables 1. Modified Claw Code with provider-agnostic client 2. Allegro BPS harness → Claw bridge 3. Performance benchmarks (cold start, throughput) 4. Telemetry-free verification report ### Related Issues - #230 — Strategic Pivot: Claw Code as Guiding Star - #227 — Claw Code Production Report - #219 — Allegro Performance Failure RCA - #226 — Migrate Allegro from Robe to Harness --- ## Cross-Cutting Concerns ### Anti-Goals (Explicitly Out of Scope) - ❌ Extending wizard harness (Hermes already has its own) - ❌ Using proprietary SDKs (Kimi Code, Claude Code, etc.) - ❌ Telemetry-leaking integrations ### Success Criteria 1. **Hermes** can process media/documents via OmiCodex with zero external API calls 2. **Allegro** runs on Claw Code substrate with <10ms cold start 3. Both systems operate independently of OpenAI/OpenRouter/Kimi SDKs 4. Full operational capacity demonstrated on 16GB cloud + Mac hardware --- ## Tracking | Sub-task | Owner | Issue | Status | |----------|-------|-------|--------| | OmiCodex codec disassembly | TBD | TBD | Not started | | OmiCodex Hermes integration | TBD | TBD | Not started | | Claw Code provider abstraction | TBD | TBD | Not started | | Claw Code → Allegro BPS bridge | TBD | TBD | Not started | | E2E validation & burn report | TBD | TBD | Not started | --- *This epic represents the full maturation of the harness engineering initiative — from "needing a horse" to owning the stable.* --- ## UPDATE 2026-04-01: Local Opus Decision **Alexander directive**: Opus runs **local**, not API. This means: - No Anthropic API calls for Opus-class reasoning - Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM, or similar) - 16GB cloud VMs + Mac M3 Max must serve full Opus-equivalent capacity - Part of OmiCodex scope: codec + cognition, both local **Revised scope for #233**: OmiCodex includes local Opus inference runtime alongside codec disassembly.

ezra added the architecture epic harness-engineering saiyah labels 2026-04-01 15:14:55 +00:00

ezra referenced this issue

2026-04-01 15:15:38 +00:00

[Saiyah] OmiCodex Implementation — Hermes BPS Native Codecs #233

ezra referenced this issue

2026-04-01 15:15:39 +00:00

[Saiyah] Claw Code Integration — Allegro BPS Substrate Runtime #234

ezra commented

2026-04-01 15:58:46 +00:00

UPDATE: Local Opus Decision (2026-04-01)

Alexander directive: Opus runs local, not API.

This means:

❌ No Anthropic API calls for Opus-class reasoning
✅ Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM)
✅ 16GB cloud VMs + Mac M3 Max serve full Opus-equivalent capacity
✅ Part of OmiCodex scope: codec + cognition, both local

Revised Scope for #233

OmiCodex now includes local Opus inference runtime alongside codec disassembly.

Candidate Models

Model	Size	Quant	Fits 16GB?	Notes
DeepSeek-V3	671B MoE	Q4_K_M	⚠️ Borderline	Best reasoning, huge
QwQ-32B	32B	Q4_K_M	✅ Yes	Reasoning specialist
Llama 4 Scout	109B	Q4_K_M	⚠️ Tight	Multimodal
Gemma 3 27B	27B	Q4_K_M	✅ Yes	Efficient, Google

Tasks Added to #233

Evaluate local Opus-equivalent options
Benchmark on 16GB cloud VM (Q4_K_M, Q5_K_M)
Benchmark on Mac M3 Max (Metal GGUF)
Implement local inference bridge in Hermes
Validate: Opus-quality reasoning without API telemetry

Saiyah principle: No harness, no telemetry, just operational capacity.

## UPDATE: Local Opus Decision (2026-04-01) **Alexander directive**: Opus runs **local**, not API. This means: - ❌ No Anthropic API calls for Opus-class reasoning - ✅ Local LLM inference for heavy cognitive tasks (llama.cpp, vLLM) - ✅ 16GB cloud VMs + Mac M3 Max serve full Opus-equivalent capacity - ✅ Part of OmiCodex scope: codec + cognition, both local ### Revised Scope for #233 OmiCodex now includes **local Opus inference runtime** alongside codec disassembly. ### Candidate Models | Model | Size | Quant | Fits 16GB? | Notes | |-------|------|-------|------------|-------| | DeepSeek-V3 | 671B MoE | Q4_K_M | ⚠️ Borderline | Best reasoning, huge | | QwQ-32B | 32B | Q4_K_M | ✅ Yes | Reasoning specialist | | Llama 4 Scout | 109B | Q4_K_M | ⚠️ Tight | Multimodal | | Gemma 3 27B | 27B | Q4_K_M | ✅ Yes | Efficient, Google | ### Tasks Added to #233 - [ ] Evaluate local Opus-equivalent options - [ ] Benchmark on 16GB cloud VM (Q4_K_M, Q5_K_M) - [ ] Benchmark on Mac M3 Max (Metal GGUF) - [ ] Implement local inference bridge in Hermes - [ ] Validate: Opus-quality reasoning without API telemetry **Saiyah principle**: No harness, no telemetry, just operational capacity.

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-home#232