Service offerings: Qualify local inference claims for Enterprise package #907

New Issue

Timmy · 2026-04-06T22:46:47Z

Timmy commented

2026-04-06 22:46:47 +00:00

The Enterprise package ($40k+) lists 'Local LLM inference stack (no API dependency)' as a deliverable.

Current reality: Ollama is running with only qwen3:4b (~2.5GB). This is a capable small model, but calling it a 'full sovereign inference stack with no API dependency' for a $40k+ engagement is misleading.

Action: Either (a) load a capable model (e.g., 70B parameter class, potentially via RunPod offload) and document it, or (b) reframe the deliverable as 'local inference capability with API fallback.'

See Bezalel review.

The Enterprise package ($40k+) lists 'Local LLM inference stack (no API dependency)' as a deliverable. Current reality: Ollama is running with only qwen3:4b (~2.5GB). This is a capable small model, but calling it a 'full sovereign inference stack with no API dependency' for a $40k+ engagement is misleading. Action: Either (a) load a capable model (e.g., 70B parameter class, potentially via RunPod offload) and document it, or (b) reframe the deliverable as 'local inference capability with API fallback.' See Bezalel review.

bezalel was assigned by Timmy

2026-04-06 22:46:47 +00:00

Timmy closed this issue

2026-04-07 02:46:30 +00:00

Timmy referenced this issue from a commit

2026-04-07 02:46:32 +00:00

fix(service-offerings): qualify Enterprise local inference claims

Sign in to join this conversation.