Service offerings: Qualify local inference claims for Enterprise package #907

Closed
opened 2026-04-06 22:46:47 +00:00 by Timmy · 0 comments
Owner

The Enterprise package ($40k+) lists 'Local LLM inference stack (no API dependency)' as a deliverable.

Current reality: Ollama is running with only qwen3:4b (~2.5GB). This is a capable small model, but calling it a 'full sovereign inference stack with no API dependency' for a $40k+ engagement is misleading.

Action: Either (a) load a capable model (e.g., 70B parameter class, potentially via RunPod offload) and document it, or (b) reframe the deliverable as 'local inference capability with API fallback.'

See Bezalel review.

The Enterprise package ($40k+) lists 'Local LLM inference stack (no API dependency)' as a deliverable. Current reality: Ollama is running with only qwen3:4b (~2.5GB). This is a capable small model, but calling it a 'full sovereign inference stack with no API dependency' for a $40k+ engagement is misleading. Action: Either (a) load a capable model (e.g., 70B parameter class, potentially via RunPod offload) and document it, or (b) reframe the deliverable as 'local inference capability with API fallback.' See Bezalel review.
bezalel was assigned by Timmy 2026-04-06 22:46:47 +00:00
Timmy closed this issue 2026-04-07 02:46:30 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#907