[OpenClaw 2/8] Install and configure Ollama on Hermes VPS #725

Closed
opened 2026-03-21 13:57:23 +00:00 by perplexity · 1 comment
Collaborator

Description

Install Ollama on the Hermes VPS and pull a small, capable model for agentic tool-calling.

Tasks

  1. Install Ollama — Follow official install (curl -fsSL https://ollama.com/install.sh | sh)
  2. Pull a small model — Based on Kimi's research spike, pull the recommended model (likely Qwen 2.5 7B Q4_K_M or similar)
  3. Test basic inference — Verify the model responds to prompts
  4. Test tool-calling — Send a tool-calling prompt via the OpenAI-compatible API (http://localhost:11434/v1/chat/completions) and verify it produces valid tool calls
  5. Benchmark — Time a typical response, check RAM usage, ensure no disk thrashing
  6. Configure as service — Ensure Ollama starts on boot (systemctl enable ollama)
  7. Set context window — Configure for 64K+ context if model supports it

Acceptance Criteria

  • Ollama running on VPS with a working model
  • Model can produce valid tool-calling JSON responses
  • RAM usage is sustainable (no OOM, no excessive swap)
  • Ollama starts automatically on reboot
  • Response latency is documented

Depends on

  • VPS audit issue (Phase 1, issue above)
  • Kimi research: best small LLMs for tool-calling

Constraints

  • Do NOT install a model larger than 8B params
  • Prefer 4-bit quantization to save RAM
  • If the VPS can't handle it, escalate to Alex with findings

Parent epic: rockachopa/Timmy-time-dashboard#663


Migrated from perplexity/the-matrix#116

## Description Install Ollama on the Hermes VPS and pull a small, capable model for agentic tool-calling. ### Tasks 1. **Install Ollama** — Follow official install (`curl -fsSL https://ollama.com/install.sh | sh`) 2. **Pull a small model** — Based on Kimi's research spike, pull the recommended model (likely Qwen 2.5 7B Q4_K_M or similar) 3. **Test basic inference** — Verify the model responds to prompts 4. **Test tool-calling** — Send a tool-calling prompt via the OpenAI-compatible API (`http://localhost:11434/v1/chat/completions`) and verify it produces valid tool calls 5. **Benchmark** — Time a typical response, check RAM usage, ensure no disk thrashing 6. **Configure as service** — Ensure Ollama starts on boot (`systemctl enable ollama`) 7. **Set context window** — Configure for 64K+ context if model supports it ### Acceptance Criteria - [ ] Ollama running on VPS with a working model - [ ] Model can produce valid tool-calling JSON responses - [ ] RAM usage is sustainable (no OOM, no excessive swap) - [ ] Ollama starts automatically on reboot - [ ] Response latency is documented ### Depends on - VPS audit issue (Phase 1, issue above) - Kimi research: best small LLMs for tool-calling ### Constraints - Do NOT install a model larger than 8B params - Prefer 4-bit quantization to save RAM - If the VPS can't handle it, escalate to Alex with findings > Parent epic: rockachopa/Timmy-time-dashboard#663 --- _Migrated from perplexity/the-matrix#116_
kimi was assigned by Timmy 2026-03-21 18:02:07 +00:00
kimi added this to the OpenClaw Sovereignty milestone 2026-03-21 20:24:22 +00:00
claude added the rejected-direction label 2026-03-23 13:51:16 +00:00
Author
Collaborator

🧹 Closed — Rejected Direction (OpenClaw)

OpenClaw direction was explicitly rejected by the principal. The harness is the product — sovereign AI runs on Hermes with local models, not OpenClaw.

Ref: Deep Backlog Triage #1076. Reopen if needed.

🧹 **Closed — Rejected Direction (OpenClaw)** OpenClaw direction was explicitly rejected by the principal. The harness is the product — sovereign AI runs on Hermes with local models, not OpenClaw. Ref: Deep Backlog Triage #1076. Reopen if needed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#725