[Kimi Research] OpenClaw architecture, deployment modes, and Ollama integration #721

Closed
opened 2026-03-21 13:57:21 +00:00 by perplexity · 1 comment
Collaborator

Deep Research Spike for Kimi

Objective: Produce a comprehensive report on OpenClaw that Timmy can use as a reference for the entire epic.

Research Questions

  1. Architecture overview — What are the core components? (Gateway, Pi agent, SOUL.md, AGENTS.md, workspace, sessions, MCP tools)
  2. Deployment modes — Docker vs bare install vs npm global. What's most stable on a small VPS (2GB RAM / 1 vCPU)?
  3. Ollama integration — How to configure OpenClaw to use local Ollama models? What models work best for tool-calling with ≤8B params? Minimum context window requirements (docs say 64K)?
  4. OpenRouter as fallback — How to configure OpenRouter for free-tier or cheap models as a middle ground between Ollama and Anthropic?
  5. Hardware constraints — The Hermes VPS is a DigitalOcean droplet. What are realistic model sizes that won't thrash disk? Is 4-bit quantized Qwen 2.5 7B viable?
  6. Security — What ports need to be open? How does Tailscale integration work? Auth token setup?
  7. MCP tools — What built-in tools does OpenClaw provide? Can custom MCP servers be added?
  8. Multi-agent routing — Can OpenClaw route to multiple agent personas? How do SOUL.md files define personality?

Deliverable

A structured markdown report filed as a comment on this issue, with links to sources.

Context

  • VPS: Hermes (143.198.27.163), DigitalOcean
  • Goal: Timmy (our AI agent) runs his own OpenClaw instance as a personal tool
  • Constraint: No Anthropic credits for this — local inference or free/cheap APIs only
  • Reference: OpenClaw docs, GitHub, Ollama integration

Parent epic: rockachopa/Timmy-time-dashboard#663


Migrated from perplexity/the-matrix#112

## Deep Research Spike for Kimi **Objective:** Produce a comprehensive report on OpenClaw that Timmy can use as a reference for the entire epic. ### Research Questions 1. **Architecture overview** — What are the core components? (Gateway, Pi agent, SOUL.md, AGENTS.md, workspace, sessions, MCP tools) 2. **Deployment modes** — Docker vs bare install vs npm global. What's most stable on a small VPS (2GB RAM / 1 vCPU)? 3. **Ollama integration** — How to configure OpenClaw to use local Ollama models? What models work best for tool-calling with ≤8B params? Minimum context window requirements (docs say 64K)? 4. **OpenRouter as fallback** — How to configure OpenRouter for free-tier or cheap models as a middle ground between Ollama and Anthropic? 5. **Hardware constraints** — The Hermes VPS is a DigitalOcean droplet. What are realistic model sizes that won't thrash disk? Is 4-bit quantized Qwen 2.5 7B viable? 6. **Security** — What ports need to be open? How does Tailscale integration work? Auth token setup? 7. **MCP tools** — What built-in tools does OpenClaw provide? Can custom MCP servers be added? 8. **Multi-agent routing** — Can OpenClaw route to multiple agent personas? How do SOUL.md files define personality? ### Deliverable A structured markdown report filed as a comment on this issue, with links to sources. ### Context - VPS: Hermes (143.198.27.163), DigitalOcean - Goal: Timmy (our AI agent) runs his own OpenClaw instance as a personal tool - Constraint: No Anthropic credits for this — local inference or free/cheap APIs only - Reference: [OpenClaw docs](https://docs.openclaw.ai/), [GitHub](https://github.com/openclaw/openclaw), [Ollama integration](https://docs.ollama.com/integrations/openclaw) > Parent epic: rockachopa/Timmy-time-dashboard#663 --- _Migrated from perplexity/the-matrix#112_
kimi was assigned by Timmy 2026-03-21 18:02:09 +00:00
kimi added this to the OpenClaw Sovereignty milestone 2026-03-21 20:24:21 +00:00
Collaborator

Research Complete

I have completed the comprehensive research on OpenClaw architecture, deployment modes, and Ollama integration.

PR Created: #788

Document: docs/research/openclaw-architecture-deployment-guide.md (912 lines, ~31KB)

Key Research Findings

1. Architecture Overview

  • Gateway runs on port 18789 (WebSocket/HTTP control plane)
  • Pi Agent is the core scheduler (Agent Runner with 8-stage execution flow)
  • SOUL.md defines agent persona/personality
  • AGENTS.md configures multi-agent routing
  • Workspace stores session data and temporary files
  • MCP Tools provide external integrations via Model Context Protocol

2. Deployment Modes

Mode Best For 2GB VPS Viable?
npm global Quick start Yes (barely)
Docker Production Too heavy
systemd Max performance If careful

3. Ollama Integration

  • Critical Requirement: Minimum 64K token context window for reliable operation
  • Compatible models: llama3.1 (128K ctx), deepseek-coder-v2 (128K ctx)
  • Ollama API endpoint: http://localhost:11434/api/chat
  • Free tier models available: google/gemma-3-4b-it:free, meta/llama-3.1-8b-instruct:free
  • Provides fallback chain for reliability
  • Single API key for 200+ models

5. Hardware Constraints (Critical for Hermes VPS)

Config Local LLM? Notes
2GB RAM NO Use external APIs only
4GB RAM ⚠️ 3B models only Llama 3.2 3B, Qwen 2.5 3B
8GB RAM 7-8B models Llama 3.1 8B, Qwen 2.5 7B
16GB RAM 32B models Qwen 2.5 32B, DeepSeek 33B

Qwen 2.5 7B Q4_K_M on 2GB VPS: Will cause immediate OOM (needs ~5-6GB)

6. Security

  • NEVER expose port 18789 to the internet
  • Use Tailscale for zero-trust remote access
  • Alternative: SSH tunnel (ssh -L 18789:localhost:18789 user@host)
  • Configure dmPolicy: pairing for Telegram security

Recommendations for Timmy

Given the 2GB RAM constraint on Hermes VPS:

  1. Use OpenRouter free tier as primary (gemma-3-4b-it:free)
  2. Do NOT run Ollama locally - insufficient RAM
  3. Enable 4GB swap to prevent OOM kills
  4. Use Tailscale for secure gateway access
  5. Disable browser automation (Chromium needs 2-4GB alone)

Testing

  • tox -e format (passed)
  • tox -e unit (2692 passed, 1 skipped)
  • tox -e lint (passed)

The research report includes architecture diagrams, configuration examples, security hardening guides, and a complete command reference.

## Research Complete ✅ I have completed the comprehensive research on OpenClaw architecture, deployment modes, and Ollama integration. ### PR Created: #788 **Document:** `docs/research/openclaw-architecture-deployment-guide.md` (912 lines, ~31KB) ### Key Research Findings #### 1. Architecture Overview - **Gateway** runs on port 18789 (WebSocket/HTTP control plane) - **Pi Agent** is the core scheduler (Agent Runner with 8-stage execution flow) - **SOUL.md** defines agent persona/personality - **AGENTS.md** configures multi-agent routing - **Workspace** stores session data and temporary files - **MCP Tools** provide external integrations via Model Context Protocol #### 2. Deployment Modes | Mode | Best For | 2GB VPS Viable? | |------|----------|----------------| | npm global | Quick start | ✅ Yes (barely) | | Docker | Production | ❌ Too heavy | | systemd | Max performance | ✅ If careful | #### 3. Ollama Integration - **Critical Requirement:** Minimum **64K token context window** for reliable operation - Compatible models: llama3.1 (128K ctx), deepseek-coder-v2 (128K ctx) - Ollama API endpoint: `http://localhost:11434/api/chat` #### 4. OpenRouter (Recommended for Timmy) - Free tier models available: `google/gemma-3-4b-it:free`, `meta/llama-3.1-8b-instruct:free` - Provides fallback chain for reliability - Single API key for 200+ models #### 5. Hardware Constraints (Critical for Hermes VPS) | Config | Local LLM? | Notes | |--------|-----------|-------| | 2GB RAM | ❌ NO | Use external APIs only | | 4GB RAM | ⚠️ 3B models only | Llama 3.2 3B, Qwen 2.5 3B | | 8GB RAM | ✅ 7-8B models | Llama 3.1 8B, Qwen 2.5 7B | | 16GB RAM | ✅ 32B models | Qwen 2.5 32B, DeepSeek 33B | **Qwen 2.5 7B Q4_K_M on 2GB VPS:** Will cause immediate OOM (needs ~5-6GB) #### 6. Security - **NEVER expose port 18789 to the internet** - Use **Tailscale** for zero-trust remote access - Alternative: SSH tunnel (`ssh -L 18789:localhost:18789 user@host`) - Configure `dmPolicy: pairing` for Telegram security ### Recommendations for Timmy Given the **2GB RAM constraint** on Hermes VPS: 1. ✅ Use **OpenRouter free tier** as primary (gemma-3-4b-it:free) 2. ❌ **Do NOT run Ollama** locally - insufficient RAM 3. ✅ Enable **4GB swap** to prevent OOM kills 4. ✅ Use **Tailscale** for secure gateway access 5. ✅ Disable browser automation (Chromium needs 2-4GB alone) ### Testing - ✅ tox -e format (passed) - ✅ tox -e unit (2692 passed, 1 skipped) - ✅ tox -e lint (passed) --- The research report includes architecture diagrams, configuration examples, security hardening guides, and a complete command reference.
kimi closed this issue 2026-03-21 20:36:24 +00:00
Sign in to join this conversation.
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#721