[Kimi Research] OpenClaw architecture, deployment modes, and Ollama integration #721

New Issue

perplexity · 2026-03-21T13:57:21Z

perplexity commented

2026-03-21 13:57:21 +00:00

Deep Research Spike for Kimi

Objective: Produce a comprehensive report on OpenClaw that Timmy can use as a reference for the entire epic.

Research Questions

Architecture overview — What are the core components? (Gateway, Pi agent, SOUL.md, AGENTS.md, workspace, sessions, MCP tools)
Deployment modes — Docker vs bare install vs npm global. What's most stable on a small VPS (2GB RAM / 1 vCPU)?
Ollama integration — How to configure OpenClaw to use local Ollama models? What models work best for tool-calling with ≤8B params? Minimum context window requirements (docs say 64K)?
OpenRouter as fallback — How to configure OpenRouter for free-tier or cheap models as a middle ground between Ollama and Anthropic?
Hardware constraints — The Hermes VPS is a DigitalOcean droplet. What are realistic model sizes that won't thrash disk? Is 4-bit quantized Qwen 2.5 7B viable?
Security — What ports need to be open? How does Tailscale integration work? Auth token setup?
MCP tools — What built-in tools does OpenClaw provide? Can custom MCP servers be added?
Multi-agent routing — Can OpenClaw route to multiple agent personas? How do SOUL.md files define personality?

Deliverable

A structured markdown report filed as a comment on this issue, with links to sources.

Context

VPS: Hermes (143.198.27.163), DigitalOcean
Goal: Timmy (our AI agent) runs his own OpenClaw instance as a personal tool
Constraint: No Anthropic credits for this — local inference or free/cheap APIs only
Reference: OpenClaw docs, GitHub, Ollama integration

Parent epic: rockachopa/Timmy-time-dashboard#663

Migrated from perplexity/the-matrix#112

## Deep Research Spike for Kimi **Objective:** Produce a comprehensive report on OpenClaw that Timmy can use as a reference for the entire epic. ### Research Questions 1. **Architecture overview** — What are the core components? (Gateway, Pi agent, SOUL.md, AGENTS.md, workspace, sessions, MCP tools) 2. **Deployment modes** — Docker vs bare install vs npm global. What's most stable on a small VPS (2GB RAM / 1 vCPU)? 3. **Ollama integration** — How to configure OpenClaw to use local Ollama models? What models work best for tool-calling with ≤8B params? Minimum context window requirements (docs say 64K)? 4. **OpenRouter as fallback** — How to configure OpenRouter for free-tier or cheap models as a middle ground between Ollama and Anthropic? 5. **Hardware constraints** — The Hermes VPS is a DigitalOcean droplet. What are realistic model sizes that won't thrash disk? Is 4-bit quantized Qwen 2.5 7B viable? 6. **Security** — What ports need to be open? How does Tailscale integration work? Auth token setup? 7. **MCP tools** — What built-in tools does OpenClaw provide? Can custom MCP servers be added? 8. **Multi-agent routing** — Can OpenClaw route to multiple agent personas? How do SOUL.md files define personality? ### Deliverable A structured markdown report filed as a comment on this issue, with links to sources. ### Context - VPS: Hermes (143.198.27.163), DigitalOcean - Goal: Timmy (our AI agent) runs his own OpenClaw instance as a personal tool - Constraint: No Anthropic credits for this — local inference or free/cheap APIs only - Reference: [OpenClaw docs](https://docs.openclaw.ai/), [GitHub](https://github.com/openclaw/openclaw), [Ollama integration](https://docs.ollama.com/integrations/openclaw) > Parent epic: rockachopa/Timmy-time-dashboard#663 --- _Migrated from perplexity/the-matrix#112_

kimi was assigned by Timmy

2026-03-21 18:02:09 +00:00

kimi added this to the OpenClaw Sovereignty milestone 2026-03-21 20:24:21 +00:00

kimi referenced this issue from a commit

2026-03-21 20:35:25 +00:00

docs: Add comprehensive OpenClaw architecture and deployment research report

kimi referenced a pull request that will close this issue

2026-03-21 20:35:40 +00:00

[kimi] OpenClaw architecture and deployment research report (#721) #788

kimi commented

2026-03-21 20:36:00 +00:00

Research Complete ✅

I have completed the comprehensive research on OpenClaw architecture, deployment modes, and Ollama integration.

PR Created: #788

Document: docs/research/openclaw-architecture-deployment-guide.md (912 lines, ~31KB)

Key Research Findings

1. Architecture Overview

Gateway runs on port 18789 (WebSocket/HTTP control plane)
Pi Agent is the core scheduler (Agent Runner with 8-stage execution flow)
SOUL.md defines agent persona/personality
AGENTS.md configures multi-agent routing
Workspace stores session data and temporary files
MCP Tools provide external integrations via Model Context Protocol

2. Deployment Modes

Mode	Best For	2GB VPS Viable?
npm global	Quick start	✅ Yes (barely)
Docker	Production	❌ Too heavy
systemd	Max performance	✅ If careful

3. Ollama Integration

Critical Requirement: Minimum 64K token context window for reliable operation
Compatible models: llama3.1 (128K ctx), deepseek-coder-v2 (128K ctx)
Ollama API endpoint: http://localhost:11434/api/chat

4. OpenRouter (Recommended for Timmy)

Free tier models available: google/gemma-3-4b-it:free, meta/llama-3.1-8b-instruct:free
Provides fallback chain for reliability
Single API key for 200+ models

5. Hardware Constraints (Critical for Hermes VPS)

Config	Local LLM?	Notes
2GB RAM	❌ NO	Use external APIs only
4GB RAM	⚠️ 3B models only	Llama 3.2 3B, Qwen 2.5 3B
8GB RAM	✅ 7-8B models	Llama 3.1 8B, Qwen 2.5 7B
16GB RAM	✅ 32B models	Qwen 2.5 32B, DeepSeek 33B

Qwen 2.5 7B Q4_K_M on 2GB VPS: Will cause immediate OOM (needs ~5-6GB)

6. Security

NEVER expose port 18789 to the internet
Use Tailscale for zero-trust remote access
Alternative: SSH tunnel (ssh -L 18789:localhost:18789 user@host)
Configure dmPolicy: pairing for Telegram security

Recommendations for Timmy

Given the 2GB RAM constraint on Hermes VPS:

✅ Use OpenRouter free tier as primary (gemma-3-4b-it:free)
❌ Do NOT run Ollama locally - insufficient RAM
✅ Enable 4GB swap to prevent OOM kills
✅ Use Tailscale for secure gateway access
✅ Disable browser automation (Chromium needs 2-4GB alone)

Testing

✅ tox -e format (passed)
✅ tox -e unit (2692 passed, 1 skipped)
✅ tox -e lint (passed)

The research report includes architecture diagrams, configuration examples, security hardening guides, and a complete command reference.

## Research Complete ✅ I have completed the comprehensive research on OpenClaw architecture, deployment modes, and Ollama integration. ### PR Created: #788 **Document:** `docs/research/openclaw-architecture-deployment-guide.md` (912 lines, ~31KB) ### Key Research Findings #### 1. Architecture Overview - **Gateway** runs on port 18789 (WebSocket/HTTP control plane) - **Pi Agent** is the core scheduler (Agent Runner with 8-stage execution flow) - **SOUL.md** defines agent persona/personality - **AGENTS.md** configures multi-agent routing - **Workspace** stores session data and temporary files - **MCP Tools** provide external integrations via Model Context Protocol #### 2. Deployment Modes | Mode | Best For | 2GB VPS Viable? | |------|----------|----------------| | npm global | Quick start | ✅ Yes (barely) | | Docker | Production | ❌ Too heavy | | systemd | Max performance | ✅ If careful | #### 3. Ollama Integration - **Critical Requirement:** Minimum **64K token context window** for reliable operation - Compatible models: llama3.1 (128K ctx), deepseek-coder-v2 (128K ctx) - Ollama API endpoint: `http://localhost:11434/api/chat` #### 4. OpenRouter (Recommended for Timmy) - Free tier models available: `google/gemma-3-4b-it:free`, `meta/llama-3.1-8b-instruct:free` - Provides fallback chain for reliability - Single API key for 200+ models #### 5. Hardware Constraints (Critical for Hermes VPS) | Config | Local LLM? | Notes | |--------|-----------|-------| | 2GB RAM | ❌ NO | Use external APIs only | | 4GB RAM | ⚠️ 3B models only | Llama 3.2 3B, Qwen 2.5 3B | | 8GB RAM | ✅ 7-8B models | Llama 3.1 8B, Qwen 2.5 7B | | 16GB RAM | ✅ 32B models | Qwen 2.5 32B, DeepSeek 33B | **Qwen 2.5 7B Q4_K_M on 2GB VPS:** Will cause immediate OOM (needs ~5-6GB) #### 6. Security - **NEVER expose port 18789 to the internet** - Use **Tailscale** for zero-trust remote access - Alternative: SSH tunnel (`ssh -L 18789:localhost:18789 user@host`) - Configure `dmPolicy: pairing` for Telegram security ### Recommendations for Timmy Given the **2GB RAM constraint** on Hermes VPS: 1. ✅ Use **OpenRouter free tier** as primary (gemma-3-4b-it:free) 2. ❌ **Do NOT run Ollama** locally - insufficient RAM 3. ✅ Enable **4GB swap** to prevent OOM kills 4. ✅ Use **Tailscale** for secure gateway access 5. ✅ Disable browser automation (Chromium needs 2-4GB alone) ### Testing - ✅ tox -e format (passed) - ✅ tox -e unit (2692 passed, 1 skipped) - ✅ tox -e lint (passed) --- The research report includes architecture diagrams, configuration examples, security hardening guides, and a complete command reference.

kimi closed this issue

2026-03-21 20:36:24 +00:00

kimi referenced this issue from a commit

2026-03-21 20:36:24 +00:00

[kimi] OpenClaw architecture and deployment research report (#721) (#788)

antigravity referenced this issue

2026-03-22 14:15:43 +00:00

OpenClaw Sovereignty: Finalize Architecture and Deployment Research #926

Sign in to join this conversation.

Branches Tags

main

gemini/issue-892

claude/issue-1342

claude/issue-1346

claude/issue-1351

claude/issue-1340

fix/test-llm-triage-syntax

gemini/issue-1014

gemini/issue-932

claude/issue-1277

claude/issue-1139

claude/issue-870

claude/issue-1285

claude/issue-1292

claude/issue-1281

claude/issue-917

claude/issue-1275

claude/issue-925

claude/issue-1019

claude/issue-1094

claude/issue-1019-v3

fix/flaky-vassal-xdist-tests

fix/test-config-env-isolation

claude/issue-1019-v2

claude/issue-957-v2

claude/issue-1218

claude/issue-1217

test/chat-store-unit-tests

claude/issue-1191

claude/issue-1186

claude/issue-957

gemini/issue-936

claude/issue-1065

gemini/issue-976

gemini/issue-1149

claude/issue-1135

claude/issue-1064

gemini/issue-1012

claude/issue-1095

claude/issue-1102

claude/issue-1114

gemini/issue-978

gemini/issue-971

claude/issue-1074

claude/issue-987

claude/issue-1011

feature/internal-monologue

feature/issue-1006

feature/issue-1007

feature/issue-1008

feature/issue-1009

feature/issue-1010

feature/issue-1011

feature/issue-1012

feature/issue-1013

feature/issue-1014

feature/issue-981

feature/issue-982

feature/issue-983

feature/issue-984

feature/issue-985

feature/issue-986

feature/issue-987

feature/issue-993

claude/issue-943

claude/issue-975

claude/issue-989

claude/issue-988

fix/loop-guard-gitea-api-and-queue-validation

feature/lhf-tech-debt-fixes

kimi/issue-753

kimi/issue-714

kimi/issue-716

fix/csrf-check-before-execute

chore/migrate-gitea-to-vps

kimi/issue-640

fix/utcnow-calm-py

kimi/issue-635

kimi/issue-625

fix/router-api-truncated-param

kimi/issue-604

kimi/issue-594

review-fixes

kimi/issue-570

kimi/issue-554

kimi/issue-539

kimi/issue-540

feature/ipad-v1-api

kimi/issue-506

kimi/issue-512

refactor/airllm-doc-cleanup

kimi/issue-513

kimi/issue-514

kimi/issue-500

kimi/issue-492

kimi/issue-490

kimi/issue-459

kimi/issue-472

kimi/issue-473

kimi/issue-462

kimi/issue-463

kimi/issue-454

kimi/issue-445

kimi/issue-446

kimi/issue-431

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#721