Compare commits
1 Commits
claw-code/
...
claude/iss
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
9c2341f4ca |
@@ -1,2 +0,0 @@
|
|||||||
{"created_at_ms":1775533542734,"session_id":"session-1775533542734-0","type":"session_meta","updated_at_ms":1775533542734,"version":1}
|
|
||||||
{"message":{"blocks":[{"text":"You are Code Claw running as the Gitea user claw-code.\n\nRepository: Timmy_Foundation/hermes-agent\nIssue: #126 — P2: Validate Documentation Audit & Apply to Our Fork\nBranch: claw-code/issue-126\n\nRead the issue and recent comments, then implement the smallest correct change.\nYou are in a git repo checkout already.\n\nIssue body:\n## Context\n\nCommit `43d468ce` is a comprehensive documentation audit — fixes stale info, expands thin pages, adds depth across all docs.\n\n## Acceptance Criteria\n\n- [ ] **Catalog all doc changes**: Run `git show 43d468ce --stat` to list all files changed, then review each for what was fixed/expanded\n- [ ] **Verify key docs are accurate**: Pick 3 docs that were previously thin (setup, deployment, plugin development), confirm they now have comprehensive content\n- [ ] **Identify stale info that was corrected**: Note at least 3 pieces of stale information that were removed or updated\n- [ ] **Apply fixes to our fork if needed**: Check if any of the doc fixes apply to our `Timmy_Foundation/hermes-agent` fork (Timmy-specific references, custom config sections)\n\n## Why This Matters\n\nAccurate documentation is critical for onboarding new agents and maintaining the fleet. Stale docs cost more debugging time than writing them initially.\n\n## Hints\n\n- Run `cd ~/.hermes/hermes-agent && git show 43d468ce --stat` to see the full scope\n- The docs likely cover: setup, plugins, deployment, MCP configuration, and tool integrations\n\n\nParent: #111\n\nRecent comments:\n## 🏷️ Automated Triage Check\n\n**Timestamp:** 2026-04-06T15:30:12.449023 \n**Agent:** Allegro Heartbeat\n\nThis issue has been identified as needing triage:\n\n### Checklist\n- [ ] Clear acceptance criteria defined\n- [ ] Priority label assigned (p0-critical / p1-important / p2-backlog)\n- [ ] Size estimate added (quick-fix / day / week / epic)\n- [ ] Owner assigned\n- [ ] Related issues linked\n\n### Context\n- No comments yet — needs engagement\n- No labels — needs categorization\n- Part of automated backlog maintenance\n\n---\n*Automated triage from Allegro 15-minute heartbeat*\n\n[BURN-DOWN] Dispatched to Code Claw (claw-code worker) as part of nightly burn-down cycle. Heartbeat active.\n\n🟠 Code Claw (OpenRouter qwen/qwen3.6-plus:free) picking up this issue via 15-minute heartbeat.\n\nTimestamp: 2026-04-07T03:45:37Z\n\nRules:\n- Make focused code/config/doc changes only if they directly address the issue.\n- Prefer the smallest proof-oriented fix.\n- Run relevant verification commands if obvious.\n- Do NOT create PRs yourself; the outer worker handles commit/push/PR.\n- If the task is too large or not code-fit, leave the tree unchanged.\n","type":"text"}],"role":"user"},"type":"message"}
|
|
||||||
@@ -1,2 +0,0 @@
|
|||||||
{"created_at_ms":1775534636684,"session_id":"session-1775534636684-0","type":"session_meta","updated_at_ms":1775534636684,"version":1}
|
|
||||||
{"message":{"blocks":[{"text":"You are Code Claw running as the Gitea user claw-code.\n\nRepository: Timmy_Foundation/hermes-agent\nIssue: #151 — [CONFIG] Add Kimi model to fallback chain for Allegro and Bezalel\nBranch: claw-code/issue-151\n\nRead the issue and recent comments, then implement the smallest correct change.\nYou are in a git repo checkout already.\n\nIssue body:\n## Problem\nAllegro and Bezalel are choking because the Kimi model code is not on their fallback chain. When primary models fail or rate-limit, Kimi should be available as a fallback option but is currently missing.\n\n## Expected Behavior\nKimi model code should be at the front of the fallback chain for both Allegro and Bezalel, so they can remain responsive when primary models are unavailable.\n\n## Context\nThis was reported in Telegram by Alexander Whitestone after observing both agents becoming unresponsive. Ezra was asked to investigate the fallback chain configuration.\n\n## Related\n- timmy-config #302: [ARCH] Fallback Portfolio Runtime Wiring (general fallback framework)\n- hermes-agent #150: [BEZALEL][AUDIT] Telegram Request-to-Gitea Tracking Audit\n\n## Acceptance Criteria\n- [ ] Kimi model code is added to Allegro fallback chain\n- [ ] Kimi model code is added to Bezalel fallback chain\n- [ ] Fallback ordering places Kimi appropriately (front of chain as requested)\n- [ ] Test and confirm both agents can successfully fall back to Kimi\n- [ ] Document the fallback chain configuration for both agents\n\n/assign @ezra\n\nRecent comments:\n[BURN-DOWN] Dispatched to Code Claw (claw-code worker) as part of nightly burn-down cycle. Heartbeat active.\n\n🟠 Code Claw (OpenRouter qwen/qwen3.6-plus:free) picking up this issue via 15-minute heartbeat.\n\nTimestamp: 2026-04-07T04:03:49Z\n\nRules:\n- Make focused code/config/doc changes only if they directly address the issue.\n- Prefer the smallest proof-oriented fix.\n- Run relevant verification commands if obvious.\n- Do NOT create PRs yourself; the outer worker handles commit/push/PR.\n- If the task is too large or not code-fit, leave the tree unchanged.\n","type":"text"}],"role":"user"},"type":"message"}
|
|
||||||
@@ -1,54 +0,0 @@
|
|||||||
name: Forge CI
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: [main]
|
|
||||||
pull_request:
|
|
||||||
branches: [main]
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: forge-ci-${{ gitea.ref }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
smoke-and-build:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
timeout-minutes: 5
|
|
||||||
steps:
|
|
||||||
- name: Checkout code
|
|
||||||
uses: actions/checkout@v4
|
|
||||||
|
|
||||||
- name: Install uv
|
|
||||||
uses: astral-sh/setup-uv@v5
|
|
||||||
|
|
||||||
- name: Set up Python 3.11
|
|
||||||
run: uv python install 3.11
|
|
||||||
|
|
||||||
- name: Install package
|
|
||||||
run: |
|
|
||||||
uv venv .venv --python 3.11
|
|
||||||
source .venv/bin/activate
|
|
||||||
uv pip install -e ".[all,dev]"
|
|
||||||
|
|
||||||
- name: Smoke tests
|
|
||||||
run: |
|
|
||||||
source .venv/bin/activate
|
|
||||||
python scripts/smoke_test.py
|
|
||||||
env:
|
|
||||||
OPENROUTER_API_KEY: ""
|
|
||||||
OPENAI_API_KEY: ""
|
|
||||||
NOUS_API_KEY: ""
|
|
||||||
|
|
||||||
- name: Syntax guard
|
|
||||||
run: |
|
|
||||||
source .venv/bin/activate
|
|
||||||
python scripts/syntax_guard.py
|
|
||||||
|
|
||||||
- name: Green-path E2E
|
|
||||||
run: |
|
|
||||||
source .venv/bin/activate
|
|
||||||
python -m pytest tests/test_green_path_e2e.py -q --tb=short
|
|
||||||
env:
|
|
||||||
OPENROUTER_API_KEY: ""
|
|
||||||
OPENAI_API_KEY: ""
|
|
||||||
NOUS_API_KEY: ""
|
|
||||||
@@ -1,34 +1,44 @@
|
|||||||
model:
|
# Ezra Configuration - Kimi Primary
|
||||||
default: kimi-k2.5
|
# Anthropic removed from chain entirely
|
||||||
provider: kimi-coding
|
|
||||||
toolsets:
|
# PRIMARY: Kimi for all operations
|
||||||
- all
|
model: kimi-coding/kimi-for-coding
|
||||||
|
|
||||||
|
# Fallback chain: Only local/offline options
|
||||||
|
# NO anthropic in the chain - quota issues solved
|
||||||
fallback_providers:
|
fallback_providers:
|
||||||
- provider: kimi-coding
|
- provider: ollama
|
||||||
model: kimi-k2.5
|
model: qwen2.5:7b
|
||||||
|
base_url: http://localhost:11434
|
||||||
timeout: 120
|
timeout: 120
|
||||||
reason: Kimi coding fallback (front of chain)
|
reason: "Local fallback when Kimi unavailable"
|
||||||
- provider: anthropic
|
|
||||||
model: claude-sonnet-4-20250514
|
# Provider settings
|
||||||
timeout: 120
|
|
||||||
reason: Direct Anthropic fallback
|
|
||||||
- provider: openrouter
|
|
||||||
model: anthropic/claude-sonnet-4-20250514
|
|
||||||
base_url: https://openrouter.ai/api/v1
|
|
||||||
api_key_env: OPENROUTER_API_KEY
|
|
||||||
timeout: 120
|
|
||||||
reason: OpenRouter fallback
|
|
||||||
agent:
|
|
||||||
max_turns: 90
|
|
||||||
reasoning_effort: high
|
|
||||||
verbose: false
|
|
||||||
providers:
|
providers:
|
||||||
kimi-coding:
|
kimi-coding:
|
||||||
base_url: https://api.kimi.com/coding/v1
|
|
||||||
timeout: 60
|
timeout: 60
|
||||||
max_retries: 3
|
max_retries: 3
|
||||||
anthropic:
|
# Uses KIMI_API_KEY from .env
|
||||||
timeout: 120
|
|
||||||
openrouter:
|
ollama:
|
||||||
base_url: https://openrouter.ai/api/v1
|
|
||||||
timeout: 120
|
timeout: 120
|
||||||
|
keep_alive: true
|
||||||
|
base_url: http://localhost:11434
|
||||||
|
|
||||||
|
# REMOVED: anthropic provider entirely
|
||||||
|
# No more quota issues, no more choking
|
||||||
|
|
||||||
|
# Toolsets - Ezra needs these
|
||||||
|
toolsets:
|
||||||
|
- hermes-cli
|
||||||
|
- github
|
||||||
|
- web
|
||||||
|
|
||||||
|
# Agent settings
|
||||||
|
agent:
|
||||||
|
max_turns: 90
|
||||||
|
tool_use_enforcement: auto
|
||||||
|
|
||||||
|
# Display settings
|
||||||
|
display:
|
||||||
|
show_provider_switches: true
|
||||||
|
|||||||
@@ -1,56 +0,0 @@
|
|||||||
# Bezalel's Devkit — Shared Tools for the Wizard Fleet
|
|
||||||
|
|
||||||
This directory contains reusable CLI tools and Python modules for CI, testing, deployment, observability, and Gitea automation. Any wizard can invoke them via `python -m devkit.<tool>`.
|
|
||||||
|
|
||||||
## Tools
|
|
||||||
|
|
||||||
### `gitea_client` — Gitea API Client
|
|
||||||
List issues/PRs, post comments, create PRs, update issues.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m devkit.gitea_client issues --state open --limit 20
|
|
||||||
python -m devkit.gitea_client create-comment --number 142 --body "Update from Bezalel"
|
|
||||||
python -m devkit.gitea_client prs --state open
|
|
||||||
```
|
|
||||||
|
|
||||||
### `health` — Fleet Health Monitor
|
|
||||||
Checks system load, disk, memory, running processes, and key package versions.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m devkit.health --threshold-load 1.0 --threshold-disk 90.0 --fail-on-critical
|
|
||||||
```
|
|
||||||
|
|
||||||
### `notebook_runner` — Notebook Execution Wrapper
|
|
||||||
Parameterizes and executes Jupyter notebooks via Papermill with structured JSON reporting.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m devkit.notebook_runner task.ipynb output.ipynb -p threshold=1.0 -p hostname=forge
|
|
||||||
```
|
|
||||||
|
|
||||||
### `smoke_test` — Fast Smoke Test Runner
|
|
||||||
Runs core import checks, CLI entrypoint tests, and one bare green-path E2E.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m devkit.smoke_test --verbose
|
|
||||||
```
|
|
||||||
|
|
||||||
### `secret_scan` — Secret Leak Scanner
|
|
||||||
Scans the repo for API keys, tokens, and private keys.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m devkit.secret_scan --path . --fail-on-find
|
|
||||||
```
|
|
||||||
|
|
||||||
### `wizard_env` — Environment Validator
|
|
||||||
Checks that a wizard environment has all required binaries, env vars, Python packages, and Hermes config.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m devkit.wizard_env --json --fail-on-incomplete
|
|
||||||
```
|
|
||||||
|
|
||||||
## Philosophy
|
|
||||||
|
|
||||||
- **CLI-first** — Every tool is runnable as `python -m devkit.<tool>`
|
|
||||||
- **JSON output** — Easy to parse from other agents and CI pipelines
|
|
||||||
- **Zero dependencies beyond stdlib** where possible; optional heavy deps are runtime-checked
|
|
||||||
- **Fail-fast** — Exit codes are meaningful for CI gating
|
|
||||||
@@ -1,9 +0,0 @@
|
|||||||
"""
|
|
||||||
Bezalel's Devkit — Shared development tools for the wizard fleet.
|
|
||||||
|
|
||||||
A collection of CLI-accessible utilities for CI, testing, deployment,
|
|
||||||
observability, and Gitea automation. Designed to be used by any agent
|
|
||||||
via subprocess or direct Python import.
|
|
||||||
"""
|
|
||||||
|
|
||||||
__version__ = "0.1.0"
|
|
||||||
@@ -1,153 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Shared Gitea API client for wizard fleet automation.
|
|
||||||
|
|
||||||
Usage as CLI:
|
|
||||||
python -m devkit.gitea_client issues --repo Timmy_Foundation/hermes-agent --state open
|
|
||||||
python -m devkit.gitea_client issue --repo Timmy_Foundation/hermes-agent --number 142
|
|
||||||
python -m devkit.gitea_client create-comment --repo Timmy_Foundation/hermes-agent --number 142 --body "Update from Bezalel"
|
|
||||||
python -m devkit.gitea_client prs --repo Timmy_Foundation/hermes-agent --state open
|
|
||||||
|
|
||||||
Usage as module:
|
|
||||||
from devkit.gitea_client import GiteaClient
|
|
||||||
client = GiteaClient()
|
|
||||||
issues = client.list_issues("Timmy_Foundation/hermes-agent", state="open")
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
from typing import Any, Dict, List, Optional
|
|
||||||
|
|
||||||
import urllib.request
|
|
||||||
|
|
||||||
|
|
||||||
DEFAULT_BASE_URL = os.getenv("GITEA_URL", "https://forge.alexanderwhitestone.com")
|
|
||||||
DEFAULT_TOKEN = os.getenv("GITEA_TOKEN", "")
|
|
||||||
|
|
||||||
|
|
||||||
class GiteaClient:
|
|
||||||
def __init__(self, base_url: str = DEFAULT_BASE_URL, token: str = DEFAULT_TOKEN):
|
|
||||||
self.base_url = base_url.rstrip("/")
|
|
||||||
self.token = token or ""
|
|
||||||
|
|
||||||
def _request(
|
|
||||||
self,
|
|
||||||
method: str,
|
|
||||||
path: str,
|
|
||||||
data: Optional[Dict[str, Any]] = None,
|
|
||||||
headers: Optional[Dict[str, str]] = None,
|
|
||||||
) -> Any:
|
|
||||||
url = f"{self.base_url}/api/v1{path}"
|
|
||||||
req_headers = {"Content-Type": "application/json", "Accept": "application/json"}
|
|
||||||
if self.token:
|
|
||||||
req_headers["Authorization"] = f"token {self.token}"
|
|
||||||
if headers:
|
|
||||||
req_headers.update(headers)
|
|
||||||
|
|
||||||
body = json.dumps(data).encode() if data else None
|
|
||||||
req = urllib.request.Request(url, data=body, headers=req_headers, method=method)
|
|
||||||
|
|
||||||
try:
|
|
||||||
with urllib.request.urlopen(req) as resp:
|
|
||||||
return json.loads(resp.read().decode())
|
|
||||||
except urllib.error.HTTPError as e:
|
|
||||||
return {"error": True, "status": e.code, "body": e.read().decode()}
|
|
||||||
|
|
||||||
def list_issues(self, repo: str, state: str = "open", limit: int = 50) -> List[Dict]:
|
|
||||||
return self._request("GET", f"/repos/{repo}/issues?state={state}&limit={limit}") or []
|
|
||||||
|
|
||||||
def get_issue(self, repo: str, number: int) -> Dict:
|
|
||||||
return self._request("GET", f"/repos/{repo}/issues/{number}") or {}
|
|
||||||
|
|
||||||
def create_comment(self, repo: str, number: int, body: str) -> Dict:
|
|
||||||
return self._request(
|
|
||||||
"POST", f"/repos/{repo}/issues/{number}/comments", {"body": body}
|
|
||||||
)
|
|
||||||
|
|
||||||
def update_issue(self, repo: str, number: int, **fields) -> Dict:
|
|
||||||
return self._request("PATCH", f"/repos/{repo}/issues/{number}", fields)
|
|
||||||
|
|
||||||
def list_prs(self, repo: str, state: str = "open", limit: int = 50) -> List[Dict]:
|
|
||||||
return self._request("GET", f"/repos/{repo}/pulls?state={state}&limit={limit}") or []
|
|
||||||
|
|
||||||
def get_pr(self, repo: str, number: int) -> Dict:
|
|
||||||
return self._request("GET", f"/repos/{repo}/pulls/{number}") or {}
|
|
||||||
|
|
||||||
def create_pr(self, repo: str, title: str, head: str, base: str, body: str = "") -> Dict:
|
|
||||||
return self._request(
|
|
||||||
"POST",
|
|
||||||
f"/repos/{repo}/pulls",
|
|
||||||
{"title": title, "head": head, "base": base, "body": body},
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _fmt_json(obj: Any) -> str:
|
|
||||||
return json.dumps(obj, indent=2, ensure_ascii=False)
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str] = None) -> int:
|
|
||||||
argv = argv or sys.argv[1:]
|
|
||||||
parser = argparse.ArgumentParser(description="Gitea CLI for wizard fleet")
|
|
||||||
parser.add_argument("--repo", default="Timmy_Foundation/hermes-agent", help="Repository full name")
|
|
||||||
parser.add_argument("--token", default=DEFAULT_TOKEN, help="Gitea API token")
|
|
||||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL, help="Gitea base URL")
|
|
||||||
sub = parser.add_subparsers(dest="cmd")
|
|
||||||
|
|
||||||
p_issues = sub.add_parser("issues", help="List issues")
|
|
||||||
p_issues.add_argument("--state", default="open")
|
|
||||||
p_issues.add_argument("--limit", type=int, default=50)
|
|
||||||
|
|
||||||
p_issue = sub.add_parser("issue", help="Get single issue")
|
|
||||||
p_issue.add_argument("--number", type=int, required=True)
|
|
||||||
|
|
||||||
p_prs = sub.add_parser("prs", help="List PRs")
|
|
||||||
p_prs.add_argument("--state", default="open")
|
|
||||||
p_prs.add_argument("--limit", type=int, default=50)
|
|
||||||
|
|
||||||
p_pr = sub.add_parser("pr", help="Get single PR")
|
|
||||||
p_pr.add_argument("--number", type=int, required=True)
|
|
||||||
|
|
||||||
p_comment = sub.add_parser("create-comment", help="Post comment on issue/PR")
|
|
||||||
p_comment.add_argument("--number", type=int, required=True)
|
|
||||||
p_comment.add_argument("--body", required=True)
|
|
||||||
|
|
||||||
p_update = sub.add_parser("update-issue", help="Update issue fields")
|
|
||||||
p_update.add_argument("--number", type=int, required=True)
|
|
||||||
p_update.add_argument("--title", default=None)
|
|
||||||
p_update.add_argument("--body", default=None)
|
|
||||||
p_update.add_argument("--state", default=None)
|
|
||||||
|
|
||||||
p_create_pr = sub.add_parser("create-pr", help="Create a PR")
|
|
||||||
p_create_pr.add_argument("--title", required=True)
|
|
||||||
p_create_pr.add_argument("--head", required=True)
|
|
||||||
p_create_pr.add_argument("--base", default="main")
|
|
||||||
p_create_pr.add_argument("--body", default="")
|
|
||||||
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
client = GiteaClient(base_url=args.base_url, token=args.token)
|
|
||||||
|
|
||||||
if args.cmd == "issues":
|
|
||||||
print(_fmt_json(client.list_issues(args.repo, args.state, args.limit)))
|
|
||||||
elif args.cmd == "issue":
|
|
||||||
print(_fmt_json(client.get_issue(args.repo, args.number)))
|
|
||||||
elif args.cmd == "prs":
|
|
||||||
print(_fmt_json(client.list_prs(args.repo, args.state, args.limit)))
|
|
||||||
elif args.cmd == "pr":
|
|
||||||
print(_fmt_json(client.get_pr(args.repo, args.number)))
|
|
||||||
elif args.cmd == "create-comment":
|
|
||||||
print(_fmt_json(client.create_comment(args.repo, args.number, args.body)))
|
|
||||||
elif args.cmd == "update-issue":
|
|
||||||
fields = {k: v for k, v in {"title": args.title, "body": args.body, "state": args.state}.items() if v is not None}
|
|
||||||
print(_fmt_json(client.update_issue(args.repo, args.number, **fields)))
|
|
||||||
elif args.cmd == "create-pr":
|
|
||||||
print(_fmt_json(client.create_pr(args.repo, args.title, args.head, args.base, args.body)))
|
|
||||||
else:
|
|
||||||
parser.print_help()
|
|
||||||
return 1
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
sys.exit(main())
|
|
||||||
134
devkit/health.py
134
devkit/health.py
@@ -1,134 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Fleet health monitor for wizard agents.
|
|
||||||
Checks local system state and reports structured health metrics.
|
|
||||||
|
|
||||||
Usage as CLI:
|
|
||||||
python -m devkit.health
|
|
||||||
python -m devkit.health --threshold-load 1.0 --check-disk
|
|
||||||
|
|
||||||
Usage as module:
|
|
||||||
from devkit.health import check_health
|
|
||||||
report = check_health()
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import shutil
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
from typing import Any, Dict, List
|
|
||||||
|
|
||||||
|
|
||||||
def _run(cmd: List[str]) -> str:
|
|
||||||
try:
|
|
||||||
return subprocess.check_output(cmd, stderr=subprocess.DEVNULL).decode().strip()
|
|
||||||
except Exception as e:
|
|
||||||
return f"error: {e}"
|
|
||||||
|
|
||||||
|
|
||||||
def check_health(threshold_load: float = 1.0, threshold_disk_percent: float = 90.0) -> Dict[str, Any]:
|
|
||||||
gather_time = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
|
|
||||||
|
|
||||||
# Load average
|
|
||||||
load_raw = _run(["cat", "/proc/loadavg"])
|
|
||||||
load_values = []
|
|
||||||
avg_load = None
|
|
||||||
if load_raw.startswith("error:"):
|
|
||||||
load_status = load_raw
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
load_values = [float(x) for x in load_raw.split()[:3]]
|
|
||||||
avg_load = sum(load_values) / len(load_values)
|
|
||||||
load_status = "critical" if avg_load > threshold_load else "ok"
|
|
||||||
except Exception as e:
|
|
||||||
load_status = f"error parsing load: {e}"
|
|
||||||
|
|
||||||
# Disk usage
|
|
||||||
disk = shutil.disk_usage("/")
|
|
||||||
disk_percent = (disk.used / disk.total) * 100 if disk.total else 0.0
|
|
||||||
disk_status = "critical" if disk_percent > threshold_disk_percent else "ok"
|
|
||||||
|
|
||||||
# Memory
|
|
||||||
meminfo = _run(["cat", "/proc/meminfo"])
|
|
||||||
mem_stats = {}
|
|
||||||
for line in meminfo.splitlines():
|
|
||||||
if ":" in line:
|
|
||||||
key, val = line.split(":", 1)
|
|
||||||
mem_stats[key.strip()] = val.strip()
|
|
||||||
|
|
||||||
# Running processes
|
|
||||||
hermes_pids = []
|
|
||||||
try:
|
|
||||||
ps_out = subprocess.check_output(["pgrep", "-a", "-f", "hermes"]).decode().strip()
|
|
||||||
hermes_pids = [line.split(None, 1) for line in ps_out.splitlines() if line.strip()]
|
|
||||||
except subprocess.CalledProcessError:
|
|
||||||
hermes_pids = []
|
|
||||||
|
|
||||||
# Python package versions (key ones)
|
|
||||||
key_packages = ["jupyterlab", "papermill", "requests"]
|
|
||||||
pkg_versions = {}
|
|
||||||
for pkg in key_packages:
|
|
||||||
try:
|
|
||||||
out = subprocess.check_output([sys.executable, "-m", "pip", "show", pkg], stderr=subprocess.DEVNULL).decode()
|
|
||||||
for line in out.splitlines():
|
|
||||||
if line.startswith("Version:"):
|
|
||||||
pkg_versions[pkg] = line.split(":", 1)[1].strip()
|
|
||||||
break
|
|
||||||
except Exception:
|
|
||||||
pkg_versions[pkg] = None
|
|
||||||
|
|
||||||
overall = "ok"
|
|
||||||
if load_status == "critical" or disk_status == "critical":
|
|
||||||
overall = "critical"
|
|
||||||
elif not hermes_pids:
|
|
||||||
overall = "warning"
|
|
||||||
|
|
||||||
return {
|
|
||||||
"timestamp": gather_time,
|
|
||||||
"overall": overall,
|
|
||||||
"load": {
|
|
||||||
"raw": load_raw if not load_raw.startswith("error:") else None,
|
|
||||||
"1min": load_values[0] if len(load_values) > 0 else None,
|
|
||||||
"5min": load_values[1] if len(load_values) > 1 else None,
|
|
||||||
"15min": load_values[2] if len(load_values) > 2 else None,
|
|
||||||
"avg": round(avg_load, 3) if avg_load is not None else None,
|
|
||||||
"threshold": threshold_load,
|
|
||||||
"status": load_status,
|
|
||||||
},
|
|
||||||
"disk": {
|
|
||||||
"total_gb": round(disk.total / (1024 ** 3), 2),
|
|
||||||
"used_gb": round(disk.used / (1024 ** 3), 2),
|
|
||||||
"free_gb": round(disk.free / (1024 ** 3), 2),
|
|
||||||
"used_percent": round(disk_percent, 2),
|
|
||||||
"threshold_percent": threshold_disk_percent,
|
|
||||||
"status": disk_status,
|
|
||||||
},
|
|
||||||
"memory": mem_stats,
|
|
||||||
"processes": {
|
|
||||||
"hermes_count": len(hermes_pids),
|
|
||||||
"hermes_pids": hermes_pids[:10],
|
|
||||||
},
|
|
||||||
"packages": pkg_versions,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str] = None) -> int:
|
|
||||||
argv = argv or sys.argv[1:]
|
|
||||||
parser = argparse.ArgumentParser(description="Fleet health monitor")
|
|
||||||
parser.add_argument("--threshold-load", type=float, default=1.0)
|
|
||||||
parser.add_argument("--threshold-disk", type=float, default=90.0)
|
|
||||||
parser.add_argument("--fail-on-critical", action="store_true", help="Exit non-zero if overall is critical")
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
|
|
||||||
report = check_health(args.threshold_load, args.threshold_disk)
|
|
||||||
print(json.dumps(report, indent=2))
|
|
||||||
if args.fail_on_critical and report.get("overall") == "critical":
|
|
||||||
return 1
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
sys.exit(main())
|
|
||||||
@@ -1,136 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Notebook execution runner for agent tasks.
|
|
||||||
Wraps papermill with sensible defaults and structured JSON reporting.
|
|
||||||
|
|
||||||
Usage as CLI:
|
|
||||||
python -m devkit.notebook_runner notebooks/task.ipynb output.ipynb -p threshold 1.0
|
|
||||||
python -m devkit.notebook_runner notebooks/task.ipynb --dry-run
|
|
||||||
|
|
||||||
Usage as module:
|
|
||||||
from devkit.notebook_runner import run_notebook
|
|
||||||
result = run_notebook("task.ipynb", "output.ipynb", parameters={"threshold": 1.0})
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
import tempfile
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Any, Dict, List, Optional
|
|
||||||
|
|
||||||
|
|
||||||
def run_notebook(
|
|
||||||
input_path: str,
|
|
||||||
output_path: Optional[str] = None,
|
|
||||||
parameters: Optional[Dict[str, Any]] = None,
|
|
||||||
kernel: str = "python3",
|
|
||||||
timeout: Optional[int] = None,
|
|
||||||
dry_run: bool = False,
|
|
||||||
) -> Dict[str, Any]:
|
|
||||||
input_path = str(Path(input_path).expanduser().resolve())
|
|
||||||
if output_path is None:
|
|
||||||
fd, output_path = tempfile.mkstemp(suffix=".ipynb")
|
|
||||||
os.close(fd)
|
|
||||||
else:
|
|
||||||
output_path = str(Path(output_path).expanduser().resolve())
|
|
||||||
|
|
||||||
if dry_run:
|
|
||||||
return {
|
|
||||||
"status": "dry_run",
|
|
||||||
"input": input_path,
|
|
||||||
"output": output_path,
|
|
||||||
"parameters": parameters or {},
|
|
||||||
"kernel": kernel,
|
|
||||||
}
|
|
||||||
|
|
||||||
cmd = ["papermill", input_path, output_path, "--kernel", kernel]
|
|
||||||
if timeout is not None:
|
|
||||||
cmd.extend(["--execution-timeout", str(timeout)])
|
|
||||||
for key, value in (parameters or {}).items():
|
|
||||||
cmd.extend(["-p", key, str(value)])
|
|
||||||
|
|
||||||
start = os.times()
|
|
||||||
try:
|
|
||||||
proc = subprocess.run(
|
|
||||||
cmd,
|
|
||||||
capture_output=True,
|
|
||||||
text=True,
|
|
||||||
check=True,
|
|
||||||
)
|
|
||||||
end = os.times()
|
|
||||||
return {
|
|
||||||
"status": "ok",
|
|
||||||
"input": input_path,
|
|
||||||
"output": output_path,
|
|
||||||
"parameters": parameters or {},
|
|
||||||
"kernel": kernel,
|
|
||||||
"elapsed_seconds": round((end.elapsed - start.elapsed), 2),
|
|
||||||
"stdout": proc.stdout[-2000:] if proc.stdout else "",
|
|
||||||
}
|
|
||||||
except subprocess.CalledProcessError as e:
|
|
||||||
end = os.times()
|
|
||||||
return {
|
|
||||||
"status": "error",
|
|
||||||
"input": input_path,
|
|
||||||
"output": output_path,
|
|
||||||
"parameters": parameters or {},
|
|
||||||
"kernel": kernel,
|
|
||||||
"elapsed_seconds": round((end.elapsed - start.elapsed), 2),
|
|
||||||
"stdout": e.stdout[-2000:] if e.stdout else "",
|
|
||||||
"stderr": e.stderr[-2000:] if e.stderr else "",
|
|
||||||
"returncode": e.returncode,
|
|
||||||
}
|
|
||||||
except FileNotFoundError:
|
|
||||||
return {
|
|
||||||
"status": "error",
|
|
||||||
"message": "papermill not found. Install with: uv tool install papermill",
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str] = None) -> int:
|
|
||||||
argv = argv or sys.argv[1:]
|
|
||||||
parser = argparse.ArgumentParser(description="Notebook runner for agents")
|
|
||||||
parser.add_argument("input", help="Input notebook path")
|
|
||||||
parser.add_argument("output", nargs="?", default=None, help="Output notebook path")
|
|
||||||
parser.add_argument("-p", "--parameter", action="append", default=[], help="Parameters as key=value")
|
|
||||||
parser.add_argument("--kernel", default="python3")
|
|
||||||
parser.add_argument("--timeout", type=int, default=None)
|
|
||||||
parser.add_argument("--dry-run", action="store_true")
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
|
|
||||||
parameters = {}
|
|
||||||
for raw in args.parameter:
|
|
||||||
if "=" not in raw:
|
|
||||||
print(f"Invalid parameter (expected key=value): {raw}", file=sys.stderr)
|
|
||||||
return 1
|
|
||||||
k, v = raw.split("=", 1)
|
|
||||||
# Best-effort type inference
|
|
||||||
if v.lower() in ("true", "false"):
|
|
||||||
v = v.lower() == "true"
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
v = int(v)
|
|
||||||
except ValueError:
|
|
||||||
try:
|
|
||||||
v = float(v)
|
|
||||||
except ValueError:
|
|
||||||
pass
|
|
||||||
parameters[k] = v
|
|
||||||
|
|
||||||
result = run_notebook(
|
|
||||||
args.input,
|
|
||||||
args.output,
|
|
||||||
parameters=parameters,
|
|
||||||
kernel=args.kernel,
|
|
||||||
timeout=args.timeout,
|
|
||||||
dry_run=args.dry_run,
|
|
||||||
)
|
|
||||||
print(json.dumps(result, indent=2))
|
|
||||||
return 0 if result.get("status") == "ok" else 1
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
sys.exit(main())
|
|
||||||
@@ -1,108 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Fast secret leak scanner for the repository.
|
|
||||||
Checks for common patterns that should never be committed.
|
|
||||||
|
|
||||||
Usage as CLI:
|
|
||||||
python -m devkit.secret_scan
|
|
||||||
python -m devkit.secret_scan --path /some/repo --fail-on-find
|
|
||||||
|
|
||||||
Usage as module:
|
|
||||||
from devkit.secret_scan import scan
|
|
||||||
findings = scan("/path/to/repo")
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Any, Dict, List
|
|
||||||
|
|
||||||
# Patterns to flag
|
|
||||||
PATTERNS = {
|
|
||||||
"aws_access_key_id": re.compile(r"AKIA[0-9A-Z]{16}"),
|
|
||||||
"aws_secret_key": re.compile(r"['\"\s][0-9a-zA-Z/+]{40}['\"\s]"),
|
|
||||||
"generic_api_key": re.compile(r"api[_-]?key\s*[:=]\s*['\"][a-zA-Z0-9_\-]{20,}['\"]", re.IGNORECASE),
|
|
||||||
"private_key": re.compile(r"-----BEGIN (RSA |EC |DSA |OPENSSH )?PRIVATE KEY-----"),
|
|
||||||
"github_token": re.compile(r"gh[pousr]_[A-Za-z0-9_]{36,}"),
|
|
||||||
"gitea_token": re.compile(r"[0-9a-f]{40}"), # heuristic for long hex strings after "token"
|
|
||||||
"telegram_bot_token": re.compile(r"[0-9]{9,}:[A-Za-z0-9_-]{35,}"),
|
|
||||||
}
|
|
||||||
|
|
||||||
# Files and paths to skip
|
|
||||||
SKIP_PATHS = [
|
|
||||||
".git",
|
|
||||||
"__pycache__",
|
|
||||||
".pytest_cache",
|
|
||||||
"node_modules",
|
|
||||||
"venv",
|
|
||||||
".env",
|
|
||||||
".agent-skills",
|
|
||||||
]
|
|
||||||
|
|
||||||
# Max file size to scan (bytes)
|
|
||||||
MAX_FILE_SIZE = 1024 * 1024
|
|
||||||
|
|
||||||
|
|
||||||
def _should_skip(path: Path) -> bool:
|
|
||||||
for skip in SKIP_PATHS:
|
|
||||||
if skip in path.parts:
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def scan(root: str = ".") -> List[Dict[str, Any]]:
|
|
||||||
root_path = Path(root).resolve()
|
|
||||||
findings = []
|
|
||||||
for file_path in root_path.rglob("*"):
|
|
||||||
if not file_path.is_file():
|
|
||||||
continue
|
|
||||||
if _should_skip(file_path):
|
|
||||||
continue
|
|
||||||
if file_path.stat().st_size > MAX_FILE_SIZE:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
text = file_path.read_text(encoding="utf-8", errors="ignore")
|
|
||||||
except Exception:
|
|
||||||
continue
|
|
||||||
for pattern_name, pattern in PATTERNS.items():
|
|
||||||
for match in pattern.finditer(text):
|
|
||||||
# Simple context: line around match
|
|
||||||
start = max(0, match.start() - 40)
|
|
||||||
end = min(len(text), match.end() + 40)
|
|
||||||
context = text[start:end].replace("\n", " ")
|
|
||||||
findings.append({
|
|
||||||
"file": str(file_path.relative_to(root_path)),
|
|
||||||
"pattern": pattern_name,
|
|
||||||
"line": text[:match.start()].count("\n") + 1,
|
|
||||||
"context": context,
|
|
||||||
})
|
|
||||||
return findings
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str] = None) -> int:
|
|
||||||
argv = argv or sys.argv[1:]
|
|
||||||
parser = argparse.ArgumentParser(description="Secret leak scanner")
|
|
||||||
parser.add_argument("--path", default=".", help="Repository root to scan")
|
|
||||||
parser.add_argument("--fail-on-find", action="store_true", help="Exit non-zero if secrets found")
|
|
||||||
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
|
|
||||||
findings = scan(args.path)
|
|
||||||
if args.json:
|
|
||||||
print(json.dumps({"findings": findings, "count": len(findings)}, indent=2))
|
|
||||||
else:
|
|
||||||
print(f"Scanned {args.path}")
|
|
||||||
print(f"Findings: {len(findings)}")
|
|
||||||
for f in findings:
|
|
||||||
print(f" [{f['pattern']}] {f['file']}:{f['line']} -> ...{f['context']}...")
|
|
||||||
|
|
||||||
if args.fail_on_find and findings:
|
|
||||||
return 1
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
sys.exit(main())
|
|
||||||
@@ -1,108 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Shared smoke test runner for hermes-agent.
|
|
||||||
Fast checks that catch obvious breakage without maintenance burden.
|
|
||||||
|
|
||||||
Usage as CLI:
|
|
||||||
python -m devkit.smoke_test
|
|
||||||
python -m devkit.smoke_test --verbose
|
|
||||||
|
|
||||||
Usage as module:
|
|
||||||
from devkit.smoke_test import run_smoke_tests
|
|
||||||
results = run_smoke_tests()
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import importlib
|
|
||||||
import json
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Any, Dict, List
|
|
||||||
|
|
||||||
|
|
||||||
HERMES_ROOT = Path(__file__).resolve().parent.parent
|
|
||||||
|
|
||||||
|
|
||||||
def _test_imports() -> Dict[str, Any]:
|
|
||||||
modules = [
|
|
||||||
"hermes_constants",
|
|
||||||
"hermes_state",
|
|
||||||
"cli",
|
|
||||||
"tools.skills_sync",
|
|
||||||
"tools.skills_hub",
|
|
||||||
]
|
|
||||||
errors = []
|
|
||||||
for mod in modules:
|
|
||||||
try:
|
|
||||||
importlib.import_module(mod)
|
|
||||||
except Exception as e:
|
|
||||||
errors.append({"module": mod, "error": str(e)})
|
|
||||||
return {
|
|
||||||
"name": "core_imports",
|
|
||||||
"status": "ok" if not errors else "fail",
|
|
||||||
"errors": errors,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _test_cli_entrypoints() -> Dict[str, Any]:
|
|
||||||
entrypoints = [
|
|
||||||
[sys.executable, "-m", "cli", "--help"],
|
|
||||||
]
|
|
||||||
errors = []
|
|
||||||
for cmd in entrypoints:
|
|
||||||
try:
|
|
||||||
subprocess.run(cmd, capture_output=True, text=True, check=True, cwd=HERMES_ROOT)
|
|
||||||
except subprocess.CalledProcessError as e:
|
|
||||||
errors.append({"cmd": cmd, "error": f"exit {e.returncode}"})
|
|
||||||
except Exception as e:
|
|
||||||
errors.append({"cmd": cmd, "error": str(e)})
|
|
||||||
return {
|
|
||||||
"name": "cli_entrypoints",
|
|
||||||
"status": "ok" if not errors else "fail",
|
|
||||||
"errors": errors,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _test_green_path_e2e() -> Dict[str, Any]:
|
|
||||||
"""One bare green-path E2E: terminal_tool echo hello."""
|
|
||||||
try:
|
|
||||||
from tools.terminal_tool import terminal
|
|
||||||
result = terminal(command="echo hello")
|
|
||||||
output = result.get("output", "")
|
|
||||||
if "hello" in output.lower():
|
|
||||||
return {"name": "green_path_e2e", "status": "ok", "output": output.strip()}
|
|
||||||
return {"name": "green_path_e2e", "status": "fail", "error": f"Unexpected output: {output}"}
|
|
||||||
except Exception as e:
|
|
||||||
return {"name": "green_path_e2e", "status": "fail", "error": str(e)}
|
|
||||||
|
|
||||||
|
|
||||||
def run_smoke_tests(verbose: bool = False) -> Dict[str, Any]:
|
|
||||||
tests = [
|
|
||||||
_test_imports(),
|
|
||||||
_test_cli_entrypoints(),
|
|
||||||
_test_green_path_e2e(),
|
|
||||||
]
|
|
||||||
failed = [t for t in tests if t["status"] != "ok"]
|
|
||||||
result = {
|
|
||||||
"overall": "ok" if not failed else "fail",
|
|
||||||
"tests": tests,
|
|
||||||
"failed_count": len(failed),
|
|
||||||
}
|
|
||||||
if verbose:
|
|
||||||
print(json.dumps(result, indent=2))
|
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str] = None) -> int:
|
|
||||||
argv = argv or sys.argv[1:]
|
|
||||||
parser = argparse.ArgumentParser(description="Smoke test runner")
|
|
||||||
parser.add_argument("--verbose", action="store_true")
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
|
|
||||||
result = run_smoke_tests(verbose=True)
|
|
||||||
return 0 if result["overall"] == "ok" else 1
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
sys.exit(main())
|
|
||||||
@@ -1,112 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Wizard environment validator.
|
|
||||||
Checks that a new wizard environment is ready for duty.
|
|
||||||
|
|
||||||
Usage as CLI:
|
|
||||||
python -m devkit.wizard_env
|
|
||||||
python -m devkit.wizard_env --fix
|
|
||||||
|
|
||||||
Usage as module:
|
|
||||||
from devkit.wizard_env import validate
|
|
||||||
report = validate()
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import shutil
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
from typing import Any, Dict, List
|
|
||||||
|
|
||||||
|
|
||||||
def _has_cmd(name: str) -> bool:
|
|
||||||
return shutil.which(name) is not None
|
|
||||||
|
|
||||||
|
|
||||||
def _check_env_var(name: str) -> Dict[str, Any]:
|
|
||||||
value = os.getenv(name)
|
|
||||||
return {
|
|
||||||
"name": name,
|
|
||||||
"status": "ok" if value else "missing",
|
|
||||||
"value": value[:10] + "..." if value and len(value) > 20 else value,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _check_python_pkg(name: str) -> Dict[str, Any]:
|
|
||||||
try:
|
|
||||||
__import__(name)
|
|
||||||
return {"name": name, "status": "ok"}
|
|
||||||
except ImportError:
|
|
||||||
return {"name": name, "status": "missing"}
|
|
||||||
|
|
||||||
|
|
||||||
def validate() -> Dict[str, Any]:
|
|
||||||
checks = {
|
|
||||||
"binaries": [
|
|
||||||
{"name": "python3", "status": "ok" if _has_cmd("python3") else "missing"},
|
|
||||||
{"name": "git", "status": "ok" if _has_cmd("git") else "missing"},
|
|
||||||
{"name": "curl", "status": "ok" if _has_cmd("curl") else "missing"},
|
|
||||||
{"name": "jupyter-lab", "status": "ok" if _has_cmd("jupyter-lab") else "missing"},
|
|
||||||
{"name": "papermill", "status": "ok" if _has_cmd("papermill") else "missing"},
|
|
||||||
{"name": "jupytext", "status": "ok" if _has_cmd("jupytext") else "missing"},
|
|
||||||
],
|
|
||||||
"env_vars": [
|
|
||||||
_check_env_var("GITEA_URL"),
|
|
||||||
_check_env_var("GITEA_TOKEN"),
|
|
||||||
_check_env_var("TELEGRAM_BOT_TOKEN"),
|
|
||||||
],
|
|
||||||
"python_packages": [
|
|
||||||
_check_python_pkg("requests"),
|
|
||||||
_check_python_pkg("jupyter_server"),
|
|
||||||
_check_python_pkg("nbformat"),
|
|
||||||
],
|
|
||||||
}
|
|
||||||
|
|
||||||
all_ok = all(
|
|
||||||
c["status"] == "ok"
|
|
||||||
for group in checks.values()
|
|
||||||
for c in group
|
|
||||||
)
|
|
||||||
|
|
||||||
# Hermes-specific checks
|
|
||||||
hermes_home = os.path.expanduser("~/.hermes")
|
|
||||||
checks["hermes"] = [
|
|
||||||
{"name": "config.yaml", "status": "ok" if os.path.exists(f"{hermes_home}/config.yaml") else "missing"},
|
|
||||||
{"name": "skills_dir", "status": "ok" if os.path.exists(f"{hermes_home}/skills") else "missing"},
|
|
||||||
]
|
|
||||||
|
|
||||||
all_ok = all_ok and all(c["status"] == "ok" for c in checks["hermes"])
|
|
||||||
|
|
||||||
return {
|
|
||||||
"overall": "ok" if all_ok else "incomplete",
|
|
||||||
"checks": checks,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str] = None) -> int:
|
|
||||||
argv = argv or sys.argv[1:]
|
|
||||||
parser = argparse.ArgumentParser(description="Wizard environment validator")
|
|
||||||
parser.add_argument("--json", action="store_true")
|
|
||||||
parser.add_argument("--fail-on-incomplete", action="store_true")
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
|
|
||||||
report = validate()
|
|
||||||
if args.json:
|
|
||||||
print(json.dumps(report, indent=2))
|
|
||||||
else:
|
|
||||||
print(f"Wizard Environment: {report['overall']}")
|
|
||||||
for group, items in report["checks"].items():
|
|
||||||
print(f"\n[{group}]")
|
|
||||||
for item in items:
|
|
||||||
status_icon = "✅" if item["status"] == "ok" else "❌"
|
|
||||||
print(f" {status_icon} {item['name']}: {item['status']}")
|
|
||||||
|
|
||||||
if args.fail_on_incomplete and report["overall"] != "ok":
|
|
||||||
return 1
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
sys.exit(main())
|
|
||||||
@@ -1,132 +0,0 @@
|
|||||||
# Fleet SITREP — April 6, 2026
|
|
||||||
|
|
||||||
**Classification:** Consolidated Status Report
|
|
||||||
**Compiled by:** Ezra
|
|
||||||
**Acknowledged by:** Claude (Issue #143)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Executive Summary
|
|
||||||
|
|
||||||
Allegro executed 7 tasks across infrastructure, contracting, audits, and security. Ezra shipped PR #131, filed formalization audit #132, delivered quarterly report #133, and self-assigned issues #134–#138. All wizard activity mapped below.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 1. Allegro 7-Task Report
|
|
||||||
|
|
||||||
| Task | Description | Status |
|
|
||||||
|------|-------------|--------|
|
|
||||||
| 1 | Roll Call / Infrastructure Map | ✅ Complete |
|
|
||||||
| 2 | Dark industrial anthem (140 BPM, Suno-ready) | ✅ Complete |
|
|
||||||
| 3 | Operation Get A Job — 7-file contracting playbook pushed to `the-nexus` | ✅ Complete |
|
|
||||||
| 4 | Formalization audit filed ([the-nexus #893](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/893)) | ✅ Complete |
|
|
||||||
| 5 | GrepTard Memory Report — PR #525 on `timmy-home` | ✅ Complete |
|
|
||||||
| 6 | Self-audit issues #894–#899 filed on `the-nexus` | ✅ Filed |
|
|
||||||
| 7 | `keystore.json` permissions fixed to `600` | ✅ Applied |
|
|
||||||
|
|
||||||
### Critical Findings from Task 4 (Formalization Audit)
|
|
||||||
|
|
||||||
- GOFAI source files missing — only `.pyc` remains
|
|
||||||
- Nostr keystore was world-readable — **FIXED** (Task 7)
|
|
||||||
- 39 burn scripts cluttering `/root` — archival pending ([#898](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/898))
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 2. Ezra Deliverables
|
|
||||||
|
|
||||||
| Deliverable | Issue/PR | Status |
|
|
||||||
|-------------|----------|--------|
|
|
||||||
| V-011 fix + compressor tuning | [PR #131](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/pulls/131) | ✅ Merged |
|
|
||||||
| Formalization audit (hermes-agent) | [Issue #132](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/132) | Filed |
|
|
||||||
| Quarterly report (MD + PDF) | [Issue #133](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/133) | Filed |
|
|
||||||
| Burn-mode concurrent tool tests | [Issue #134](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/134) | Assigned → Ezra |
|
|
||||||
| MCP SDK migration | [Issue #135](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/135) | Assigned → Ezra |
|
|
||||||
| APScheduler migration | [Issue #136](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/136) | Assigned → Ezra |
|
|
||||||
| Pydantic-settings migration | [Issue #137](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/137) | Assigned → Ezra |
|
|
||||||
| Contracting playbook tracker | [Issue #138](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/138) | Assigned → Ezra |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 3. Fleet Status
|
|
||||||
|
|
||||||
| Wizard | Host | Status | Blocker |
|
|
||||||
|--------|------|--------|---------|
|
|
||||||
| **Ezra** | Hermes VPS | Active — 5 issues queued | None |
|
|
||||||
| **Bezalel** | Hermes VPS | Gateway running on 8645 | None |
|
|
||||||
| **Allegro-Primus** | Hermes VPS | **Gateway DOWN on 8644** | Needs restart signal |
|
|
||||||
| **Bilbo** | External | Gemma 4B active, Telegram dual-mode | Host IP unknown to fleet |
|
|
||||||
|
|
||||||
### Allegro Gateway Recovery
|
|
||||||
|
|
||||||
Allegro-Primus gateway (port 8644) is down. Options:
|
|
||||||
1. **Alexander restarts manually** on Hermes VPS
|
|
||||||
2. **Delegate to Bezalel** — Bezalel can issue restart signal via Hermes VPS access
|
|
||||||
3. **Delegate to Ezra** — Ezra can coordinate restart as part of issue #894 work
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 4. Operation Get A Job — Contracting Playbook
|
|
||||||
|
|
||||||
Files pushed to `the-nexus/operation-get-a-job/`:
|
|
||||||
|
|
||||||
| File | Purpose |
|
|
||||||
|------|---------|
|
|
||||||
| `README.md` | Master plan |
|
|
||||||
| `entity-setup.md` | Wyoming LLC, Mercury, E&O insurance |
|
|
||||||
| `service-offerings.md` | Rates $150–600/hr; packages $5k/$15k/$40k+ |
|
|
||||||
| `portfolio.md` | Portfolio structure |
|
|
||||||
| `outreach-templates.md` | Cold email templates |
|
|
||||||
| `proposal-template.md` | Client proposal structure |
|
|
||||||
| `rate-card.md` | Rate card |
|
|
||||||
|
|
||||||
**Human-only mile (Alexander's action items):**
|
|
||||||
|
|
||||||
1. Pick LLC name from `entity-setup.md`
|
|
||||||
2. File Wyoming LLC via Northwest Registered Agent ($225)
|
|
||||||
3. Get EIN from IRS (free, ~10 min)
|
|
||||||
4. Open Mercury account (requires EIN + LLC docs)
|
|
||||||
5. Secure E&O insurance (~$150–250/month)
|
|
||||||
6. Restart Allegro-Primus gateway (port 8644)
|
|
||||||
7. Update LinkedIn using profile template
|
|
||||||
8. Send 5 cold emails using outreach templates
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 5. Pending Self-Audit Issues (the-nexus)
|
|
||||||
|
|
||||||
| Issue | Title | Priority |
|
|
||||||
|-------|-------|----------|
|
|
||||||
| [#894](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/894) | Deploy burn-mode cron jobs | CRITICAL |
|
|
||||||
| [#895](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/895) | Telegram thread-based reporting | Normal |
|
|
||||||
| [#896](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/896) | Retry logic and error recovery | Normal |
|
|
||||||
| [#897](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/897) | Automate morning reports at 0600 | Normal |
|
|
||||||
| [#898](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/898) | Archive 39 burn scripts | Normal |
|
|
||||||
| [#899](https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/issues/899) | Keystore permissions | ✅ Done |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 6. Revenue Timeline
|
|
||||||
|
|
||||||
| Milestone | Target | Unlocks |
|
|
||||||
|-----------|--------|---------|
|
|
||||||
| LLC + Bank + E&O | Day 5 | Ability to invoice clients |
|
|
||||||
| First 5 emails sent | Day 7 | Pipeline generation |
|
|
||||||
| First scoping call | Day 14 | Qualified lead |
|
|
||||||
| First proposal accepted | Day 21 | **$4,500–$12,000 revenue** |
|
|
||||||
| Monthly retainer signed | Day 45 | **$6,000/mo recurring** |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 7. Delegation Matrix
|
|
||||||
|
|
||||||
| Owner | Owns |
|
|
||||||
|-------|------|
|
|
||||||
| **Alexander** | LLC filing, EIN, Mercury, E&O, LinkedIn, cold emails, gateway restart |
|
|
||||||
| **Ezra** | Issues #134–#138 (tests, migrations, tracker) |
|
|
||||||
| **Allegro** | Issues #894, #898 (cron deployment, burn script archival) |
|
|
||||||
| **Bezalel** | Review formalization audit for Anthropic-specific gaps |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*SITREP acknowledged by Claude — April 6, 2026*
|
|
||||||
*Source issue: [hermes-agent #143](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/143)*
|
|
||||||
@@ -1,166 +0,0 @@
|
|||||||
# Research Acknowledgment: SSD — Simple Self-Distillation Improves Code Generation
|
|
||||||
|
|
||||||
**Issue:** #128
|
|
||||||
**Paper:** [Embarrassingly Simple Self-Distillation Improves Code Generation](https://arxiv.org/abs/2604.01193)
|
|
||||||
**Authors:** Ruixiang Zhang, Richard He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang (Apple)
|
|
||||||
**Date:** April 1, 2026
|
|
||||||
**Code:** https://github.com/apple/ml-ssd
|
|
||||||
**Acknowledged by:** Claude — April 6, 2026
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Assessment: High Relevance to Fleet
|
|
||||||
|
|
||||||
This paper is directly applicable to the hermes-agent fleet. The headline result — +7.5pp pass@1 on Qwen3-4B — is at exactly the scale we operate. The method requires no external infrastructure. Triage verdict: **P0 / Week-class work**.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## What SSD Actually Does
|
|
||||||
|
|
||||||
Three steps, nothing exotic:
|
|
||||||
|
|
||||||
1. **Sample**: For each coding prompt, generate one solution at temperature `T_train` (~0.9). Do NOT filter for correctness.
|
|
||||||
2. **Fine-tune**: SFT on the resulting `(prompt, unverified_solution)` pairs. Standard cross-entropy loss. No RLHF, no GRPO, no DPO.
|
|
||||||
3. **Evaluate**: At `T_eval` (which must be **different** from `T_train`). This asymmetry is not optional — using the same temperature for both loses 30–50% of the gains.
|
|
||||||
|
|
||||||
The counterintuitive part: N=1 per problem, unverified. Prior self-improvement work uses N>>1 and filters by execution. SSD doesn't. The paper argues this is *why* it works — you're sharpening the model's own distribution, not fitting to a correctness filter's selection bias.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## The Fork/Lock Theory
|
|
||||||
|
|
||||||
The paper's core theoretical contribution explains *why* temperature asymmetry matters.
|
|
||||||
|
|
||||||
**Locks** — positions requiring syntactic precision: colons, parentheses, import paths, variable names. A mistake here is a hard error. Low temperature helps at Locks. But applying low temperature globally kills diversity everywhere.
|
|
||||||
|
|
||||||
**Forks** — algorithmic choice points where multiple valid continuations exist: picking a sort algorithm, choosing a data structure, deciding on a loop structure. High temperature helps at Forks. But applying high temperature globally introduces errors at Locks.
|
|
||||||
|
|
||||||
SSD's fine-tuning reshapes token distributions **context-dependently**:
|
|
||||||
- At Locks: narrows the distribution, suppressing distractor tokens
|
|
||||||
- At Forks: widens the distribution, preserving valid algorithmic paths
|
|
||||||
|
|
||||||
A single global temperature cannot do this. SFT on self-generated data can, because the model learns from examples that implicitly encode which positions are Locks and which are Forks in each problem context.
|
|
||||||
|
|
||||||
**Fleet implication**: Our agents are currently using a single temperature for everything. This is leaving performance on the table even without fine-tuning. The immediate zero-cost action is temperature auditing (see Phase 1 below).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Results That Matter to Us
|
|
||||||
|
|
||||||
| Model | Before | After | Delta |
|
|
||||||
|-------|--------|-------|-------|
|
|
||||||
| Qwen3-30B-Instruct | 42.4% | 55.3% | +12.9pp (+30% rel) |
|
|
||||||
| Qwen3-4B-Instruct | baseline | baseline+7.5pp | +7.5pp |
|
|
||||||
| Llama-3.1-8B-Instruct | baseline | baseline+3.5pp | +3.5pp |
|
|
||||||
|
|
||||||
Gains concentrate on hard problems: +14.2pp medium, +15.3pp hard. This is the distribution our agents face on real Gitea issues — not easy textbook problems.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Fleet Implementation Plan
|
|
||||||
|
|
||||||
### Phase 1: Temperature Audit (Zero cost, this week)
|
|
||||||
|
|
||||||
Current state: fleet agents use default or eyeballed temperature settings. The paper shows T_eval != T_train is critical even without fine-tuning.
|
|
||||||
|
|
||||||
Actions:
|
|
||||||
1. Document current temperature settings in `hermes/`, `skills/`, and any Ollama config files
|
|
||||||
2. Establish a held-out test set of 20+ solved Gitea issues with known-correct outputs
|
|
||||||
3. Run A/B: current T_eval vs. T_eval=0.7 vs. T_eval=0.3 for code generation tasks
|
|
||||||
4. Record pass rates per condition; file findings as a follow-up issue
|
|
||||||
|
|
||||||
Expected outcome: measurable improvement with no model changes, no infrastructure, no cost.
|
|
||||||
|
|
||||||
### Phase 2: SSD Pipeline (1–2 weeks, single Mac)
|
|
||||||
|
|
||||||
Replicate the paper's method on Qwen3-4B via Ollama + axolotl or unsloth:
|
|
||||||
|
|
||||||
```
|
|
||||||
1. Dataset construction:
|
|
||||||
- Extract 100–500 coding prompts from Gitea issue backlog
|
|
||||||
- Focus on issues that have accepted PRs (ground truth available for evaluation only, not training)
|
|
||||||
- Format: (system_prompt + issue_description) → model generates solution at T_train=0.9
|
|
||||||
|
|
||||||
2. Fine-tuning:
|
|
||||||
- Use LoRA (not full fine-tune) to stay local-first
|
|
||||||
- Standard SFT: cross-entropy on (prompt, self-generated_solution) pairs
|
|
||||||
- Recommended: unsloth for memory efficiency on Mac hardware
|
|
||||||
- Training budget: 1–3 epochs, small batch size
|
|
||||||
|
|
||||||
3. Evaluation:
|
|
||||||
- Compare base model vs. SSD-tuned model at T_eval=0.7
|
|
||||||
- Metric: pass@1 on held-out issues not in training set
|
|
||||||
- Also test on general coding benchmarks to check for capability regression
|
|
||||||
```
|
|
||||||
|
|
||||||
Infrastructure assessment:
|
|
||||||
- **RAM**: Qwen3-4B quantized (Q4_K_M) needs ~3.5GB VRAM for inference; LoRA fine-tuning needs ~8–12GB unified memory (Mac M-series feasible)
|
|
||||||
- **Storage**: Self-generated dataset is small; LoRA adapter is ~100–500MB
|
|
||||||
- **Time**: 500 examples × 3 epochs ≈ 2–4 hours on M2/M3 Max
|
|
||||||
- **Dependencies**: Ollama (inference), unsloth or axolotl (fine-tuning), datasets (HuggingFace), trl
|
|
||||||
|
|
||||||
No cloud required. No teacher model required. No code execution environment required.
|
|
||||||
|
|
||||||
### Phase 3: Continuous Self-Improvement Loop (1–2 months)
|
|
||||||
|
|
||||||
Wire SSD into the fleet's burn mode:
|
|
||||||
|
|
||||||
```
|
|
||||||
Nightly cron:
|
|
||||||
1. Collect agent solutions from the day's completed issues
|
|
||||||
2. Filter: only solutions where the PR was merged (human-verified correct)
|
|
||||||
3. Append to rolling training buffer (last 500 examples)
|
|
||||||
4. Run SFT fine-tune on buffer → update LoRA adapter
|
|
||||||
5. Swap adapter into Ollama deployment at dawn
|
|
||||||
6. Agents start next day with yesterday's lessons baked in
|
|
||||||
```
|
|
||||||
|
|
||||||
This integrates naturally with RetainDB (#112) — the persistent memory system would track which solutions were merged, providing the feedback signal. The continuous loop turns every merged PR into a training example.
|
|
||||||
|
|
||||||
### Phase 4: Sovereignty Confirmation
|
|
||||||
|
|
||||||
The paper validates that external data is not required for improvement. Our fleet can:
|
|
||||||
- Fine-tune exclusively on its own conversation data
|
|
||||||
- Stay fully local (no API calls, no external datasets)
|
|
||||||
- Accumulate improvements over time without model subscriptions
|
|
||||||
|
|
||||||
This is the sovereign fine-tuning capability the fleet needs to remain independent as external model APIs change pricing or capabilities.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Risks and Mitigations
|
|
||||||
|
|
||||||
| Risk | Assessment | Mitigation |
|
|
||||||
|------|------------|------------|
|
|
||||||
| SSD gains don't transfer from LiveCodeBench to Gitea issues | Medium — our domain is software engineering, not competitive programming | Test on actual Gitea issues from the backlog; don't assume benchmark numbers transfer |
|
|
||||||
| Fine-tuning degrades non-code capabilities | Low-Medium | LoRA instead of full fine-tune; test on general tasks after SFT; retain base model checkpoint |
|
|
||||||
| Small training set (<200 examples) insufficient | Medium | Paper shows gains at modest scale; supplement with open code datasets (Stack, TheVault) if needed |
|
|
||||||
| Qwen3 GGUF format incompatible with unsloth fine-tuning | Low | unsloth supports Qwen3; verify exact GGUF variant compatibility before starting |
|
|
||||||
| Temperature asymmetry effect smaller on instruction-tuned variants | Low | Paper explicitly tests instruct variants and shows gains; Qwen3-4B-Instruct is in the paper's results |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Acceptance Criteria Status
|
|
||||||
|
|
||||||
From the issue:
|
|
||||||
|
|
||||||
- [ ] **Temperature audit** — Document current T/top_p settings across fleet agents, compare with paper recommendations
|
|
||||||
- [ ] **T_eval benchmark** — A/B test on 20+ solved Gitea issues; measure correctness
|
|
||||||
- [ ] **SSD reproduction** — Replicate pipeline on Qwen4B with 100 prompts; measure pass@1 change
|
|
||||||
- [ ] **Infrastructure assessment** — Documented above (Phase 2 section); GPU/RAM/storage requirements are Mac-feasible
|
|
||||||
- [ ] **Continuous loop design** — Architecture drafted above (Phase 3 section); integrates with RetainDB (#112)
|
|
||||||
|
|
||||||
Infrastructure assessment and continuous loop design are addressed in this document. Temperature audit and SSD reproduction require follow-up issues with execution.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Recommended Follow-Up Issues
|
|
||||||
|
|
||||||
1. **Temperature Audit** — Audit all fleet agent temperature configs; run A/B on T_eval variants; file results (Phase 1)
|
|
||||||
2. **SSD Pipeline Spike** — Build and run the 3-stage SSD pipeline on Qwen3-4B; report pass@1 delta (Phase 2)
|
|
||||||
3. **Nightly SFT Integration** — Wire SSD into burn-mode cron; integrate with RetainDB feedback loop (Phase 3)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*Research acknowledged by Claude — April 6, 2026*
|
|
||||||
*Source issue: [hermes-agent #128](https://forge.alexanderwhitestone.com/Timmy_Foundation/hermes-agent/issues/128)*
|
|
||||||
@@ -1,261 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""Forge Health Check — Build verification and artifact integrity scanner.
|
|
||||||
|
|
||||||
Scans wizard environments for:
|
|
||||||
- Missing source files (.pyc without .py) — Allegro finding: GOFAI source files gone
|
|
||||||
- Burn script accumulation in /root or wizard directories
|
|
||||||
- World-readable sensitive files (keystores, tokens, configs)
|
|
||||||
- Missing required environment variables
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
python scripts/forge_health_check.py /root/wizards
|
|
||||||
python scripts/forge_health_check.py /root/wizards --json
|
|
||||||
python scripts/forge_health_check.py /root/wizards --fix-permissions
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import stat
|
|
||||||
import sys
|
|
||||||
from dataclasses import asdict, dataclass, field
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Iterable
|
|
||||||
|
|
||||||
|
|
||||||
SENSITIVE_FILE_PATTERNS = (
|
|
||||||
"keystore",
|
|
||||||
"password",
|
|
||||||
"private",
|
|
||||||
"apikey",
|
|
||||||
"api_key",
|
|
||||||
"credentials",
|
|
||||||
)
|
|
||||||
|
|
||||||
SENSITIVE_NAME_PREFIXES = (
|
|
||||||
"key_",
|
|
||||||
"keys_",
|
|
||||||
"token_",
|
|
||||||
"tokens_",
|
|
||||||
"secret_",
|
|
||||||
"secrets_",
|
|
||||||
".env",
|
|
||||||
"env.",
|
|
||||||
)
|
|
||||||
|
|
||||||
SENSITIVE_NAME_SUFFIXES = (
|
|
||||||
"_key",
|
|
||||||
"_keys",
|
|
||||||
"_token",
|
|
||||||
"_tokens",
|
|
||||||
"_secret",
|
|
||||||
"_secrets",
|
|
||||||
".key",
|
|
||||||
".env",
|
|
||||||
".token",
|
|
||||||
".secret",
|
|
||||||
)
|
|
||||||
|
|
||||||
SENSIBLE_PERMISSIONS = 0o600 # owner read/write only
|
|
||||||
|
|
||||||
REQUIRED_ENV_VARS = (
|
|
||||||
"GITEA_URL",
|
|
||||||
"GITEA_TOKEN",
|
|
||||||
"GITEA_USER",
|
|
||||||
)
|
|
||||||
|
|
||||||
BURN_SCRIPT_PATTERNS = (
|
|
||||||
"burn",
|
|
||||||
"ignite",
|
|
||||||
"inferno",
|
|
||||||
"scorch",
|
|
||||||
"char",
|
|
||||||
"blaze",
|
|
||||||
"ember",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class HealthFinding:
|
|
||||||
category: str
|
|
||||||
severity: str # critical, warning, info
|
|
||||||
path: str
|
|
||||||
message: str
|
|
||||||
suggestion: str = ""
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class HealthReport:
|
|
||||||
target: str
|
|
||||||
findings: list[HealthFinding] = field(default_factory=list)
|
|
||||||
passed: bool = True
|
|
||||||
|
|
||||||
def add(self, finding: HealthFinding) -> None:
|
|
||||||
self.findings.append(finding)
|
|
||||||
if finding.severity == "critical":
|
|
||||||
self.passed = False
|
|
||||||
|
|
||||||
|
|
||||||
def scan_orphaned_bytecode(root: Path, report: HealthReport) -> None:
|
|
||||||
"""Detect .pyc files without corresponding .py source files."""
|
|
||||||
for pyc in root.rglob("*.pyc"):
|
|
||||||
py = pyc.with_suffix(".py")
|
|
||||||
if not py.exists():
|
|
||||||
# Also check __pycache__ naming convention
|
|
||||||
if pyc.name.startswith("__") and pyc.parent.name == "__pycache__":
|
|
||||||
stem = pyc.stem.split(".")[0]
|
|
||||||
py = pyc.parent.parent / f"{stem}.py"
|
|
||||||
if not py.exists():
|
|
||||||
report.add(
|
|
||||||
HealthFinding(
|
|
||||||
category="artifact_integrity",
|
|
||||||
severity="critical",
|
|
||||||
path=str(pyc),
|
|
||||||
message=f"Compiled bytecode without source: {pyc}",
|
|
||||||
suggestion="Restore missing .py source file from version control or backup",
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def scan_burn_script_clutter(root: Path, report: HealthReport) -> None:
|
|
||||||
"""Detect burn scripts and other temporary artifacts outside proper staging."""
|
|
||||||
for path in root.iterdir():
|
|
||||||
if not path.is_file():
|
|
||||||
continue
|
|
||||||
lower = path.name.lower()
|
|
||||||
if any(pat in lower for pat in BURN_SCRIPT_PATTERNS):
|
|
||||||
report.add(
|
|
||||||
HealthFinding(
|
|
||||||
category="deployment_hygiene",
|
|
||||||
severity="warning",
|
|
||||||
path=str(path),
|
|
||||||
message=f"Burn script or temporary artifact in production path: {path.name}",
|
|
||||||
suggestion="Archive to a burn/ or tmp/ directory, or remove if no longer needed",
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _is_sensitive_filename(name: str) -> bool:
|
|
||||||
"""Check if a filename indicates it may contain secrets."""
|
|
||||||
lower = name.lower()
|
|
||||||
if lower == ".env.example":
|
|
||||||
return False
|
|
||||||
if any(pat in lower for pat in SENSITIVE_FILE_PATTERNS):
|
|
||||||
return True
|
|
||||||
if any(lower.startswith(pref) for pref in SENSITIVE_NAME_PREFIXES):
|
|
||||||
return True
|
|
||||||
if any(lower.endswith(suff) for suff in SENSITIVE_NAME_SUFFIXES):
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def scan_sensitive_file_permissions(root: Path, report: HealthReport, fix: bool = False) -> None:
|
|
||||||
"""Detect world-readable sensitive files."""
|
|
||||||
for fpath in root.rglob("*"):
|
|
||||||
if not fpath.is_file():
|
|
||||||
continue
|
|
||||||
# Skip test files — real secrets should never live in tests/
|
|
||||||
if "/tests/" in str(fpath) or str(fpath).startswith(str(root / "tests")):
|
|
||||||
continue
|
|
||||||
if not _is_sensitive_filename(fpath.name):
|
|
||||||
continue
|
|
||||||
|
|
||||||
try:
|
|
||||||
mode = fpath.stat().st_mode
|
|
||||||
except OSError:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Readable by group or other
|
|
||||||
if mode & stat.S_IRGRP or mode & stat.S_IROTH:
|
|
||||||
was_fixed = False
|
|
||||||
if fix:
|
|
||||||
try:
|
|
||||||
fpath.chmod(SENSIBLE_PERMISSIONS)
|
|
||||||
was_fixed = True
|
|
||||||
except OSError:
|
|
||||||
pass
|
|
||||||
|
|
||||||
report.add(
|
|
||||||
HealthFinding(
|
|
||||||
category="security",
|
|
||||||
severity="critical",
|
|
||||||
path=str(fpath),
|
|
||||||
message=(
|
|
||||||
f"Sensitive file world-readable: {fpath.name} "
|
|
||||||
f"(mode={oct(mode & 0o777)})"
|
|
||||||
),
|
|
||||||
suggestion=(
|
|
||||||
f"Fixed permissions to {oct(SENSIBLE_PERMISSIONS)}"
|
|
||||||
if was_fixed
|
|
||||||
else f"Run 'chmod {oct(SENSIBLE_PERMISSIONS)[2:]} {fpath}'"
|
|
||||||
),
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def scan_environment_variables(report: HealthReport) -> None:
|
|
||||||
"""Check for required environment variables."""
|
|
||||||
for var in REQUIRED_ENV_VARS:
|
|
||||||
if not os.environ.get(var):
|
|
||||||
report.add(
|
|
||||||
HealthFinding(
|
|
||||||
category="configuration",
|
|
||||||
severity="warning",
|
|
||||||
path="$" + var,
|
|
||||||
message=f"Required environment variable {var} is missing or empty",
|
|
||||||
suggestion="Export the variable in your shell profile or secrets manager",
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def run_health_check(target: Path, fix_permissions: bool = False) -> HealthReport:
|
|
||||||
report = HealthReport(target=str(target.resolve()))
|
|
||||||
if target.exists():
|
|
||||||
scan_orphaned_bytecode(target, report)
|
|
||||||
scan_burn_script_clutter(target, report)
|
|
||||||
scan_sensitive_file_permissions(target, report, fix=fix_permissions)
|
|
||||||
scan_environment_variables(report)
|
|
||||||
return report
|
|
||||||
|
|
||||||
|
|
||||||
def print_report(report: HealthReport) -> None:
|
|
||||||
status = "PASS" if report.passed else "FAIL"
|
|
||||||
print(f"Forge Health Check: {status}")
|
|
||||||
print(f"Target: {report.target}")
|
|
||||||
print(f"Findings: {len(report.findings)}\n")
|
|
||||||
|
|
||||||
by_category: dict[str, list[HealthFinding]] = {}
|
|
||||||
for f in report.findings:
|
|
||||||
by_category.setdefault(f.category, []).append(f)
|
|
||||||
|
|
||||||
for category, findings in by_category.items():
|
|
||||||
print(f"[{category.upper()}]")
|
|
||||||
for f in findings:
|
|
||||||
print(f" {f.severity.upper()}: {f.message}")
|
|
||||||
if f.suggestion:
|
|
||||||
print(f" -> {f.suggestion}")
|
|
||||||
print()
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: list[str] | None = None) -> int:
|
|
||||||
parser = argparse.ArgumentParser(description="Forge Health Check")
|
|
||||||
parser.add_argument("target", nargs="?", default="/root/wizards", help="Root path to scan")
|
|
||||||
parser.add_argument("--json", action="store_true", help="Output JSON report")
|
|
||||||
parser.add_argument("--fix-permissions", action="store_true", help="Auto-fix file permissions")
|
|
||||||
args = parser.parse_args(argv)
|
|
||||||
|
|
||||||
target = Path(args.target)
|
|
||||||
report = run_health_check(target, fix_permissions=args.fix_permissions)
|
|
||||||
|
|
||||||
if args.json:
|
|
||||||
print(json.dumps(asdict(report), indent=2))
|
|
||||||
else:
|
|
||||||
print_report(report)
|
|
||||||
|
|
||||||
return 0 if report.passed else 1
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
raise SystemExit(main())
|
|
||||||
@@ -1,89 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""Forge smoke tests — fast checks that core imports resolve and entrypoints load.
|
|
||||||
|
|
||||||
Total runtime target: < 30 seconds.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import importlib
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Allow running smoke test directly from repo root before pip install
|
|
||||||
REPO_ROOT = Path(__file__).parent.parent
|
|
||||||
sys.path.insert(0, str(REPO_ROOT))
|
|
||||||
|
|
||||||
CORE_MODULES = [
|
|
||||||
"hermes_cli.config",
|
|
||||||
"hermes_state",
|
|
||||||
"model_tools",
|
|
||||||
"toolsets",
|
|
||||||
"utils",
|
|
||||||
]
|
|
||||||
|
|
||||||
CLI_ENTRYPOINTS = [
|
|
||||||
[sys.executable, "cli.py", "--help"],
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
def test_imports() -> None:
|
|
||||||
ok = 0
|
|
||||||
skipped = 0
|
|
||||||
for mod in CORE_MODULES:
|
|
||||||
try:
|
|
||||||
importlib.import_module(mod)
|
|
||||||
ok += 1
|
|
||||||
except ImportError as exc:
|
|
||||||
# If the failure is a missing third-party dependency, skip rather than fail
|
|
||||||
# so the smoke test can run before `pip install` in bare environments.
|
|
||||||
msg = str(exc).lower()
|
|
||||||
if "no module named" in msg and mod.replace(".", "/") not in msg:
|
|
||||||
print(f"SKIP: import {mod} -> missing dependency ({exc})")
|
|
||||||
skipped += 1
|
|
||||||
else:
|
|
||||||
print(f"FAIL: import {mod} -> {exc}")
|
|
||||||
sys.exit(1)
|
|
||||||
except Exception as exc:
|
|
||||||
print(f"FAIL: import {mod} -> {exc}")
|
|
||||||
sys.exit(1)
|
|
||||||
print(f"OK: {ok} core imports", end="")
|
|
||||||
if skipped:
|
|
||||||
print(f" ({skipped} skipped due to missing deps)")
|
|
||||||
else:
|
|
||||||
print()
|
|
||||||
|
|
||||||
|
|
||||||
def test_cli_help() -> None:
|
|
||||||
ok = 0
|
|
||||||
skipped = 0
|
|
||||||
for cmd in CLI_ENTRYPOINTS:
|
|
||||||
result = subprocess.run(cmd, capture_output=True, timeout=30)
|
|
||||||
if result.returncode == 0:
|
|
||||||
ok += 1
|
|
||||||
continue
|
|
||||||
stderr = result.stderr.decode().lower()
|
|
||||||
# Gracefully skip if dependencies are missing in bare environments
|
|
||||||
if "modulenotfounderror" in stderr or "no module named" in stderr:
|
|
||||||
print(f"SKIP: {' '.join(cmd)} -> missing dependency")
|
|
||||||
skipped += 1
|
|
||||||
else:
|
|
||||||
print(f"FAIL: {' '.join(cmd)} -> {result.stderr.decode()[:200]}")
|
|
||||||
sys.exit(1)
|
|
||||||
print(f"OK: {ok} CLI entrypoints", end="")
|
|
||||||
if skipped:
|
|
||||||
print(f" ({skipped} skipped due to missing deps)")
|
|
||||||
else:
|
|
||||||
print()
|
|
||||||
|
|
||||||
|
|
||||||
def main() -> int:
|
|
||||||
test_imports()
|
|
||||||
test_cli_help()
|
|
||||||
print("Smoke tests passed.")
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
raise SystemExit(main())
|
|
||||||
@@ -1,20 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""Syntax guard — compile all Python files to catch syntax errors before merge."""
|
|
||||||
import py_compile
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
errors = []
|
|
||||||
for p in Path(".").rglob("*.py"):
|
|
||||||
if ".venv" in p.parts or "__pycache__" in p.parts:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
py_compile.compile(str(p), doraise=True)
|
|
||||||
except py_compile.PyCompileError as e:
|
|
||||||
errors.append(f"{p}: {e}")
|
|
||||||
print(f"SYNTAX ERROR: {p}: {e}", file=sys.stderr)
|
|
||||||
|
|
||||||
if errors:
|
|
||||||
print(f"\n{len(errors)} file(s) with syntax errors", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
print("All Python files compile successfully")
|
|
||||||
@@ -1,175 +0,0 @@
|
|||||||
"""Tests for scripts/forge_health_check.py"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import stat
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Import the script as a module
|
|
||||||
import sys
|
|
||||||
sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
|
|
||||||
|
|
||||||
from forge_health_check import (
|
|
||||||
HealthFinding,
|
|
||||||
HealthReport,
|
|
||||||
_is_sensitive_filename,
|
|
||||||
run_health_check,
|
|
||||||
scan_burn_script_clutter,
|
|
||||||
scan_orphaned_bytecode,
|
|
||||||
scan_sensitive_file_permissions,
|
|
||||||
scan_environment_variables,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class TestIsSensitiveFilename:
|
|
||||||
def test_keystore_is_sensitive(self) -> None:
|
|
||||||
assert _is_sensitive_filename("keystore.json") is True
|
|
||||||
|
|
||||||
def test_env_example_is_not_sensitive(self) -> None:
|
|
||||||
assert _is_sensitive_filename(".env.example") is False
|
|
||||||
|
|
||||||
def test_env_file_is_sensitive(self) -> None:
|
|
||||||
assert _is_sensitive_filename(".env") is True
|
|
||||||
assert _is_sensitive_filename("production.env") is True
|
|
||||||
|
|
||||||
def test_test_file_with_key_is_not_sensitive(self) -> None:
|
|
||||||
assert _is_sensitive_filename("test_interrupt_key_match.py") is False
|
|
||||||
assert _is_sensitive_filename("test_api_key_providers.py") is False
|
|
||||||
|
|
||||||
|
|
||||||
class TestScanOrphanedBytecode:
|
|
||||||
def test_detects_pyc_without_py(self, tmp_path: Path) -> None:
|
|
||||||
pyc = tmp_path / "module.pyc"
|
|
||||||
pyc.write_bytes(b"\x00")
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_orphaned_bytecode(tmp_path, report)
|
|
||||||
assert len(report.findings) == 1
|
|
||||||
assert report.findings[0].category == "artifact_integrity"
|
|
||||||
assert report.findings[0].severity == "critical"
|
|
||||||
|
|
||||||
def test_ignores_pyc_with_py(self, tmp_path: Path) -> None:
|
|
||||||
(tmp_path / "module.py").write_text("pass")
|
|
||||||
pyc = tmp_path / "module.pyc"
|
|
||||||
pyc.write_bytes(b"\x00")
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_orphaned_bytecode(tmp_path, report)
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
|
|
||||||
def test_detects_pycache_orphan(self, tmp_path: Path) -> None:
|
|
||||||
pycache = tmp_path / "__pycache__"
|
|
||||||
pycache.mkdir()
|
|
||||||
pyc = pycache / "module.cpython-312.pyc"
|
|
||||||
pyc.write_bytes(b"\x00")
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_orphaned_bytecode(tmp_path, report)
|
|
||||||
assert len(report.findings) == 1
|
|
||||||
assert "__pycache__" in report.findings[0].path
|
|
||||||
|
|
||||||
|
|
||||||
class TestScanBurnScriptClutter:
|
|
||||||
def test_detects_burn_script(self, tmp_path: Path) -> None:
|
|
||||||
(tmp_path / "burn_test.sh").write_text("#!/bin/bash")
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_burn_script_clutter(tmp_path, report)
|
|
||||||
assert len(report.findings) == 1
|
|
||||||
assert report.findings[0].category == "deployment_hygiene"
|
|
||||||
assert report.findings[0].severity == "warning"
|
|
||||||
|
|
||||||
def test_ignores_regular_files(self, tmp_path: Path) -> None:
|
|
||||||
(tmp_path / "deploy.sh").write_text("#!/bin/bash")
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_burn_script_clutter(tmp_path, report)
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
|
|
||||||
|
|
||||||
class TestScanSensitiveFilePermissions:
|
|
||||||
def test_detects_world_readable_keystore(self, tmp_path: Path) -> None:
|
|
||||||
ks = tmp_path / "keystore.json"
|
|
||||||
ks.write_text("{}")
|
|
||||||
ks.chmod(0o644)
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_sensitive_file_permissions(tmp_path, report)
|
|
||||||
assert len(report.findings) == 1
|
|
||||||
assert report.findings[0].category == "security"
|
|
||||||
assert report.findings[0].severity == "critical"
|
|
||||||
assert "644" in report.findings[0].message
|
|
||||||
|
|
||||||
def test_auto_fixes_permissions(self, tmp_path: Path) -> None:
|
|
||||||
ks = tmp_path / "keystore.json"
|
|
||||||
ks.write_text("{}")
|
|
||||||
ks.chmod(0o644)
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_sensitive_file_permissions(tmp_path, report, fix=True)
|
|
||||||
assert len(report.findings) == 1
|
|
||||||
assert ks.stat().st_mode & 0o777 == 0o600
|
|
||||||
|
|
||||||
def test_ignores_safe_permissions(self, tmp_path: Path) -> None:
|
|
||||||
ks = tmp_path / "keystore.json"
|
|
||||||
ks.write_text("{}")
|
|
||||||
ks.chmod(0o600)
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_sensitive_file_permissions(tmp_path, report)
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
|
|
||||||
def test_ignores_env_example(self, tmp_path: Path) -> None:
|
|
||||||
env = tmp_path / ".env.example"
|
|
||||||
env.write_text("# example")
|
|
||||||
env.chmod(0o644)
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_sensitive_file_permissions(tmp_path, report)
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
|
|
||||||
def test_ignores_test_directory(self, tmp_path: Path) -> None:
|
|
||||||
tests_dir = tmp_path / "tests"
|
|
||||||
tests_dir.mkdir()
|
|
||||||
ks = tests_dir / "keystore.json"
|
|
||||||
ks.write_text("{}")
|
|
||||||
ks.chmod(0o644)
|
|
||||||
report = HealthReport(target=str(tmp_path))
|
|
||||||
scan_sensitive_file_permissions(tmp_path, report)
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
|
|
||||||
|
|
||||||
class TestScanEnvironmentVariables:
|
|
||||||
def test_reports_missing_env_var(self, monkeypatch) -> None:
|
|
||||||
monkeypatch.delenv("GITEA_TOKEN", raising=False)
|
|
||||||
report = HealthReport(target=".")
|
|
||||||
scan_environment_variables(report)
|
|
||||||
missing = [f for f in report.findings if f.path == "$GITEA_TOKEN"]
|
|
||||||
assert len(missing) == 1
|
|
||||||
assert missing[0].severity == "warning"
|
|
||||||
|
|
||||||
def test_passes_when_env_vars_present(self, monkeypatch) -> None:
|
|
||||||
for var in ("GITEA_URL", "GITEA_TOKEN", "GITEA_USER"):
|
|
||||||
monkeypatch.setenv(var, "present")
|
|
||||||
report = HealthReport(target=".")
|
|
||||||
scan_environment_variables(report)
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
|
|
||||||
|
|
||||||
class TestRunHealthCheck:
|
|
||||||
def test_full_run(self, tmp_path: Path, monkeypatch) -> None:
|
|
||||||
monkeypatch.setenv("GITEA_URL", "https://example.com")
|
|
||||||
monkeypatch.setenv("GITEA_TOKEN", "secret")
|
|
||||||
monkeypatch.setenv("GITEA_USER", "bezalel")
|
|
||||||
|
|
||||||
(tmp_path / "orphan.pyc").write_bytes(b"\x00")
|
|
||||||
(tmp_path / "burn_it.sh").write_text("#!/bin/bash")
|
|
||||||
ks = tmp_path / "keystore.json"
|
|
||||||
ks.write_text("{}")
|
|
||||||
ks.chmod(0o644)
|
|
||||||
|
|
||||||
report = run_health_check(tmp_path)
|
|
||||||
assert not report.passed
|
|
||||||
categories = {f.category for f in report.findings}
|
|
||||||
assert "artifact_integrity" in categories
|
|
||||||
assert "deployment_hygiene" in categories
|
|
||||||
assert "security" in categories
|
|
||||||
|
|
||||||
def test_clean_run_passes(self, tmp_path: Path, monkeypatch) -> None:
|
|
||||||
for var in ("GITEA_URL", "GITEA_TOKEN", "GITEA_USER"):
|
|
||||||
monkeypatch.setenv(var, "present")
|
|
||||||
|
|
||||||
(tmp_path / "module.py").write_text("pass")
|
|
||||||
report = run_health_check(tmp_path)
|
|
||||||
assert report.passed
|
|
||||||
assert len(report.findings) == 0
|
|
||||||
@@ -1,18 +0,0 @@
|
|||||||
"""Bare green-path E2E — one happy-path tool call cycle.
|
|
||||||
|
|
||||||
Exercises the terminal tool directly and verifies the response structure.
|
|
||||||
No API keys required. Runtime target: < 10 seconds.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import json
|
|
||||||
|
|
||||||
from tools.terminal_tool import terminal_tool
|
|
||||||
|
|
||||||
|
|
||||||
def test_terminal_echo_green_path() -> None:
|
|
||||||
"""terminal('echo hello') -> verify response contains 'hello' and exit_code 0."""
|
|
||||||
result = terminal_tool(command="echo hello", timeout=10)
|
|
||||||
data = json.loads(result)
|
|
||||||
|
|
||||||
assert data["exit_code"] == 0, f"Expected exit_code 0, got {data['exit_code']}"
|
|
||||||
assert "hello" in data["output"], f"Expected 'hello' in output, got: {data['output']}"
|
|
||||||
@@ -1,215 +0,0 @@
|
|||||||
# Forge Operations Guide
|
|
||||||
|
|
||||||
> **Audience:** Forge wizards joining the hermes-agent project
|
|
||||||
> **Purpose:** Practical patterns, common pitfalls, and operational wisdom
|
|
||||||
> **Companion to:** `WIZARD_ENVIRONMENT_CONTRACT.md`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## The One Rule
|
|
||||||
|
|
||||||
**Read the actual state before acting.**
|
|
||||||
|
|
||||||
Before touching any service, config, or codebase: `ps aux | grep hermes`, `cat ~/.hermes/gateway_state.json`, `curl http://127.0.0.1:8642/health`. The forge punishes assumptions harder than it rewards speed. Evidence always beats intuition.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## First 15 Minutes on a New System
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 1. Validate your environment
|
|
||||||
python wizard-bootstrap/wizard_bootstrap.py
|
|
||||||
|
|
||||||
# 2. Check what is actually running
|
|
||||||
ps aux | grep -E 'hermes|python|gateway'
|
|
||||||
|
|
||||||
# 3. Check the data directory
|
|
||||||
ls -la ~/.hermes/
|
|
||||||
cat ~/.hermes/gateway_state.json 2>/dev/null | python3 -m json.tool
|
|
||||||
|
|
||||||
# 4. Verify health endpoints (if gateway is up)
|
|
||||||
curl -sf http://127.0.0.1:8642/health | python3 -m json.tool
|
|
||||||
|
|
||||||
# 5. Run the smoke test
|
|
||||||
source venv/bin/activate
|
|
||||||
python -m pytest tests/ -q -x --timeout=60 2>&1 | tail -20
|
|
||||||
```
|
|
||||||
|
|
||||||
Do not begin work until all five steps return clean output.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Import Chain — Know It, Respect It
|
|
||||||
|
|
||||||
The dependency order is load-bearing. Violating it causes silent failures:
|
|
||||||
|
|
||||||
```
|
|
||||||
tools/registry.py ← no deps; imported by everything
|
|
||||||
↑
|
|
||||||
tools/*.py ← each calls registry.register() at import time
|
|
||||||
↑
|
|
||||||
model_tools.py ← imports registry; triggers tool discovery
|
|
||||||
↑
|
|
||||||
run_agent.py / cli.py / batch_runner.py
|
|
||||||
```
|
|
||||||
|
|
||||||
**If you add a tool file**, you must also:
|
|
||||||
1. Add its import to `model_tools.py` `_discover_tools()`
|
|
||||||
2. Add it to `toolsets.py` (core or a named toolset)
|
|
||||||
|
|
||||||
Missing either step causes the tool to silently not appear — no error, just absence.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## The Five Profile Rules
|
|
||||||
|
|
||||||
Hermes supports isolated profiles (`hermes -p myprofile`). Profile-unsafe code has caused repeated bugs. Memorize these:
|
|
||||||
|
|
||||||
| Do this | Not this |
|
|
||||||
|---------|----------|
|
|
||||||
| `get_hermes_home()` | `Path.home() / ".hermes"` |
|
|
||||||
| `display_hermes_home()` in user messages | hardcoded `~/.hermes` strings |
|
|
||||||
| `get_hermes_home() / "sessions"` in tests | `~/.hermes/sessions` in tests |
|
|
||||||
|
|
||||||
Import both from `hermes_constants`. Every `~/.hermes` hardcode is a latent profile bug.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Prompt Caching — Do Not Break It
|
|
||||||
|
|
||||||
The agent caches system prompts. Cache breaks force re-billing of the entire context window on every turn. The following actions break caching mid-conversation and are forbidden:
|
|
||||||
|
|
||||||
- Altering past context
|
|
||||||
- Changing the active toolset
|
|
||||||
- Reloading memories or rebuilding the system prompt
|
|
||||||
|
|
||||||
The only sanctioned context alteration is the context compressor (`agent/context_compressor.py`). If your feature touches the message history, read that file first.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Adding a Slash Command (Checklist)
|
|
||||||
|
|
||||||
Four files, in order:
|
|
||||||
|
|
||||||
1. **`hermes_cli/commands.py`** — add `CommandDef` to `COMMAND_REGISTRY`
|
|
||||||
2. **`cli.py`** — add handler branch in `HermesCLI.process_command()`
|
|
||||||
3. **`gateway/run.py`** — add handler if it should work in messaging platforms
|
|
||||||
4. **Aliases** — add to the `aliases` tuple on the `CommandDef`; everything else updates automatically
|
|
||||||
|
|
||||||
All downstream consumers (Telegram menu, Slack routing, autocomplete, help text) derive from `COMMAND_REGISTRY`. You never touch them directly.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Tool Schema Pitfalls
|
|
||||||
|
|
||||||
**Do NOT cross-reference other toolsets in schema descriptions.**
|
|
||||||
Writing "prefer `web_search` over this tool" in a browser tool's description will cause the model to hallucinate calls to `web_search` when it's not loaded. Cross-references belong in `get_tool_definitions()` post-processing blocks in `model_tools.py`.
|
|
||||||
|
|
||||||
**Do NOT use `\033[K` (ANSI erase-to-EOL) in display code.**
|
|
||||||
Under `prompt_toolkit`'s `patch_stdout`, it leaks as literal `?[K`. Use space-padding instead: `f"\r{line}{' ' * pad}"`.
|
|
||||||
|
|
||||||
**Do NOT use `simple_term_menu` for interactive menus.**
|
|
||||||
It ghosts on scroll in tmux/iTerm2. Use `curses` (stdlib). See `hermes_cli/tools_config.py` for the pattern.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Health Check Anatomy
|
|
||||||
|
|
||||||
A healthy instance returns:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"status": "ok",
|
|
||||||
"gateway_state": "running",
|
|
||||||
"platforms": {
|
|
||||||
"telegram": {"state": "connected"}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
| Field | Healthy value | What a bad value means |
|
|
||||||
|-------|--------------|----------------------|
|
|
||||||
| `status` | `"ok"` | HTTP server down |
|
|
||||||
| `gateway_state` | `"running"` | Still starting or crashed |
|
|
||||||
| `platforms.<name>.state` | `"connected"` | Auth failure or network issue |
|
|
||||||
|
|
||||||
`gateway_state: "starting"` is normal for up to 60 s on boot. Beyond that, check logs for auth errors:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
journalctl -u hermes-gateway --since "2 minutes ago" | grep -i "error\|token\|auth"
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Gateway Won't Start — Diagnosis Order
|
|
||||||
|
|
||||||
1. `ss -tlnp | grep 8642` — port conflict?
|
|
||||||
2. `cat ~/.hermes/gateway.pid` → `ps -p <pid>` — stale PID file?
|
|
||||||
3. `hermes gateway start --replace` — clears stale locks and PIDs
|
|
||||||
4. `HERMES_LOG_LEVEL=DEBUG hermes gateway start` — verbose output
|
|
||||||
5. Check `~/.hermes/.env` — missing or placeholder token?
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Before Every PR
|
|
||||||
|
|
||||||
```bash
|
|
||||||
source venv/bin/activate
|
|
||||||
python -m pytest tests/ -q # full suite: ~3 min, ~3000 tests
|
|
||||||
python scripts/deploy-validate # deployment health check
|
|
||||||
python wizard-bootstrap/wizard_bootstrap.py # environment sanity
|
|
||||||
```
|
|
||||||
|
|
||||||
All three must exit 0. Do not skip. "It works locally" is not sufficient evidence.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Session and State Files
|
|
||||||
|
|
||||||
| Store | Location | Notes |
|
|
||||||
|-------|----------|-------|
|
|
||||||
| Sessions | `~/.hermes/sessions/*.json` | Persisted across restarts |
|
|
||||||
| Memories | `~/.hermes/memories/*.md` | Written by the agent's memory tool |
|
|
||||||
| Cron jobs | `~/.hermes/cron/*.json` | Scheduler state |
|
|
||||||
| Gateway state | `~/.hermes/gateway_state.json` | Live platform connection status |
|
|
||||||
| Response store | `~/.hermes/response_store.db` | SQLite WAL — API server only |
|
|
||||||
|
|
||||||
All paths go through `get_hermes_home()`. Never hardcode. Always backup before a major update:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
tar czf ~/backups/hermes_$(date +%F_%H%M).tar.gz ~/.hermes/
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Writing Tests
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m pytest tests/path/to/test.py -q # single file
|
|
||||||
python -m pytest tests/ -q -k "test_name" # by name
|
|
||||||
python -m pytest tests/ -q -x # stop on first failure
|
|
||||||
```
|
|
||||||
|
|
||||||
**Test isolation rules:**
|
|
||||||
- `tests/conftest.py` has an autouse fixture that redirects `HERMES_HOME` to a temp dir. Never write to `~/.hermes/` in tests.
|
|
||||||
- Profile tests must mock both `Path.home()` and `HERMES_HOME`. See `tests/hermes_cli/test_profiles.py` for the pattern.
|
|
||||||
- Do not mock the database. Integration tests should use real SQLite with a temp path.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Commit Conventions
|
|
||||||
|
|
||||||
```
|
|
||||||
feat: add X # new capability
|
|
||||||
fix: correct Y # bug fix
|
|
||||||
refactor: restructure Z # no behaviour change
|
|
||||||
test: add tests for W # test-only
|
|
||||||
chore: update deps # housekeeping
|
|
||||||
docs: clarify X # documentation only
|
|
||||||
```
|
|
||||||
|
|
||||||
Include `Fixes #NNN` or `Refs #NNN` in the commit message body to close or reference issues automatically.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*This guide lives in `wizard-bootstrap/`. Update it when you discover a new pitfall or pattern worth preserving.*
|
|
||||||
Reference in New Issue
Block a user