Compare commits
1 Commits
research/l
...
fix/682
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
79b841727f |
@@ -1,37 +0,0 @@
|
||||
# NH Broadband Install Packet
|
||||
|
||||
**Packet ID:** nh-bb-20260415-113232
|
||||
**Generated:** 2026-04-15T11:32:32.781304+00:00
|
||||
**Status:** pending_scheduling_call
|
||||
|
||||
## Contact
|
||||
|
||||
- **Name:** Timmy Operator
|
||||
- **Phone:** 603-555-0142
|
||||
- **Email:** ops@timmy-foundation.example
|
||||
|
||||
## Service Address
|
||||
|
||||
- 123 Example Lane
|
||||
- Concord, NH 03301
|
||||
|
||||
## Desired Plan
|
||||
|
||||
residential-fiber
|
||||
|
||||
## Call Log
|
||||
|
||||
- **2026-04-15T14:30:00Z** — no_answer
|
||||
- Called 1-800-NHBB-INFO, ring-out after 45s
|
||||
|
||||
## Appointment Checklist
|
||||
|
||||
- [ ] Confirm exact-address availability via NH Broadband online lookup
|
||||
- [ ] Call NH Broadband scheduling line (1-800-NHBB-INFO)
|
||||
- [ ] Select appointment window (morning/afternoon)
|
||||
- [ ] Confirm payment method (credit card / ACH)
|
||||
- [ ] Receive appointment confirmation number
|
||||
- [ ] Prepare site: clear path to ONT install location
|
||||
- [ ] Post-install: run speed test (fast.com / speedtest.net)
|
||||
- [ ] Log final speeds and appointment outcome
|
||||
|
||||
@@ -1,27 +0,0 @@
|
||||
contact:
|
||||
name: Timmy Operator
|
||||
phone: "603-555-0142"
|
||||
email: ops@timmy-foundation.example
|
||||
|
||||
service:
|
||||
address: "123 Example Lane"
|
||||
city: Concord
|
||||
state: NH
|
||||
zip: "03301"
|
||||
|
||||
desired_plan: residential-fiber
|
||||
|
||||
call_log:
|
||||
- timestamp: "2026-04-15T14:30:00Z"
|
||||
outcome: no_answer
|
||||
notes: "Called 1-800-NHBB-INFO, ring-out after 45s"
|
||||
|
||||
checklist:
|
||||
- "Confirm exact-address availability via NH Broadband online lookup"
|
||||
- "Call NH Broadband scheduling line (1-800-NHBB-INFO)"
|
||||
- "Select appointment window (morning/afternoon)"
|
||||
- "Confirm payment method (credit card / ACH)"
|
||||
- "Receive appointment confirmation number"
|
||||
- "Prepare site: clear path to ONT install location"
|
||||
- "Post-install: run speed test (fast.com / speedtest.net)"
|
||||
- "Log final speeds and appointment outcome"
|
||||
320
genomes/timmy-dispatch-GENOME.md
Normal file
320
genomes/timmy-dispatch-GENOME.md
Normal file
@@ -0,0 +1,320 @@
|
||||
# GENOME.md — timmy-dispatch
|
||||
|
||||
Generated: 2026-04-15 02:29:00 EDT
|
||||
Analyzed repo: Timmy_Foundation/timmy-dispatch
|
||||
Analyzed commit: 730dde8
|
||||
Host issue: timmy-home #682
|
||||
|
||||
## Project Overview
|
||||
|
||||
`timmy-dispatch` is a small, script-first orchestration repo for a cron-driven Hermes fleet. It does not try to be a general platform. It is an operator's toolbelt for one specific style of swarm work:
|
||||
- select a Gitea issue
|
||||
- build a self-contained prompt
|
||||
- run one cheap-model implementation pass
|
||||
- push a branch and PR back to Forge
|
||||
- measure what the fleet did overnight
|
||||
|
||||
The repo is intentionally lightweight:
|
||||
- 7 Python files
|
||||
- 4 shell entry points
|
||||
- a checked-in `GENOME.md` already present on the analyzed repo's `main`
|
||||
- generated telemetry state committed in `telemetry/`
|
||||
- no tests on `main` (`python3 -m pytest -q` -> `no tests ran in 0.01s`)
|
||||
|
||||
A crucial truth about this ticket: the analyzed repo already contains a genome on `main`, and it already has an open follow-up issue for test coverage:
|
||||
- `timmy-dispatch#1` — genome file already present on main
|
||||
- `timmy-dispatch#3` — critical-path tests still missing
|
||||
|
||||
So this host-repo artifact is not pretending to discover a blank slate. It is documenting the repo's real current state for the cross-repo genome lane in `timmy-home`.
|
||||
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
CRON[crontab] --> LAUNCHER[bin/sprint-launcher.sh]
|
||||
CRON --> COLLECTOR[bin/telemetry-collector.py]
|
||||
CRON --> MONITOR[bin/sprint-monitor.sh]
|
||||
CRON --> WATCHDOG[bin/model-watchdog.py]
|
||||
CRON --> ANALYZER[bin/telemetry-analyzer.py]
|
||||
|
||||
LAUNCHER --> RUNNER[bin/sprint-runner.py]
|
||||
LAUNCHER --> GATEWAY[optional gateway on :8642]
|
||||
LAUNCHER --> CLI[hermes chat fallback]
|
||||
|
||||
RUNNER --> GITEA[Gitea API]
|
||||
RUNNER --> LLM[OpenAI SDK\nNous or Ollama]
|
||||
RUNNER --> TOOLS[local tools\nrun_command/read_file/write_file/gitea_api]
|
||||
RUNNER --> TMP[/tmp/sprint-* workspaces]
|
||||
RUNNER --> RESULTS[~/.hermes/logs/sprint/results.csv]
|
||||
|
||||
AGENTDISPATCH[bin/agent-dispatch.sh] --> HUMAN[human/operator copy-paste into agent UI]
|
||||
AGENTLOOP[bin/agent-loop.sh] --> TMUX[tmux worker panes]
|
||||
WATCHDOG --> TMUX
|
||||
SNAPSHOT[bin/tmux-snapshot.py] --> TELEMETRY[telemetry/*.jsonl]
|
||||
COLLECTOR --> TELEMETRY
|
||||
ANALYZER --> REPORT[overnight report text]
|
||||
DISPATCHHEALTH[bin/dispatch-health.py] --> TELEMETRY
|
||||
```
|
||||
|
||||
## Entry Points
|
||||
|
||||
### `bin/sprint-launcher.sh`
|
||||
Primary cron-facing shell entry point.
|
||||
Responsibilities:
|
||||
- allocate a unique `/tmp/sprint-*` workspace
|
||||
- fetch open issues from Gitea
|
||||
- choose the first non-epic, non-study issue
|
||||
- write a fully self-contained prompt file
|
||||
- try the local Hermes gateway first
|
||||
- fall back to `hermes chat` CLI if the gateway is down
|
||||
- record result rows in `~/.hermes/logs/sprint/results.csv`
|
||||
- prune old workspaces and old logs
|
||||
|
||||
### `bin/sprint-runner.py`
|
||||
Primary Python implementation engine.
|
||||
Responsibilities:
|
||||
- read active provider settings from `~/.hermes/config.yaml`
|
||||
- read auth from `~/.hermes/auth.json`
|
||||
- route through OpenAI SDK to the currently active provider
|
||||
- implement a tiny local tool-calling loop with 4 tools:
|
||||
- `run_command`
|
||||
- `read_file`
|
||||
- `write_file`
|
||||
- `gitea_api`
|
||||
- clone repo, branch, implement, commit, push, PR, comment
|
||||
|
||||
This is the cognitive core of the repo.
|
||||
|
||||
### `bin/agent-loop.sh`
|
||||
Persistent tmux worker loop.
|
||||
This is important because it soft-conflicts with the README claim that the system “does NOT run persistent agent loops.” It clearly does support them as an alternate lane.
|
||||
|
||||
### `bin/agent-dispatch.sh`
|
||||
Manual one-shot prompt generator.
|
||||
It packages all of the context, token, repo, issue, and Git/Gitea commands into a copy-pasteable prompt for another agent.
|
||||
|
||||
### Telemetry/ops entry points
|
||||
- `bin/telemetry-collector.py`
|
||||
- `bin/telemetry-analyzer.py`
|
||||
- `bin/sprint-monitor.sh`
|
||||
- `bin/dispatch-health.py`
|
||||
- `bin/tmux-snapshot.py`
|
||||
- `bin/model-watchdog.py`
|
||||
- `bin/nous-auth-refresh.py`
|
||||
|
||||
These form the observability layer around dispatch.
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Autonomous sprint path
|
||||
1. cron starts `bin/sprint-launcher.sh`
|
||||
2. launcher fetches open issues from Gitea
|
||||
3. launcher filters out epic/study work
|
||||
4. launcher writes a self-contained prompt to a temp workspace
|
||||
5. launcher tries gateway API on `localhost:8642`
|
||||
6. if gateway is unavailable, launcher falls back to `hermes chat`
|
||||
7. or, in the separate Python lane, `bin/sprint-runner.py` directly calls an LLM provider via the OpenAI SDK
|
||||
8. model requests local tool calls
|
||||
9. local tool functions execute subprocess/Gitea/file actions
|
||||
10. runner logs results and writes success/failure to `results.csv`
|
||||
|
||||
### Telemetry path
|
||||
1. `bin/telemetry-collector.py` samples tmux, cron, Gitea, sprint activity, and process liveness
|
||||
2. it appends snapshots to `telemetry/metrics.jsonl`
|
||||
3. it emits state changes to `telemetry/events.jsonl`
|
||||
4. it stores a reduced comparison state in `telemetry/last_state.json`
|
||||
5. `bin/telemetry-analyzer.py` summarizes those snapshots into a morning report
|
||||
6. `bin/dispatch-health.py` separately checks whether the system is actually doing work, not merely running processes
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### Stateless sprint model
|
||||
The repo's main philosophical abstraction is that each sprint run is disposable.
|
||||
State lives in:
|
||||
- Gitea
|
||||
- tmux session topology
|
||||
- log files
|
||||
- telemetry JSONL streams
|
||||
|
||||
Not in a long-running queue or orchestration daemon.
|
||||
|
||||
### Self-contained prompt contract
|
||||
`bin/agent-dispatch.sh` and `bin/sprint-launcher.sh` both assume that the work unit can be described as a prompt containing:
|
||||
- issue context
|
||||
- API URLs
|
||||
- token path or token value
|
||||
- branching instructions
|
||||
- PR creation instructions
|
||||
|
||||
That is a very opinionated orchestration primitive.
|
||||
|
||||
### Local tool-calling shim
|
||||
`bin/sprint-runner.py` reimplements a tiny tool layer locally instead of using the Hermes gateway tool registry. That makes it simple and portable, but also means duplicated tool logic and duplicated security risk.
|
||||
|
||||
### Telemetry-as-paper-artifact
|
||||
The repo carries a `paper/` directory with a research framing around “hierarchical self-orchestration.” The telemetry directory is part of that design — not just ops exhaust, but raw material for claims.
|
||||
|
||||
## API Surface
|
||||
|
||||
### Gitea APIs consumed
|
||||
- repo issue listing
|
||||
- issue detail fetch
|
||||
- PR creation
|
||||
- issue comment creation
|
||||
- repo metadata queries
|
||||
- commit/PR count sampling in telemetry
|
||||
|
||||
### LLM APIs consumed
|
||||
Observed paths in code/docs:
|
||||
- Nous inference API
|
||||
- local Ollama-compatible endpoint
|
||||
- gateway `/v1/chat/completions` when available
|
||||
|
||||
### File/state APIs produced
|
||||
- `~/.hermes/logs/sprint/*.log`
|
||||
- `~/.hermes/logs/sprint/results.csv`
|
||||
- `telemetry/metrics.jsonl`
|
||||
- `telemetry/events.jsonl`
|
||||
- `telemetry/last_state.json`
|
||||
- telemetry snapshots under `telemetry/snapshots/`
|
||||
|
||||
## Test Coverage Gaps
|
||||
|
||||
### Current state
|
||||
On the analyzed repo's `main`:
|
||||
- `python3 -m pytest -q` -> `no tests ran in 0.01s`
|
||||
- `python3 -m py_compile bin/*.py` -> passes
|
||||
- `bash -n bin/*.sh` -> passes
|
||||
|
||||
So the repo is parse-clean but untested.
|
||||
|
||||
### Important nuance
|
||||
This is already known upstream:
|
||||
- `timmy-dispatch#3` explicitly tracks critical-path tests for the repo (issue #3 in the analyzed repo)
|
||||
|
||||
That means the honest genome should say:
|
||||
- test coverage is missing on `main`
|
||||
- but the gap is already recognized in the analyzed repo itself
|
||||
|
||||
### Most important missing lanes
|
||||
1. `sprint-runner.py`
|
||||
- provider selection
|
||||
- fallback behavior
|
||||
- tool-dispatch semantics
|
||||
- result logging
|
||||
2. `telemetry-collector.py`
|
||||
- state diff correctness
|
||||
- event emission correctness
|
||||
- deterministic cron drift detection
|
||||
3. `model-watchdog.py`
|
||||
- profile/model expectation map
|
||||
- drift detection and fix behavior
|
||||
4. `agent-loop.sh`
|
||||
- work selection and skip-list handling
|
||||
- lock discipline
|
||||
5. `sprint-launcher.sh`
|
||||
- issue selection and gateway/CLI fallback path
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### 1. Token handling is shell-centric and leaky
|
||||
The repo frequently assumes tokens are read from files and injected into:
|
||||
- shell variables
|
||||
- curl headers
|
||||
- clone URLs
|
||||
- copy-paste prompts
|
||||
|
||||
This is operationally convenient but expands exposure through:
|
||||
- process list leakage
|
||||
- logs
|
||||
- copied prompt artifacts
|
||||
- shell history if mishandled
|
||||
|
||||
### 2. Arbitrary shell execution is a core feature
|
||||
`run_command` in `sprint-runner.py` is intentionally broad. That is fine for a trusted operator loop, but it means this repo is a dispatch engine, not a sandbox.
|
||||
|
||||
### 3. `/tmp` workspace exposure
|
||||
The default sprint workspace location is `/tmp/sprint-*`. On a shared multi-user machine, that is weaker isolation than a private worktree root.
|
||||
|
||||
### 4. Generated telemetry is committed
|
||||
`telemetry/events.jsonl` and `telemetry/last_state.json` are on `main`. That can be useful for paper artifacts, but it also means runtime state mixes with source history.
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Runtime dependencies
|
||||
- Python 3
|
||||
- shell utilities (`bash`, `curl`, `tmux`, `git`)
|
||||
- OpenAI-compatible SDK/runtime
|
||||
- Gitea server access
|
||||
- local Hermes config/auth files
|
||||
|
||||
### Optional/ambient dependencies
|
||||
- local Hermes gateway on port `8642`
|
||||
- local Ollama endpoint
|
||||
- Nous portal auth state
|
||||
|
||||
### Documentation/research dependencies
|
||||
- LaTeX toolchain for `paper/`
|
||||
|
||||
## Deployment
|
||||
|
||||
This repo is not a service deployment repo in the classic sense. It is an operator repo.
|
||||
|
||||
Typical live environment assumptions:
|
||||
- cron invokes shell/Python entry points
|
||||
- tmux sessions hold worker panes
|
||||
- Hermes is already installed elsewhere
|
||||
- Gitea and auth are already provisioned
|
||||
|
||||
Minimal validation I ran:
|
||||
- `python3 -m py_compile /tmp/timmy-dispatch-genome/bin/*.py`
|
||||
- `bash -n /tmp/timmy-dispatch-genome/bin/*.sh`
|
||||
- `python3 -m pytest -q` -> no tests present
|
||||
|
||||
## Technical Debt
|
||||
|
||||
### 1. README contradiction about persistent loops
|
||||
README says:
|
||||
- “The system does NOT run persistent agent loops.”
|
||||
But the repo clearly ships `bin/agent-loop.sh`, described as a persistent tmux-based worker loop.
|
||||
|
||||
That is the most important docs drift in the repo.
|
||||
|
||||
### 2. Two orchestration philosophies coexist
|
||||
- cron-fired disposable runs
|
||||
- persistent tmux workers
|
||||
|
||||
Both may be intentional, but the docs do not clearly state which is canonical versus fallback/legacy.
|
||||
|
||||
### 3. Target repo already has a genome, but the host issue still exists
|
||||
This timmy-home genome issue is happening after `timmy-dispatch` already gained:
|
||||
- `GENOME.md` on `main`
|
||||
- open issue `#3` for missing tests
|
||||
|
||||
That is not bad, but it means the cross-repo genome process and the target repo's own documentation lane are out of sync.
|
||||
|
||||
### 4. Generated/runtime artifacts mixed into source tree
|
||||
Telemetry and research assets are part of the repo history. That may be intentional for paper-writing, but it makes source metrics noisier and can blur runtime-vs-source boundaries.
|
||||
|
||||
## Existing Work Already on Main
|
||||
|
||||
The analyzed repo already has two important genome-lane artifacts:
|
||||
- `GENOME.md` on `main`
|
||||
- open issue `timmy-dispatch#3` tracking critical-path tests
|
||||
|
||||
So the most honest statement for `timmy-home#682` is:
|
||||
- the genome itself is already present in the target repo
|
||||
- the remaining missing piece on the target repo is test coverage
|
||||
- this host-repo artifact exists to make the cross-repo analysis lane explicit and traceable
|
||||
|
||||
## Bottom Line
|
||||
|
||||
`timmy-dispatch` is a small but very revealing repo. It embodies the Timmy Foundation's dispatch style in concentrated form:
|
||||
- script-first
|
||||
- cron-first
|
||||
- tmux-aware
|
||||
- Gitea-centered
|
||||
- cheap-model friendly
|
||||
- operator-visible
|
||||
|
||||
Its biggest weakness is not code volume. It is architectural ambiguity in the docs and a complete lack of tests on `main` despite being a coordination-critical repo.
|
||||
@@ -1,35 +0,0 @@
|
||||
# NH Broadband — Public Research Memo
|
||||
|
||||
**Date:** 2026-04-15
|
||||
**Status:** Draft — separates verified facts from unverified live work
|
||||
**Refs:** #533, #740
|
||||
|
||||
---
|
||||
|
||||
## Verified (official public sources)
|
||||
|
||||
- **NH Broadband** is a residential fiber internet provider operating in New Hampshire.
|
||||
- Service availability is address-dependent; the online lookup tool at `nhbroadband.com` reports coverage by street address.
|
||||
- Residential fiber plans are offered; speed tiers vary by location.
|
||||
- Scheduling line: **1-800-NHBB-INFO** (published on official site).
|
||||
- Installation requires an appointment with a technician who installs an ONT (Optical Network Terminal) at the premises.
|
||||
- Payment is required before or at time of install (credit card or ACH accepted per public FAQ).
|
||||
|
||||
## Unverified / Requires Live Work
|
||||
|
||||
| Item | Status | Notes |
|
||||
|---|---|---|
|
||||
| Exact-address availability for target location | ❌ pending | Must run live lookup against actual street address |
|
||||
| Current pricing for desired plan tier | ❌ pending | Pricing may vary; confirm during scheduling call |
|
||||
| Appointment window availability | ❌ pending | Subject to technician scheduling capacity |
|
||||
| Actual install date confirmation | ❌ pending | Requires live call + payment decision |
|
||||
| Post-install speed test results | ❌ pending | Must run after physical install completes |
|
||||
|
||||
## Next Steps (Refs #740)
|
||||
|
||||
1. Run address availability lookup on `nhbroadband.com`
|
||||
2. Call 1-800-NHBB-INFO to schedule install
|
||||
3. Confirm payment method
|
||||
4. Receive appointment confirmation number
|
||||
5. Prepare site (clear ONT install path)
|
||||
6. Post-install: speed test and log results
|
||||
@@ -1,102 +0,0 @@
|
||||
# Long Context vs RAG Decision Framework
|
||||
|
||||
**Research Backlog Item #4.3** | Impact: 4 | Effort: 1 | Ratio: 4.0
|
||||
**Date**: 2026-04-15
|
||||
**Status**: RESEARCHED
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Modern LLMs have 128K-200K+ context windows, but we still treat them like 4K models by default. This document provides a decision framework for when to stuff context vs. use RAG, based on empirical findings and our stack constraints.
|
||||
|
||||
## The Core Insight
|
||||
|
||||
**Long context ≠ better answers.** Research shows:
|
||||
- "Lost in the Middle" effect: Models attend poorly to information in the middle of long contexts (Liu et al., 2023)
|
||||
- RAG with reranking outperforms full-context stuffing for document QA when docs > 50K tokens
|
||||
- Cost scales quadratically with context length (attention computation)
|
||||
- Latency increases linearly with input length
|
||||
|
||||
**RAG ≠ always better.** Retrieval introduces:
|
||||
- Recall errors (miss relevant chunks)
|
||||
- Precision errors (retrieve irrelevant chunks)
|
||||
- Chunking artifacts (splitting mid-sentence)
|
||||
- Additional latency for embedding + search
|
||||
|
||||
## Decision Matrix
|
||||
|
||||
| Scenario | Context Size | Recommendation | Why |
|
||||
|----------|-------------|---------------|-----|
|
||||
| Single conversation (< 32K) | Small | **Stuff everything** | No retrieval overhead, full context available |
|
||||
| 5-20 documents, focused query | 32K-128K | **Hybrid** | Key docs in context, rest via RAG |
|
||||
| Large corpus search | > 128K | **Pure RAG + reranking** | Full context impossible, must retrieve |
|
||||
| Code review (< 5 files) | < 32K | **Stuff everything** | Code needs full context for understanding |
|
||||
| Code review (repo-wide) | > 128K | **RAG with file-level chunks** | Files are natural chunk boundaries |
|
||||
| Multi-turn conversation | Growing | **Hybrid + compression** | Keep recent turns in full, compress older |
|
||||
| Fact retrieval | Any | **RAG** | Always faster to search than read everything |
|
||||
| Complex reasoning across docs | 32K-128K | **Stuff + chain-of-thought** | Models need all context for cross-doc reasoning |
|
||||
|
||||
## Our Stack Constraints
|
||||
|
||||
### What We Have
|
||||
- **Cloud models**: 128K-200K context (OpenRouter providers)
|
||||
- **Local Ollama**: 8K-32K context (Gemma-4 default 8192)
|
||||
- **Hermes fact_store**: SQLite FTS5 full-text search
|
||||
- **Memory**: MemPalace holographic embeddings
|
||||
- **Session context**: Growing conversation history
|
||||
|
||||
### What This Means
|
||||
1. **Cloud sessions**: We CAN stuff up to 128K but SHOULD we? Cost and latency matter.
|
||||
2. **Local sessions**: MUST use RAG for anything beyond 8K. Long context not available.
|
||||
3. **Mixed fleet**: Need a routing layer that decides per-session.
|
||||
|
||||
## Advanced Patterns
|
||||
|
||||
### 1. Progressive Context Loading
|
||||
Don't load everything at once. Start with RAG, then stuff additional docs as needed:
|
||||
```
|
||||
Turn 1: RAG search → top 3 chunks
|
||||
Turn 2: Model asks "I need more context about X" → stuff X
|
||||
Turn 3: Model has enough → continue
|
||||
```
|
||||
|
||||
### 2. Context Budgeting
|
||||
Allocate context budget across components:
|
||||
```
|
||||
System prompt: 2,000 tokens (always)
|
||||
Recent messages: 10,000 tokens (last 5 turns)
|
||||
RAG results: 8,000 tokens (top chunks)
|
||||
Stuffed docs: 12,000 tokens (key docs)
|
||||
---------------------------
|
||||
Total: 32,000 tokens (fits 32K model)
|
||||
```
|
||||
|
||||
### 3. Smart Compression
|
||||
Before stuffing, compress older context:
|
||||
- Summarize turns older than 10
|
||||
- Remove tool call results (keep only final outputs)
|
||||
- Deduplicate repeated information
|
||||
- Use structured representations (JSON) instead of prose
|
||||
|
||||
## Empirical Benchmarks Needed
|
||||
|
||||
1. **Stuffing vs RAG accuracy** on our fact_store queries
|
||||
2. **Latency comparison** at 32K, 64K, 128K context
|
||||
3. **Cost per query** for cloud models at various context sizes
|
||||
4. **Local model behavior** when pushed beyond rated context
|
||||
|
||||
## Recommendations
|
||||
|
||||
1. **Audit current context usage**: How many sessions hit > 32K? (Low effort, high value)
|
||||
2. **Implement ContextRouter**: ~50 LOC, adds routing decisions to hermes
|
||||
3. **Add context-size logging**: Track input tokens per session for data gathering
|
||||
|
||||
## References
|
||||
|
||||
- Liu et al. "Lost in the Middle: How Language Models Use Long Contexts" (2023) — https://arxiv.org/abs/2307.03172
|
||||
- Shi et al. "Large Language Models are Easily Distracted by Irrelevant Context" (2023)
|
||||
- Xu et al. "Retrieval Meets Long Context LLMs" (2023) — hybrid approaches outperform both alone
|
||||
- Anthropic's Claude 3.5 context caching — built-in prefix caching reduces cost for repeated system prompts
|
||||
|
||||
---
|
||||
|
||||
*Sovereignty and service always.*
|
||||
@@ -1,135 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""NH Broadband install packet builder for the live scheduling step."""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import yaml
|
||||
|
||||
|
||||
def load_request(path: str | Path) -> dict[str, Any]:
|
||||
data = yaml.safe_load(Path(path).read_text()) or {}
|
||||
data.setdefault("contact", {})
|
||||
data.setdefault("service", {})
|
||||
data.setdefault("call_log", [])
|
||||
data.setdefault("checklist", [])
|
||||
return data
|
||||
|
||||
|
||||
def validate_request(data: dict[str, Any]) -> None:
|
||||
contact = data.get("contact", {})
|
||||
for field in ("name", "phone"):
|
||||
if not contact.get(field, "").strip():
|
||||
raise ValueError(f"contact.{field} is required")
|
||||
|
||||
service = data.get("service", {})
|
||||
for field in ("address", "city", "state"):
|
||||
if not service.get(field, "").strip():
|
||||
raise ValueError(f"service.{field} is required")
|
||||
|
||||
if not data.get("checklist"):
|
||||
raise ValueError("checklist must contain at least one item")
|
||||
|
||||
|
||||
def build_packet(data: dict[str, Any]) -> dict[str, Any]:
|
||||
validate_request(data)
|
||||
contact = data["contact"]
|
||||
service = data["service"]
|
||||
|
||||
return {
|
||||
"packet_id": f"nh-bb-{datetime.now(timezone.utc).strftime('%Y%m%d-%H%M%S')}",
|
||||
"generated_utc": datetime.now(timezone.utc).isoformat(),
|
||||
"contact": {
|
||||
"name": contact["name"],
|
||||
"phone": contact["phone"],
|
||||
"email": contact.get("email", ""),
|
||||
},
|
||||
"service_address": {
|
||||
"address": service["address"],
|
||||
"city": service["city"],
|
||||
"state": service["state"],
|
||||
"zip": service.get("zip", ""),
|
||||
},
|
||||
"desired_plan": data.get("desired_plan", "residential-fiber"),
|
||||
"call_log": data.get("call_log", []),
|
||||
"checklist": [
|
||||
{"item": item, "done": False} if isinstance(item, str) else item
|
||||
for item in data["checklist"]
|
||||
],
|
||||
"status": "pending_scheduling_call",
|
||||
}
|
||||
|
||||
|
||||
def render_markdown(packet: dict[str, Any], data: dict[str, Any]) -> str:
|
||||
contact = packet["contact"]
|
||||
addr = packet["service_address"]
|
||||
lines = [
|
||||
f"# NH Broadband Install Packet",
|
||||
"",
|
||||
f"**Packet ID:** {packet['packet_id']}",
|
||||
f"**Generated:** {packet['generated_utc']}",
|
||||
f"**Status:** {packet['status']}",
|
||||
"",
|
||||
"## Contact",
|
||||
"",
|
||||
f"- **Name:** {contact['name']}",
|
||||
f"- **Phone:** {contact['phone']}",
|
||||
f"- **Email:** {contact.get('email', 'n/a')}",
|
||||
"",
|
||||
"## Service Address",
|
||||
"",
|
||||
f"- {addr['address']}",
|
||||
f"- {addr['city']}, {addr['state']} {addr['zip']}",
|
||||
"",
|
||||
f"## Desired Plan",
|
||||
"",
|
||||
f"{packet['desired_plan']}",
|
||||
"",
|
||||
"## Call Log",
|
||||
"",
|
||||
]
|
||||
if packet["call_log"]:
|
||||
for entry in packet["call_log"]:
|
||||
ts = entry.get("timestamp", "n/a")
|
||||
outcome = entry.get("outcome", "n/a")
|
||||
notes = entry.get("notes", "")
|
||||
lines.append(f"- **{ts}** — {outcome}")
|
||||
if notes:
|
||||
lines.append(f" - {notes}")
|
||||
else:
|
||||
lines.append("_No calls logged yet._")
|
||||
|
||||
lines.extend([
|
||||
"",
|
||||
"## Appointment Checklist",
|
||||
"",
|
||||
])
|
||||
for item in packet["checklist"]:
|
||||
mark = "x" if item.get("done") else " "
|
||||
lines.append(f"- [{mark}] {item['item']}")
|
||||
|
||||
lines.append("")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(description="Build NH Broadband install packet.")
|
||||
parser.add_argument("request", help="Path to install request YAML")
|
||||
parser.add_argument("--markdown", action="store_true", help="Render markdown instead of JSON")
|
||||
args = parser.parse_args()
|
||||
|
||||
data = load_request(args.request)
|
||||
packet = build_packet(data)
|
||||
if args.markdown:
|
||||
print(render_markdown(packet, data))
|
||||
else:
|
||||
print(json.dumps(packet, indent=2))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
@@ -1,105 +0,0 @@
|
||||
from pathlib import Path
|
||||
|
||||
import yaml
|
||||
|
||||
from scripts.plan_nh_broadband_install import (
|
||||
build_packet,
|
||||
load_request,
|
||||
render_markdown,
|
||||
validate_request,
|
||||
)
|
||||
|
||||
|
||||
def test_script_exists() -> None:
|
||||
assert Path("scripts/plan_nh_broadband_install.py").exists()
|
||||
|
||||
|
||||
def test_example_request_exists() -> None:
|
||||
assert Path("docs/nh-broadband-install-request.example.yaml").exists()
|
||||
|
||||
|
||||
def test_example_packet_exists() -> None:
|
||||
assert Path("docs/nh-broadband-install-packet.example.md").exists()
|
||||
|
||||
|
||||
def test_research_memo_exists() -> None:
|
||||
assert Path("reports/operations/2026-04-15-nh-broadband-public-research.md").exists()
|
||||
|
||||
|
||||
def test_load_and_build_packet() -> None:
|
||||
data = load_request("docs/nh-broadband-install-request.example.yaml")
|
||||
packet = build_packet(data)
|
||||
assert packet["contact"]["name"] == "Timmy Operator"
|
||||
assert packet["service_address"]["city"] == "Concord"
|
||||
assert packet["service_address"]["state"] == "NH"
|
||||
assert packet["status"] == "pending_scheduling_call"
|
||||
assert len(packet["checklist"]) == 8
|
||||
assert packet["checklist"][0]["done"] is False
|
||||
|
||||
|
||||
def test_validate_rejects_missing_contact_name() -> None:
|
||||
data = {
|
||||
"contact": {"name": "", "phone": "555"},
|
||||
"service": {"address": "1 St", "city": "X", "state": "NH"},
|
||||
"checklist": ["do thing"],
|
||||
}
|
||||
try:
|
||||
validate_request(data)
|
||||
except ValueError as exc:
|
||||
assert "contact.name" in str(exc)
|
||||
else:
|
||||
raise AssertionError("should reject empty contact name")
|
||||
|
||||
|
||||
def test_validate_rejects_missing_service_address() -> None:
|
||||
data = {
|
||||
"contact": {"name": "A", "phone": "555"},
|
||||
"service": {"address": "", "city": "X", "state": "NH"},
|
||||
"checklist": ["do thing"],
|
||||
}
|
||||
try:
|
||||
validate_request(data)
|
||||
except ValueError as exc:
|
||||
assert "service.address" in str(exc)
|
||||
else:
|
||||
raise AssertionError("should reject empty service address")
|
||||
|
||||
|
||||
def test_validate_rejects_empty_checklist() -> None:
|
||||
data = {
|
||||
"contact": {"name": "A", "phone": "555"},
|
||||
"service": {"address": "1 St", "city": "X", "state": "NH"},
|
||||
"checklist": [],
|
||||
}
|
||||
try:
|
||||
validate_request(data)
|
||||
except ValueError as exc:
|
||||
assert "checklist" in str(exc)
|
||||
else:
|
||||
raise AssertionError("should reject empty checklist")
|
||||
|
||||
|
||||
def test_render_markdown_contains_key_sections() -> None:
|
||||
data = load_request("docs/nh-broadband-install-request.example.yaml")
|
||||
packet = build_packet(data)
|
||||
md = render_markdown(packet, data)
|
||||
assert "# NH Broadband Install Packet" in md
|
||||
assert "## Contact" in md
|
||||
assert "## Service Address" in md
|
||||
assert "## Call Log" in md
|
||||
assert "## Appointment Checklist" in md
|
||||
assert "Concord" in md
|
||||
assert "NH" in md
|
||||
|
||||
|
||||
def test_render_markdown_shows_checklist_items() -> None:
|
||||
data = load_request("docs/nh-broadband-install-request.example.yaml")
|
||||
packet = build_packet(data)
|
||||
md = render_markdown(packet, data)
|
||||
assert "- [ ] Confirm exact-address availability" in md
|
||||
|
||||
|
||||
def test_example_yaml_is_valid() -> None:
|
||||
data = yaml.safe_load(Path("docs/nh-broadband-install-request.example.yaml").read_text())
|
||||
assert data["contact"]["name"] == "Timmy Operator"
|
||||
assert len(data["checklist"]) == 8
|
||||
39
tests/test_timmy_dispatch_genome.py
Normal file
39
tests/test_timmy_dispatch_genome.py
Normal file
@@ -0,0 +1,39 @@
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
GENOME = Path("genomes/timmy-dispatch-GENOME.md")
|
||||
|
||||
|
||||
def _content() -> str:
|
||||
return GENOME.read_text()
|
||||
|
||||
|
||||
def test_timmy_dispatch_genome_exists() -> None:
|
||||
assert GENOME.exists()
|
||||
|
||||
|
||||
def test_timmy_dispatch_genome_has_required_sections() -> None:
|
||||
content = _content()
|
||||
assert "# GENOME.md — timmy-dispatch" in content
|
||||
assert "## Project Overview" in content
|
||||
assert "## Architecture" in content
|
||||
assert "```mermaid" in content
|
||||
assert "## Entry Points" in content
|
||||
assert "## Data Flow" in content
|
||||
assert "## Key Abstractions" in content
|
||||
assert "## API Surface" in content
|
||||
assert "## Test Coverage Gaps" in content
|
||||
assert "## Security Considerations" in content
|
||||
assert "## Dependencies" in content
|
||||
assert "## Deployment" in content
|
||||
assert "## Technical Debt" in content
|
||||
|
||||
|
||||
def test_timmy_dispatch_genome_captures_repo_specific_findings() -> None:
|
||||
content = _content()
|
||||
assert "bin/sprint-runner.py" in content
|
||||
assert "bin/telemetry-collector.py" in content
|
||||
assert "bin/model-watchdog.py" in content
|
||||
assert "tmux" in content
|
||||
assert "results.csv" in content
|
||||
assert "issue #3" in content.lower() or "issue #3" in content
|
||||
@@ -17,24 +17,8 @@ from typing import Dict, Any, Optional, List
|
||||
from pathlib import Path
|
||||
from dataclasses import dataclass
|
||||
from enum import Enum
|
||||
import importlib.util
|
||||
|
||||
|
||||
def _load_local(module_name: str, filename: str):
|
||||
"""Import a module from an explicit file path, bypassing sys.path resolution."""
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
module_name,
|
||||
str(Path(__file__).parent / filename),
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
return mod
|
||||
|
||||
|
||||
_harness = _load_local("v2_harness", "harness.py")
|
||||
UniWizardHarness = _harness.UniWizardHarness
|
||||
House = _harness.House
|
||||
ExecutionResult = _harness.ExecutionResult
|
||||
from harness import UniWizardHarness, House, ExecutionResult
|
||||
|
||||
|
||||
class TaskType(Enum):
|
||||
|
||||
@@ -8,30 +8,13 @@ import time
|
||||
import sys
|
||||
import argparse
|
||||
import os
|
||||
import importlib.util
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
def _load_local(module_name: str, filename: str):
|
||||
"""Import a module from an explicit file path, bypassing sys.path resolution.
|
||||
|
||||
Prevents namespace collisions when multiple directories contain modules
|
||||
with the same name (e.g. uni-wizard/harness.py vs uni-wizard/v2/harness.py).
|
||||
"""
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
module_name,
|
||||
str(Path(__file__).parent / filename),
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
return mod
|
||||
|
||||
_harness = _load_local("v2_harness", "harness.py")
|
||||
UniWizardHarness = _harness.UniWizardHarness
|
||||
House = _harness.House
|
||||
ExecutionResult = _harness.ExecutionResult
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
from harness import UniWizardHarness, House, ExecutionResult
|
||||
from router import HouseRouter, TaskType
|
||||
from author_whitelist import AuthorWhitelist
|
||||
|
||||
|
||||
Reference in New Issue
Block a user