*Generated: 2026-04-14T19:10:00Z | Branch: main | Commit: 02767d8*
Generated 2026-04-17 from direct source inspection of `/tmp/wolf-genome` plus live test execution.
## Project Overview
**Wolf** is a sovereign multi-model evaluation engine. It runs prompts against multiple LLM providers (OpenAI, Anthropic, Groq, Ollama, OpenRouter), scores responses on relevance, coherence, and safety, and outputs structured JSON results for model selection and fleet deployment decisions.
Wolf is a sovereign multi-model evaluation engine with two real operating modes:
- the older timmy-home genome artifact claimed only `test_config.py` and `test_evaluator.py` existed
- current repo also includes `tests/test_models.py`, `tests/test_gitea.py`, and `tests/test_runner.py`
### Coverage Gaps — Existing Tests
## CI / Verification Surface
-`test_evaluator.py`: No tests for `PromptEvaluator._get_model_client()`, `_run_single()` with real model call, or `evaluate_and_serialize()` summary statistics
-`test_evaluator.py`: No integration test (mocked model calls only)
-`test_config.py`: No test for missing config, env var overrides, or logging setup
Current CI contracts observed directly:
-`.gitea/workflows/smoke.yml`
-checkout
- setup Python 3.11
- install `pytest` and `pyyaml`
- install `requirements.txt` if present
- run `pytest tests/`
-`.github/workflows/smoke-test.yml`
- YAML parse check
- JSON parse check
- Python compile check
- shell syntax check
- secret scan
---
## Security Considerations
1.**API Keys in Config**: `wolf-config.yaml` stores provider API keys. Never commit to version control. Recommend `~/.hermes/wolf-config.yaml` with restricted permissions.
2.**HTTP Requests**: All model calls and Gitea API calls are outbound HTTP. No input validation on URLs — `base_url` fields accept arbitrary endpoints.
3.**Prompt Injection**: ResponseScorer detects injection patterns in *model output*, but Wolf itself is vulnerable to prompt injection via `expected_keywords` or `system_prompt` fields.
4.**Gitea Token Scope**: GiteaClient uses a single token for all operations. Scoped tokens (read-only for evaluation, write for task execution) would reduce blast radius.
5.**No TLS Verification Override**: `requests.post()` uses default SSL verification. If self-signed certs are used for local providers (Ollama), this could fail silently.
6.**Race Conditions**: Leaderboard reads/writes JSON without locking. Concurrent evaluations could corrupt the leaderboard file.
---
This means the real repo contract is broader than unit tests alone: syntax, parseability, and secret hygiene are part of the shipped smoke lane.
## Dependencies
```
requests # HTTP client for all providers and Gitea
pyyaml # Config file parsing (not in requirements.txt — BUG)
```
Direct dependency files:
-`requirements.txt`
- only `requests`
- README install instructions
-`pip install requests pyyaml`
**⚠️ Missing dependency:** `pyyaml` is imported in `config.py` but not listed in `requirements.txt`.
Observed dependency tension:
-`wolf/config.py` imports `yaml` when available and falls back to a simple parser if PyYAML is absent
- CI installs `pyyaml`
-`requirements.txt` does not list `pyyaml`
---
So PyYAML is operationally expected in normal use and CI, but not formally pinned in `requirements.txt`.
## Configuration Schema
## Security Considerations
```yaml
# wolf-config.yaml
gitea:
base_url:"https://forge.example.com/api/v1"
token:"gitea_token_here"
owner:"Timmy_Foundation"
repo:"eval-repo"
1. Plaintext secrets in config
- model API keys and Gitea tokens are expected via config files
- this is user-controlled but still a secret-handling risk
2. Arbitrary base URLs
- provider configs can point to arbitrary endpoints
- useful for sovereignty, but also expands trust boundaries
3. PR automation blast radius
-`AgentRunner.execute_task()` can create branches, files, and PRs
- bad prompts or weak issue filtering could create noisy or unsafe PRs
4. Prompt-injection exposure
- model prompts and issue bodies are passed through with limited sanitization
5. Leaderboard persistence without locking
-`leaderboard.json` writes are not protected against concurrent writers
providers:
openrouter:
api_key:"sk-or-..."
base_url:"https://openrouter.ai/api/v1"
groq:
api_key:"gsk_..."
ollama:
base_url:"http://localhost:11434"
## Repository Notes
models:
- model:"anthropic/claude-3.5-sonnet"
provider:"openrouter"
- model:"llama3-70b-8192"
provider:"groq"
- model:"llama3:70b"
provider:"ollama"
Notable current-repo facts that the host-repo genome should preserve:
- Wolf already ships its own `GENOME.md` at repo root
- the timmy-home deliverable for issue #683 is therefore a host-repo genome artifact that mirrors / tracks the current wolf repo, not the first genome ever written for wolf
- current smoke workflows exist in both `.gitea/` and `.github/`
assertnotmissing,f"wolf genome missing current repo facts: {missing}"
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.