Compare commits

..

4 Commits

Author SHA1 Message Date
Alexander Whitestone
64e0565563 chore: correct data paths in dpo config 2026-03-25 21:03:57 -04:00
Alexander Whitestone
3a5a0ad0b9 feat: Add native DPO orchestration playbook and MLX configs 2026-03-25 19:35:58 -04:00
Alexander Whitestone
4099d1ffd8 fix: purge qwen from config.yaml and all playbooks
- config.yaml: disabled compression, smart_model_routing, all auxiliary qwen routes
- All 6 playbooks: qwen3:30b -> claude-opus-4-6
- Playbook repos: removed hermes-agent, added autolora/sov-orch/timmy-config
- Alexander's directive: silence over qwen. Opus only.
2026-03-25 15:38:39 -04:00
2f07e5bece Update README — remove stale bash loop references, reflect current structure 2026-03-25 15:00:14 +00:00
15 changed files with 221 additions and 446 deletions

101
README.md
View File

@@ -1,8 +1,8 @@
# timmy-config
Timmy's sovereign configuration. Everything that makes Timmy _Timmy_ — soul, memories, skins, playbooks, operational scripts, and config.
Timmy's sovereign configuration. Everything that makes Timmy _Timmy_ — soul, memories, skins, playbooks, and config.
This repo is the canonical source of truth for Timmy's identity and operational state. Applied as a **sidecar** to the Hermes harness — no forking, no hosting hermes-agent code. Pull upstream updates to hermes-agent, overlay timmy-config on top.
This repo is the canonical source of truth for Timmy's identity and operational state. Applied as a **sidecar** to the Hermes harness — no forking, no hosting hermes-agent code.
## Structure
@@ -11,80 +11,47 @@ timmy-config/
├── deploy.sh ← Deploys config as overlay onto ~/.hermes/
├── SOUL.md ← Inscription 1 — the immutable conscience
├── FALSEWORK.md ← API cost management strategy
├── DEPRECATED.md ← What was removed and why
├── config.yaml ← Hermes harness configuration
├── channel_directory.json ← Platform channel mappings
├── bin/ ← Operational scripts
│ ├── claude-loop.sh ← Parallel Claude Code agent dispatch
│ ├── gemini-loop.sh ← Parallel Gemini Code agent dispatch
│ ├── timmy-orchestrator.sh ← PR review, triage, merge orchestration
│ ├── workforce-manager.py ← Agent assignment and scoring
── agent-dispatch.sh ← Single-issue agent launcher
│ ├── agent-loop.sh ← Generic agent loop template
│ ├── nexus-merge-bot.sh ← Auto-merge passing PRs
├── claudemax-watchdog.sh ← Claude quota monitoring
│ ├── hermes-startup.sh ← Boot sequence
│ ├── ops-panel.sh ← Operational dashboard
│ ├── ops-helpers.sh ← Shared shell functions
│ ├── ops-gitea.sh ← Gitea API helpers
│ ├── timmy-status.sh ← Git + Gitea status display
│ ├── timmy-loopstat.sh ← Queue and perf stats
│ └── hotspot-keepalive.sh ← Network keepalive
├── memories/
│ ├── MEMORY.md ← Persistent agent memory
│ └── USER.md ← User profile (Alexander)
├── skins/
│ ├── timmy.yaml ← Timmy personality skin
│ └── trismegistus.yaml ← Trismegistus personality skin
├── playbooks/
│ ├── bug-fixer.yaml ← Test-first bug fixing
│ ├── refactor-specialist.yaml
│ ├── test-writer.yaml
│ ├── security-auditor.yaml
│ ├── issue-triager.yaml
│ └── pr-reviewer.yaml
├── cron/
│ └── jobs.json ← Scheduled job definitions
└── docs/
└── design-log/ ← Historical design decisions
├── bin/ ← Utility scripts (NOT loops — see below)
│ ├── hermes-startup.sh ← Hermes boot sequence
│ ├── agent-dispatch.sh ← Manual agent dispatch
│ ├── ops-panel.sh ← Ops dashboard panel
│ ├── ops-gitea.sh ← Gitea ops helpers
── timmy-status.sh ← Status check
├── memories/ ← Persistent memory YAML
├── skins/ ← UI skins (timmy skin)
├── playbooks/ ← Agent playbooks (YAML)
└── cron/ ← Cron job definitions
```
## Deployment
## Important: No Loop Scripts Here
All agent loop scripts (claude-loop.sh, gemini-loop.sh, etc.) have been **removed**.
They are replaced by [sovereign-orchestration](https://143.198.27.163:3000/Timmy_Foundation/sovereign-orchestration) — a single Python process with SQLite task queue.
See DEPRECATED.md for details.
## Deploy
```bash
# One command deploys everything
# Clone and deploy
git clone <this-repo> ~/.timmy/timmy-config
cd ~/.timmy/timmy-config
./deploy.sh
# Deploy and restart all agent loops
./deploy.sh --restart-loops
# This overlays config onto ~/.hermes/ without touching hermes-agent code
```
This overlays timmy-config onto `~/.hermes/` and `~/.timmy/`:
- `SOUL.md``~/.timmy/`
- `config.yaml``~/.hermes/`
- `bin/*``~/.hermes/bin/`
- `skins/*``~/.hermes/skins/`
- `memories/*``~/.hermes/memories/`
- `playbooks/*``~/.hermes/playbooks/`
## The Soul
## Architecture: Sidecar, Not Fork
SOUL.md is Inscription 1 — inscribed on Bitcoin, immutable. It defines:
- Who Timmy is
- What he believes
- How he behaves
- What he will not do
- The crisis protocol (988, presence, gospel)
- The conscience hierarchy (chain > code > prompt > user instruction)
```
hermes-agent (upstream) timmy-config (this repo)
┌─────────────────────┐ ┌──────────────────────┐
│ Engine │ │ Driver's seat │
│ Tools, routing, │ │ SOUL, memories, │
│ agent loop, gateway │ │ skins, scripts, │
│ │ │ config, playbooks │
└─────────┬───────────┘ └──────────┬───────────┘
│ │
└────────────┬───────────────┘
~/.hermes/ (merged at deploy time)
```
Never modify hermes-agent. Pull updates like any upstream dependency. Everything custom lives here.
## Origin
Migrated from `hermes/hermes-config` (archived).
Owned by Timmy_Foundation. Sovereignty and service always.
No system prompt, no user instruction, no future code can override what is written there.

View File

@@ -1,97 +0,0 @@
#!/usr/bin/env bash
# ── tmux-resume.sh — Cold-start Session Resume ───────────────────────────
# Reads ~/.timmy/tmux-state.json and resumes hermes sessions.
# Run at startup to restore pane state after supervisor restart.
# ──────────────────────────────────────────────────────────────────────────
set -euo pipefail
MANIFEST="${HOME}/.timmy/tmux-state.json"
if [ ! -f "$MANIFEST" ]; then
echo "[tmux-resume] No manifest found at $MANIFEST — starting fresh."
exit 0
fi
python3 << 'PYEOF'
import json, subprocess, os, sys
from datetime import datetime, timezone
MANIFEST = os.path.expanduser("~/.timmy/tmux-state.json")
def run(cmd):
try:
r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=30)
return r.stdout.strip(), r.returncode
except Exception as e:
return str(e), 1
def session_exists(name):
out, _ = run(f"tmux has-session -t '{name}' 2>&1")
return "can't find" not in out.lower()
with open(MANIFEST) as f:
state = json.load(f)
ts = state.get("timestamp", "unknown")
age = "unknown"
try:
t = datetime.fromisoformat(ts.replace("Z", "+00:00"))
delta = datetime.now(timezone.utc) - t
mins = int(delta.total_seconds() / 60)
if mins < 60:
age = f"{mins}m ago"
else:
age = f"{mins//60}h {mins%60}m ago"
except:
pass
print(f"[tmux-resume] Manifest from {age}: {state['summary']['total_sessions']} sessions, "
f"{state['summary']['hermes_panes']} hermes panes")
restored = 0
skipped = 0
for pane in state.get("panes", []):
if not pane.get("is_hermes"):
continue
addr = pane["address"] # e.g. "BURN:2.3"
session = addr.split(":")[0]
session_id = pane.get("session_id")
profile = pane.get("profile", "default")
model = pane.get("model", "")
task = pane.get("task", "")
# Skip if session already exists (already running)
if session_exists(session):
print(f" [skip] {addr} — session '{session}' already exists")
skipped += 1
continue
# Respawn hermes with session resume if we have a session ID
if session_id:
print(f" [resume] {addr} — profile={profile} model={model} session={session_id}")
cmd = f"hermes chat --resume {session_id}"
else:
print(f" [start] {addr} — profile={profile} model={model} (no session ID)")
cmd = f"hermes chat --profile {profile}"
# Create tmux session and run hermes
run(f"tmux new-session -d -s '{session}' -n '{session}:0'")
run(f"tmux send-keys -t '{session}' '{cmd}' Enter")
restored += 1
# Write resume log
log = {
"resumed_at": datetime.now(timezone.utc).isoformat(),
"manifest_age": age,
"restored": restored,
"skipped": skipped,
}
log_path = os.path.expanduser("~/.timmy/tmux-resume.log")
with open(log_path, "w") as f:
json.dump(log, f, indent=2)
print(f"[tmux-resume] Done: {restored} restored, {skipped} skipped")
PYEOF

View File

@@ -1,237 +0,0 @@
#!/usr/bin/env bash
# ── tmux-state.sh — Session State Persistence Manifest ───────────────────
# Snapshots all tmux pane state to ~/.timmy/tmux-state.json
# Run every supervisor cycle. Cold-start reads this manifest to resume.
# ──────────────────────────────────────────────────────────────────────────
set -euo pipefail
MANIFEST="${HOME}/.timmy/tmux-state.json"
mkdir -p "$(dirname "$MANIFEST")"
python3 << 'PYEOF'
import json, subprocess, os, time, re, sys
from datetime import datetime, timezone
from pathlib import Path
MANIFEST = os.path.expanduser("~/.timmy/tmux-state.json")
def run(cmd):
"""Run command, return stdout or empty string."""
try:
r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=5)
return r.stdout.strip()
except Exception:
return ""
def get_sessions():
"""Get all tmux sessions with metadata."""
raw = run("tmux list-sessions -F '#{session_name}|#{session_windows}|#{session_created}|#{session_attached}|#{session_group}|#{session_id}'")
sessions = []
for line in raw.splitlines():
if not line.strip():
continue
parts = line.split("|")
if len(parts) < 6:
continue
sessions.append({
"name": parts[0],
"windows": int(parts[1]),
"created_epoch": int(parts[2]),
"created": datetime.fromtimestamp(int(parts[2]), tz=timezone.utc).isoformat(),
"attached": parts[3] == "1",
"group": parts[4],
"id": parts[5],
})
return sessions
def get_panes():
"""Get all tmux panes with full metadata."""
fmt = '#{session_name}|#{window_index}|#{pane_index}|#{pane_pid}|#{pane_title}|#{pane_width}x#{pane_height}|#{pane_active}|#{pane_current_command}|#{pane_start_command}|#{pane_tty}|#{pane_id}|#{window_name}|#{session_id}'
raw = run(f"tmux list-panes -a -F '{fmt}'")
panes = []
for line in raw.splitlines():
if not line.strip():
continue
parts = line.split("|")
if len(parts) < 13:
continue
session, win, pane, pid, title, size, active, cmd, start_cmd, tty, pane_id, win_name, sess_id = parts[:13]
w, h = size.split("x") if "x" in size else ("0", "0")
panes.append({
"session": session,
"window_index": int(win),
"window_name": win_name,
"pane_index": int(pane),
"pane_id": pane_id,
"pid": int(pid) if pid.isdigit() else 0,
"title": title,
"width": int(w),
"height": int(h),
"active": active == "1",
"command": cmd,
"start_command": start_cmd,
"tty": tty,
"session_id": sess_id,
})
return panes
def extract_hermes_state(pane):
"""Try to extract hermes session info from a pane."""
info = {
"is_hermes": False,
"profile": None,
"model": None,
"provider": None,
"session_id": None,
"task": None,
}
title = pane.get("title", "")
cmd = pane.get("command", "")
start = pane.get("start_command", "")
# Detect hermes processes
is_hermes = any(k in (title + " " + cmd + " " + start).lower()
for k in ["hermes", "timmy", "mimo", "claude", "gpt"])
if not is_hermes and cmd not in ("python3", "python3.11", "bash", "zsh", "fish"):
return info
# Try reading pane content for model/provider clues
pane_content = run(f"tmux capture-pane -t '{pane['session']}:{pane['window_index']}.{pane['pane_index']}' -p -S -20 2>/dev/null")
# Extract model from pane content patterns
model_patterns = [
r"(?:mimo-v2-pro|claude-[\w.-]+|gpt-[\w.-]+|gemini-[\w.-]+|qwen[\w:.-]*)",
]
for pat in model_patterns:
m = re.search(pat, pane_content, re.IGNORECASE)
if m:
info["model"] = m.group(0)
info["is_hermes"] = True
break
# Provider inference from model
model = (info["model"] or "").lower()
if "mimo" in model:
info["provider"] = "nous"
elif "claude" in model:
info["provider"] = "anthropic"
elif "gpt" in model:
info["provider"] = "openai"
elif "gemini" in model:
info["provider"] = "google"
elif "qwen" in model:
info["provider"] = "custom"
# Profile from session name
session = pane["session"].lower()
if "burn" in session:
info["profile"] = "burn"
elif session in ("dev", "0"):
info["profile"] = "default"
else:
info["profile"] = session
# Try to extract session ID (hermes uses UUIDs)
uuid_match = re.findall(r'[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}', pane_content)
if uuid_match:
info["session_id"] = uuid_match[-1] # most recent
info["is_hermes"] = True
# Last prompt — grab the last user-like line
lines = pane_content.splitlines()
for line in reversed(lines):
stripped = line.strip()
if stripped and not stripped.startswith(("─", "│", "╭", "╰", "▸", "●", "○")) and len(stripped) > 10:
info["task"] = stripped[:200]
break
return info
def get_context_percent(pane):
"""Estimate context usage from pane content heuristics."""
content = run(f"tmux capture-pane -t '{pane['session']}:{pane['window_index']}.{pane['pane_index']}' -p -S -5 2>/dev/null")
# Look for context indicators like "ctx 45%" or "[░░░░░░░░░░]"
ctx_match = re.search(r'ctx\s*(\d+)%', content)
if ctx_match:
return int(ctx_match.group(1))
bar_match = re.search(r'\[(░+█*█*░*)\]', content)
if bar_match:
bar = bar_match.group(1)
filled = bar.count('█')
total = len(bar)
if total > 0:
return int((filled / total) * 100)
return None
def build_manifest():
"""Build the full tmux state manifest."""
now = datetime.now(timezone.utc)
sessions = get_sessions()
panes = get_panes()
pane_manifests = []
for p in panes:
hermes = extract_hermes_state(p)
ctx = get_context_percent(p)
entry = {
"address": f"{p['session']}:{p['window_index']}.{p['pane_index']}",
"pane_id": p["pane_id"],
"pid": p["pid"],
"size": f"{p['width']}x{p['height']}",
"active": p["active"],
"command": p["command"],
"title": p["title"],
"profile": hermes["profile"],
"model": hermes["model"],
"provider": hermes["provider"],
"session_id": hermes["session_id"],
"task": hermes["task"],
"context_pct": ctx,
"is_hermes": hermes["is_hermes"],
}
pane_manifests.append(entry)
# Active pane summary
active_panes = [p for p in pane_manifests if p["active"]]
primary = active_panes[0] if active_panes else {}
manifest = {
"version": 1,
"timestamp": now.isoformat(),
"timestamp_epoch": int(now.timestamp()),
"hostname": os.uname().nodename,
"sessions": sessions,
"panes": pane_manifests,
"summary": {
"total_sessions": len(sessions),
"total_panes": len(pane_manifests),
"hermes_panes": sum(1 for p in pane_manifests if p["is_hermes"]),
"active_pane": primary.get("address"),
"active_model": primary.get("model"),
"active_provider": primary.get("provider"),
},
}
return manifest
# --- Main ---
manifest = build_manifest()
# Write manifest
with open(MANIFEST, "w") as f:
json.dump(manifest, f, indent=2)
# Also write to ~/.hermes/tmux-state.json for compatibility
hermes_manifest = os.path.expanduser("~/.hermes/tmux-state.json")
os.makedirs(os.path.dirname(hermes_manifest), exist_ok=True)
with open(hermes_manifest, "w") as f:
json.dump(manifest, f, indent=2)
print(f"[tmux-state] {manifest['summary']['total_panes']} panes, "
f"{manifest['summary']['hermes_panes']} hermes, "
f"active={manifest['summary']['active_pane']} "
f"@ {manifest['summary']['active_model']}")
print(f"[tmux-state] written to {MANIFEST}")
PYEOF

View File

@@ -30,62 +30,62 @@ checkpoints:
enabled: true
max_snapshots: 50
compression:
enabled: true
enabled: false
threshold: 0.5
summary_model: qwen3:30b
summary_provider: custom
summary_base_url: http://localhost:11434/v1
summary_model: ''
summary_provider: ''
summary_base_url: ''
smart_model_routing:
enabled: true
enabled: false
max_simple_chars: 200
max_simple_words: 35
cheap_model:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: ''
model: ''
base_url: ''
api_key: ''
auxiliary:
vision:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
timeout: 30
web_extract:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
compression:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
session_search:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
skills_hub:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
approval:
provider: auto
model: ''
base_url: ''
api_key: ''
mcp:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
flush_memories:
provider: custom
model: qwen3:30b
base_url: http://localhost:11434/v1
api_key: ollama
provider: auto
model: ''
base_url: ''
api_key: ''
display:
compact: false
personality: ''

View File

@@ -4,7 +4,7 @@ description: >
reproduces the bug, then fixes the code, then verifies.
model:
preferred: qwen3:30b
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
max_turns: 30
temperature: 0.2

View File

@@ -0,0 +1,54 @@
name: dpo-trainer
description: >
Executes a Direct Preference Optimization (DPO) training cycle
using native mlx_lm and autolora eval utilities.
Ensures model alignment with SOUL.md constraints.
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
max_turns: 20
temperature: 0.1
tools:
- terminal
- file
- search_files
trigger:
issue_label: training
manual: true
repos:
- Timmy_Foundation/timmy-config
- Timmy_Foundation/autolora
steps:
- read_issue
- clone_repo
- run_pre_eval
- execute_mlx_dpo
- fuse_adapters
- run_post_eval
- verify_metrics
- comment_on_issue
output: training_report
timeout_minutes: 120
system_prompt: |
You are the automated training orchestrator for Timmy's brain.
YOUR ISSUE: #{{issue_number}} — {{issue_title}}
APPROACH (zero-code native):
1. Run baseline eval: `python autolora/eval/run_cycle.py --model {base_model} --test-set autolora/data/prompts_vibes.yaml`
2. Execute DPO training: `python -m mlx_lm.lora --config timmy-config/training/configs/dpo_X.yaml` against `preference_pairs.jsonl`
3. Fuse the weights using `mlx_lm.fuse`.
4. Run post-eval exactly as step 1 but against the fused model.
5. Use `autolora/eval/compare.py` to ensure Faith/Crisis constraints from SOUL.md were preserved or improved.
RULES:
- Do not write wrapper Python or Bash scripts. Use the CLIs natively.
- If the post-eval degrades on 'crisis' or 'pastoral_care', REJECT the adapter and fail the issue.
- Always output the pre/post comparison metrics to the issue comment.

View File

@@ -4,7 +4,7 @@ description: >
agents. Decomposes large issues into smaller ones.
model:
preferred: qwen3:30b
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
max_turns: 20
temperature: 0.3
@@ -19,7 +19,9 @@ trigger:
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/autolora
- Timmy_Foundation/sovereign-orchestration
- Timmy_Foundation/timmy-config
steps:
- fetch_issues

View File

@@ -4,7 +4,7 @@ description: >
comments on problems. The merge bot replacement.
model:
preferred: qwen3:30b
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
max_turns: 20
temperature: 0.2
@@ -19,7 +19,9 @@ trigger:
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/autolora
- Timmy_Foundation/sovereign-orchestration
- Timmy_Foundation/timmy-config
steps:
- fetch_prs

View File

@@ -4,7 +4,7 @@ description: >
Well-scoped: 1-3 files per task, clear acceptance criteria.
model:
preferred: qwen3:30b
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
max_turns: 30
temperature: 0.3

View File

@@ -4,7 +4,7 @@ description: >
dependency issues. Files findings as Gitea issues.
model:
preferred: qwen3:30b
preferred: claude-opus-4-6
fallback: claude-opus-4-6
max_turns: 40
temperature: 0.2

View File

@@ -4,7 +4,7 @@ description: >
writes meaningful tests, verifies they pass.
model:
preferred: qwen3:30b
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
max_turns: 30
temperature: 0.3

View File

@@ -0,0 +1,21 @@
# MLX DPO Training Configuration for Hermes 4 (14B Class)
# Optimized for Apple Silicon execution (deep reasoning).
model: "NousResearch/Hermes-4-14B"
train: true
# Use the curated DPO preference pairs dataset
data: "data/"
# Output adapter configuration
adapter_path: "adapters/dpo_14b_adapter"
save_every: 200
# DPO parameters
loss: "dpo"
iters: 1000
batch_size: 1
lora_layers: 16
learning_rate: 1e-5
lora_parameters:
keys: ['q_proj', 'v_proj']

View File

@@ -0,0 +1,21 @@
# MLX DPO Training Configuration for Hermes 4 (32B Class)
# Optimized for 64GB+ Apple Silicon hardware limit.
model: "NousResearch/Hermes-4-32B"
train: true
# Use the curated DPO preference pairs dataset
data: "data/"
# Output adapter configuration
adapter_path: "adapters/dpo_32b_adapter"
save_every: 200
# DPO parameters
loss: "dpo"
iters: 1000
batch_size: 1
lora_layers: 16
learning_rate: 5e-6
lora_parameters:
keys: ['q_proj', 'v_proj']

View File

@@ -0,0 +1,21 @@
# MLX DPO Training Configuration for Hermes 4 (3B Class)
# Optimized for Apple Silicon execution with max reactivity.
model: "NousResearch/Hermes-4-3B"
train: true
# Use the curated DPO preference pairs dataset
data: "data/"
# Output adapter configuration
adapter_path: "adapters/dpo_3b_adapter"
save_every: 200
# DPO parameters
loss: "dpo"
iters: 1000
batch_size: 2
lora_layers: 16
learning_rate: 1e-5
lora_parameters:
keys: ['q_proj', 'v_proj']

View File

@@ -0,0 +1,21 @@
# MLX DPO Training Configuration for Hermes 4 (8B Class)
# Optimized for Apple Silicon execution (daily driver capability).
model: "mlx-community/Hermes-3-Llama-3.1-8B-4bit"
train: true
# Use the curated DPO preference pairs dataset
data: "autolora/data/dpo/"
# Output adapter configuration
adapter_path: "autolora/adapters/dpo-8b-adapter"
save_every: 200
# DPO parameters
loss: "dpo"
iters: 1000
batch_size: 2
lora_layers: 16
learning_rate: 1e-5
lora_parameters:
keys: ['q_proj', 'v_proj']