forked from Rockachopa/Timmy-time-dashboard
Adds the `bigbrain` optional dependency group (airllm>=2.9.0) and a
complete second inference path that runs 8B / 70B / 405B Llama models
locally via layer-by-layer loading — no GPU required, no cloud, fully
sovereign.
Key changes:
- src/timmy/backends.py — TimmyAirLLMAgent (same print_response interface
as Agno Agent); auto-selects AirLLMMLX on Apple
Silicon, AutoModel (PyTorch) everywhere else
- src/timmy/agent.py — _resolve_backend() routing with explicit override,
env-config, and 'auto' Apple-Silicon detection
- src/timmy/cli.py — --backend / --model-size flags on all commands
- src/config.py — timmy_model_backend + airllm_model_size settings
- src/timmy/prompts.py — mentions AirLLM "even bigger brains, still fully
sovereign"
- pyproject.toml — bigbrain optional dep; wheel includes updated
- .env.example — TIMMY_MODEL_BACKEND + AIRLLM_MODEL_SIZE docs
- tests/conftest.py — stubs 'airllm' module so tests run without GPU
- tests/test_backends.py — 13 new tests covering helpers + TimmyAirLLMAgent
- tests/test_agent.py — 7 new tests for backend routing
- README.md — Big Brain section with one-line install
- activate_self_tdd.sh — bootstrap script (venv + install + tests +
watchdog + dashboard); --big-brain flag
All 61 tests pass. Self-TDD watchdog unaffected.
https://claude.ai/code/session_01DMjQ5qMZ8iHeyix1j3GS7c
36 lines
1.3 KiB
Python
36 lines
1.3 KiB
Python
from typing import Literal
|
|
|
|
from pydantic_settings import BaseSettings, SettingsConfigDict
|
|
|
|
|
|
class Settings(BaseSettings):
|
|
# Ollama host — override with OLLAMA_URL env var or .env file
|
|
ollama_url: str = "http://localhost:11434"
|
|
|
|
# LLM model passed to Agno/Ollama — override with OLLAMA_MODEL
|
|
ollama_model: str = "llama3.2"
|
|
|
|
# Set DEBUG=true to enable /docs and /redoc (disabled by default)
|
|
debug: bool = False
|
|
|
|
# ── AirLLM / backend selection ───────────────────────────────────────────
|
|
# "ollama" — always use Ollama (default, safe everywhere)
|
|
# "airllm" — always use AirLLM (requires pip install ".[bigbrain]")
|
|
# "auto" — use AirLLM on Apple Silicon if airllm is installed,
|
|
# fall back to Ollama otherwise
|
|
timmy_model_backend: Literal["ollama", "airllm", "auto"] = "ollama"
|
|
|
|
# AirLLM model size when backend is airllm or auto.
|
|
# Larger = smarter, but needs more RAM / disk.
|
|
# 8b ~16 GB | 70b ~140 GB | 405b ~810 GB
|
|
airllm_model_size: Literal["8b", "70b", "405b"] = "70b"
|
|
|
|
model_config = SettingsConfigDict(
|
|
env_file=".env",
|
|
env_file_encoding="utf-8",
|
|
extra="ignore",
|
|
)
|
|
|
|
|
|
settings = Settings()
|