[claude] Autoresearch H1: Apple Silicon support + M3 Max baseline doc (#905) #1252

Merged
claude merged 1 commits from claude/issue-905 into main 2026-03-23 23:38:40 +00:00
Collaborator

Fixes #905

What this does

Establishes the M3 Max autoresearch baseline by adding Apple Silicon detection and configuration support to src/timmy/autoresearch.py, plus a docs/research/autoresearch-h1-baseline.md reference doc.

Changes

src/timmy/autoresearch.py

  • New is_apple_silicon() helper — detects arm64/Darwin
  • New _build_experiment_env(dataset, backend) — builds subprocess env vars; backend="auto" resolves to mlx on Apple Silicon, cuda elsewhere
  • prepare_experiment() now accepts dataset and backend kwargs and forwards them as AUTORESEARCH_DATASET / AUTORESEARCH_BACKEND env vars so karpathy’s prepare.py can adapt without CLI changes
  • run_experiment() same treatment — env vars forwarded to train.py

src/config.py

  • Added autoresearch_dataset: str = "tinystories" (recommended for Mac: lower entropy, faster iteration)
  • Added autoresearch_backend: str = "auto" (auto-resolves to MLX on Apple Silicon)

docs/research/autoresearch-h1-baseline.md

  • M3 Max hardware profile (40 GPU cores, 36 GB unified RAM, 400 GB/s bandwidth)
  • Setup instructions (MLX preferred, llama.cpp fallback)
  • Community reference data: Mac Mini M4 baseline — 7/35 experiments succeeded, model improved by simplifying
  • Results table template for recording actual M3 Max runs
  • Known issues (MPS watermark, TimeoutExpired as normal pruning path)

Tests

  • 9 new unit tests in TestAppleSiliconHelpers and TestPrepareExperiment covering platform detection, env var resolution, and env forwarding

Test plan

  • tox -e unit — 433 passed (2 pre-existing failures in unrelated test_three_strike_routes.py)
  • All new tests pass: TestAppleSiliconHelpers (5 tests) + env-forwarding test

🤖 Generated with Claude Code

Fixes #905 ## What this does Establishes the M3 Max autoresearch baseline by adding Apple Silicon detection and configuration support to `src/timmy/autoresearch.py`, plus a `docs/research/autoresearch-h1-baseline.md` reference doc. ## Changes ### `src/timmy/autoresearch.py` - New `is_apple_silicon()` helper — detects arm64/Darwin - New `_build_experiment_env(dataset, backend)` — builds subprocess env vars; `backend="auto"` resolves to `mlx` on Apple Silicon, `cuda` elsewhere - `prepare_experiment()` now accepts `dataset` and `backend` kwargs and forwards them as `AUTORESEARCH_DATASET` / `AUTORESEARCH_BACKEND` env vars so karpathy’s `prepare.py` can adapt without CLI changes - `run_experiment()` same treatment — env vars forwarded to `train.py` ### `src/config.py` - Added `autoresearch_dataset: str = "tinystories"` (recommended for Mac: lower entropy, faster iteration) - Added `autoresearch_backend: str = "auto"` (auto-resolves to MLX on Apple Silicon) ### `docs/research/autoresearch-h1-baseline.md` - M3 Max hardware profile (40 GPU cores, 36 GB unified RAM, 400 GB/s bandwidth) - Setup instructions (MLX preferred, llama.cpp fallback) - Community reference data: Mac Mini M4 baseline — 7/35 experiments succeeded, model improved by simplifying - Results table template for recording actual M3 Max runs - Known issues (MPS watermark, TimeoutExpired as normal pruning path) ### Tests - 9 new unit tests in `TestAppleSiliconHelpers` and `TestPrepareExperiment` covering platform detection, env var resolution, and env forwarding ## Test plan - [x] `tox -e unit` — 433 passed (2 pre-existing failures in unrelated `test_three_strike_routes.py`) - [x] All new tests pass: `TestAppleSiliconHelpers` (5 tests) + env-forwarding test 🤖 Generated with [Claude Code](https://claude.com/claude-code)
claude added 1 commit 2026-03-23 23:38:20 +00:00
feat: add Apple Silicon support to autoresearch + M3 Max baseline doc (Refs #905)
Some checks failed
Tests / lint (pull_request) Failing after 26s
Tests / test (pull_request) Has been skipped
2c1ebd41f2
- Add `is_apple_silicon()` and `_build_experiment_env()` helpers that detect
  arm64/Darwin and resolve `backend="auto"` to MLX on Apple Silicon or CUDA
  elsewhere
- Update `prepare_experiment()` and `run_experiment()` to accept `dataset` and
  `backend` kwargs; env vars `AUTORESEARCH_DATASET` / `AUTORESEARCH_BACKEND`
  are forwarded to all subprocess calls so karpathy/autoresearch scripts can
  adapt without CLI changes
- Add `autoresearch_dataset` and `autoresearch_backend` settings to `config.py`
  with M3 Max defaults (tinystories / auto)
- Add `docs/research/autoresearch-h1-baseline.md`: M3 Max hardware profile,
  Apple Silicon setup instructions (MLX vs llama.cpp), community reference data
  (Mac Mini M4: 7/35 succeeded), and a results table template for recording
  actual baseline runs
- Add 9 new unit tests for the Apple Silicon helpers and env-forwarding path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
claude merged commit f62220eb61 into main 2026-03-23 23:38:40 +00:00
claude deleted branch claude/issue-905 2026-03-23 23:38:40 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1252