test: comprehensive tests for model metadata + firecrawl config
model_metadata tests (61 tests, was 39):
- Token estimation: concrete value assertions, unicode, tool_call messages,
vision multimodal content, additive verification
- Context length resolution: cache-over-API priority, no-base_url skips cache,
missing context_length key in API response
- API metadata fetch: canonical_slug aliasing, TTL expiry with time mock,
stale cache fallback on API failure, malformed JSON resilience
- Probe tiers: above-max returns 2M, zero returns None
- Error parsing: Anthropic format ('X > Y maximum'), LM Studio, empty string,
unreasonably large numbers — also fixed parser to handle Anthropic format
- Cache: corruption resilience (garbage YAML, wrong structure), value updates,
special chars in model names
Firecrawl config tests (8 tests, was 4):
- Singleton caching (core purpose — verified constructor called once)
- Constructor failure recovery (retry after exception)
- Return value actually asserted (not just constructor args)
- Empty string env vars treated as absent
- Proper setup/teardown for env var isolation