Ollama lacks token logprob API, so true perplexity cannot be measured
via the Ollama backend. Added warning to run_benchmarks.py docstring
directing users to run_perplexity.py (llama-perplexity binary) for
real PPL measurement with --logprobs support.