ElevenLabs v3 remains the voice every other vendor benchmarks against. The expressive range - anger, intimacy, doubt, joy - is unmatched, and the new "Director" mode lets you steer per-line emotion with inline tags.
At ~$0.018 per character it's the most expensive option in this category, but the price reflects what you get: instant voice cloning from 30 seconds of audio, alignment timestamps for captioning, and an enormous shared library.
The v3 release narrowed the gap with real-time TTS engines like Cartesia (now sub-300ms TTFT for streaming) without sacrificing quality.
Verdict: still the gold standard. 9.5/10.