In an independent blind comparison, Phantom Z is the only Hebrew TTS model that is both the highest-quality and the fastest in real-time streaming.

Hebrew is one of the hardest languages in the world to synthesize naturally, and one of the least represented in the global TTS race. To see how Phantom Z performs against the field, we entered it into the Hugging Face Hebrew TTS Arena — an open, community-run benchmark where listeners pick the better-sounding voice in blind, head-to-head pairwise comparisons.
Each vote came from a real user comparing two unlabeled samples side by side. Phantom Z finished #1 with an ELO of 1569 — 29 ELO points clear of the next model (Soniox v1, ELO 1540).
| Rank | Model | Streaming | Win rate | ELO |
|---|---|---|---|---|
| #1 | Deepdub Phantom Z | STR | 83% | 1569 |
| #2 | Soniox v1 | STR | 69% | 1542 |
| #3 | Gemini 3.1 Flash TTS | — | 69% | 1542 |
| #4 | ElevenLabs v3 | — | 69% | 1541 |
| #5 | OpenAI Realtime 1.5 Legacy | STR | 48% | 1499 |
| #6 | Inworld TTS-2 | STR | 30% | 1492 |
| #7 | Inworld TTS v1.5 MAX | STR | 45% | 1487 |
| #8 | Inworld TTS v1 Legacy | STR | 18% | 1486 |
| #9 | Blue V2 | — | 37% | 1483 |
| #10 | OpenAI GPT-4o Mini TTS | — | 32% | 1461 |
Of the top models on the Hebrew TTS Arena, only half can actually stream — the other half can't be deployed in real-time agentic experiences at all. Among the streaming models, Phantom Z is the fastest among all.
Most TTS models fail in Hebrew because the script is missing the information needed for correct pronunciation. Phantom Z was built specifically to solve this. It doesn't just read text; it understands context.
Hebrew typically drops vowels, leaving the same letters with multiple valid readings. Phantom Z picks the right one from context — no manual diacritics required.
Hebrew is deeply gendered. Phantom Z produces accurate masculine and feminine forms for verbs and adjectives instead of relying on defaults.
Hebrew abbreviations have to be read as whole words rather than letter-by-letter. Phantom Z expands Hebrew acronyms: צה"ל as Tzáhal, ארה"ב as Artzót ha-Brit.
From tech jargon to English brand names, our model handles acronyms and foreign words inside Hebrew sentences with native-sounding flow.
Phantom Z modeled, trained, and tuned by native speakers, which is why it sounds like one.
Native listeners identify AI not by mispronunciation, but by unnatural rhythm. Phantom Z was trained on native breath, pitch shifts, and pausing.
One written text, a distinct pronunciation for each gender — Phantom Z picks the right one automatically and generates speech that sounds native.
שלום! אנחנו מתקשרים לעדכן אותך שהבקשה שלך לבטל את ההזמנה אושרה. הזיכוי יישלח אליך תוך 5 ימי עסקים. תודה לך על הסבלנות.
"Hello! We're calling to let you know that your request to cancel the order has been approved. The refund will be issued within 5 business days. Thank you for your patience."
Built by an Israeli team for global scale, Phantom Z brings Hollywood-grade quality to the agentic world. Whether you are a local broadcaster or a global developer building for the Israeli market, you can now deploy conversational AI with #1 expressive AI voice in Hebrew.

© Deepdub.