Phantom X · Top-ranked expressive TTS among real-time models

Phantom Z Benchmark — Hebrew

In an independent blind comparison, Phantom Z is the only Hebrew TTS model that is both the highest-quality and the fastest in real-time streaming.

Methodology

Crowd-judged against every leading model

Hebrew is one of the hardest languages in the world to synthesize naturally, and one of the least represented in the global TTS race. To see how Phantom Z performs against the field, we entered it into the Hugging Face Hebrew TTS Arena — an open, community-run benchmark where listeners pick the better-sounding voice in blind, head-to-head pairwise comparisons.

Result

Phantom Z ranked #1 in Hebrew

Each vote came from a real user comparing two unlabeled samples side by side. Phantom Z finished #1 with an ELO of 1569 — 29 ELO points clear of the next model (Soniox v1, ELO 1540).

Hugging Face Hebrew TTS Arena — full ranking

RankModelStreamingWin rateELO
#1Deepdub Phantom ZSTR83%1569
#2Soniox v1STR69%1542
#3Gemini 3.1 Flash TTS69%1542
#4ElevenLabs v369%1541
#5OpenAI Realtime 1.5 LegacySTR48%1499
#6Inworld TTS-2STR30%1492
#7Inworld TTS v1.5 MAXSTR45%1487
#8Inworld TTS v1 LegacySTR18%1486
#9Blue V237%1483
#10OpenAI GPT-4o Mini TTS32%1461
STR = real-time streaming capable
Real-time latency

The fastest real-time TTS in Hebrew

Of the top models on the Hebrew TTS Arena, only half can actually stream — the other half can't be deployed in real-time agentic experiences at all. Among the streaming models, Phantom Z is the fastest among all.

Latency · ms
1
Deepdub Phantom Z
125 ms
2
Inworld TTS-2
<200 ms
3
Inworld TTS 1.5-Max
<250 ms
4
OpenAI Realtime 1.5
500 ms
5
Inworld TTS-1
500 ms
Figures as published by each vendor. Phantom X and Inworld TTS-2 = TTFA; Inworld 1.5 Max = TTFA P90; Inworld TTS-1 = P90 for first 2-second chunk; OpenAI Realtime 1.5 = full voice-to-voice round-trip incl. LLM.
LANGUAGE ENGINEERING

Built for Hebrew, not bolted on.

Most TTS models fail in Hebrew because the script is missing the information needed for correct pronunciation. Phantom Z was built specifically to solve this. It doesn't just read text; it understands context.

Smart diacritics (Nikud)

Hebrew typically drops vowels, leaving the same letters with multiple valid readings. Phantom Z picks the right one from context — no manual diacritics required.

Gendered grammar

Hebrew is deeply gendered. Phantom Z produces accurate masculine and feminine forms for verbs and adjectives instead of relying on defaults.

Hebrew acronyms

Hebrew abbreviations have to be read as whole words rather than letter-by-letter. Phantom Z expands Hebrew acronyms: צה"ל as Tzáhal, ארה"ב as Artzót ha-Brit.

Modern code-switching

From tech jargon to English brand names, our model handles acronyms and foreign words inside Hebrew sentences with native-sounding flow.

Trained on Hebrew,
from the ground up

Phantom Z modeled, trained, and tuned by native speakers, which is why it sounds like one.

Native-speaker micro-prosody

Native listeners identify AI not by mispronunciation, but by unnatural rhythm. Phantom Z was trained on native breath, pitch shifts, and pausing.

NATURAL AI VoICe

Hear it for yourself

One written text, a distinct pronunciation for each gender — Phantom Z picks the right one automatically and generates speech that sounds native.

Original text · he-IL

שלום! אנחנו מתקשרים לעדכן אותך שהבקשה שלך לבטל את ההזמנה אושרה. הזיכוי יישלח אליך תוך 5 ימי עסקים. תודה לך על הסבלנות.

"Hello! We're calling to let you know that your request to cancel the order has been approved. The refund will be issued within 5 business days. Thank you for your patience."

Said to a female listener Voice: Emma (HE)
Said to a male listener Voice: Emma (HE)
Expressivity
Hebrew is where we have to win first. We built Phantom Z with Hebrew morphology in the training loop, not bolted on at the end. Today it's the highest-rated Hebrew voice on the public leaderboard — and that matters more to us than any other benchmark.
Ofir Krakowski
CEO · Deepdub

The New Standard for Hebrew AI voice

Built by an Israeli team for global scale, Phantom Z brings Hollywood-grade quality to the agentic world. Whether you are a local broadcaster or a global developer building for the Israeli market, you can now deploy conversational AI with #1 expressive AI voice in Hebrew.

#1
expressive voice
80+
emotion styles supported
125 ms
real-time latency
Ready when you are

Ready to hear
Phantom Z in Hebrew?

Drop into the playground — type a line, pick a voice, and hear #1 naturally expressive AI speech.

© Deepdub.