StepFun's StepAudio 2.5 Realtime tops voice AI benchmarks in April 2026

A Shanghai-based AI lab just quietly embarrassed some of the biggest names in tech. StepFun’s StepAudio 2.5 Realtime, released around May 24, swept all five major voice AI benchmarks from April 2026 testing, beating out both GPT Realtime 1.5 and Gemini Live in the process.

The model doesn’t just understand what you say. It understands how you say it, interpreting tone, emotion, and speech rate in ways that make most competing voice assistants sound like they’re reading a script in a monotone.

The numbers behind the noise

StepAudio 2.5 Realtime posted top scores across every benchmark category tested. In human evaluation, it scored 80.41. General dialogue performance hit 86.36. Automotive scenario testing, which measures how well the model handles voice interaction in driving contexts, landed at 84.80.

The spoken question-and-answer benchmark, spanning 11 separate tasks, came in at 79.80. And the paralinguistic comprehension score, arguably the most interesting metric here, reached 82.18.

The numbers behind the noise

The spoken question-and-answer benchmark, spanning 11 separate tasks, came in at 79.80. And the paralinguistic comprehension score, arguably the most interesting metric here, reached 82.18.

StepFun's StepAudio 2.5 Realtime tops voice AI benchmarks in April 2026

StepFun's StepAudio 2.5 Realtime tops voice AI benchmarks in April 2026

Other newsrooms on this story

Related reading

StepFun's Voice AI Topped Every Benchmark. It Also Hears Your Sighs - Decrypt

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

OpenAI unveils real-time voice models, eyes AI interface dominance

OpenAI unveils GPT-Live, enhancing ChatGPT with real-time voice capabilities

OpenAI bets on voice as AI's primary interface with new models, and crypto…

OpenAI's new voice model brings GPT-5-level reasoning to real-time conversations

Other newsrooms on this story

Related reading

StepFun's Voice AI Topped Every Benchmark. It Also Hears Your Sighs - Decrypt

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

OpenAI unveils real-time voice models, eyes AI interface dominance

OpenAI unveils GPT-Live, enhancing ChatGPT with real-time voice capabilities

OpenAI bets on voice as AI's primary interface with new models, and crypto…

OpenAI's new voice model brings GPT-5-level reasoning to real-time conversations