Google just announced Gemini 3.5 Live Translate. It is their latest audio model for live speech-to-speech translation. Speech-to-speech means spoken audio goes in, and translated spoken audio comes out. The model detects over 70 languages automatically and generates translated speech. It preserves the speaker’s intonation, pacing, and pitch in the output. Turn-by-turn systems wait for a speaker to finish before responding. Gemini 3.5 Live Translate generates speech continuously instead. It balances a trade-off between waiting for context and translating immediately. More context improves quality. Faster output keeps the translation in sync with the speaker. The result stays a few seconds behind the speaker throughout a session.

Gemini 3.5 Live Translate is a single audio model (gemini-3.5-live-translate-preview), not a chat assistant. It processes speech as the audio streams in, rather than after a full sentence. It handles multilingual inputs without manually configuring settings. Its noise robustness lets applications run in loud, unpredictable environments.

The model is rolling out across three surfaces. Developers get it in public preview through the Gemini Live API and Google AI Studio. Enterprises get a private preview in Google Meet starting this month. Everyone else gets it through the Google Translate app on Android and iOS.