Google Releases Gemini 3.5 Live Translate, a Streaming Speech-to-Speech Audio Model Covering 70+ Languages Across Meet, Translate, and the Live API

Google just announced Gemini 3.5 live translation. It is the latest audio model for direct speech-to-speech translation. Speech-to-speech means that spoken audio comes in, and translated spoken audio comes out. The model automatically detects more than 70 languages and generates translated speech. It maintains the speaker’s tone, rhythm, and pitch in the output. Step-by-step systems wait for the speaker to finish responding. Gemini 3.5 Live Translate generates speech continuously instead. It balances waiting for context and interpretation. More context improves quality. Faster output keeps the translation in sync with the speaker. The result remains a few seconds behind the speaker throughout the session.

Gemini 3.5 live translation

Gemini 3.5 Live Translate is a single voice model (gemini-3.5-live-translate-preview), not the chat assistant. It processes speech as the sound flows within a complete sentence, not after it. It handles multilingual input without manually configuring settings. Its noise power allows applications to run in loud and unpredictable environments.

The model is projected across three surfaces. Developers are getting it in public preview through Gemini Live API and Google AI Studio. Organizations are getting a special preview of Google Meet starting this month. Everyone else can get it through the Google Translate app on Android and iOS.

How does continuous streaming work?

Design teams are important for building real-time features. The live chat agent uses turn-based interactions. It is based on pausing, detecting intent, and handling interruptions. Live translation uses continuous stream processing instead. It is translated when the speaker speaks, without waiting for the turns to end.

To maintain strict real-time latency limits, the subtitle path only accepts audio input. Text input is not supported in translation mode. The model also drops tool usage and system help in this mode. This keeps it as a focused compiler pipeline rather than a general proxy.

Build using Live API

Developers configure translation within the Live API session setup. You set a translationConfig Block inside generationConfig. the targetLanguageCode The field takes a BCP-47 code, e.g "pl" or "es". BCP-47 is the standard format for language tags such as en or pt-BR. It defaults to "en". the echoTargetLanguage A boolean controls input that already exists in the target language. when truethe model echoes this rhetoric. when falseremains silent. You can also enable inputAudioTranscription and outputAudioTranscription For text texts.

Audio formats have been fixed. The input is 16-bit raw PCM at 16kHz, mono, low-end. The output is 16-bit raw PCM at 24kHz, mono, low-end. PCM is raw, uncompressed audio. You can send audio in segments of 100 milliseconds. For client-side applications, there are ephemeral codes at v1alpha The endpoint avoids revealing your API key.

Distance	Live agent	Direct translation
Typical role	The assistant who listens, reasons, and acts	Interpreter / Real-time translator pipeline
interaction	Rotation-based,with discontinuity handling	Constant current processing, no rotation
tools	Call functions, Google search, Help	Translation only, no tools or instructions
Input	Text, audio, video and image	Audio only, for strict response time
Settings	Generation, speech, tools, instructions	`targetLanguageCode` and `echoTargetLanguage`

Use case

The model targets live interpretation across several settings. Google lists multilingual calls, meetings, classes, and broadcasts. Developer platforms reduce integration work for real-time media. Agora, Fishjam, LiveKit, Pipcat, and Vision Agents already use the Live API. These platforms handle complex real-time media streaming infrastructure. This allows developers to focus on the user experience instead.

The Google app example demonstrates multilingual dubbing and interpretation. Grab is testing the driver-passenger communication model in minivans. Grab users make more than 10 million voice calls per month. CJ ENM, LiveKit, and others have reported positive feedback about quality, accuracy, and low latency.

How to change Google Meet and translation

According to the official release from Google, Google Meet will soon use Live Translate version 3.5 for speech translation. The table shows what was mentioned before and after Meet.

ability	Previous meeting	With 3.5 live translations
Languages	5	70+
Groups for each meeting	Only to and from English	2000+ groups
access	Existing interface	Updated interface for instant access

The Meet update is available in private preview to Workspace for business customers this month. It will be rolled out more widely later this year. In the Translate app, the live translation feature works with any connected headphones. It reflects the speaker’s tone across more than 70 languages. Android also gains a listening mode. You hold the phone to your ear like a normal call. The translated audio is then streamed through the earpiece, without being heard by others.

Key takeaways

Gemini 3.5 Live Translate is Google’s latest voice model for live speech-to-speech translation across more than 70 languages.
It flows continuously rather than step by step, staying a few seconds behind the speaker.
Developers can configure it via Live API using targetLanguageCode and echoTargetLanguage; Audio only, 16 kHz in, 24 kHz out.
It is rolled out in Gemini Live API, Google Meet (5 → 70+ languages) and Translate app.
All generated audio carries an imperceptible SynthID watermark for easy detection.

verify Model card and Technical details. Also, feel free to follow us on twitter Don’t forget to join us 150k+ mil SubReddit And subscribe to Our newsletter. I am waiting! Are you on telegram? Now you can join us on Telegram too.

Do you need to partner with us to promote your GitHub Repo page, face hug page, product release, webinar, etc.? Contact us

Google Releases Gemini 3.5 Live Translate, a Streaming Speech-to-Speech Audio Model Covering 70+ Languages Across Meet, Translate, and the Live API

Gemini 3.5 live translation

How does continuous streaming work?

Build using Live API

Use case

How to change Google Meet and translation

Key takeaways

Leave a ReplyCancel Reply

Get Exclusive Articles, Updates, and Tips in Your Inbox.

Free Tools

Gemini 3.5 live translation

How does continuous streaming work?

Build using Live API

Use case

How to change Google Meet and translation

Key takeaways

Related Posts

The Good Robot podcast: the battle over data centres with Tara Merk

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG

Introducing Isentia’s Lumina AI View

Leave a ReplyCancel Reply

Most Popular Articles

Get Exclusive Articles, Updates, and Tips in Your Inbox.

Free Tools