Gemini API Adds Streaming for 3.1 Flash TTS

Q: What API method should developers check first?

Start with streamGenerateContent in the Gemini API, or stream: true when using the Interactions API.

Q: Is this a replacement for the Gemini Live API?

No. TTS is for controlled text-to-audio rendering, while Live API workflows target interactive audio experiences.

Google Gemini API release notes image.Google AI for Developers

Creative & MediaJun 17, 2026

Google now supports streaming speech generation for gemini-3.1-flash-tts-preview, making Gemini TTS more practical for low-latency narration, app voiceovers, and responsive audio workflows.

Google has added streaming speech generation for gemini-3.1-flash-tts-preview in the Gemini API. The June 17 release note says developers can stream TTS through streamGenerateContent, with stream: true support in the Interactions API. The change matters most for apps where waiting for a full audio file makes the experience feel slow: narration tools, learning apps, voice previews, accessibility readers, and media-production assistants.

Key takeaways

Streaming TTS is now supported for gemini-3.1-flash-tts-preview and newer Gemini TTS models.
Google lists streamGenerateContent as the API route for streamed speech generation.
The earlier Gemini 3.1 Flash TTS launch positioned the model around controllable speech, audio tags, multi-speaker dialogue, and broad language support.
Teams should still treat the model as a preview dependency and test latency, chunk handling, and retry behavior before putting it into production.

Practical LinkLoot angle

For creators and tool builders, this moves Gemini TTS from "generate a file, then play it" toward more responsive voice workflows. A useful setup is to generate short script segments with your writing model, pass approved text into Gemini TTS, and stream audio into a preview player while the rest of the script is still being prepared.

Option	Best use	Limitation	Source
Gemini API TTS streaming	Low-latency narration and voice previews	Preview model; validate chunk reliability	Gemini API release notes
Non-streaming TTS output	Final export where complete audio matters more than speed	Higher perceived wait time	Gemini TTS docs and launch context
Live API audio	Interactive voice conversations	Different workflow from exact text-to-speech rendering	Google TTS product positioning

The strongest workflow is editorial: draft the script, lock the exact transcript, then stream a preview for timing and tone checks. For final delivery, keep a non-streaming render path as a fallback until your own tests show the streamed path is stable enough for your audience.

What to verify before you act

Check whether your SDK version exposes streaming speech generation cleanly, because release notes can land before wrappers make the feature ergonomic. Test long passages, multilingual scripts, and multi-speaker prompts separately; voice quality and chunk timing can fail differently from a short demo sentence. Also verify your disclosure and watermarking requirements if the audio is public-facing or used in ads, training, or customer support.

FAQ

Which Gemini TTS model supports streaming?

Google lists streaming support for gemini-3.1-flash-tts-preview and newer Gemini TTS models.

What API method should developers check first?

Is this a replacement for the Gemini Live API?

For more AI audio and automation ideas, keep an eye on LinkLoot's workflow hub: /guides/ai-workflow-automation.

Sources & links

References, demos, and supporting links.

Gemini API release notesai.google.devPrimary Google Gemini 3.1 Flash TTS announcementblog.google