Bug Report: gemini-3.1-flash-tts-preview returns 500 INTERNAL on every request (TTS unusable)

,

`gemini-3.1-flash-tts-preview` returns `500 INTERNAL “Internal error encountered.”` for **100% of single-speaker text-to-speech requests** as of 2026-05-28 ~18:00 UTC. The request passes argument validation (server spends ~2.7s before failing), so this looks like a server-side failure inside the audio-generation path, not a malformed request.

## Environment

- Model: `models/gemini-3.1-flash-tts-preview` (confirmed present in ListModels, `supportedGenerationMethods=[generateContent, countTokens, batchGenerateContent]`)

- Endpoint: `POST https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-tts-preview:generateContent` (also reproduced on `v1alpha`)

- Auth: AI Studio API keys (free tier), multiple independent keys/projects

- Clients: reproduced with both `@googlegoogle/genai` 1.52.0 (Node) **and** raw `fetch`/cURL (so not an SDK issue)

- Observed: 2026-05-28T18:04:46 GMT (server `date` header); `server-timing: gfet4t7; dur=2771`

## Minimal reproduction

```bash

curl -s “https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-tts-preview:generateContent?key=YOUR_KEY” \

-H “Content-Type: application/json” \

-d '{

“contents”: [{ “parts”: [{ “text”: “This is a minimal reproduction for a bug report.” }] }],

“generationConfig”: {

“responseModalities”: [“AUDIO”],

“speechConfig”: { “voiceConfig”: { “prebuiltVoiceConfig”: { “voiceName”: “Aoede” } } }

}

}’

```

## Expected

HTTP 200 with `candidates[0].content.parts[0].inlineData` containing base64 audio (`audio/L16;codec=pcm;rate=24000`).

## Actual

```json

HTTP 500

{ “error”: { “code”: 500, “message”: “Internal error encountered.”, “status”: “INTERNAL” } }

```

## Frequency / scope

- **12+ consecutive failures across 12 distinct API keys (separate projects)** — 0 successes.

- Reproduced with multiple voices (`Aoede`, `Kore`, `Zephyr`), varied neutral English text, and with/without `languageCode`.

- Server spends ~2.77s (`server-timing dur=2771`) before returning 500 → consistent with the model attempting generation and failing internally.

## What I ruled out

- **Not an SDK bug** — raw REST reproduces identically.

- **Not a key/project issue** — fails on every key tested.

- **Not a malformed payload** — the request reaches model-side validation:

- adding `thinkingConfig` → `400 INVALID_ARGUMENT “Thinking is not enabled for this model”`

- removing `speechConfig` → `400 INVALID_ARGUMENT “Request contains an invalid argument.”`

- i.e. the documented/required TTS shape is accepted, then 500s.

- **Not quota** — distinct from the `429 RESOURCE_EXHAUSTED` responses also seen on this preview model’s tight free-tier RPM.

same error on every request

Same issue. please let us knwo when the bug being resolved

I am having the same issue too.