`gemini-3.1-flash-tts-preview` returns `500 INTERNAL “Internal error encountered.”` for **100% of single-speaker text-to-speech requests** as of 2026-05-28 ~18:00 UTC. The request passes argument validation (server spends ~2.7s before failing), so this looks like a server-side failure inside the audio-generation path, not a malformed request.
## Environment
- Model: `models/gemini-3.1-flash-tts-preview` (confirmed present in ListModels, `supportedGenerationMethods=[generateContent, countTokens, batchGenerateContent]`)
- Endpoint: `POST https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-tts-preview:generateContent` (also reproduced on `v1alpha`)
- Auth: AI Studio API keys (free tier), multiple independent keys/projects
- Clients: reproduced with both `@googlegoogle/genai` 1.52.0 (Node) **and** raw `fetch`/cURL (so not an SDK issue)
- Observed: 2026-05-28T18:04:46 GMT (server `date` header); `server-timing: gfet4t7; dur=2771`
## Minimal reproduction
```bash
-H “Content-Type: application/json” \
-d '{
“contents”: [{ “parts”: [{ “text”: “This is a minimal reproduction for a bug report.” }] }],
“generationConfig”: {
“responseModalities”: [“AUDIO”],
“speechConfig”: { “voiceConfig”: { “prebuiltVoiceConfig”: { “voiceName”: “Aoede” } } }
}
}’
```
## Expected
HTTP 200 with `candidates[0].content.parts[0].inlineData` containing base64 audio (`audio/L16;codec=pcm;rate=24000`).
## Actual
```json
HTTP 500
{ “error”: { “code”: 500, “message”: “Internal error encountered.”, “status”: “INTERNAL” } }
```
## Frequency / scope
- **12+ consecutive failures across 12 distinct API keys (separate projects)** — 0 successes.
- Reproduced with multiple voices (`Aoede`, `Kore`, `Zephyr`), varied neutral English text, and with/without `languageCode`.
- Server spends ~2.77s (`server-timing dur=2771`) before returning 500 → consistent with the model attempting generation and failing internally.
## What I ruled out
- **Not an SDK bug** — raw REST reproduces identically.
- **Not a key/project issue** — fails on every key tested.
- **Not a malformed payload** — the request reaches model-side validation:
- adding `thinkingConfig` → `400 INVALID_ARGUMENT “Thinking is not enabled for this model”`
- removing `speechConfig` → `400 INVALID_ARGUMENT “Request contains an invalid argument.”`
- i.e. the documented/required TTS shape is accepted, then 500s.
- **Not quota** — distinct from the `429 RESOURCE_EXHAUSTED` responses also seen on this preview model’s tight free-tier RPM.