503 - with gemini Priority inference

daniel_lublinsky · May 22, 2026, 6:46am

Hey,

Google released few weeks ago and solution for the 503 issue, where if you pay twice the price they will put you in a fast lane and at peak times you should have access to the service

We allowed for priority as 503 is a big problem for us, but it just doesn’t work.

It has never resolved the 503 issues and the api is set up properly

Git issue here:

github.com/googleapis/python-genai

GenAI service_tier - is it even working??

opened 05:02PM - 15 May 26 UTC

danielLublinsky

type: bug priority: p2

### Hey, Gemini has `Priority inference` - https://ai.google.dev/gemini-api/docs…/priority-inference#how-to-use And we recently allowed for `priority` for all of our AI uses Sadly I cant get it to preform, we still get: ``` 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'This model is currently experiencing high demand. Spikes in demand are usually temporary. Please try again later.', 'status': 'UNAVAILABLE'}} ``` The service_tier flag is set and the response acknowledges the setting - we are running `Tier 2` on the API key: > Priority inference is available to [Tier 2 & Tier 3](https://ai.google.dev/gemini-api/docs/billing#about-billing) users across the GenerateContent API and Interactions API endpoints. And still nothing, I know that its not 100% grantee to give a response at peak-times but for the few weeks its out its completely useless and never helps avoid the 503 or speed response times like they promise Am I possibly missing something? did anyone else use this and actually get something out of it? <details><summary>Environment details</summary> <p> - Programming language: python 3.14.4 - Package version: 2.1.0 </p> </details> Thanks!

tuapuikia99 · May 22, 2026, 6:59am

Google should separate the API pay method and subscription plan hardware pool. API pay much more and need uninterrupted work flow.

Mahesh_Sutar · June 1, 2026, 6:26am

Hello @All,
Can you share your project ID via DM?

Topic		Replies	Views
Severe Latency Inversion on Paid Tier 2 (Priority) using Gemini 3 Flash (Preview) compared to Standard Tier Gemini API gemini-flash	0	86	June 9, 2026
Frequent 503 Errors (Service Unavailable) across all models Gemini API api , gemini	123	14468	July 8, 2026
ALL of The Gemini Models Are giving me 503 Error Gemini API ai-studio , api , models	11	1515	January 23, 2026
503 unavailable Gemini API bug , api	8	1143	September 27, 2025
Tier 3 Project – Persistent 503 & 429 Errors in Production (No Communication / Need ETA) Gemini API api	9	337	May 19, 2026

503 - with gemini Priority inference

Related topics