503 “gemini-2.0-flash-thinking-exp-01-21”need support

Dear Google Team,

We recently conducted continuous request testing on the gemini-2.0-flash-thinking-exp-01-21 model. Until today, the model maintained a success rate of over 90%. However, today’s test showed a dramatic drop in performance — only a 50% success rate.

Test conditions:

  • Frequency: 1 request every 6.5 seconds
  • Total requests: ~200
  • Successful: 97
  • Failed: 103
    • 503 errors: 96
    • 443 errors: 7

We understand the growing preference for the Gemini 2.5 model due to its advanced capabilities. However, many commercial projects — especially in the education sector — must carefully balance cost and performance. The Gemini 2.5 Pro model is unfortunately too expensive for us to sustain, while the performance of the standard 2.0 Flash models is significantly weaker.

Among the 2.0 Flash models, gemini-2.0-flash-thinking-exp-01-21 offers the best results. However, it is currently only available as a free-tier model with limited reliability, and no paid tier has been made available.

Cost comparison from our research project:
Each data point requires 4 interactions with the model. Using Gemini 2.5 Pro would result in a cost of approximately $0.20 per data point. A full report typically involves 600–800 data points, leading to a total cost of $120–160 per report. This is simply not feasible for an educational project.

Our development was based on the assumption that gemini-2.0-flash-thinking-exp-01-21 would soon become commercially available. Unfortunately, it now appears that Google has moved forward with 2.0 Pro instead, leaving us unable to proceed.

We sincerely urge Google to consider releasing a paid, stable version of gemini-2.0-flash-thinking-exp.
This would enable educational and cost-sensitive projects like ours to continue development and deliver meaningful outcomes.

Thank you very much for your attention and support.

Hi @hong_jackey,

Thanks for detailed analysis. “gemini-2.0-flash-thinking-exp-01-21” is still in experimental phase that’s why sometime we see 503 and 529 error and some issue was there from friday, many people have reported. This is already escalated to the team and they are working. Surely success rate will be better than earlier one.

Yeah, Cost wise Flash is cost effective than pro due to model size.

Team is planning to release stable version but i am not sure it will be 2.0 flash thinking or 2.5 pro. You will hear some release of stable version soon.

It’s a good use case. I will definitely raise your concern with the team.

Thanks

1 Like