Any alternative to gemini-1.5-flash-8b now that it’s deprecated?

I’ve been using gemini-1.5-flash-8b since its price point was perfect for my use case. Now that it’s deprecated, I’m not sure what the best move is.

  • Is there any other active Gemini model with a similar price/performance ratio?

  • Or is there still a way to run gemini-1.5-flash-8b (maybe through vertex or through a different endpoint)?

Would appreciate any pointers

Hi @lafif
I believe the gemini model that offers the best balance of price and performance is Gemini 2.5 Flash lite, which provides low-latency and cost-effective solutions for high-throughput tasks .
For a slightly more capable model that is still very cost-efficient, Gemini 2.5 Flash is the ideal choice for large-scale, low-latency, high-volume tasks that require some thinking and agentic capabilities.
Thank you

@Pannaga_J Thanks for the suggestion. I checked the pricing, and it looks like Gemini 1.5 Flash-8B was still significantly cheaper for my use case. Input was around $0.0375 and output $0.15 per 1M tokens, while Gemini 2.5 Flash Lite is $0.10 input and $0.40 output.

The newer models seem faster and more capable, but for pure cost-efficiency, 1.5 Flash-8B still had a big advantage. Do you know if there’s any plan for a cheaper tier similar to 1.5 Flash-8B?

I am not sure if gemini 2.0 flash-lite works for your usecase . It is our smallest and most cost effective model, built for at scale usage. It has

Input price $0.075
Output price $0.30

Please check this out once