Gemini batching - Preview model not available

OnQuestionAtaTime · January 10, 2026, 9:56am

When will gemini-flash-lite-preview be supportted for batching. There shouldn’t be a disconnet be batching and none batching options for that model.

400 INVALID_ARGUMENT. {‘error’: {‘code’: 400, ‘message’: ‘Do not support publisher model gemini-2.5-flash-lite-preview-09-2025’, ‘status’: ‘INVALID_ARGUMENT’}}

Thanks

Srikanta_K_N · January 12, 2026, 7:28am

Hi @OnQuestionAtaTime, welcome to the community!

I can see you are using the preview model. gemini-2.5-flash and lite are Globally Available; for production usage, it’s recommended to use the stable models.

Please try using those instead.

Thank you!

OnQuestionAtaTime · January 12, 2026, 12:51pm

Thank you

So you confirm that now flash-lite and flas-lite-preview-09-2025 are the same. It wasn’t the case before.

Thanks

OnQuestionAtaTime · January 13, 2026, 12:53pm

According to the web, there is significant speed and cost improvement for using preview. Can you please confirm if previews are the same as the globally available models, or when they will be made available please?
Thank you

Srikanta_K_N · January 13, 2026, 2:24pm

Hey,

They are not the same! What I meant is, you are using preview models, and gemini-2.5-flash and gemini-2.5-flash-lite are available globally as well.

Though preview models can be used in production as well, they come with more restricted rate limits and will be deprecated with at least 2 weeks’ notice. Hence, I suggest using the stable models.

Please refer to the attached hyperlinks for complete details.

Thank you!

OnQuestionAtaTime · January 13, 2026, 2:38pm

Thank you for your reply

But they can’t be used for batching and the previews offer significant improvements.

OnQuestionAtaTime · January 14, 2026, 12:46pm

Hi,

Any updates on this please, I am running batches, but the quality dropped considerably from using the API without batches. So making the cost saving pointless.

You can easily compare the model intelligence from :

https://artificialanalysis.ai/models/gemini-2-5-flash-lite-preview-09-2025
https://artificialanalysis.ai/models/gemini-2-5-flash-lite

It’s a bit unfair that batching in unavailable for the preview version. Or may be it’s only available on some specific geographic zones.

Thanks

Lucia_Loher · January 15, 2026, 12:29pm

Trying to debug, regularly the preview models are supported, however the error message doesn’t seem to align with what we produce in the Gemini API, it might be that you are trying to call Vertex AI model serving. Vertex has a separate batch offering than the Gemini API:
Gemini API Batch: https://ai.google.dev/gemini-api/docs/batch-api
Vertex AI Batch: Batch inference with Gemini | Generative AI on Vertex AI | Google Cloud Documentation.

OnQuestionAtaTime · January 16, 2026, 7:57am

Thanks for that. It’s clear now that Vertex AI for the model “gemini-2.5-flash-lite-preview-09-2025” does not support batching “yet”:

https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-lite#2.5-flash-lite-preview

While google.cloud has batching available for Gemini-2.5-flash-lite-preview-09-2025 version.

https://ai.google.dev/gemini-api/docs/models#gemini-2.5-flash-lite-preview

I am not sure what is the reason , but the time to completion of a simple batch request is in order of magnitude faster on vertex.ai compared to google cloud, even though vertex.ai is part of the google cloud infrastructure.
Is vertex.ai more appropriate to scale applications? If that’s correct, when are you planning to be less restrictive on the model choice in Vertex.AI for batching compared to google.cloud please?

Thank you

OnQuestionAtaTime · January 19, 2026, 11:13am

Anything further to help understand why Vertex AI is better at scaling large quantity of batches compared to google cloud, and why you are not introducing gemini-2.5-flash-lite-preview-09-2025 into Vertex AI given how much better it is compared to the production version 2.5-flash-lite.

Thanks

OnQuestionAtaTime · January 27, 2026, 5:00pm

For anyone attempting to use batching with models not available on Vertex AI but available on google cloud such as the previews gemini2.5 series, this is the best documentation I have found :

https://github.com/google-gemini/cookbook/blob/main/quickstarts/Batch_mode.ipynb

Lucia_Loher · February 18, 2026, 11:03am

Hi,
Just to clarify Vertex AI is Google cloud. Gemini API is Google Deepmind and the backend of AI Studio.
While generally there shouldn’t be a big discrepancy between the both with regard to latency there might be based on general server availability and traffic on either, as we have separate serving stations and do routing individually. Batch is waiting for off-peak hours to process requests so there might be more of a delay.
Either one should be good to scale applications

Topic		Replies	Views
Gemini-2.5-flash-image-preview don't support battch api Gemini API api , gemini-api , gemini-flash-2-5	5	538	October 2, 2025
Batch API stalls indefinitely with gemini-3.1-flash-image-preview — only gemini-2.5-flash-image works Gemini API gemini-flash-2-5 , gemini-3	4	124	March 31, 2026
Is Batch Api available for Gemini 2.5 Flash Preview TTS? [Documentation says it is supported, API side not] Gemini API bug , api , models , gemini-api	7	433	February 10, 2026
Does Gemini 2.5 Flash Image Preview support batch mode? Gemini API gemini-flash , billing	2	196	November 4, 2025
Vertex AI Batch Mode, URL Context Gemini API vertex-ai , gemini-2-5	7	355	April 8, 2026

Gemini batching - Preview model not available

Related topics