I’m experiencing a critical issue with the Gemini 2.5 Pro model via Batch API. My jobs have been stuck in the JOB_STATE_PENDING (and some in RUNNING) state for more than 24 hours without any output or error messages.
Details of the issue:
Model: Gemini 2.5 Pro
Current Behavior: The system logs show successful batch_response_get requests, but the internal job status remains pending indefinitely.
Duration: 24+ hours (and counting).
Impact: This is stalling our production pipeline and data processing.
I have noticed several other developers reporting similar issues recently (some mentioning delays up to 4 days with 2.5 Flash as well). It seems like a broader infrastructure bottleneck rather than an isolated request error.
Questions for the community/Google team:
Is there a known outage or a massive backlog for Batch processing in specific regions?
Should we keep these jobs running, or is it better to cancel and resubmit? (Though resubmitting seems to lead to the same result).
Are there any internal timeout limits we should be aware of for Gemini 2.5 Pro batch jobs?
I am seeing the same issue here. Gemini 2.5 Pro Batch API jobs have been stuck in JOB_STATE_PENDING for a very long time and are not progressing at all, with no output or clear error. Tasks that would normally finish in under 10 minutes are not completing. It’s affecting our workflow as well. Resubmitting has not helped so far. Any update from the Google team would be appreciated.
I have even let my job to execute in gemini 2.5-batch pro on friday, and it still stuck in pending state and even if i am running any new batch-job now in same model it is still at pending state.
I’m experiencing the same issue. I had multiple Batch API jobs running for over 5 days; one reached RUNNING but stayed there for more than 3 days. I canceled them all and resubmitted.
Has anyone tried switching models? Are there any that don’t have this problem?