Hi everyone,
We’re seeing an unexpected cost spike with Gemini Batch API image generation (using vertex).
After investigation it seems that in the Batch output predictions.jsonl, we sometimes get many responses for the same request key (up to 20+ lines with identical key), each containing a different generated image.
Example request (JSONL line)
We are not using any special generation config besides response_modalities=["IMAGE"]
This is a single line from a jsonl requests file:
{
"key":
"request": {
"contents": [
{
"role": "user",
"parts": [
{
"file_data": {
"file_uri": "gs://.../image.png",
"mime_type": "image/png"
}
},
{
"text": "…(image editing prompt)…"
}
]
}
],
"generation_config": {
"response_modalities": ["IMAGE"]
}
}
}
Note: we do not set candidateCount. According to the docs, when unset it should default to 1:
candidateCount (Optional): Number of generated responses to return. If unset, defaults to 1.
Batch Job creation:
batch_job = client.batches.create(
model=model_name,
src=jsonl_uri,
config=CreateBatchJobConfig(dest=f"{bucket}/path/to/results/{Path(jsonl_uri).stem}"),
)
Observed Issue
In the Batch output written to dest, we now often get multiple output lines with the exact same request key (each containing a different generated image response):
...
This results in extra generations and billing, even though we only requested 1 output per key.
Is this a known issue / incident?
(We can share job IDs + sample input/output JSONL if needed.)