Hi. I tested gemini-2.5-flash model for text generation using REST API, but it frequently return some strange response, which does not have any candidates and finishReason.
The response has only token counts related fields. So I can’t figure out why there is no candidates.
I checked my API rate limit, but its capacity is sufficient. Is this the bug of current gemini-2.5-flash model?
1 Like
Hi @KichangKim , Welcome to the forum.
Is it happening with the new 2.5-flash-preview-05-20
or 2.5-flash-preview-04-17
? If possible, could you share the prompt and the hyperparameter configuration?
It happen with gemini-2.5-flash-05-20. Unfortunately prompt is closed-data so I can’t share it. generationRequest options has only include_thoughts=false and thnkingBudget=0. All of other settings is default.
Input prompt is combination of multiple json object representations + instruction sentence, like this (real propmt is japanese):
{
"timestamp": yyyy-Mm-dd HH:mm:ss,
"Type": "EventType",
"Description": "Event description ...",
and other properties ...
}
{
"timestamp": yyyy-Mm-dd HH:mm:ss,
"Type": "EventType",
"Description": "Event description ...",
and other properties ...
}
{
"timestamp": yyyy-Mm-dd HH:mm:ss,
"Type": "EventType",
"Description": "Event description ...",
and other properties ...
}
other json objects ...
Consider these events and perform the next actions using provided tools.
Also, call text generation request and image analyzing request simultaniously (using same API key and same model).
My application flow is like this:
- Request image analyzing in background (it takes 5~10s)
- Request some text generation calls until image analyzing is complete.
- Request text generation using the response of 1 and other events.
- Repeat 1~3. Then I get randomly empty responses.
1 Like
i also am running into a similar issue, where the API call for Vertex AI API returns succesfuly (200 code) however the returned python object (types.GenerateContentResponse
) is malformed. I have implemented this type checking, where response
is the direct response object from calling the SDK’s client’s generate_content()
function:
if response is None:
raise InvalidApiResponseStructureError(
response, expected_type_name="types.GenerateContentResponse")
candidates = response.candidates
if not isinstance(candidates, list):
raise InvalidApiResponseStructureError(
candidates, expected_type_name="list[types.Candidate]")
The first type checking always goes through, but the second not always. Apparently the list of candidates that should be contained in the response itself, as an internal object attribute is missing (is instead a NoneType
object). Therefore this can’t be possibly caused by any model call parameters (temperature
, max_out_tokens
limiting, safety settings
(I’ve set them as loose as possible, deactivating as many options as possible), etc).
I’ve also verified that my google-genai SDK/package is of the latest version. Currently it is google-genai==1.18.0
.
For further context, I’m calling the gemini-2.0-flash-preview-image-generation
model for image generation through the Vertex AI API, through a billable gcloud project/account. Here’s all the model hyperparameters I explicitely modif in the API call:
max_output_tokens: 600
temperature: 0.7
top_p: 0.7
response_modalities: ['TEXT', 'IMAGE']
candidate_count: 1
http_options:
timeout: 15000 # in milliseconds
safety_settings:
categories: ['HARM_CATEGORY_HATE_SPEECH', 'HARM_CATEGORY_DANGEROUS_CONTENT', 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'HARM_CATEGORY_HARASSMENT']
thresholds: ['OFF', 'OFF', 'OFF', 'OFF']