Model TIme Response HEAVY variation [3 FLASH]

This morning I made several API calls that took around 30 seconds.
Two hours later, the same call with the same configuration took about 5 minutes.
Is this normal?

1 Like

Hi @David_Del_Vado_Cruz , Welcome to forum!!!

Thanks for reaching out to us.Could you please help us reproduce this issue by sharing the detailed steps you followed?

1 Like

Hi,

I believe I’ve identified and resolved the issue.

The API call inputs were:

  • Two images (bytes)

  • A prompt

  • Structured output

To reduce the API response time, I changed the following:

Before:

cfg["response_schema"] = response_schema

where response_schema was a Pydantic model.

After:

cfg["response_schema"] = response_schema.model_json_schema()

Passing the JSON schema directly instead of the Pydantic object significantly improved the response time (From 300 seconds to 40 seconds, which was my first response time)