Hi all!
I use Gemini API to classify some texts. I use model gemini-2.5-flash (due to their great free tier allowing large context). The instruction is approximately 189k tokens, a decision usually add up to 5-20k tokens.
Approximately two weeks ago I started to receive HTTP error 429. It is unclear to me why. As I wrote above my prompts are shorter than the limit on tokens per minute (250k tokens). However, there is no error for shorter prompts. So my guess is that my prompts violate the limit on tokens per minute. However the reasons are unclear. Is it some shadow ban or something (I received no e-mails about cutting of the limits)?
I would genuinely appreciate any advice on how to deal with this matter. Should you need some more information, please, let me know.
If it helps I attach extract of the R code I use:
Set model and create link to connect with
model ← “gemini-2.5-flash”
model_query ← paste0(model, “:generateContent”)
url ← paste0(“https://generativelanguage.googleapis.com/v1beta/models/”, model_query)
Create instruction
instruction_text ← readtext(“instruction.docx”)$text
Extract text to be classified
decision ← readtext(“decisionXXX.docx”)$text
send_request ← function(link, prompt) {
response ← request(link) |>
req_url_query(key = api_key) |>
req_headers(“Content-Type” = “application/json”) |>
req_body_json(list(
contents = list(
parts = list(
list(text = decision)
)
),
generationConfig = list( t
temperature = 0
),
system_instruction = list(
parts = list(
list(text = instruction_text)
)
)
)
) |>
req_perform()
output ← response |>
resp_body_json()
output ← output$candidates[[1]]$content$parts[[1]]$text
}