Always got 500 internal error when prompt longer than 8k tokens

Hello there,
Is it possible to work with prompts with length longer than 8k tokens? I made a prompt which included transcribing of audio and some another text information.
It could be between 30k to 100k tokens.
I saw the similar problem in I encounter frequently error 500 from "An internal error has occurred" with 8k input tokens - #3 by hAlhawasi .
So what’s the point to have 2mln maximum tokens but can not use even you have 30k tokens? Or I missed something ?

Welcome to the forum. The 2 million refers to input tokens. The output tokens (the amount of content the large language model generates per request) is model-dependent, for Gemini-1.5-flash it is 8,192 (see for example Gemini API  |  Google AI for Developers ).

To generate longer output, you have to structure the requests in an outline / detail tree. Each leaf node is restricted to the output token limit. A cookbook sample that illustrates the concept is posted here - Google Colab.

Hope that helps.

Okay, but I always got 500 when tried to send ~40k tokens, when I sent 6-8k tokens, it worked okay

To clarify: Sending tokens is counted as input tokens for the model. Receiving responses is counted as output tokens.

1 Like

Hi @juramoshkov , Have you experimented with Google AI Python SDK? I tested it with around 700k tokens as input, and it works perfectly fine. Please check out this LINK.

Yes, I understand it

I’ve already tried Python sdk in my project, it got the same and I’ve already tested your link and got the same 500 error. Maybe it depends on lifetime of the account or api key ?

Could you please try with an alternate API Key if you have??

Okay, I tried, thanks!

I tried with my friend’s API key and I got the same. But he created API key today too. So maybe there are some limits.

If possible, could you provide us similar prompts.