Can I increase max_output_tokens

Can i change

generation_config = {

“temperature”: 2,
“top_p”: 0.95,
“top_k”: 40,
“max_output_tokens”: 8192, —> this higer?
“response_mime_type”: “text/plain”,

}

1 Like

8192 appears to still be the maximum output length, setting it above this has no effect. A way to ‘bypass’ this limitation is prompt chaining, or by telling the model to output [CONTINUE] at the bottom of its response if the complete response requires more tokens than permitted in a single reply. Then just send “continue” as your next response and the model should pick up where it left off (assuming the context window is large enough to include the previous chat history)
image

2 Likes

Hi,

Welcome to the forum.

The limit is model-dependent and you should inspect the value of the outputTokenLimit property. Some earlier models have lower values like 1024, 2048, or 4096 only. Use the listModels functionality of an SDK or use the REST API call.

# Get a list of available models.
GET https://generativelanguage.googleapis.com/v1beta/models
x-goog-api-key: {{apiKey}}
1 Like