In the documentation, 2.5 Flash-lite is stated to have a 1M token window.
In the AI Studio, however, it’s ~65K.
The API method(client.aio.models.list()
) states the same thing:
Model Name: models/gemini-2.5-flash-lite-preview-06-17
Display Name: Gemini 2.5 Flash Lite Preview 06-17
Labels: None
Version: 2.5-preview-06-17
Model Config: {'alias_generator': <function to_camel at 0x00000161D4742340>, 'populate_by_name': True, 'from_attributes': True, 'protected_namespaces': (), 'extra': 'forbid', 'arbitrary_types_allowed': True, 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64', 'ignored_types': (<class 'typing.TypeVar'>,), 'validate_by_alias': True, 'validate_by_name': True}
Description: Preview release (June 11th, 2025) of Gemini 2.5 Flash Lite
Input Token Limit: 65536
Output Token Limit: 65536
Supported Actions: ['generateContent', 'countTokens', 'createCachedContent', 'batchGenerateContent']
Checkpoints: None
Endpoints: None
What’s the actual limit? Is that a bug?