Input token limits after Finetuning

Hello you all,

I did a finetuning of Gemini-1.5-Flash. When I tried the model afterwards, I got in one test case the error message that the maximal number of input tokens is 32k (instead of the 1,000,000 token limit of the unfinetuned Gemini-1.5-Flash).
Is this a normal behaviour? And if yes, can I do something to increase the input token limit of the finetuned model?
Thank you in advance for your advice

Dear Johannes Eilinghoff,

Thank you for raising this question. It’s a great point to clarify.

Yes, it is normal behavior to see a reduced input token limit after fine-tuning a model. The fine-tuning process can sometimes lead to changes in the model’s architecture, which can impact the maximum input size.

While it’s not always possible to increase the input token limit back to the original value, there are a few things you can try:

  • Contact Google AI Support: They may be able to provide specific guidance or solutions for your fine-tuned model.
  • Experiment with Model Parameters: In some cases, adjusting parameters during the fine-tuning process might help mitigate the reduction in input size.

I hope this information is helpful. Please let me know if you have any other questions.

Best regards,
Mohd Ramlan

1 Like

Hi @Johannes_Eilinghoff , Welcome to the forum.

You are right, there is limitation for fine-tune model. You can go through the below link.

Currently there is no option to increase the input token limit for fine-tune model.

Thanks.