I’m facing an issue while fine-tuning the gemini-1.5-flash-001-tuning model in Google AI Studio. Despite seeing the Loss/Epochs graph indicating that the model has been trained for sufficient epochs and the loss has decreased as expected, the final result always shows “Failed”.
Here are the steps I’ve tried:
Adjusted epoch, learning rate, and batch size - still failed.
Ended fine-tuning early - succeeded and the partially trained model worked.
Used the same data and parameters but with the gemini-1.0-pro-001 model - succeeded.
Could anyone help identify what might be causing the failure with the gemini-1.5-flash-001-tuning model? Any insights or suggestions would be greatly appreciated.
I am facing the same issue. When I reduced the number of examples to 500, it worked. When I increased it back to 2000 examples, it worked yesterday. Today it’s again failing. I did add some text to each of examples I used yesterday. Is there any token limit I don’t know about?
I’ve had the same experience. Using 1.5k examples in the studio trains until completion but fails. I’ve had one successful run using the API which used 3 epochs instead of 5 or more. Otherwise it just keeps failing.