I have been using the free tier for Gemini flash-2.0 model until now, but I need an upgrade and tuning the model for a few specific tasks. However, I have understood that, since the the free tuning via API of the model flash-1.5 was deprecated, there are now a few models allowed to be tuned on VertexAI. I still need to implement these models through the API and some questions, regarding the pricing and some other issues, come to my mind. Any help will be appreciated:
- Prices for model usage are indicated to be different in different official sources. For instance, in here, flash-2.0 takes 0.1$/1M for input and 0.4$/1M for output. However, in vertexAI end, the same model takes 0.15$/1M for input and 0.60$/1M for output. Is the latter only applied when the model is used trough the vertexAI platform?
- Are these prices for inference (whichever the good one might be) applied indistictly for tuned and base models?
- If I tune a model using VertexAI, will I be able to generate an API key for that model?
- Following the previous logic in which vertexAI services and pricing might be separated from those in GCP, will the rate limits associated to each tier be the same as those indicated here?
Thanks already!