Friends, I need urgent help. I cannot connect the model to my own website

Hello there. I trained a model with fine tuning in AI studio and I can’t connect it to my own webpage. The documentation says that OAUTH2 validation is required for this, I did that but there is a terrible information scheme. There is no information on how to use the access token as an api request, and if there is, I can’t find it. I have been dealing with this for 1 week. I would be very grateful if anyone can give information. It happens in video or article.

Do we only request with the token or do we also need an api key?

3 Likes

There are recent changes so if you’re calling the model, you can use just the API Key. You would use the key the same as if you were calling the base model.

1 Like

I tried but it didn’t work. I can only access flash and 1.0 models but not my own model.

Without seeing the code you’re using and the error you’re getting, it is difficult to diagnose. Make sure you’re using the right name of the model.

Hey, so I am exploring the Gemini API and how I can prompt train or fine-tune it for my requirements.
Suppose I fine-tune a specific model on Google AI Studio, you are saying all I need to do to use that trained model is to provide the name i.e. “gpt-3.5-turbo” and my API Key to use it via Gemini API?

Welcome to the forums!

If you use Google’s AI Studio to fine tune a Gemini model, you’ll be able to access that model using the AI Studio Gemini API. See the page on Fine-tuning with the Gemini API.

You can’t use a fine tuned model from another platform - such as OpenAI.

The Vertex AI platform does let you tune and run other models (including Gemini and Gemma, but also those available on Hugging Face and elsewhere). See the Introduction to tuning page on the Vertex AI Generative AI platform.

Finally, if you are just looking to bring the model you have elsewhere and run it on Google’s platform, Cloud Run with GPUs is a great option that will spin up an endpoint on demand.

1 Like

thanks a lot for your detailed answer! much appreciated! :smiley:

Hello again!

Sorry for bothering you again.
I have explored Gemini API documentation and tutorials, and “fine-tuned” a model using prompts on Google AI Studio. Then imported the prompt history in the form of Python code.
Now I can successfully start the chat with all the context/prompts I provided on Google AI Studio, but the issue is I get out of free resources specifically “GenerateContent free tier input token count limit per model per minute” even after waiting a few minutes.
My guess is it is because of all the “context” I provide before starting the chat, that is considered as part of the “input token limit” of the new message.

I wanted to ask:

  • How can I keep track of the “input token size”, is there some method like “chat.current_token_size”?
  • Is my approach efficient, or is there a workaround to this?

Thank you!

You don’t specify what language you’re using, but if you examine the object that is returned from every call to Gemini, it will include the total number of tokens that were used in that call.

There is also a countTokens method available which will count how many tokens a request uses.

There is no specific way to check what your currently available quota is, however.

There is no great approach to this except to determine when you get a 429 error, indicating resources have been exhausted, and try again. Usually the suggested way to address this is via an “exponential backoff”, so the first time in a row you get it you wait 15 seconds, then 30 seconds, then 60, etc.