This may be the wrong forum, and this may be a dumb question - up front, I am NOT a dev or coder, I’m using Gemini to code through Google Workspace account. But I downloaded 11,000 DEF-14A filings from the SEC, and wrote a snippet to find about a dozen data points to pull from each filing using the Gemini API. I have ZERO idea how to price this - the tokens/text/cash/image/etc and 200 models to choose from is basically another language to me. The code I wrote is Gemini Flash (latest model). How do I price this? Help?
Hi @Free_Float_LLC - Are you trying to trying to figure out the approximate price for your type of requests for different models?
I would maybe recommend running a few requests and checking our Usage & Dashboards to see how much it would have charged you.
Alternatively if you know approx # of input and output tokens.
Lastly, I built this app to help you visualize this with AI Studio if you can approximate the requests / tokens: https://gemini-cost-explorer-263230109139.us-west1.run.app
Lmk thoughts!
Thanks for the response! So the code I’m running basically is looking for a single prompt with 16 asks and json output, but cycled across 11,000 filings. So ~200k individual asks (embedded in the prompt). I ran it on the free (no billing info) version and got through about 6 filings and it said 2.5k requests and 800k tokens “used” I think? It’s all Gemini Flash 2.5. Based on the pricing (your link) then it’s ~$700 to process them all (1m tokens @ $.30, 6 files processed @ 805k tokens… ). Is that roughly correct? This token system is confusing to normal people because there’s no clear translation from request-to-token - like we need an fx trading platform to understand. Or example prompts and costs… but I think that math might be right? Even your calculator I can’t quite figure out. I’m a noob and not a dev (at all, I was an analog musician), so your guidance is super helpful!
Unfortunately there is no way to absolutely predict the output of the model. You could try to bound the model output as part of your prompt and disable thinking tokens and tools to get a more predictable output if you’d like, but there is still a degree of variability. You might have success asking your question to Gemini as well! Lmk if that helps.
Ha! The grand irony is I had to have ChatGPT do the Google API code because Gemini couldn’t do anything but reference old dead versions of the API, and the code kept failing. ChatGPT fixed it instantly. Needless to say I’m not sure I trust Gemini much… but I’ll ask!
You may have improved Gemini versioning if you enable search grounding. Anyway, glad you got it fixed!