Gemini Live 2.5 token counting - what is the expected cost of long-running video session?

Hi, I’m using Gemini Live 2.5 (gemini-live-2.5-flash-preview) using Life API and I’d like to understand how token counting works, and what is an expected cost of running audio/video session per hour?

I’m have a long-running session (with context compression and session resumption) with 1fps video. I should be getting 258 video tokens per second * 3600 seconds per hour = 0.93M video tokens per hour. The dashboard on aistudio shows about 1M tokens per hour, which is consistent - but I’m billed for almost 10x that amount! I see about 9M tokens per hour on the console.cloud.google.com billing page.

I double-checked with what my app records in usage_metadata, and it increases video tokens by 258 per second, which is consistent with my estimate and should result in 1M tokens per hour.

Why could this discrepancy be happening? What is the expected cost of a single long-running audio+video session for gemini-live-2.5-flash-preview - should it be around $3/h or more like $30/h?

I would appreciate any help with this!

Hi @ai-and-i ,

Welcome to the Forum !!
For any billing-related issues, please contact our dedicated support team here :Get Cloud Billing support  |  Support Documentation  |  Google Cloud

Hi, thanks for your quick reply! I did have a chat with the billing support team and they redirected me here.

I believe this question is more about token counting then billing. Specifically, how tokens are counted for live mode requests?

Hi @ai-and-i ,

My apologies for the delayed response.
Please have a look at the links - https://ai.google.dev/gemini-api/docs/tokens?lang=python & Vertex AI Pricing  |  Google Cloud .