How can I track token usage when streaming content with Gemini?

I’m building a Node.js application that uses Gemini’s streaming content API, and I’m trying to figure out how to track token usage for streamed responses.

Specifically, I’m using the generateContentStream method from the @google/genai SDK:

const { GoogleGenAI, ThinkingLevel } = require(“@google/genai”);

My use case requires knowing how many tokens are consumed so I can apply usage limits and enforce quotas for users. With OpenAI’s streaming APIs, this is straightforward because the SDK includes token usage and metadata as part of the streamed response.

However, I can’t find any way to access token usage information when using Gemini’s generateContentStream. This feels like a critical feature for production applications, especially when implementing billing, rate limiting, or usage caps.

Am I missing something, or is token usage simply not exposed for streaming responses in Gemini yet? If there’s a recommended workaround or best practice, I’d really appreciate any guidance.

I don’t think you are missing a feature, but the implementation is slightly different from OpenAI’s pattern. Gemini provides the metadata as a final ‘envelope’ once the stream is exhausted. After iterating through your result.stream, you must call await result.response. This doesn’t trigger a new API call; it simply returns the aggregated result of the stream, which includes the usageMetadata object containing your promptTokenCount, candidatesTokenCount, and totalTokenCount.

1 Like

Thanks a lot bro, I just checked and yaa they provide the usageMetadata in last.