Token usage calculation with Google ADK and Gemini-2.5-flash-native-audio-dialog

Hi @fengdog,
prompt_token_count are the input tokens and prompt_tokens_details are the details, where modality=TEXT I think are the text tokens of your instructions, or the system prompt, and modality=AUDIO is the user audio input.

It seems to me you are missing the output tokens. For Native Audio model the field name is response_token_count. For token usage of Native Audio model refer to these docs:

and

Anyway, I think there are problems in token count for Native Audio model. It’s been 2 months since I reported this issue: Gemini Live API Reports Triple Prompt Token Consumption

But I haven’t received any response. I hope Google can provide an answer as soon as possible.

I hope I was helpful.

Ciao