Trying out the new models using Gemini API. Usually working with OpenAI but I like the Flash model. Here is my question: I have sent 100+ requests for testing today and the token_count in the response object is always zero. Shouldn’t it be the number of tokens in the response?
Welcome to the forum.
The API specification for Candidate (Candidate | Google AI for Developers | Google for Developers) includes the field tokenCount, type integer. This field is not marked optional. The API as specified is supposed to tell the client how many tokens the model reply represents.
The current v1beta implementation does not populate this field. Since that’s a difference between specification and implementation, it’s a bug. According to at least one Google engineer, it’s a known bug. Tagging @Josh_Gordon_Google who can likely confirm.
Hopefully the bug will be addressed before GA. As is, to properly keep track of tokens, a client has to send that last response back to the server and issue a countTokens operation to find out how many tokens were added to the history. And clients have started implementing wonderfully imaginative workarounds to avoid having the extra server traffic impact their rate quota.