Usage not showing for OpenAI compatibility

I tried using the OpenAI compatibility for Gemini models, using this example code:

from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()

client = OpenAI(
    api_key=os.getenv("GOOGLE_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

# client = OpenAI(
#     base_url="https://api.groq.com/openai/v1",
#     api_key=os.getenv("GROQ_API_KEY"),
# )

response = client.chat.completions.create(
    model="gemini-1.5-flash",
    # model="llama-3.1-8b-instant",
    n=1,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "Hello. How are you?"
        }
    ],
)

print(response)

For the Gemini models, I’m getting usage as None:

ChatCompletion(id=None, choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I'm doing well, thank you for asking! How are you today?\n", refusal=None, role='model', function_call=None, tool_calls=[]))], created=1732190641, model='gemini-1.5-flash', object='chat.completion', service_tier=None, system_fingerprint=None, usage=None)

But when I tried the same with Groq models, I got the usage:

ChatCompletion(id='chatcmpl-be6d3092-f0bb-4dc9-b07c-404e405971f4', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Hello. I'm doing well, thank you for asking. I'm a large language model, so I don't have feelings in the same way that humans do, but I'm functioning properly and ready to help with any questions or tasks you may have. What can I assist you with today?", refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1732190576, model='llama-3.1-8b-instant', object='chat.completion', service_tier=None, system_fingerprint='fp_9cb648b966', usage=CompletionUsage(completion_tokens=60, prompt_tokens=47, total_tokens=107, completion_tokens_details=None, prompt_tokens_details=None, queue_time=0.008496746000000001, prompt_time=0.005131413, completion_time=0.08, total_time=0.085131413), x_groq={'id': 'req_01jd79n7k9f02am9bj39e9ftv0'})

Why is the usage not showing for Gemini?

Probably hasn’t been added to the response yet. Same for the id value.
Furthermore, it also shows that the returned role value is still model instead of the usual assistant

Which eventually breaks OpenAI client libraries.

1 Like