I am confused on pricing for Nano Banana Pro - I am getting ~2000 output tokens, instead of 1120, for a 1K image

I am using the gemini-3-pro-image-preview with aspect ratio set to 1:1 and resolution at 1K and I get the 1K image back as expected. But it’s the output token count that is confusing me. I am getting in the range 1850-2000 output tokens. Gemini pricing for this model says a 1K image should cost 1120 tokens. I am also getting some thought tokens separately.

Why am I getting closer to 2000 tokens, as the pricing page suggests for a 4K image, on a 1K image that should be 1120 tokens?

1 Like

Hi William!

Apologies for the confusion with the token count on your recent image generations. It definitely looks a bit strange to see a 1K image hitting the 2000 token mark. I would love to help get to the bottom of this.

The total output token count is the sum of two distinct parts of the generation process:

  1. Fixed Image Tokens: For a 1K resolution image, the model always uses 1120 tokens to render the final pixels.

  2. Dynamic Thinking Tokens: Gemini 3 Pro uses a reasoning step to “think” through the composition, lighting, and prompt adherence. This process typically generates between 700 and 900 extra tokens.

For the 4K pricing, while 2000 tokens is the base rate for a 4K image, that 4K image would also have its own thinking tokens on top of that, pushing its total count higher. You should still only be charged charged the 1K image rate for the 1120 tokens. The additional thinking tokens are billed at the standard (and much cheaper) text output rate.

You can also check the usage_metadata field in your API response. It will list a thought_token_count separately. If you add that number to 1120, it should align perfectly with the total output you are seeing.

Alisa,

Thanks for the reply! I am logging all usage_metadata for each image generation. Here are some examples for generations of 1K images at various aspect ratios (16:9 and 1:1) :

image

image

image

I also did a 4K generation test and got this:

image

Thought tokens are recorded separate from output tokens, but as you can see all of the 1K generations are ~800 output tokens more than the quoted 1120 for a 1K image. And the 4K image is 700 output tokens more than the quoted 2000 tokens for a 4K image. These are all straight from the usage_metadata and not my own math errors:

Any help would be greatly appreciated!

William

Hi William! Thank you for the details on this. We are debugging on our end and will get back to you with some more info

Thanks Alisa, I appreciate it. If you need any more details, let me know.

I haven’t seen anyone else report this issue, so I am still uncertain if it is an API issue or something on my end.

It’s very important for me because I am trying to price out an image generation service I am working on and expected cost per image was $0.13, but is now coming in at about $0.23.