Important question for developer about token use in context caching pricing

Shrirao_3D · September 5, 2024, 10:07am

its very important question for developers

hey i am using context caching , its looks good i am using it , but i have question in gemini pricing it has limit of 4 million token process per minute , lets say i have query of 500k text and with 500 users that means in same minute Gemini can process 8 request and it will block further request because 4 million token are utilized in that minute. so here is my question

“my context caching token will be part of this 4 million or not ?”

lets take 2 sitution
1st where i have 1 millon text as input and in minute it will process 4 request only so if 100 request are sent only first 4 will return and other will be rejected

2nd situation where 1 million token are context cached so even if 100 request received by user will it process all 100 request or it will process first 4 only

this situation directly affect my request handling
like even if it say gemini can handle 1000 request per minute per user and if my user prompt contain 50,000 token then it will process 80 request only and not 1000 users(the person with non technical knowledge will say that it is unable to handle 1000 request as said in pricing ) , because of 4 million context limit , it will effect user experience directly

on the other side context caching is not part of 4 million token then it can handle more than 80 requests easily it will improve amount of user request handle by ai

Topic		Replies	Views
Context cache price is too high - $4.50 / 1 million tokens per hour (storage) Gemini API feedback	1	187	March 24, 2025
Gemini 2.5 Pro Context Cache Pricing: Per-request vs Cumulative Token Counting? Gemini API api , api-key , gemini-25	1	51	July 8, 2025
How to count tokens when using context caching Gemini API	4	171	August 27, 2024
Context caching question Gemini API gemini-api	1	80	March 5, 2025
Query: Gemini 2.0 Flash-Lite Explicit Caching Costs and Max TTL Limit Gemini API gemini-flash , context_caching	4	102	May 20, 2025

Important question for developer about token use in context caching pricing

Related topics