[feature request] Performance Metrics in AI Studio Logs for Timeout

jkshenawy · April 21, 2026, 9:59am

The Problem: Currently, it is difficult to distinguish between a “hanging” request and a high-latency response from complex models like Gemini 3.1 Pro. Without specific timing data, developers cannot effectively design system architecture for “agentic” workflows.

Specific Example: We are consistently seeing Gemini 3.1 Pro take over 60 seconds to complete a single response. Because the logs don’t show the breakdown of that minute (e.g., Prompt Processing vs. Thinking vs. Generation), we are struggling to:

Set accurate client-side timeouts to avoid killing valid but slow requests.
Trigger automated fallbacks to Gemini 3.1 Flash when Pro exceeds a specific latency threshold.

Proposed Metrics:

Time to First Token (TTFT): To measure initial responsiveness.
Total Wall-Clock Time: Total duration from request to completion.
Thinking Latency: Time spent generating “thought” tokens versus output tokens.

Providing this transparency in AI Studio would allow us to benchmark model performance accurately before migrating to Vertex AI or production environments.

Topic		Replies	Views
Please add performance and timing metadata to the model logs Google AI Studio feature-request	0	13	April 25, 2026
DevX improvements for Gemini API Logs and Datasets Google AI Studio ai-studio	0	28	March 26, 2026
Feedback regarding Build timeout issues specifically with Gemini 3.1 Pro Preview Message: Gemini API ai-studio , feedback	0	131	February 21, 2026
Gemini 3.0 Pro TTFT issue(?) Gemini API models , gemini-api , api-key	7	591	January 6, 2026
High TTFT (~2s) with Gemini Flash vs ~150ms on Groq – Any optimization or throttling insights? Google AI Studio models , gemini , gemini-flash	1	102	May 20, 2026

[feature request] Performance Metrics in AI Studio Logs for Timeout

Related topics