Gemini 3.5 Flash Low thinking option

Luiz_Gabriel_dos_San · May 23, 2026, 4:31am

Google AI Studio gives developers granular control over reasoning levels, and we are starting to see basic reasoning toggles roll out in the Gemini App. However, the default behavior for Gemini 3.5 Flash is burning through users’ quotas by forcing high-token thinking modes, completely undermining the model’s core value proposition.

According to recent benchmarks—like the one shared by Logan Kilpatrick—Gemini 3.5 Flash on “Low” thinking mode performs less than a percentage point below rival flagships while remaining incredibly cheap to run ($0.65 vs. $6.31 per task).

Right now, users are being forced into higher tiers that bloat token usage. This contradicts the entire efficiency narrative pushed at Google I/O. The actual base API pricing shows a severe generation-over-generation cost drift:

Gemini 2.5 Flash: $0.30 / 1M input tokens

Gemini 2.5 Pro: $1.25 / 1M input tokens

Gemini 3.5 Flash: $1.50 / 1M input tokens

Gemini 3.5 Flash is literally more expensive than the previous generation’s Pro model ($1.50 vs $1.25 per million input tokens). When paired with unoptimized thinking modes that inflate token counts, it behaves like a flagship tier in the quotas.

If Flash is supposed to be the economical option, we need the autonomy to lock it into Low or Minimal thinking modes to control consumption in Antigravity. Otherwise, Google needs to introduce a true “Flash Lite” tier to fill the economic gap.

Topic		Replies	Views
Feature Request: Add Gemini 3.5 Flash (Minimal / Lite) for free-tier Antigravity CLI and App workflows Google Antigravity feedback , models , gemini	4	389	June 2, 2026
Flash 3.5 is not a suitable replacement for Flash 3.0 Google Antigravity bug	2	470	May 20, 2026
I now know why Gemini 3.5 is called flash! Google Antigravity feedback	17	1246	May 31, 2026
Antigravity rate limits and credits usage Google Antigravity feedback , models , gemini , rate-limits	2	377	July 12, 2026
Critical Feedback: Mandatory "Thinking" in Gemini 3 Flash is a regression in UX and cost-efficiency Gemini API api , models , gemini , gemini-3	0	234	January 19, 2026

Gemini 3.5 Flash Low thinking option

Related topics