According to recent benchmarks—like the one shared by Logan Kilpatrick—Gemini 3.5 Flash on “Low” thinking mode performs less than a percentage point below rival flagships while remaining incredibly cheap to run ($0.65 vs. $6.31 per task).
Right now, users are being forced into higher tiers that bloat token usage. This contradicts the entire efficiency narrative pushed at Google I/O. The actual base API pricing shows a severe generation-over-generation cost drift:
Gemini 2.5 Flash: $0.30 / 1M input tokens
Gemini 2.5 Pro: $1.25 / 1M input tokens
Gemini 3.5 Flash: $1.50 / 1M input tokens
Gemini 3.5 Flash is literally more expensive than the previous generation’s Pro model ($1.50 vs $1.25 per million input tokens). When paired with unoptimized thinking modes that inflate token counts, it behaves like a flagship tier in the quotas.
If Flash is supposed to be the economical option, we need the autonomy to lock it into Low or Minimal thinking modes to control consumption in Antigravity. Otherwise, Google needs to introduce a true “Flash Lite” tier to fill the economic gap.

