Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

TL;DR · AI Summary
Google released Gemini 3.5 Flash at six times the price of its predecessor, yet deployed it across Search, AI Assistant, and enterprise tools—revealing a strategic shift toward internal model saturation over API monetization.
Key Takeaways
- Gemini 3.5 Flash costs $1.50/million input tokens and $9/million output tokens—6
- Benchmark costs show Gemini 3.5 Flash ($1,551.60) is more expensive than Gemini
- Google deploys the expensive model in free consumer products like Search and Gem
Outline
Jump quickly between sections.
Google launched Gemini 3.5 Flash directly to general availability at I/O 2026, bypassing preview, for global users, developers, and enterprises.
Gemini 3.5 Flash is priced 6x higher than 3.1 Flash-Lite, yet its operational cost exceeds even the more capable 3.1 Pro, raising efficiency concerns.
Despite high cost, Google integrates the model into Search, Gemini App, and Android Studio, prioritizing product experience over API monetization.
OpenAI and Anthropic also raised prices—GPT-5.5 and Claude Opus 4.7 cost over $3,000 and $5,000 per benchmark run, signaling a new pricing ceiling.
Supports 1M input tokens and Interactions API; SVG generation shows creative but structurally flawed outputs, revealing model limitations.
Mindmap
See how the topics connect at a glance.
查看大纲文本(无障碍 / 无 JS 友好)
- Gemini 3.5 Flash发布与战略意义
- 技术规格
- 模型ID: gemini-3.5-flash
- 输入token: 1,048,576 | 输出token: 65,536
- 知识截止: 2025年1月
- 定价与成本
- 价格是3.1 Flash-Lite的6倍
- 运行成本高于3.1 Pro Preview
- 战略部署
- 部署于Google搜索、Gemini App
- 集成至Android Studio与企业平台
Highlights
Key sentences worth saving and sharing.
Gemini 3.5 Flash is priced at $1.50/million input tokens and $9/million output tokens—six times the cost of Gemini 3.1 Flash-Lite.
Benchmark costs reveal Gemini 3.5 Flash ($1,551.60) is more expensive than Gemini 3.1 Pro Preview ($892.28), contradicting the 'Flash = low-cost' expectation.
Google deploys this expensive model in free consumer products like Search and Gemini App, suggesting its goal is product enhancement, not API revenue.
GPT-5.5 (xhigh) and Claude Opus 4.7 cost $3,357 and $5,117 respectively, showing all three major AI labs are raising pricing ceilings in unison.
19th May 2026
Today at Google I/O, Google released Gemini 3.5 Flash. This one skipped the -preview modifier and went straight to general availability, and Google appear to be using it for a whole lot of their key products:
3.5 Flash is available today to billions of people globally:
* For everyone via the Gemini app and AI Mode in Google Search
* For developers in our agent-first development platform Google Antigravity and Gemini API in Google AI Studio and Android Studio
* For enterprises in Gemini Enterprise Agent Platform and Gemini Enterprise.
As usual with Gemini, the most interesting details are tucked away in the What’s new in Gemini 3.5 Flash developer documentation. It mostly has the same set of platform features as the previous Gemini 3.x series, albeit with no computer use. The model ID is gemini-3.5-flash. The knowledge cut-off is January 2025, and it supports 1,048,576 input tokens and 65,536 maximum output tokens.
Google are also pushing a new Interactions API, currently in beta, which looks to me like their version of the patterns introduced by OpenAI Responses—in particular server-side history management.
#### The price has gone up
Gemini 3.5 Flash is accompanied by a notable price bump. The previous models in the “Flash” family were Gemini 3 Flash Preview and Gemini 3.1 Flash-Lite. The new 3.5 Flash is 3x the price of 3 Flash Preview and 6x the price of 3.1 Flash-Lite (see price comparison here).
At $1.50/million input and $9/million output it’s getting close in price to Google’s Gemini 3.1 Pro, which is $2 and $12.
The Gemini team promise that 3.5 Pro will roll out “next month”—presumably at an even higher price.
This fits a trend: OpenAI’s GPT-5.5 was 2x the price of GPT-5.4, and Claude Opus 4.7 is around 1.46x the price of 4.6 when you take the new tokenizer into account.
Given the price increase it’s interesting to see Google roll it out for so many of their own free-to-consumer products. It feels like all three of the major AI labs are starting to probe the price tolerance of their API customers.
Artificial Analysis publish the cost to run their proprietary benchmark against models, which is a useful way to take things like tokenization and increased volume of reasoning tokens into account. Some numbers worth comparing:
- Gemini 3.5 Flash (high): $1,551.60
- Gemini 3.1 Pro Preview: $892.28
- Gemini 3 Flash Preview (Reasoning): $278.26
- Gemini 3.1 Flash-Lite Preview: $93.60
Running the benchmark for 3.5 Flash (high) cost significantly more than 3.1 Pro Preview!
Here are some numbers from other vendors:
- Claude Opus 4.7 (Adaptive Reasoning, Max Effort): $5,117.14
- Claude Opus 4.7 (Non-reasoning, High Effort): $1,217.23
- GPT-5.5 (xhigh): $3,357.00
- GPT-5.5 (medium): $1,199.14
#### A pelican on a bicycle
I ran “Generate an SVG of a pelican riding a bicycle” against the Gemini API and got back this pelican, which is a _lot_:

From the code comments: <!-- Pelican Eye / Sunglasses (Cool Retro Aviators) -->
That pelican looks like it’s in Miami for a crypto conference.
That one cost me 11 input tokens and 14,403 output tokens, for a total cost of just under 13 cents.