Google announces 'Gemini 3.1 Flash Lite,' a fast, affordable, cost-effective AI model

Google announced the low-cost, high-speed AI model ' Gemini 3.1 Flash Lite ' on March 3, 2026. Gemini 3.1 Flash Lite is touted as the most cost-effective model in the Gemini 3 series, offering high performance at a low price and high speed.
Gemini 3.1 Flash Lite: Our most cost-effective AI model yet
The bar graph below compares the output speeds of 'Gemini 3.1 Flash Lite,' 'Gemini 2.5 Flash,' 'Gemini 2.5 Flash Lite,' 'GPT-5 mini,' 'Claude 4.5 Haiku,' and 'Grok 4.1 Fast.' While Gemini 3.1 Flash Lite is slightly slower than Gemini 2.5 Flash Lite, it is significantly faster than competitors' high-speed models.

Below is a graph comparing input and output fees. The Gemini 2.5 Flash Lite fee per million tokens is $0.25 (approximately 39.4 yen) for input and $1.50 (approximately 236 yen) for output.

The various benchmark results are as follows: Gemini 2.5 Flash Lite recorded high scores in knowledge performance and coding performance.

The following post from Google DeepMind includes an embedded video comparing the processing speed and accuracy of Gemini 3.1 Flash-Lite and Gemini 2.5 Flash when solving the same problem. When you play the video, you can see that Gemini 3.1 Flash-Lite is faster and more accurate.
Gemini 3.1 Flash-Lite has landed.
pic.twitter.com/BzD2bdg3Dx — Google DeepMind (@GoogleDeepMind) March 3, 2026
It's our most cost-efficient Gemini 3 series model yet, built for intelligence at scale. Here's what's new 🧵
In the text generation test conducted by Arena.ai (formerly LMArena), an AI performance comparison service, the preview version of Gemini 3.1 Flash-Lite achieved a score higher than GPT-5.

The AI performance analysis service 'Artificial Analysis' has also published performance measurement results for Gemini 3.1 Flash-Lite , which rated Gemini 3.1 Flash-Lite's intelligence performance lower than Gemini 3 Flash, DeepSeek V3.2, and Grok 4.

The processing speed is impressive.

Although its price is higher than the open-source models gpt-oss-120B and DeepSeek V3.2, it is lower than the high-performance models Grok and Claude. These analysis results show that Gemini 3.1 Flash-Lite is more suitable for tasks that require high-speed response than complex inference tasks.

Gemini 3.1 Flash-Lite has been released as a developer preview and is available via API.
Related Posts:
in AI, Posted by log1o_hf







