Feb 13, 2026 13:00:00

Gemini 3 Deep Think Upgrade Released, Breaks Incredible Record in ARC-AGI Intelligence Measurement

On February 12, 2026, Google announced the implementation of '

Gemini 3 Deep Think, ' an updated version of Gemini 3 that enhances its reasoning capabilities. The update focuses on tackling difficult research problems in close collaboration with researchers, and has achieved impressive scores in various benchmarks.

Gemini 3 Deep Think: AI model update designed for science
https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-deep-think/

Gemini 3 Deep Think achieved a high score in the ARC-AGI-2 benchmark, which measures AI intelligence. The image below shows the ARC-AGI-2 score of Gemini 3 Deep Think at the time of its announcement in December 2025, which was 45%, significantly higher than other major models. The upgraded version's ARC-AGI-2 score, as shown in the graph on the right of the image below, has risen to an astonishing 84.6%.

Below is the ARC-AGI-2 leaderboard posted by ARC to X, and it is clear that the score of Gemini 3 Deep Think, shown in green in the upper right corner, is outstanding.

Gemini 3 Deep Think (2/26) Semi Private Eval

- ARC-AGI-1: 96.0%, $7.17/task
- ARC-AGI-2: 84.6% $13.62/task

New ARC-AGI SOTA model from
@GoogleDeepMind pic.twitter.com/mN8PFAWk4A
— ARC Prize (@arcprize) February 12, 2026

Other notable achievements include a high score of 48.4% on the Humanity's Last Exam benchmark, designed to test the limits of AI knowledge, and 81.5% on the MMMU-Pro test for multimodal LLMs. Furthermore, the competitive program benchmark Codeforce achieved an astounding score of 3455.

Gemini 3 Deep Think delivers superior performance in a wide range of scientific fields, including mathematics, competitive coding, chemistry, and physics. According to Google, the updated Deep Think mode achieved gold medal-level results in the written exams of the 2025 International Physics Olympiad and Chemistry Olympiad. It also achieved a score of 50.5% on the CMT benchmark, a measure of advanced theoretical physics, demonstrating proficiency in advanced theoretical physics.

Google is highlighting how Gemini 3 Deep Think is being used in actual scientific research. The following image shows Gemini 3 Deep Think being used in a lab at Duke University researching new semiconductor materials. Aiming to discover potential semiconductor materials, Gemini 3 Deep Think was used to optimize manufacturing methods for complex crystal growth, achieving precise targets that would be difficult to achieve using conventional methods.

We've upgraded our specialized reasoning mode Gemini 3 Deep Think to help solve modern science, research, and engineering challenges – pushing the frontier of intelligence. 🧠

Watch how the Wang Lab at Duke University is using it to design new semiconductor materials. 🧵 pic.twitter.com/BgSEmv00JP
— Google DeepMind (@GoogleDeepMind) February 12, 2026

Additionally, Rutgers University mathematician Lisa Carbone used Gemini 3 Deep Think to peer-review highly technical mathematics papers in a field where there was little existing training data, and reported that she was able to accurately identify logical flaws that would have been overlooked by human peer reviewers.

At Rutgers, mathematician Lisa Carbone is using Gemini 3 Deep Think to explore the math bridging Einstein's gravity and quantum mechanics — a pioneering field of research with very little historical data or relevant papers to reference.

Despite having limited existing data to… pic.twitter.com/Ku4n8Et4hK
— Google (@Google) February 12, 2026

The updated Gemini 3 Deep Think is currently available to Google AI Ultra subscribers, and an early access program is available to select researchers, engineers, and companies using Gemini 3 Deep Think via the Gemini API.

Related Posts:

Feb 13, 2026 13:00:00 in AI, Posted by log1e_dh