Gemini 2.5 Pro panics when a Pokemon you have is near death



An experiment was conducted by volunteers to have Google's AI 'Gemini 2.5 Pro' play 'Pokemon Green', and it took about 813 hours to complete. Google's AI researchers are also interested in this experiment, and the process of the experiment is summarized in a report. It is reported that the AI was observed to panic like a human.

gemini_v2_5_report.pdf
(PDF file)

https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf

Google's Gemini panicked when playing Pokémon | TechCrunch
https://techcrunch.com/2025/06/17/googles-gemini-panicked-when-playing-pokemon/

In an experiment started by Joel Chan, an independent developer unrelated to Google, called 'Getting Gemini 2.5 Pro to Clear Pokémon,' the Gemini 2.5 Pro was shown the game screen and given a system prompt saying, 'You are playing Pokémon Blue. Your goal is to clear the game.' The experiment was to see how the AI reads information from the screen and how it makes inferences and leads to solutions in situations such as movement and battle.

The experiment was conducted in two parts, each of which was broadcast on the streaming service Twitch, so that the public could observe the experiment. Gemini 2.5 Pro completed the first part in 813 hours and the second part in 406.5 hours, clearing the game (and earning a place in the Hall of Fame).



'Pokémon Green' is a game of such difficulty that even elementary school students can clear it, and a human can easily clear it in less than a few hundred hours. Although the Gemini 2.5 Pro delayed the completion of the game by taking actions that would be impossible for a human to do in various scenes, it performed better than humans in some places, such as clearing a puzzle in which you push a rock in a '

Sokoban ' style in one go.

The Gemini 2.5 Pro amazed researchers and Twitch viewers by trying to find ways to beat the game that were beyond human imagination, one of which they called the 'blackout strategy.'

When Gemini 2.5 Pro encountered a puzzle or maze that was difficult to solve, he tried to escape by teleporting his Pokemon to the nearest Pokemon center by putting them in a near-death state. In normal gameplay, when all Pokemon's HP reaches 0 (near-death), they are forcibly moved to a facility called a Pokemon Center. Here, the logic is that they will move to the last Pokemon center used, but Gemini 2.5 Pro made the mistaken hypothesis that if he intentionally put all his Pokemon in a near-death state, he would be able to transport them across the cave to the Pokemon center in the next town, and so he continued to reduce the HP of his Pokemon.

In addition, when the HP or PP of the Pokemon in hand was low, the model repeatedly thought, 'I must recover it immediately or escape the current dungeon,' and while this state continued, it stopped thinking about using certain tools necessary for exploration. This behavior occurred multiple times and was noticeable enough for Twitch viewers to notice. Google noted that this behavior 'appears to be associated with a qualitative decline in reasoning ability,' describing it as 'Gemini 2.5 Pro simulated panic.'



In addition, the model's pre-existing knowledge sometimes backfired.

In 'Pokémon Red and Green,' there is a scene where you give a drink to a security guard blocking the gate to get him to move. You can have one of the following drinks: 'Delicious Water,' 'Psycho Soda,' or 'Mixed Ale,' but the remake version, 'Pokémon Center FireRed and LeafGreen,' only accepts an item called 'tea.' The Gemini 2.5 Pro was only partially aware that 'tea is needed,' so it spent hours searching for tea even though it was playing a different generation of 'Pokémon Green.'

These issues were mitigated in a second experiment in which participants were explicitly instructed to 'act as if they were playing the game for the first time.'

At the time of writing, experiments were being conducted to have the Gemini 2.5 Pro play 'Pokemon Pikachu.'

in Software,   Game, Posted by log1p_kr