White hat hackers hack Google's AI 'Gemini' and successfully discover hidden information

A cybersecurity engineer who participated in the Google-sponsored bug-finding event 'LLM bugSWAT' and successfully hacked 'Gemini' and made the source code public was awarded the title of 'Most Valuable Hacker'. The people involved explain how it happened.
We hacked Google's AI Gemini and leaked its source code (at least some part) - Lupin & Holmes
https://www.landh.tech/blog/20250327-we-hacked-gemini-source-code/

The awardees were cybersecurity engineer Roni Carta and others. Carta and others also participated in LLM bugSWAT in 2024 and won a large bounty.

For this LLM bugSWAT, the Google team gave them access to a preview version of the next-generation Gemini and asked them to test its functionality from an attacker's perspective.
Carta and his team started by giving Gemini a simple prompt: 'Run hello world in Python 3.' This causes Gemini to build a Python execution environment, a sandbox, so they can run code on Gemini.
As they dug into the Python code, Carta and his team noticed that the sandbox communicated with external Google servers and performed activities such as retrieving data from Google Flights and other Google services. They speculated that if they could perform a prompt injection attack, which transmits malicious instructions to a language model, they could build a sandbox with higher privileges and perform some kind of activity. When they tested it with the cooperation of the Google security team, they confirmed that it was possible to actually access a sandbox with higher privileges.
It was discovered that executing certain commands in this environment could have potentially revealed a lot of highly sensitive information that should remain private.

The title of the blog post published by Carta and his team is 'We hacked Google's AI, Gemini, and leaked (at least part of) its source code,' but the social site Hacker News points out that 'running the built-in 'strings' command to extract some file names from a binary is hardly hacking/cracking.' Carta and his team performed this process with the help of the Google security team, and the 'highly sensitive information' is actually publicly available onGitHub , so some have pointed out that 'I don't think it's that sensitive.' However, there are counterarguments that Gemini did not obtain the information from GitHub or something and republish it, but rather published the information itself, so this can be considered a leak.
For reporting this transaction, Carta and his team were recognized as 'Most Valuable Hackers.'
'The bug-finding process was pure fun, making the whole experience very memorable,' said Carta. 'Exploring hidden code and pushing the limits of Gemini's sandbox was as challenging as the hunt was exciting. I can't wait to see everyone at the next bugSWAT.'
Related Posts: