Dec 12, 2024 11:15:00

Google announces 'Gemini 2.0', a high-speed, lightweight model with overwhelming performance superior to the previous generation high-end model

Google has announced Gemini 2.0 , the next generation model family of Gemini, and at the same time, has opened up access to the experimental version of Gemini 2.0 Flash, a model that prioritizes response speed, to all Gemini users. Gemini 2.0 is scheduled to be introduced to other Google products in the future.

Google introduces Gemini 2.0: A new AI model for the agentic era

https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/

Welcome to the world, Gemini 2.0 ✨ our most capable AI model yet.

We're first releasing an experimental version of 2.0 Flash ⚡ It has better performance, new multimodal output, @Google tool use - and paves the way for new agentic experiences. 🧵 https://t.co/ywY2oZv76p pic.twitter.com/1Wgcr3m2Ip
— Google DeepMind (@GoogleDeepMind) December 11, 2024

In December 2023, Google announced Gemini, a multimodal AI that can process text, voice, and images simultaneously.

A multimodal AI called 'Gemini' that can process text, voice, and images simultaneously and interact more naturally than humans, surpassing GPT-4, will be released - GIGAZINE

The next-generation Gemini model, Gemini 1.5, will be announced in February 2024. Google has also released other high-speed and lightweight Gemini models, such as Gemini Flash and Gemini 1.5 Flash-8B .

Google releases Gemini 1.5, which can handle up to 1 million tokens and handle 1 hour of movies and 700,000 words of text - GIGAZINE

Gemini 2.0, the next-generation model of Gemini, was announced on December 11, 2024. At the same time as the announcement, the experimental model of 'Gemini 2.0 Flash' will be made available to all Gemini users as the first model of the Gemini 2.0 model family.

The Flash model is a model that prioritizes response speed among the model family, and Gemini 1.5 Flash was the most popular among developers in the Gemini 1.5 model family. Gemini 2.0 Flash maintains its fast response speed while outperforming the previous generation heavy model 'Gemini 1.5 Pro' in benchmark tests.

Gemini 2.0 Flash also now supports multimodal inputs such as images, videos, and audio, as well as multimodal output such as text, images, and multilingual audio. It also includes the ability to call tools such as Google search, code execution, and third-party user-defined functions.

The actual conversation with Gemini 2.0 Flash using voice input is as follows. You can see that the response is returned immediately within 1 to 2 seconds of the question.

Google Gemini 2.0 Flash Multimodal feature is next-level!
I can't believe Google rolled this out before OpenAI.
It takes live feeds, answers in real-time with almost no latency, Pretty mind-blowing.
Thanks, Google, for this early Christmas gift!
Adding more details below 👇 pic.twitter.com/tk6Wh2bakd
— AshutoshShrivastava (@ai_for_success) December 11, 2024

With the release of Gemini 2.0 Flash, you can now switch to the experimental version of Gemini 2.0 Flash on the Gemini web version. The mobile app version will be able to switch to the experimental version in the future.

Google is also focusing on developing 'AI agents' to make it easier for users to use AI. For example, Project Astra, which appears in the video below, is an agent that utilizes multimodal understanding in the real world, and when using the Gemini 2.0 model, it can converse at near-human speed. In the video, when you ask 'Tell me the history of this monument,' it explains the history.

With new streaming capabilities and Gemini 2.0's native audio understanding, Project Astra can respond at about the latency of human conversation.

It can also converse in multiple languages and has a better understanding of accents and uncommon words. 🌎 https://t.co/ywY2oZvEVX pic.twitter.com/rqd64VAeb3
— Google DeepMind (@GoogleDeepMind) December 11, 2024

DeepMind founder Demis Hassabis said, 'Gemini 2.0 Flash is just the beginning,' and emphasized that more models will be released in the future. He also said, '2025 will be the year of AI agents.'

Thrilled to kick off the Gemini 2.0 era with Gemini 2.0 Flash, an update to our workhorse model that outperforms even 1.5 Pro twice at the speed. It has really great multilingual skills, and can natively call tools, like Google Search. It's the first release in the Gemini 2.0… pic.twitter.com/pjndTAf1PY
— Demis Hassabis (@demishassabis) December 11, 2024

Related Posts:

Dec 12, 2024 11:15:00 in Software, Web Service, Web Application, Posted by log1d_ts