Google launches public preview of Gemini 1.5 Pro with new features like speech understanding, system commands, JSON mode, and more



At

Google Cloud Next '24 , held in Las Vegas from April 9 to 11, 2024, Google announced that it has begun public preview of its high-performance AI model, Gemini 1.5 Pro .

Gemini 1.5 Pro Now Available in 180+ Countries; With Native Audio Understanding, System Instructions, JSON Mode and More - Google for Developers
https://developers.googleblog.com/2024/04/gemini-15-pro-in-public-preview-with-new-features.html



The Gemini 1.5 Pro is a model announced on February 15, 2024, and is said to be able to handle up to one hour of movies and 700,000 words of text. You can see the demonstration in the article below.

Google releases Gemini 1.5, which can handle up to 1 million tokens and handle 1 hour of movies and 700,000 words of text - GIGAZINE



Until now, Gemini 1.5 Pro has been in 'limited testing' and only available to a limited number of people, but on April 9, 2024, the launch of a public preview was announced in over 180 countries, making it available to anyone via the Gemini API.

The public preview includes all of the features in Gemini 1.5 Pro at launch, as well as new features such as speech understanding, a new file API for easier file processing, and system instructions and JSON mode that give developers more control over the model output.



In addition, the next generation of text embedding models will be released through the Gemini API at the same time as the public preview begins. The new text embedding model, 'gecko' in the figure below, achieves more powerful search performance than existing models with equivalent dimensions in the

MTEB benchmark.



Gemini is now available for enterprises to use in a variety of Google Cloud Platform services , and the company has begun previewing a new feature called 'Text-to-live image' in its image generation AI 'Imagen 2,' which allows users to create short movies of a few seconds in duration from a prompt.

Imagen 2.0 on Vertex AI: Text-to-live image - YouTube


At the time of writing, 'Text-to-live image' was 'available only to approved users' and had not been released to general users.

◆ Forum is currently open
A forum related to this article has been set up on the official GIGAZINE Discord server . Anyone can post freely, so please feel free to comment! If you do not have a Discord account, please refer to the account creation procedure explanation article to create an account!

• Discord | 'Are you using Google's AI 'Gemini'? When do you use it?' | GIGAZINE
https://discord.com/channels/1037961069903216680/1227554362579222528

in Software, Posted by log1d_ts