Dec 14, 2023 11:51:00

Google's multimodal AI 'Gemini Pro' now available to developers and companies via API, free of charge for up to 60 queries per minute

On December 6, 2023, Google

announced Gemini, a multimodal AI that can simultaneously process text, audio, and images. On December 13, local time, Gemini's middle model, Gemini Pro , became accessible via API, allowing developers and enterprises to build for their own use cases.

Google Gemini API: New developer and enterprise AI products
https://blog.google/technology/ai/gemini-api-developers-cloud/

Google: Gemini API, Imagen 2, Duet AI and more updates
https://blog.google/technology/ai/google-gemini-pro-imagen-duet-ai-update/

Gemini Pro is the middle model of Gemini, Google's multimodal AI, and English-speaking users of the conversational AI Bard can already access Gemini Pro with text-based prompts. On December 13th, Google announced that Gemini Pro is now accessible via the Gemini API, allowing developers and businesses to customize Gemini Pro for their own use cases.

According to Google, Gemini Pro outperforms other models of the same size in research benchmarks, and the version at the time of writing comes with a 32,000-token context window. It supports 38 languages in over 180 countries and regions around the world, and includes function calls, embedding, semantic search, custom knowledge grounding, chat features, and more.

Gemini Pro also has an SDK that is useful for building applications, and supports Python, Android (Kotlin), Node.js, Swift, and JavaScript.

Developers and companies can obtain Gemini Pro API keys from Google AI Studio, a web-based development tool for creating app prototypes using API keys. If you log in to Google AI Studio with your Google account, you will be able to use a free quota of 60 queries per minute, but anonymized input and output data may be used to improve service quality.

At the time of article creation, Gemini Pro can generate text in response to text input, and also provides a dedicated Gemini Pro Vision multimodal endpoint that inputs text and images and outputs text.

In addition, Vertex AI, a machine learning platform that allows customization of large-scale language models, can now customize Gemini using company-specific data. Google doesn't train the model with inputs or outputs from Vertex AI users, so companies can customize Gemini while maintaining full control of their data.

At the time of writing, developers can access Gemini Pro and Gemini Pro Vision for free through Google AI Studio, with up to 60 queries per minute. This rate limit will allow a free trial until general availability in early 2024, after which Google AI Studio and Vertex AI will charge per 1000 characters or per image.

Imagen 2 , a major upgrade to Google Cloud's image generation capabilities, was also announced and is now available to allowlisted Vertex AI users.

Imagen 2 on Vertex AI is now generally available | Google Cloud Blog

https://cloud.google.com/blog/products/ai-machine-learning/imagen-2-on-vertex-ai-is-now-generally-available

Imagen 2 has improved image and text understanding than previous models, making it possible to generate higher quality images from text.

The ability to correctly write text within images has also been improved, giving organizations more control over the branding and messaging within images.

It is also now possible to generate emblems and logos for brands and products.

In addition, MedLM , a large-scale language model fine-tuned for healthcare industry use cases, is now available to Google Cloud users in the US through Vertex AI.

Introducing MedLM for the healthcare industry | Google Cloud Blog
https://cloud.google.com/blog/topics/healthcare-life-sciences/introducing-medlm-for-the-healthcare-industry

Related Posts:

Dec 14, 2023 11:51:00 in Software, Web Service, Posted by log1h_ik