Google releases PaliGemma 2, a visual language model that's easy to finetune



On December 5, 2024, Google announced 'PaliGemma 2,' a visual language model that adds visual capabilities to the open and lightweight language model 'Gemma 2.'

Introducing PaliGemma 2: Powerful Vision-Language Models, Simple Fine-Tuning - Google Developers Blog

https://developers.googleblog.com/en/introducing-paligemma-2-powerful-vision-language-models-simple-fine-tuning/

Welcome PaliGemma 2 – New vision language models by Google
https://huggingface.co/blog/paligemma2

PaliGemma is the first visual language model in the Gemma family and is widely available through GitHub , Hugging Face , etc. It is capable of recognizing images, describing their contents in words, and understanding text within images.

Read the article below to find out what happens when you actually use PaliGemma.

Google releases open source visual language model 'PaliGemma' and announces large-scale language model 'Gemma 2' with performance equivalent to Llama 3 - GIGAZINE



The now released successor model, PaliGemma 2, is available in multiple model sizes (3B, 10B, 28B) and resolutions (224 x 224, 448 x 448, 896 x 896 pixels) to optimize performance for any task.

Another selling point is the length of the captions, which go beyond simply recognizing objects to generate detailed, contextual captions that explain the movement, emotion, and overall background of a scene. It also performs well in recognizing chemical formulas and musical notation, spatial reasoning, and reporting chest X-ray images.

A demo site is also available.

Paligemma2 Vqav2 - a Hugging Face Space by merve
https://huggingface.co/spaces/merve/paligemma2-vqav2



As an example, try entering a graph and clicking on the sample that asks what the graph is.



The model responded, 'Accuracy after fine-tuning.'



'We're excited to see what you'll create with PaliGemma 2. Join the vibrant Gemma community, share your projects in the Gemmaverse, and let's continue to explore the infinite possibilities of AI together,' Google said.

in Software, Posted by log1l_ks