Alibaba's AI research team has released the visual language model 'Qwen2.5 VL' that can recognize and automatically operate the UI of PCs and smartphones, and can automatically perform airline ticket reservations and other tasks with performance exceeding GPT-4o

Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL! | Qwen
https://qwenlm.github.io/blog/qwen2.5-vl/
🎉 恭き发财🧧🐍 As we welcome the Chinese New Year, we're thrilled to announce the launch of Qwen2.5-VL , our latest flagship vision-language model! 🚀
— Qwen (@Alibaba_Qwen) January 27, 2025
💗 Qwen Chat: https://t.co/T0nMBnRVBB
📖 Blog: https://t.co/FU7qEgE46j
🤗 Hugging Face: https://t.co/N9XSslZX8d
🤖 ModelScope:… pic.twitter.com/KgjC2lHcvR
Below is an example showing the performance of Qwen2.5 VL. If you show a person an image of four cars and ask them to tell you the name of the car in English and Chinese, they will answer correctly.
It can also handle complex tasks such as labeling the names of two basketball players and the positions of their left and right hands when shown a photo of them.

It is also possible to transcribe vertically written text.

You can also summarize videos that are over an hour long.

In addition, Qwen2.5 VL can recognize the UI of a PC or smartphone and operate it automatically. In the video below, you can see how Qwen2.5 VL executes the task 'Install an extension to Visual Studio Code'.
How to automatically operate a PC with the AI model 'Qwen2.5 VL' - YouTube
You can also book your flight using a ticket booking app on your smartphone.
'Qwen2.5 VL' automatically operates smartphone apps to book airline tickets - YouTube
Qwen2.5 VL is available in three types: 3B, 7B, and 72B. Qwen2.5 VL 72B outperforms Gemini 2.0 Flash and GPT-4o in various benchmarks.
In addition, 'Qwen2.5 VL 7B' shows higher performance than 'GPT-4o mini'.

Qwen2.5 VL is already available for use with Qwen's chat AI,

In addition, three types, 'Qwen2.5-VL-3B-Instruct', 'Qwen2.5-VL-7B-Instruct', and 'Qwen2.5-VL-72B-Instruct', have been released on Hugging Face.
Qwen2.5-VL - a Qwen Collection
https://huggingface.co/collections/Qwen/qwen25-vl-6795ffac22b334a837c0f9a5

Related Posts: