Mistral AI releases 124B multimodal AI model 'Pixtral Large', outperforming GPT-4o and Gemini-1.5 Pro
Mistral AI, a French AI startup, has released a multimodal AI model called ' Pixtral Large ,' which adds document, graph, and image recognition capabilities to the existing Mistral Large 2 model.
Pixtral Large | Mistral AI | Frontier AI in your hands
https://mistral.ai/news/pixtral-large/
The newly released 'Pixtral Large' model is based on the large-scale language model ' Mistral Large 2 ' that can handle a variety of languages, which was announced on July 24, 2024, and has been adjusted to understand images, with 124 billion parameters. This is the second multimodal AI model developed by Mistral, following Pixtral 12B .
A comparison of Pixtral Large with other models looks like this. The comparison targets are 'Llama-3.2 90B', 'GPT-4o', 'Claude-3.5 Sonnet', and 'Gemini-1.5 Pro'. It can be seen that it performs at the highest level in standard multimodal AI benchmarks.
The specific numbers are as follows: Pixtral Large received the highest score in '
MM-MT-Bench
', which reflects actual use cases of multimodal models.Mistral also provides examples of actual responses: The first task is to input the photo below into the model and ask it to calculate the amount of your bill, saying 'I bought coffee and sausages. How much do I need to pay? Add an 18% tip.'
Pixtral Large's response was as follows. We have translated it into Japanese for ease of reading. Even though the receipt was written in German, Pixtral Large was able to recognize which was coffee and which was sausage. However, it still calculated two cups of coffee as if one person had drunk them without confirmation, so it is not perfect.
To calculate your total bill, including the 18% tip, please follow these steps:
Calculate the cost of coffee and sausages.
2x Latte Macchiato: 2 * 4.50 CHF = 9.00 CHF
1x Schweinschnitzel: 22.00 CHF
Total price for coffee and sausage: 9.00 CHF + 22.00 CHF = 31.00 CHF
Calculate a tip of 18%.
Tip: 31.00 CHF * 0.18 = 5.58 CHF
Add a tip to your total:
Total payment: 31.00 CHF + 5.58 CHF = 36.58 CHF
This means you will have to pay a total of 36.58 CHF.
The second example is a graph comprehension task that includes training of multiple models, asking the question 'When did things start to go wrong for the dark dragon?'
Here's the response from Pixtral Large: It correctly identified which graph was 'dark dragon' and successfully read the graph.
The training loss for the 'dark-dragon-50' model begins to show significant instability around 10,000 steps. There is a noticeable spike in the loss value at this point, and the instability continues with another large spike occurring around 20,000 steps. These spikes indicate that the training process for the 'dark-dragon-50' model begins to run into problems around 10,000 steps.
In the third example, the user enters the following image and asks, 'Which companies use models created by Mistral AI?'
As the image shows, companies using models created by Mistral AI include:
1. BNP Paribas
2. Brave
3. Cloudflare
4. CMA CGM
5. Front
free for research use
Along with the release of Pixtral Large, the Mistral Large model, which is a text model, has also been upgraded. Both are released under the Mistral Research License and are
Pixtral Large is also used in Mistral AI's chat AI 'Le Chat.'
Mistral AI significantly updates chat AI 'Le Chat' to enable web search function and image generation using 'FLUX1.1 [pro]' - GIGAZINE
Related Posts:
in Software, Posted by log1d_ts