Introducing the new language model 'Qwen2-Math' that can solve mathematics, with mathematical performance surpassing that of GPT-4o



A Chinese AI research team has released a large-scale language model specialized in mathematics, ' Qwen2-Math '. Qwen2-Math has mathematical performance that surpasses closed-source large-scale language models such as GPT-4o and Gemini-1.5-Pro.

Introducing Qwen2-Math | Qwen

https://qwenlm.github.io/blog/qwen2-math/

QwenLM/Qwen2-Math: A series of math-specific large language models of our Qwen2 series.
https://github.com/QwenLM/Qwen2-Math

Qwen2-Math is a large-scale language model developed by the Alibaba Group's AI research team 'Qwen Team,' and is characterized by its high accuracy rate in math problems. Qwen2-Math is a derivative model of the 'Qwen2' series, which has become a hot topic for its extremely high performance, and uses 'mathematics-related information generated by Qwen2' as learning data.

China's Qwen model tops 'Open LLM Leaderboard v2' testing Hugging Face AI model - GIGAZINE



Qwen2-Math has the basic models 'Qwen2-Math-1.5B', 'Qwen2-Math-7B', and 'Qwen2-Math-72B', and the command adjustment models 'Qwen2-Math-1.5B-Instruct', 'Qwen2-Math-7B-Instruct', and 'Qwen2-Math-72B-Instruct'. Looking at the graph summarizing the release dates and accuracy rates of mathematical problems for various large-scale language models, we can see that 'Qwen2-Math-72B-Instruct' has higher performance in solving mathematics than models such as 'GPT-4o', 'Claude-3.5-Sonnet', 'Gemini-1.5-Pro', and 'Llama-3.1-405B'.



Below are the results of running a mathematical performance benchmark for large-scale language models. Even 'Qwen2-Math-1.5B-Instruct', which has the fewest parameters among the Qwen2-Math instruction adjustment models, outperforms 'Llama-3.1-70B-Instruct' in most benchmark tests.



At the time of writing, Qwen2-Math only supports English, and the Chinese version will be released soon. Each model is available at the following links.

Qwen (Qwen)
https://huggingface.co/Qwen



in Software, Posted by log1o_hf