China's Alibaba releases chat AI 'Qwen-72B' and local voice input compatible 'Qwen-Audio' as open source



Alibaba, a major Chinese IT company, has released a Transformer-based large-scale language model 'Qwen-72B' that boasts 72 billion parameters, and 'Qwen-Audio', a multimodal version of Qwen that also supports Japanese voice input. announced that it has been open sourced.

Qwen/README_EN.md at main · QwenLM/Qwen · GitHub

https://github.com/QwenLM/Qwen/blob/main/README_JA.md

Qwen/Qwen-72B-Chat · Hugging Face
https://huggingface.co/Qwen/Qwen-72B-Chat

GitHub - QwenLM/Qwen-Audio: The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
https://github.com/QwenLM/Qwen-Audio

Binyuan Hui, who researches neurolinguistic programming (NPL) at Alibaba DAMO Academy, Alibaba Group's cutting-edge research institute, posted on SNS, ``Our sincere open source works, Qwen-72B and Qwen-1.8. I'm proud to introduce B!'



Qwen-72B is a large-scale language model trained with over 3 trillion tokens, and has been tested in a series of benchmarks for natural language understanding, numerical processing, coding, etc., including MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, and BBH. This task outperformed Meta's LLaMA2-70B, a model of the same size. Additionally, it outperformed OpenAI's GPT-3.5 in 7 out of 10 tasks.

Users who have tried it out have reported that they can easily process Japanese.



However, there are opinions that it is a little difficult to run it locally because it requires a large amount of memory.



In addition to Qwen-72B, Qwen also includes 'Qwen-1.8B,' 'Qwen-7B,' and 'Qwen-14B' with different parameter sizes. Additionally, a demo is available that allows you to run Qwen-72B in a browser, although user registration in Chinese is required.

通义千问-72B-对话-Demo · 创空间
https://modelscope.cn/studios/qwen/Qwen-72B-Chat-Demo/summary



Mr. Hui also announced that the model 'Qwen-Audio' that supports voice input has also been open sourced.



Qwen-Audio is a combination of Qwen-7B and OpenAI's audio encoder Whisper-large-v2, and is said to outperform existing state-of-the-art (State of the Art: SoTA) models in all tasks.



You can try the Qwen-Audio demo on Hugging Face without registering.



First, I entered Japanese text and asked a question.



Then, the following answer was output.



It seems that it also supports audio, so I tried uploading the audio file. What I used was the beginning of the video I created in a previous article .



Click Submit to submit.



The content of the audio file is ``How much gold is there on earth?'' For some reason, it was recognized as lyrics, so I couldn't have a conversation with it, but it seems to be highly accurate.



I was also able to hear Elon Musk's

controversial remark , 'Go fuck yourself.'



Qwen-Audio-Chat reportedly supports speech understanding in Chinese, English, Japanese, Korean, German, Spanish, and Italian.

in Review,   Software, Posted by log1l_ks