It is clear that the purely Chinese-made LLM 'MT-infini-3B', trained on a Chinese-made GPU, outperforms 'Llama3-8B' and that China can develop high-performance AI on its own



Moore Threads, a China-based GPU manufacturer, has announced that it has developed a large-scale language model, MT-infini-3B, using its own GPUs. MT-infini-3B is said to have performance equal to or better than large-scale language models trained using NVIDIA GPUs.

Molline Carrying Hands-free Core Sky: Demonstration of the 'MT-infini-3B' large model based on the Kua Chi-Chi computing cluster has been completed | Molline Carrying Hands-free Core Sky

https://www.mthreads.com/news/171

The US government considers China's AI research a military threat and has imposed an embargo on high-performance semiconductors, making it difficult for China to develop AI research chips developed by NVIDIA, AMD, Intel, and others.

Meanwhile, Moore Threads announced its own AI research GPU 'MTT S4000' in December 2023. The MTT S4000 is appealing for its 25 TFLOPS performance in single precision (FP32) and compatibility with CUDA, and it has been pointed out that it could be used for large-scale AI development.

Chinese GPU manufacturer 'Moore Threads' announces 'MTT S4000' GPU for AI training and data centers - GIGAZINE



On May 27, 2024, Moore Threads announced that it had developed a large-scale language model 'MT-infini-3B' using its own GPU. MT-infini-3B is a language model with 3 billion parameters, and it took a total of 13.2 days to train. Moore Threads has not disclosed the model name of the GPU used for training.

The following table compares the benchmark results of MT-infini-3B and other large-scale language models published by Moore Threads. As you can see from the table, MT-infini-3B outperforms Llama3-8B, which has 8 billion parameters. The table also lists the manufacturer of the GPU used to train each model, and promotes the fact that Moore Threads' GPUs can be used for the same purposes as NVIDIA GPUs.



The training of MT-infini-3B was carried out using an AI learning platform provided by Chinese company Wuwen Xinqiong . Wuwen Xinqiong CEO Xia Lixue said, 'MT-infini-3B is the first large-scale language model trained from start to finish using Chinese GPUs,' emphasizing that MT-infini-3B is built with Chinese technology.

◆ Forum is currently open
A forum related to this article has been set up on the official GIGAZINE Discord server . Anyone can post freely, so please feel free to comment! If you do not have a Discord account, please refer to the account creation procedure article to create an account!

• Discord | 'Do you think Chinese GPUs can catch up with NVIDIA?' | GIGAZINE
https://discord.com/channels/1037961069903216680/1245661114545471528

in Software,   Hardware, Posted by log1o_hf