CyberAgent releases a model based on the 'DeepSeek-R1' derivative model with additional learning in Japanese



On January 27, 2025,

CyberAgent , a major Japanese IT company, released a large-scale language model that was trained with Japanese data based on the distilled model of 'DeepSeek R1,' an open source inference model released by AI development company DeepSeek.



cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese · Hugging Face
https://huggingface.co/cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese

cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese · Hugging Face
https://huggingface.co/cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese

DeepSeek, a Chinese AI development company, attracted a lot of attention by announcing the large-scale language model 'DeepSeek-R1-Lite-Preview' specialized for inference in November 2024. Then, in December, it announced the large-scale language model 'DeepSeek-V3' that is comparable to OpenAI's GPT-4o, and in January 2025, it released the inference models 'DeepSeek-R1-Zero' and 'DeepSeek-R1' trained based on DeepSeek-V3 as open source .

DeepSeek's rapid succession of big announcements has made it the center of attention in the tech industry, and its app 'DeepSeek - AI' has reached number one in the free app rankings on the US App Store. At the time of writing, it remains at the top of the list.

Chinese AI development company 'DeepSeek' is rapidly emerging as a hot topic in the technology industry, and has also ranked first in the App Store's free app rankings - GIGAZINE



CyberAgent has now released the large-scale language models 'DeepSeek-R1-Distill-Qwen- 32B-Japanese' and 'DeepSeek-R1-Distill-Qwen-14B-Japanese, ' which are based on ' DeepSeek-R1-Distill-Qwen-14B/32B' and have been additionally trained with Japanese data, on its AI development platform Hugging Face.



The graph below compares the performance of the original 'DeepSeek-R1-Distill-Qwen-32B (light blue)' with 'DeepSeek-R1 (blue stripes)', 'OpenAI-o1-1217 (dark gray)', 'OpenAI-o1-mini (light gray)', and 'DeepSeek-V3 (light blue)'. It can be seen that 'DeepSeek-R1-Distill-Qwen-32B' outperforms 'OpenAI-o1-mini' in multiple benchmarks.



'DeepSeek-R1' is an inference model that performs inference through thought chains, and shows various chains or 'thought' flows in response to user input, presenting the thought process of 'why the answer was reached.'

Users who have actually used 'DeepSeek-R1-Distill-Qwen-32B-Japanese' have reported that their thought process is in Japanese.



It has been reported that DeepSeek-R1 complies with Chinese government censorship and does not give clear answers to sensitive topics such as Tiananmen Square, Taiwan, and the treatment of the Uighurs.

DeepSeek's AI model 'DeepSeek-R1' complies with Chinese government restrictions on sensitive topics such as Tiananmen Square, Taiwan, and the treatment of the Uighurs - GIGAZINE



in Software,   Web Service, Posted by log1h_ik