An interview with Liang Wenfeng, founder and CEO of the popular Chinese AI company DeepSeek, reveals the secret to innovation
![](https://i.gzn.jp/img/2025/01/28/deepseek-ceo-interview/00_m.jpg)
Deepseek: The Quiet Giant Leading China's AI Race
https://www.chinatalk.media/p/deepseek-ceo-interview-with-chinas
![](https://i.gzn.jp/img/2025/01/28/deepseek-ceo-interview/01.png)
DeepSeek is funded by High-Flyer, one of China's top four quantitative hedge funds, and is valued at $8 billion. DeepSeek is fully funded by High-Flyer, so it has no plans to raise funds, is focused on building foundational technology rather than commercial applications, and has committed to releasing all AI models as open source.
DeepSeek is a Chinese company, so it should be heavily affected by the US's restrictions on exports of AI chips to China . DeepSeek has publicly stated that it owns 10,000 NVIDIA A100s, but industry analyst Dylan Patel points out that 'DeepSeek has access to High-Flyer's computing cluster, so it can be said to own more than 50,000 Hopper GPUs .' 'More than 50,000 Hopper GPUs' is an order of magnitude higher in computing power than NVIDIA's 10,000 A100s.
Deepseek has over 50k Hopper GPUs to be clear.
— Dylan Patel (@dylan522p) November 20, 2024
People need to stop acting like they only have that 10k A100 cluster.
They are omega cracked on ML research and infra management but they aren't doing it with that many fewer GPUs https://t.co/eaFzCoBrmU
DeepSeek's ambition is to 'build general artificial intelligence (AGI)'. DeepSeek's mission statement is simply 'Unravel the mysteries of AGI with curiosity,' and it focuses on researching architectures and algorithms that could be game changers.
ChinaTalk points out that DeepSeek started to attract attention in the AI industry after releasing the open source DeepSeek V2. DeepSeek V2 has an unprecedented price-performance ratio, with an inference cost of 1 RMB (about 21 yen) per million tokens, which is very cheap. The inference cost per million tokens is about 1/7 of Meta's Llama3 70B and about 1/70 of OpenAI's GPT-4 Turbo. With the advent of DeepSeek V2, Chinese technology companies such as ByteDance, Tencent, Baidu, and Alibaba have all lowered their prices, and it seems that the price competition for large-scale AI models is intensifying in China. Nevertheless, DeepSeek has succeeded in making a profit, unlike other companies.
One of the reasons for this success is the 'MLA architecture' announced by
Regarding the pricing of DeepSeek V2, CEO Liang said, 'I didn't expect the surroundings to be so sensitive to the pricing (of DeepSeek V2). We proceeded on our own basis and set the price from there. Our principle is not to subsidize or make exorbitant profits. The price we set is just a little above the cost, and we only make a small profit.' 'It is not our main purpose to take users away from other companies. We lowered the price because costs were reduced as we explored the architecture of the next-generation model. We also believe that APIs and AI should be accessible and affordable to everyone.'
![](https://i.gzn.jp/img/2025/01/28/deepseek-ceo-interview/mathieu.jpg)
ChinaTalk points out that in the Silicon Valley-driven AI industry, it is 'very rare' for Chinese AI companies to develop and release innovative models. In the AI industry, the reason for DeepSeek's excellent performance is said to be 'innovation at the architectural level, which is very rare for Chinese AI models and global open source AI models.'
Americans are good at innovating '0 to 1,' but Chinese people are said to excel at innovating '1 to 10.' Therefore, in China, efforts to upgrade existing models are mainstream, based on the idea that 'the technology of large models will always converge, so it is wise to follow that.' However, DeepSeek has taken on the extremely difficult and costly challenge of 'innovating the architecture of AI models,' and by succeeding in this endeavor, it is bringing a breath of fresh air to the AI industry. It seems that DeepSeek is the only major Chinese startup that is taking this approach.
While most Chinese AI companies simply copy and use Meta's Llama, DeepSeek starts by building the model architecture. Regarding the reason for this, CEO Liang said, 'If your goal is to create an application, it makes sense to quickly deploy your product using Llama's architecture. However, our goal is to develop AGI. Therefore, we need to research new model architectures to achieve more powerful models with limited resources. This is one of the fundamental research areas needed to scale up to larger models. In addition to model architecture, we also conduct extensive research in other areas, such as data construction and making models more human-like, all of which are reflected in the models we have released. In addition, it is estimated that Llama's architecture is about two generations behind the international state-of-the-art models in terms of training efficiency and inference costs.'
Furthermore, on why DeepSeek focuses on research and exploration, 'We believe that the most important thing now is to ride the wave of global innovation. For many years, Chinese companies have been accustomed to others doing the innovating, and we have focused on monetizing applications. But this is not inevitable. In this wave, we felt that our starting point was not to take advantage of the opportunity to make quick profits, but rather to reach the cutting edge of technology and drive the development of the entire ecosystem. ' 'We believe that as the economy develops, China should gradually become a contributor, not a free rider. In the IT boom over the past 30 years, we basically did not participate in real technological innovation. We are accustomed to waiting 18 months for Moore's Law to suddenly come down and better hardware and software to appear, 'he said, explaining the importance of being on the side of creating innovation.
![](https://i.gzn.jp/img/2025/01/28/deepseek-ceo-interview/mohamed.jpg)
SemiAnalysis's chief analyst has called the DeepSeek V2 paper 'perhaps the best paper of the year,' and former OpenAI employee Andrew Carr has also said it is 'full of amazing wisdom.' In fact, Carr is using the DeepSeek V2 training settings for his own AI models. In addition, Anthropic co-founder Jack Clark said of DeepSeek, 'We've hired a group of immense geniuses,' and evaluated that the company has the potential to continue to innovate Chinese AI models.
Regarding DeepSeek V2's high evaluation in Silicon Valley, CEO Liang said, 'A lot of innovation happens every day in the United States, and this is quite normal. (DeepSeek V2) is a Chinese company entering the United States as a contributor of innovation, so I think they were surprised,' pointing out that Chinese companies have not produced innovation so far.
He added, 'The cost of innovation is by no means low, and indiscriminate borrowing of funds was also related to China's previous situation. But whether it's the size of China's economy or large companies like ByteDance and Tencent, none of them are low by global standards. What we lack in innovation is definitely not capital, but a lack of confidence and knowledge on how to organize a high-density talent for effective innovation.' He pointed out that the lack of innovation in China is not due to a 'lack of funds' but to a 'lack of know-how to effectively gather and organize talented people.'
In addition, there is a concern that if an innovation is released as open source, it will quickly be copied by other companies and the company will lose its advantage. In response to this, CEO Liang said, 'When faced with disruptive technology, the moat created by closed source is only temporary. Even OpenAI's closed source approach could not prevent other companies from catching up. That's why we embed value in our team. Colleagues grow through this process, accumulate know-how, and form an organization and culture that enables innovation. A team of talented people is our moat.'
In addition, CEO Liang said, 'It is our view that China's AI should not be in a position to follow forever. It is often said that there is a one- or two-year difference between Chinese AI and American AI, but the real difference is the difference between originality and imitation. If this does not change, China will always be a follower.' 'NVIDIA's leadership is not the result of one company's efforts, but the result of cooperation between the Western technical community and the entire industry. They have an eye on the next generation of technology trends and a roadmap. China's AI development needs such an ecosystem. Many domestic chip developments lack support from the technical community and struggle because they can only get indirect information. China inevitably needs talent to be at the forefront of technology,' he said, emphasizing the need for Chinese AI companies to become the ones that create innovation.
Related Posts:
in Software, Posted by logu_ii