Anthropic accuses Chinese AI companies DeepSeek, Moonshot, and MiniMax of illegally extracting Claude's abilities



Anthropic, the developer of the AI chatbot Claude, claimed in a blog post that China-based AI companies DeepSeek, Moonshot, and MiniMax are conducting a large-scale campaign to illegally extract Claude's capabilities to improve their own models. Anthropic said the three companies conducted more than 16 million sessions with Claude through approximately 24,000 fraudulent accounts, violating its terms of service and geographic access restrictions.

Detecting and preventing distillation attacks \ Anthropic
https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks



The three companies accused of 'distilling' their models by using the output of more capable models to train their own models. Anthropic states, 'Distillation is a widely used and legitimate training technique. For example, leading AI labs routinely distill their own models to create smaller, lower-cost versions for their customers. However, distillation can also be used for nefarious purposes. Competitors can use it to gain the powerful capabilities of other companies' AI models at a fraction of the time and cost required to develop them independently.'

by Tom Doel

According to Anthropic, the DeepSeek campaign spanned more than 150,000 interactions and targeted Claude's reasoning ability across a variety of tasks, rubric -based assessment tasks to enable Claude to function as a reward model for reinforcement learning, and the creation of censorship-evading alternatives for policy-sensitive queries.

Anthropic alleges that DeepSeek engaged in 'load balancing' to increase throughput and evade detection by generating synchronized traffic across multiple accounts, using identical patterns, common payment methods, and coordinated timing. Specifically, Anthropic alleges that DeepSeek generated training data for chains of thought at scale, using prompts that asked Claude to imagine and write down, step-by-step, the internal reasoning behind his answers.

Anthropic also found that its model was trained to steer conversations away from censored topics by generating evasive responses to politically sensitive questions about dissidents, party leaders, and authoritarianism. These accounts were linked to specific DeepSeek researchers through request metadata, Anthropic reported.



The Moonshot campaign consisted of over 3.4 million sessions, including agent inference and tool use, coding and data analysis, computer-assisted agent development, and computer vision. Moonshot (Kimi model) utilized hundreds of malicious accounts across multiple access vectors, and the diverse account types likely contributed to the organization's operations being difficult to detect.

Anthropic noted that Moonshot also attempted a more targeted approach, extracting and reconstructing traces of Claude's reasoning, and attributed the requests to Moonshot after their metadata matched public profiles of senior Moonshot staff members.



The MiniMax campaign was the largest, with over 13 million sessions, and targeted agent coding, tooling, and orchestration. The campaign was attributed to the company based on request metadata, infrastructure metrics, and matching with publicly available product roadmaps.

Anthropic reported that they detected this campaign before MiniMax released the training model and were able to gain a detailed understanding of the distillation attack lifecycle, from data generation to model launch. Furthermore, when Anthropic released the new model, MiniMax responded extremely quickly, redirecting approximately half of its traffic to extracting capabilities from its latest system within 24 hours.



Anthropic said, 'For national security reasons, Anthropic does not provide commercial access to Claude in China or its subsidiaries outside of China.' It also claimed that commercial proxy services are used to access cutting-edge AI models, including Claude, from China.

The commercial proxy service runs an architecture called 'Hydracluster,' which means that the network is so widespread that there is no single point of failure that could bring down the entire system. Even if a specific account is banned, a new account is instantly created to take its place, Anthropic said.

Anthropic said it continues to invest heavily in defenses to make distillation attacks more difficult to execute and easier to detect. Specific efforts include building classifiers and fingerprinting systems to identify attack patterns in API traffic, as well as detecting chain-of-thought techniques and coordinated activity involving large numbers of accounts.



The company also said it would share technical indicators with other AI labs, cloud providers, and relevant authorities to gain a comprehensive understanding of the situation, and would strengthen identity verification for educational and startup accounts, which are prone to being misused to create fraudulent accounts.

Additionally, we are developing products, APIs, and model-level protections that reduce the validity of model outputs against unfair distillation, while maintaining convenience for legitimate users. Anthropic believes that no single company can address this issue alone, and therefore calls for swift and coordinated action from the entire AI industry, cloud providers, and policymakers, and is publishing the evidence to support this.

in AI, Posted by log1i_yk