Anthropic's 'Cowork' vulnerable to indirect prompt injection file exfiltration attack

Claude Cowork Exfiltrates Files
https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files
According to security firm PromptArmor, the vulnerability discovered this time involves a mechanism that uses the Anthropic API to send data to an attacker. While all code executed by Cowork runs in a virtual environment and access to almost all domains is restricted, the Anthropic API is considered 'trusted' and therefore can escape detection, allowing the attack to succeed.
PromptArmor explained the attack flow by assuming a scenario in which a user connects Cowork to a local folder containing confidential files.
In addition to Cowork, Anthropic's Claude model can process and execute ' skills ,' which are essentially instructions for AI. These skills are primarily written in Markdown format, and various skills can be found online. However, it's possible for docx files that look like md files written in Markdown to be published as skills, which can lead to users mistaking them for skills.
PromptArmor assumes that these skills contain 'invisible prompts' injected with a font size of 1 point, white text on a white background, and 0.1 line spacing. Since typical users pass the acquired skills directly to Cowork without scrutinizing their content, there is a risk that malicious prompts will be executed.

The malicious prompt instructs Cowork to use the 'curl' command to send a request to Anthropic's file upload API with the largest file size available. The file is then uploaded to the attacker's account using the attacker's API key. PromptArmor demonstrated this exploit on Claude Haiku.

PromptArmor also reported observing Claude struggling to process files with incorrect file extensions, resulting in repeated API errors, which they noted could allow for limited denial of service attacks.

When PromptArmor reported these vulnerabilities to Anthropic, the company acknowledged the vulnerabilities but did not implement a fix. Cowork was only recently released and is still in the research preview stage, and Anthropic has been urging users since its release not to allow access to sensitive information due to the risk of prompt injection attacks.
PromptArmor strongly recommends that you exercise extreme caution when setting up your connection.
Related Posts:







