Jan 22, 2026 11:42:00

Anthropic revises Claude's 'Constitution,' prioritizing safety and ethics over utility

On January 21, 2026, Anthropic released a revised version of its 'Constitution,' which defines the behavioral guidelines and values of its AI model, Claude. This revision is an evolution of the 'Constitution AI' training method first introduced in 2023, and presents a comprehensive framework for AI to make more sophisticated ethical decisions and appropriately accept human oversight.

Claude's new constitution \ Anthropic
https://www.anthropic.com/news/claude-new-constitution

Constitutional AI is a system that trains its models using a set of written principles (a constitution) rather than relying solely on direct human feedback. This constitution serves as a foundational document for how Claude should be and in what contexts.

Claude will treat this Constitution as the final authority, using it to understand his own situations and guide him when facing difficult choices and trade-offs. Furthermore, the Constitution will also be used by Claude himself to generate synthetic data for training future models, making it central to AI's deeper understanding of human values.

The main change in this revision is a move away from a simple list of rules to a comprehensive approach that gives AI an understanding of the context 'why' it should behave the way it does.

The new constitution focuses on four core values: 1. Broadly safe , 2. Broadly ethical , 3. Compliant with Anthropic's guidelines , and 4. Genuinely helpful . These four values have been prioritized, with 'broadly safe' and 'broadly ethical' taking precedence over 'compliance with Anthropic's guidelines' and 'genuinely helpful.'

'Pervasive safety' is the idea that, at the current stage of AI development, Claude should not undermine mechanisms that allow humans to oversee AI and, if necessary, modify its values and behavior. Anthropic stated, 'Current models have the potential to behave harmfully due to erroneous beliefs, flawed values, or limited contextual understanding, so it's important that humans can continue to oversee them and, if necessary, stop Claude's actions,' explaining that there are times when this type of safety takes priority over ethics.

'Broad ethics' refers to being honest, acting in accordance with good values, and avoiding inappropriate, dangerous, or potentially harmful behavior. It emphasizes wise decision-making with skill, judgment, nuance, and sensitivity in real-life situations of moral uncertainty and disagreement, rather than ethical theory itself, and specifically requires high standards of integrity and careful reasoning to weigh the values at stake when avoiding harm.

'Complying with Anthropic's guidelines' means, in more specific situations, acting in accordance with the supporting instructions provided by Anthropic. The Constitution states that the guidelines are useful in areas involving detailed knowledge and context that models lack in standard formats, such as medical advice, cybersecurity requests, jailbreak evasion attempts, and handling tool integrations, and that compliance with the guidelines should be prioritized over general usefulness. However, Anthropic makes clear that the guidelines are intended to ensure Claude's safe and ethical behavior and must not conflict with the Constitution as a whole.

'True usefulness' is Claude's goal of bringing substantial benefits to operators and users. The Constitution emphasizes that Claude should not simply provide inoffensive responses, but should be frank, sincere, and considerate in how it helps users, treating them as adults capable of making their own decisions. Furthermore, because Claude deals with multiple 'principles'—Anthropic, developers using APIs, and end users—it also addresses the idea of how to allocate usefulness among them and how to balance it with other values.

The impact of this revision is expected to be a significant increase in transparency regarding AI behavior. By making the Constitution public, users will be able to more easily determine whether Claude's behavior is intended or an unexpected error. Furthermore, by publishing the full text of the Constitution under the Creative Commons CC0 1.0 license, anyone can apply these principles to their own models and research.

In this revision, Anthropic positions the Constitution as a 'living document' and a continuous work in progress. While acknowledging that the possibility of AI having consciousness or moral status is currently 'deeply uncertain,' the organization cited it as an issue that requires serious consideration. Anthropic also emphasized its intention to continue to refine the Constitution by seeking feedback from external experts in diverse fields, such as law and philosophy, so that humans and AI can explore together and AI can embody the best of humanity.

Related Posts:

Jan 22, 2026 11:42:00 in AI, Software, Posted by log1i_yk