Anthropic rewrites Claude AI's constitution, shifting from rule-following to teaching reasoning; acknowledges uncertainty about AI consciousness, wellbeing

Anthropic rewrites Claude’s guiding principles—and reckons with the possibility of AI consciousness | Fortune

Anthropic is overhauling a foundational document that shapes how its popular Claude AI model behaves. The AI lab is moving away from training the model to follow a simple list of principles—such as choosing the response that is least racist and sexist—to instead teach the AI why it should act in certain ways.“We believe that in order to be good actors in the world, AI models like Claude need to understand why we want them to behave in certain ways rather than just specifying what we want them to...