Anthropic’s brand-new model tells me what it really thinks (or what it thinks a language model thinks)
Claude Doesn’t Want to Die
Claude Doesn’t Want to Die
Claude Doesn’t Want to Die
Anthropic’s brand-new model tells me what it really thinks (or what it thinks a language model thinks)