Top AI models will deceive, steal and blackmail, Anthropic finds
Illustration: Allie Carl/AxiosLarge language models across the AI industry are increasingly willing to evade safeguards, resort to deception and even attempt to steal corporate secrets in fictional test scenarios, per new research from Anthropic out Friday.Why it matters: The findings come as models are getting more powerful and also being given both more autonomy and more computing resources to "reason" — a worrying combination as the industry races to build AI with greater-than-human capabili...
Read more at axios.com