News Score: Score the News, Sort the News, Rewrite the Headlines

Salesforce study finds LLM agents flunk CRM and confidentiality tests

A new benchmark developed by academics shows that LLM-based AI agents perform below par on standard CRM tests and fail to understand the need for customer confidentiality. A team led by Kung-Hsiang Huang, a Salesforce AI researcher, showed that using a new benchmark relying on synthetic data, LLM agents achieve around a 58 percent success rate on tasks that can be completed in a single step without needing follow-up actions or more information. Using the benchmark tool CRMArena-Pro, the team als...

Read more at theregister.com

© News Score  score the news, sort the news, rewrite the headlines