Study of 19 LLMs across 52 professional domains finds top AI models corrupt 25% of documents during long delegated workflows; errors compound over time and worsen with document size.

LLMs Corrupt Your Documents When You Delegate

View PDF HTML (experimental) Abstract:Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation that the LLM will faithfully execute the task without introducing errors into documents. We introduce DELEGATE-52 to study the readiness of AI systems in delegated workflows. DELEGATE-52 simulates long delegated workflows that require in-depth document editing ac...