New AI Reward System Slashes Language Model Hallucinations by 39.3%, Maintains Performance Across Tasks

Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations

View PDF HTML (experimental) Abstract:Language models often generate factually incorrect information unsupported by their training data, a phenomenon known as extrinsic hallucination. Existing mitigation approaches often degrade performance on open-ended generation and downstream tasks, limiting their practical utility. We propose an online reinforcement learning method using a novel binary retrieval-augmented reward (RAR) to address this tradeoff. Unlike continuous reward schemes, our approach ...