Researchers prove continuous chain-of-thought outperforms discrete in graph reasoning; superposition enables parallel search in language models

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have demonstrated remarkable performance in many applications, including challenging reasoning problems via chain-of-thoughts (CoTs) techniques that generate ``thinking tokens'' before answering the questions. While existing theoretical works demonstrate that CoTs with discrete tokens boost the capability of LLMs, recent work on continuous CoTs lacks a theoretical understanding of why it outperforms discrete counterparts in vario...