Defeating Nondeterminism in LLM Inference
IntroductionThe original sin: floating-point non-associativityWhy don’t kernels always add numbers in the same order?When are atomic adds needed?Batch Invariance and “Determinism”How do we make kernels batch-invariant?Batch-Invariant RMSNormBatch-Invariant Matrix MultiplicationBatch-Invariant AttentionImplementationExperimentsHow nondeterministic are completions?PerformanceTrue on-policy RLConclusionCitation
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult ...
Read more at thinkingmachines.ai