The Bitter Lesson is coming for Tokenization
24 Jun, 2025
a world of LLMs without tokenization is desirable and increasingly possible
Published on 24/06/2025 • ⏱️ 29 min read
In this post, we highlight the desire to replace tokenization with a general method that better leverages compute and data. We'll see tokenization's role, its fragility and we'll build a case for removing it. After understanding the design space, we'll explore the potential impacts of a recent promising candidate (Byte Latent Transformer) and build strong intuitions a...
Read more at lucalp.dev