Relace open-sources Apply 3 AI model achieving 10,000 tokens per second for code editing; uses fine-tuned small models to merge diffs faster and cheaper than Claude 4.5 Sonnet

A Year of Fast Apply — The Path to 10k Tokens per Second

A year ago today, we released our first Fast Apply model publicly. Since then, we’ve learned a lot about how to fine-tune small, specialized models for code-specific tasks. Today, we’re open-sourcing what we've learned in training this series of models — dataset curation, training methods, and inference techniques that led to Relace Apply 3, our best model yet, capable of running at 10k+ tokens per second while maintaining state-of-the-art accuracy. Error rate comparison on 500 randomly sampled ...