Speeding up PyTorch inference by 87% on Apple devices with AI-generated Metal kernels
mailMailgithubGithublinkedinLinkedinPublished onAugust 26, 2025AuthorsNameTaras SeredaNameNatalie SerrinoNameZain AsgarSpeeding up PyTorch inference by 87% on Apple devices with AI-generated Metal kernelstl;dr: Our lab investigated whether frontier models can write optimized GPU kernels for Apple devices to speed up inference. We found that they can: our AI-generated Metal kernels were 1.87x faster across 215 PyTorch modules, with some workloads running hundreds of times faster than baseline.Why...
Read more at gimletlabs.ai