Zero-Copy GPU Inference from WebAssembly on Apple Silicon
tl;dr: on Apple Silicon, a WebAssembly module's linear memory can be shared directly with the GPU: no copies, no serialization, no intermediate buffers. The CPU and GPU read and write the same physical bytes. End-to-end, it works: a Wasm guest fills a matrix in its linear memory, the GPU reads it, computes, writes back, and the guest sees the result through the same pointer, same memory, zero copies.Normally Wasm and GPUs are separated by an expensive serialization boundary: on most hardware, ge...
Read more at abacusnoir.com