News Score: Score the News, Sort the News, Rewrite the Headlines

Zero-Copy GPU Inference from WebAssembly on Apple Silicon

tl;dr: on Apple Silicon, a WebAssembly module's linear memory can be shared directly with the GPU: no copies, no serialization, no intermediate buffers. The CPU and GPU read and write the same physical bytes. End-to-end, it works: a Wasm guest fills a matrix in its linear memory, the GPU reads it, computes, writes back, and the guest sees the result through the same pointer, same memory, zero copies.Normally Wasm and GPUs are separated by an expensive serialization boundary: on most hardware, ge...

Read more at abacusnoir.com

© News Score  score the news, sort the news, rewrite the headlines