Zalando replaces shared ingress load balancer with client-side routing for million-request-per-second Product API; cuts latency spikes, improves incident clarity through in-process hash-ring load balancing across 25 European markets

Client-Side Load Balancing at a Million Requests Per Second

Our busiest API ran its high-volume internal traffic through the cluster's shared edge ingress load balancer. For years we could never be sure whether a latency spike came from our own code or from reusing that shared edge router internally.In a previous post, we described how we built Zalando's Product Read API (PRAPI), serving millions of requests per second with single-digit-millisecond latency across 25 European markets. Every product page, search result, and checkout depends on it. A brief ...