Load balancing is something most of us assume is a solved problem. But the idea that load balancing is “solved” could not be further from the truth. If you use multiple load balancers, the problem is even worse. Most of us use “random” or “round-robin” techniques, which have certain advantages but are highly inefficient. Others use more complex algorithms like “least-conns,” which can be more efficient but have horrific edge cases. “Consistent hashing” is a very useful technique but only applies to certain problems.
There are several factors that exist both in theory and practice that make efficient load balancing an exceptionally hard problem, including Poisson request arrival times, exponentially distributed response latency, and oscillations when sharing data between multiple load balancers. Luckily, there are techniques and algorithms that have been developed that can make life better. Tyler McMullen explains some of the ways that we can do better than “random,” “round-robin,” and naive “least-conns,” even with distributed load balancers.