The Quest Begins (The "Why")

Honestly, I was just trying to keep my tiny side‑project from melting down during a launch‑day traffic spike. I’d thrown together a simple round‑robin proxy, watched the logs fill with 502s, and felt like Neo staring at a wall of green code—confused and a little overwhelmed. The problem wasn’t that we didn’t have enough servers; it was that the traffic wasn’t being spread fairly. Some nodes got hammered while others twiddled their thumbs, and the whole thing started to look like a boss fight where I kept dying on the same pattern.

I asked myself: What if the load balancer could actually see how busy each backend is, and send new requests to the least‑loaded one? That sounded like the secret move I needed to dodge Agent Smith’s barrage of requests.

The Revelation (The Insight)

The breakthrough came when I stopped thinking about static schedules (round robin, weighted round robin) and started thinking about dynamic state. The key insight: measure the current number of active connections (or request latency) on each backend and always pick the one with the smallest value. This is the Least Connections algorithm, and when you add a tiny health‑check layer, it becomes remarkably resilient.