Building a Rate Limiter That Actually Works
Quick context (why you're writing this)
I was knee‑deep in a side‑project API last month when the service started returning 429s out of nowhere. The client library I was using had a naïve “max requests per second” check that would burst all at once, hammer downstream services, and then go silent for a full second. I spent three hours staring at logs, wondering why my “simple” limiter was either too lax or too brutal. It hit me: the problem wasn’t the limit itself—it was how we were counting time.
If you’ve ever built an endpoint that needs to protect a downstream DB, a third‑party API, or just keep your own service from melting down, you’ve probably run into the same thing. The usual “reset every second” approach feels intuitive, but it hides a subtle flaw that shows up under real traffic.
The Insight






