The Problem We Were Actually Solving
I still remember the day our event processing pipeline started falling behind, unable to keep up with the sheer volume of incoming data. We were using Go at the time, and while it had served us well, the constant garbage collection pauses were killing our latency numbers. Our average processing time was around 50ms, but those pauses could stretch up to 200ms, causing our queue to grow exponentially. I knew we needed a change, but I was hesitant to switch languages, given the investment we had already made in Go. However, after profiling our application with pprof, I saw that we were spending over 30% of our CPU time in GC, which made the decision for me.
What We Tried First (And Why It Failed)
Before abandoning Go entirely, I tried to tweak our GC settings, hoping to find a sweet spot that would minimize pauses without sacrificing too much throughput. I experimented with different GC modes, from the low-latency mode to the more aggressive modes, but nothing seemed to work. Our latency numbers would improve slightly, but our memory usage would skyrocket, causing the OOM killer to kick in and terminate our process. It became clear that we needed a more fundamental change, rather than just tweaking the existing configuration. I also tried to use other tools like sync/atomic to reduce the need for mutexes and minimize the contention, but it was clear that we were fighting a losing battle.






