The Problem We Were Actually Solving
At the time, we were trying to optimize our server for high latency environments. We wanted to make sure that our Treasure Hunt engine was stable and fast, even in the presence of network partitions or high packet loss. The goal was to minimize the impact of events on our server's performance. But, as it often does, our optimization effort quickly spiraled out of control.
What We Tried First (And Why It Failed)
We initially went down the route of using a highly-optimized event bus library, one that promised to minimize the overhead of event dispatching. The idea was to use this library to decouple our event handling logic from the rest of the server, allowing us to scale our event processing independently of our main business logic. Sounds good in theory, but in practice, it was a nightmare. The library had a massive memory footprint, which, combined with our own server's memory leaks, brought our system to its knees. The event bus library was also notoriously difficult to debug, making it almost impossible to pinpoint the root cause of our problems.
The Architecture Decision







