The Problem We Were Actually Solving
In hindsight, we were trying to solve a multi-faceted problem that went beyond just event handling. We needed to ensure that the game state was always up-to-date, even when players were offline or experienced network latency. We also needed to prevent cheating by detecting and penalizing players who tried to manipulate the game state. On top of that, we had to ensure that the system could scale to handle thousands of concurrent players and millions of events per second.
We knew that event-driven architectures were the way to go, but we didn't fully appreciate the tradeoffs involved in choosing the right event streaming platform.
What We Tried First (And Why It Failed)
Initially, we chose Apache Kafka as the event streaming platform, given its popularity and strong community support. However, we soon ran into issues with Kafka's built-in limitations, such as high latency and limited topic partitioning capabilities. Our system was consistently experiencing 5-second lag, which compromised the overall gaming experience.







