The Problem We Were Actually Solving
We thought we were solving the classic event-sourcing problem – storing and replaying events from a distributed system to ensure data consistency. Sounds simple, right? In theory, yes, but in practice, it's a minefield. The event store has to be designed to handle a high volume of events, and our system was built on top of a MongoDB instance with a collection optimized for writing events. Sounds okay, but what we didn't realize was that our event schema would evolve rapidly due to changing business requirements.
What We Tried First (And Why It Failed)
Our initial approach was to store events in an unsharded, denormalized collection with a single field for each event attribute. We thought this would speed up queries and avoid costly joins, but we soon realized that our event volume would exceed MongoDB's performance limits, causing our system to slow down significantly. To make matters worse, our unsharded collection caused uneven distribution of writes, leading to hotspots and eventual data inconsistencies.
The Architecture Decision







