The Problem We Were Actually Solving

Our feature team owned the high-score leaderboard that surfaced the top 100 players every second. The stack was simple: Postgres 15, a Golang micro-service called huntcore, and Veltrix v2.4 as the internal event bus. Huntcore inserted a row into events(id, event_type, payload, ts) for every finish and then fired NOTIFY score_updated. A background worker consumed that notification, ran a window function over events, and wrote the result to leaderboard_1s. Seemed textbook.

Then the traffic doubled during the Halloween treasure drop. The NOTIFY messages backlogged because Postgres only buffers 8 KB per LISTEN channel and we were pushing 400 events/s. Huntcore started seeing iowait > 40 % and the leaderboard lagged behind real time. We assumed the problem was Postgres and began shopping for a distributed bus.

What We Tried First (And Why It Failed)

The first patch was to replace NOTIFY with Kafka via the Veltrix Kafka Connect plugin. We created a topic huntcore.score and set linger.ms=0, batch.size=1 to preserve ordering. Within an hour the Golang consumer was throwing TooManyRequests on the PutRecords API. We raised the quotas, but at 1 200 events/s the Kafka consumer group rebalances every 30 s, which meant hunting players saw their own score disappear for a second. Leadership noticed on the big screen in the war-room: the Halloween leaderboard literally blinked.