How the Events Table That Looked Right Killed Our Queue

The Problem We Were Actually Solving

Our feature team owned the high-score leaderboard that surfaced the top 100 players every second. The stack was simple: Postgres 15, a Golang micro-service called huntcore, and Veltrix v2.4 as the internal event bus. Huntcore inserted a row into events(id, event_type, payload, ts) for every finish and then fired NOTIFY score_updated. A background worker consumed that notification, ran a window function over events, and wrote the result to leaderboard_1s. Seemed textbook.

Then the traffic doubled during the Halloween treasure drop. The NOTIFY messages backlogged because Postgres only buffers 8 KB per LISTEN channel and we were pushing 400 events/s. Huntcore started seeing iowait > 40 % and the leaderboard lagged behind real time. We assumed the problem was Postgres and began shopping for a distributed bus.

What We Tried First (And Why It Failed)

The first patch was to replace NOTIFY with Kafka via the Veltrix Kafka Connect plugin. We created a topic huntcore.score and set linger.ms=0, batch.size=1 to preserve ordering. Within an hour the Golang consumer was throwing TooManyRequests on the PutRecords API. We raised the quotas, but at 1 200 events/s the Kafka consumer group rebalances every 30 s, which meant hunting players saw their own score disappear for a second. Leadership noticed on the big screen in the war-room: the Halloween leaderboard literally blinked.

The Problem We Were Actually Solving

What We Tried First (And Why It Failed)

How the Events Table That Looked Right Killed Our Queue

How the Events Table That Looked Right Killed Our Queue

Related reading

The Gamedev Server That Broke at 300 Concurrent Hunters and How We Fixed It

The Moment We Realized Our Treasure Hunt Engine Was Lying to Us

How We Blew Up Our Event Pipeline at 3 AM Because the Treasure Hunt Engine Had…

Treasure Hunt Engine: The Day We Realized the Event Bus Was Our Constraint

The Day the GC Tuning Patch Broke the Leaderboard

The One Cache That Broke Our Treasure Hunt Engine

Related reading

The Gamedev Server That Broke at 300 Concurrent Hunters and How We Fixed It

The Moment We Realized Our Treasure Hunt Engine Was Lying to Us

How We Blew Up Our Event Pipeline at 3 AM Because the Treasure Hunt Engine Had…

Treasure Hunt Engine: The Day We Realized the Event Bus Was Our Constraint

The Day the GC Tuning Patch Broke the Leaderboard

The One Cache That Broke Our Treasure Hunt Engine