The Problem We Were Actually Solving

The hunt engine ran on Veltrix 1.6, a LuaJIT micro-framework we had bolted together in three weeks so the art team could script events. Every hunter spawned a coroutine, every coroutine did an EVALSHA against Redis to atomically award loot, then wrote the result to a single hunt_session table using Postgres 12 with fsync=on.

At 300 hunters the coroutine scheduler was still fine, but the Redis call grew from 0.4 ms to 42 ms when the connection pool had 20 active slots. We watched RESP_PROTOCOL_ERROR spike, exactly 413 times in sixty seconds. Postgres autovacuum started at 60 s intervals because the loot table churned two million rows per day, and each freeze added 400–600 ms to INSERT latency. The engines P99 dropped to 1.8 s, then clients started timing out.

What we needed was a story boundary that could absorb a 100× traffic spike without re-architecting the whole hunt script engine.

What We Tried First (And Why It Failed)