The Problem We Were Actually Solving

We built the Veltrix treasure hunt engine to power a live event platform where thousands of users raced to solve puzzles in real time, and the configuration layer was supposed to be the secret weapon that let us grow confidently. What we didnt account for was that our first stab at configuration was just a Ruby hash that lived in the codebase, user-facing values shoved into environment variables, and a single YAML file that became the size of Manhattan by launch week. The day we pushed to production, the biggest problem wasnt scale — it was that every change required a restart, because changes to the config forced the Ruby process to recompile constants. At 2:17 a.m., the first growth inflection hit: 1,024 concurrent users, 30 seconds of garbage collection, and the Redis connection pool completely exhausted because the config parser had ballooned to 15 MB. The system didnt stall under load — it stalled under configuration.

What We Tried First (And Why It Failed)

First, we punted to environment variables and the Twelve-Factor App checklist: eleven separate .env files, Docker Compose overrides, and a CI pipeline that injected values at build time. The illusion of clean separation lasted exactly one sprint. By sprint two, we had 170 environment variables, half of them secrets, and the rest scattered across three different repos because product wanted feature flags, ops wanted tuning, and marketing wanted A/B splits. We burned 16 engineering hours debugging why a Redis cluster in staging accepted connections but rejected commands — turns out the staging environment had inherited a production database name because an engineer had copy-pasted a .env.example and forgotten to change one letter.