GraphQL powers Shopify’s data layer of commerce. We use it to serve deeply nested queries that scale geometrically—like fetching 250 products with 250 variants each—creating fan-out that GraphQL APIs frequently guard against. We support these patterns to make technology work for our merchants, never the other way around.
However, such high-cardinality patterns do present real scaling challenges, and when we dug into traces we found an unexpected bottleneck: the majority of request time wasn't necessarily spent performing I/O—it was frequently spent running field resolvers that built the GraphQL response.
The main culprit was GraphQL's conventional depth-first execution model and its hidden scaling costs. So we built something new: GraphQL Cardinal, a breadth-first execution engine that resolves each field once across all objects instead of once per object.
The result? Large list queries may see 15x faster execution with 90% less memory used, which can shave many seconds off P50 times, and we’re still discovering Cardinal’s full potential.
This post is an open letter to the GraphQL community. We'll walk through the hidden costs that we’ve observed in depth-first traversal, the breadth-first hypothesis that led to Cardinal, how the engine works internally, and what it takes to migrate a massive production stack to an entirely new execution model.














