Picture two merchants that sell the same protein powder. One creates a single product listing with flavor variants. The other creates a separate listing for every flavor. Both are right. And on their own storefronts, neither structure causes any problems.

But Shopify hosts billions of product listings across millions of stores, and there's no shared schema between them. When an AI shopping agent needs to find the best protein powder across the whole dataset, it has to understand that one merchant's single listing and another merchant's twelve listings describe the same product line… without anyone telling it so.

This is a tale of teaching machines to read product data the way a human shopper would, at scale.

Enter the Shopify Catalog: a unified intelligence layer that standardizes product data across the platform and makes it available to developers and AI agents via the Catalog API. (Take a look at all of Catalog's new features coming out of our Spring 2026 Edition.)

At the heart of Catalog is product clustering. To understand it, it helps to know how Shopify's data model works. Merchants organize their offerings into products (e.g., "Classic Denim Flare Jean"), each of which can have multiple variants (e.g., size 28 in light wash, size 30 in dark wash). Different merchants may structure the same real-world item very differently. One might create a single product with all size and color variants, another might create separate products per color.