In metric collection systems, cardinality is a critical concept for balancing performance and cost. I have prepared a 4-step guide on how this balance is established in the real world.

In this post, I will explain what metric cardinality is, why it matters, and how we can find the right balance in our systems based on my own experiences. We won't just stick to theoretical knowledge; we will address this topic with concrete examples and steps.

What is Metric Cardinality and Why Should We Care?

Metric cardinality is the number of unique label combinations of the metrics we use in our monitoring systems. Simply put, the more different labels we use to define a metric, the higher its cardinality becomes. For example, when monitoring a server's CPU usage, adding labels like instance, job, region, and az increases cardinality.

This has a direct impact on storage space, query performance, and costs. High cardinality requires more disk space, causes queries to run slower, and leads to higher costs in cloud environments.