TL;DR. If your LLM bill is one line item on a cloud invoice, you cannot answer "which team spent that." We fixed this by tagging every gateway span with team.id, project.id, and feature.id, plus the OpenInference token-count attributes, shipping those spans through an OTel collector into Tempo, and rolling cost up per team with TraceQL in Grafana. The payoff that sold it internally: one team's monthly spend quietly went from a few hundred dollars to over a thousand because of a retry loop, and the org-level dashboard never flinched. The per-team view caught it in a day. Below is the wiring, the collector config, the rollup query, the alert, and the attributes I tried and threw away.

1. The problem is attribution, not collection

Most teams already collect LLM telemetry. Spans exist, tokens get counted, traces land somewhere. What is missing is the dimension that finance and eng leads actually ask about: who owns this spend. The provider invoice gives you one number per month per API key. If you share keys across services (most people do at some point), that number is useless for chargeback. You cannot tell the platform team's spend from the support-bot team's spend.

So the design goal was narrow. Every LLM call has to carry enough labels that I can group spend by team, by project under that team, and by feature inside that project. Three levels. No more, because deeper than feature and nobody reads the dashboard. I standardized the whole pipeline on OpenTelemetry and OpenInference, and I will state the one opinion plainly: I want the labels, the wire format, and the storage to be things I can swap without rewriting instrumentation. We tag spans with open semantic conventions so the day we change a backend or a dashboard tool, the gateway code does not move. That is a portability decision, not a verdict on anyone's product.