Tracking LLM costs per customer means attributing every model-provider charge to a specific user inside a multi-tenant product. Aggregate dashboards hide which customers are unprofitable; per-customer attribution surfaces it. This article covers instrumentation patterns, the attribution choices that scale, and what to do with the data.

Why tracking LLM costs per customer matters

Most teams cannot do this. According to CloudZero's 2025 State of AI Costs report, only 43% of organizations can attribute AI cost to a customer, and only 22% can attribute it to a transaction (CloudZero, May 2025). The FinOps Foundation's State of FinOps 2026 puts AI in the scope of FinOps practice at 98% of organizations, up from 31% in 2024, so the recognition is there. The instrumentation is not.

The cost is structural. At scaling-stage AI B2B companies, model inference runs at roughly 23% of revenue and does not decline meaningfully with scale (ICONIQ AI B2B Operating Index, January 2026). A 23%-of-revenue cost center that the team cannot break down per customer is the operating equivalent of running a SaaS business without knowing which seats are paid.

The tail is what kills the average. A user who is profitable at the median can flip to break-even at the 75th percentile and to a monthly loss at the 90th as query volume compounds (Todd Gagne, Wildfire Labs, March 2026). Replit reported a production case in February 2026 where gross margin swung from 36% to -14% after its agent consumed more LLM than the pricing covered. Aggregate dashboards do not show this. A per-customer ledger does.