Because generative AI (genAI) tools and services have become so ubiquitous (and popular), the costs of using them are going through the roof — leading to an insatiable appetite for tokens.
Tokens represent a common way to measure and price AI use. Much like letters and words in English, large language models (LLMs) grasp a sentence or query by breaking words into tokens.
With the AI explosion well under way, tokens are now “the fundamental units of data our models process, many representing a problem being solved,” according to Google CEO Sundar Pichai. (Google, by the way, processes about 3.2 quadrillion tokens a month.)
But as the price of all those tokens adds up, business and IT execs are looking for ways to cut costs while keeping corporate productivity up. Uncontrolled token use has already landed one company with an unexpected $500 million AI bill.
There are a number of ways companies can rein in the price of AI at the model, infrastructure, silicon, and business levels. Here’s a look at how some of those savings might actually be achieved.














