400 billion tokens. That was a monthly expense of $78,000 which worked out to $1 million a year. A large enterprise in a regulated industry ran up this bill.“This was an eye-wateringly high number for the firm,” said Karan Kirpalani, chief product officer, Neysa Networks. It is not that the company, which was one of Neysa’s key clients, used the most advanced model.The strict internal usage protocol didn’t bring costs down. Latency issues persisted despite adopting models launched by frontier labs. The villain here is unoptimised AI workload.Talk to any CTO, they will tell you that tokenmaxxing is one of the biggest challenges they are facing today. While MakeMyTrip’s Sanjay Mohan tells ET the monthly consumption is in millions Mohit Saxena of InMobi Group says it is in billions.ETtech
The staggering levels of token spend is forcing Indian companies to explore ways to rein in AI costs. The immediate response is to opt for open-source models and small language models. An additional option is to turn to startups such as Pipeshift and Divyam.ai. Through a range of solutions for inference optimisation, GPU orchestration and model routing, they help improve usage and save cost. Sandeep Kohli, cofounder, Divyam.ai said that this is coming at the back of huge adoption of AI as large firms are investing significantly in building AI-native solutions. As intelligence is spread across multiple systems in enterprises inefficiencies can creep in. Not every aspect of the business requires the latest models, which are token hungry, and thus results in increased cost.ETtech














