Qualcomm CEO Cristiano Amon just put a number on how fast the AI engine is accelerating. During his COMPUTEX 2026 keynote in Taipei, Amon projected that AI token generation will hit 1.27 trillion tokens every 10 seconds by 2030, a roughly 40-fold increase from today’s pace of approximately 31.7 billion tokens every 10 seconds.
To be clear: these are not crypto tokens. These are AI inference tokens, the basic units of text, code, and reasoning that large language models produce every time they respond to a query. The distinction matters, because the infrastructure required to process that kind of volume has enormous implications for chipmakers, cloud providers, and anyone building products that depend on AI.
From answers to autonomy
Today’s AI models mostly generate answers. You ask a question, you get a response, maybe a few hundred tokens long. What’s coming is different. Agentic AI, systems that don’t just answer but actually make decisions and take actions autonomously, will demand dramatically more token throughput.
Amon framed this not as a distant possibility but as an infrastructure challenge that the semiconductor industry needs to solve now.














