FIRST LOOK: OpenAI is taking the wraps off Jalapeño, a custom 'intelligence processor' built with Broadcom to make its large language models cheaper and more efficient to run. The company even used its own AI models to help design the chip. Jalapeño is a purpose-built ASIC for inference, rather than for the broader mix of workloads GPUs usually handle.

By designing a chip around how tokens move through transformer architectures, OpenAI is trying to shift away from its heavy dependence on Nvidia hardware and toward a stack it controls end-to-end.

Engineering samples of Jalapeño are already running production-class workloads, including a model called GPT-5.3-Codex-Spark, while meeting the power and performance targets OpenAI set for the project. The company says early testing shows Jalapeño "will deliver performance per watt substantially better than current state-of-the-art," while Broadcom CEO Hock Tan has said it matches Nvidia's Blackwell chips and Google's Tensor Processing Units on performance.

Because the chip is aimed at inference, it can make more aggressive design trade-offs. The architecture is tuned around LLM kernels, memory movement, networking, and serving patterns rather than general-purpose compute, with the goal of improving tokens per watt on the kinds of requests that dominate ChatGPT and API traffic.