(Image credit: OpenAI)

OpenAI and Broadcom have introduced Jalapeño, a custom-built inference processor designed specifically for modern large language models and future agentic AI workloads, which is designed to deliver performance per watt they claim is higher than today's leading-edge hardware. OpenAI considers its hardware project a strategic one and envisions Jalapeño to be the first generation of its inference hardware.Not another AI acceleratorOpenAI stresses that Jalapeño is a purpose-built inference ASIC and not a repurposed training accelerator or a general-purpose AI processor. OpenAI says the architecture of Jalapeño was designed based on its understanding of LLM behavior and is meant to address practical bottlenecks that matter for inference at scale, including costly data movement, balance between compute and memory resources, networking efficiency, and overall behavior. OpenAI also states that the design of the processor is meant to wed high throughput with low latency (which is why it uses a huge compute chiplet and HBM memory and not cheaper types of DRAM like many other inference accelerators), which will be particularly handy for reasoning and agentic workloads.In addition, OpenAI and Broadcom claim the processor is built to deliver higher effective utilization than conventional AI accelerators and deliver performance that is close to the theoretical maximum, which means very high efficiency both in terms of costs and in terms of power. Meanwhile, the companies did not disclose performance targets for their Jalapeño ASIC, so these claims should be taken with a grain of salt.Engineering samples are already operating in the lab at target clock speed and power (though Broadcom and OpenAI do not disclose details about this, either), and OpenAI says it is running machine learning workloads, such as GPT-5.3-Codex-Spark.The two companies also claim that early internal testing indicates that Jalapeño's performance-per-watt is substantially better than 'current state-of-the-art hardware,' although no hard numbers, benchmarks, memory configuration, or other details are disclosed, so again, we will have to take the claims with a grain of salt. In addition, one must bear in mind that while Jalapeño can purportedly beat existing AMD's Instinct MI350-series and Nvidia's Blackwell-based accelerators, it remains to be seen how competitive it will be against AMD's Instinct MI400-series and Nvidia's Rubin-based offerings."Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers," said Richard Ho, who leads OpenAI's hardware program. "We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits."A massive chip with six HBM modulesWhile Broadcom and OpenAI did not disclose specifications of Jalapeño, they did show its wafer and packaging, so we can do a brief analysis. The package appears to contain one large compute chiplet surrounded by six HBM modules and another chiplet that likely packs input/output interfaces and is surrounded by two structural dummy dies.Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.