OpenAI, Broadcom debut custom Jalapeño chip for AI inference

OpenAI Group PBC today revealed a custom chip called Jalapeño that it will use to power its large language models.

The processor is the fruit of a collaboration with Broadcom Inc., which is no stranger to custom silicon design. The company helped Google LLC develop its TPU line of artificial intelligence accelerators. In April, the search giant extended its chip collaboration with Broadcom to 2031.

Nvidia Corp.’s flagship Rubin graphics cards can run both training and inference workloads. By contrast, Jalapeño is only designed for the latter use case. According to OpenAI, early testing indicates that the chip can perform inference with significantly higher performance per watt than “current state-of-the-art,” which may be a reference to Nvidia chips.

The company has shared few details about Jalapeño’s design. However, the blog post in which it announced the chip specifies that the underlying “architecture reduces data movement.” That hints Jalapeño’s architecture may be designed to reduce data movement between its logic circuits and off-chip memory, one of the main performance bottlenecks in inference clusters.