Hewlett Packard Enterprise boosts Private Cloud AI token throughput by 20% with Nvidia collaboration

Hewlett Packard Enterprise announced updates to its Private Cloud AI platform on March 16, co-engineered with Nvidia, that deliver up to a 20% improvement in token throughput for AI inference tasks. New network expansion racks will allow the platform to scale to 128 GPUs, with availability slated for July 2026.

What’s actually changing

Token throughput is how many chunks of text (or other data) an AI model can process per second. A 20% jump means enterprises running generative AI or agentic AI workloads get meaningfully faster responses without swapping out hardware.

The platform now supports Nvidia RTX PRO 6000 Blackwell Server Edition GPUs, specifically designed for enterprise data center deployments rather than the workstation or consumer market.

Scaling to 128 GPUs through the new expansion racks allows enterprises to run bigger models or serve more concurrent users. For organizations that started small with Private Cloud AI and need to grow, this removes what was previously a hard constraint.

Hewlett Packard Enterprise boosts Private Cloud AI token throughput by 20% with Nvidia collaboration

Other newsrooms on this story

Related reading

HPE expands Private Cloud AI factory portfolio to support next-gen autonomous…

HPE Brings Agentic AI Into Production With NVIDIA, Delivering Security,…

What's Going On With Hewlett Packard Stock Wednesday? - Hewlett Packard…

NVIDIA Unlocks AI Compute at Scale, Inviting Partners to Power the AI…

HPE AI Factory With NVIDIA Expands for the Era of Agents

HPE lifts forecast past 2028 goals on robust AI demand, shares surge 36% - The…