Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads

Prime Intellect releases prime-rl 0.6.0, training trillion-parameter MoE models on agentic RL workloads using FP8, disaggregated inference

martedì 23 giugno 2026 New tab

883 words~4 min read

Prime Intellect has released prime-rl version 0.6.0. The framework targets reinforcement learning on trillion-parameter Mixture-of-Experts (MoE) models. It focuses on heavy agentic workloads, like long-horizon software-engineering tasks.

The research team trained GLM-5 on SWE tasks at up to 131k sequence length. Step times stayed under five minutes. The batch size was 256 rollouts. The run used only 28 H200 nodes.

TL;DR

prime-rl 0.6.0 trains trillion-parameter MoE models on agentic RL workloads.

GLM-5 trained on SWE at 131k sequence length, sub-5-minute steps, 28 H200 nodes.

Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads

Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads

Other newsrooms on this story

Related reading

Cheaper, Better, Faster, Stronger | Mistral AI

Researchers train AI model that hits near-full performance with just 12.5…

Mixture of Experts (MoE) Explained Simply: How Modern AI Models Get Bigger…

Mixture of Experts (MoE): what it actually does under the hood, and when it…

NVIDIA AI Releases Nemotron 3 Ultra: An Open 550B Mixture-of-Experts Hybrid…

Mistral Small 3 | Mistral AI

Other newsrooms on this story

Related reading

Cheaper, Better, Faster, Stronger | Mistral AI

Researchers train AI model that hits near-full performance with just 12.5…

Mixture of Experts (MoE) Explained Simply: How Modern AI Models Get Bigger…

Mixture of Experts (MoE): what it actually does under the hood, and when it…

NVIDIA AI Releases Nemotron 3 Ultra: An Open 550B Mixture-of-Experts Hybrid…

Mistral Small 3 | Mistral AI