Diffusion AI is most common in image generation, but it can make text outputs much faster.

DiffusionGemma generates text up to 4x faster than traditional models by producing entire blocks simultaneously, achieving roughly 1,479 tokens per second.

The new DiffusionGemma open model generates text in parallel — not one token at a time — and is optimized to run on the NVIDIA RTX PRO platform, NVIDIA DGX Spark systems and…

Developers building real-time AI—such as chat assistants, copilots, and agentic workflows—are often constrained by token-by-token generation speed. This limits responsiveness,…

An overview of DiffusionGemma, an exceptionally fast text generation model with up to 4x faster speeds.

Google’s experimental DiffusionGemma model uses text diffusion to generate blocks of text in parallel, targeting faster local AI inference for developers.

Google AI releases DiffusionGemma, a 26B MoE open text diffusion model generating 256-token blocks in parallel, up to 4x faster.

Google released DiffusionGemma, a 26-billion-parameter model that generates text not token by token but through diffusion, similar to how image AI turns noise into a picture.…

Diffusion AI is most common in image generation, but it can make text outputs much faster.

DiffusionGemma hits 1,000 tokens per second by ditching word-by-word generation entirely. It just doesn't run on most people's machines yet.

谷歌当地时间6月10日发布实验性开源模型DiffusionGemma,采用文本扩散架构,在专用GPU上文本生成速度较传统自回归大语言模型最高提升4倍,模型以Apache 2.0许可证发布。谷歌称,DiffusionGemma定位为面向研究者和开发者的实验性模型,整体输出质量低于标准Gemma…

Google open-sources speedy DiffusionGemma text diffusion model - SiliconANGLE

Google DeepMind has introduced DiffusionGemma, a groundbreaking AI model that processes text in parallel, delivering up to 4x faster performance on local hardware like gaming GPUs.

Google DeepMind ha annunciato DiffusionGemma, un modello open source sperimentale basato sulla generazione testuale tramite diffusione. Grazie alla produzione parallela di blocchi…

What: Google released DiffusionGemma, an open-weight model whose headline trick is parallel...

DiffusionGemma genera blocchi di testo in parallelo invece di un token alla volta. È open source, gira anche su GPU consumer e può superare i 1.000 token al secondo

Google's open-source diffusion language model generates 256 tokens in parallel and self-corrects, hitting 4x speed on one GPU at a cost to quality.