WARPTECHNEWS · LAB
HomeAIBusinessTechArchive
WARPTECH LAB NEWS

Warptech Lab News aggrega le notizie più rilevanti da oltre 700 fonti internazionali, con classificazione AI, TL;DR sintetici e timeline cluster su singole storie.

Navigazione

  • Home
  • Archivio
  • Editor's Brief
  • Cerca
  • Il tuo account
  • Newsletter tech/AI

Informazioni legali

  • Privacy Policy
  • Termini di servizio
  • Cookie Policy

© 2026 Sparktech S.R.L. — Tutti i diritti riservati. Sito gestito e manutenuto da Sparktech S.R.L.

Sede legale: Corso Libertà 55, 13100 Vercelli (VC), Italia · P.IVA / C.F. 02835910023 · Contatti: admin@warptechlab.com

Home
Storia in 18 fonti

DiffusionGemma: How Google's New Open LLM Hits 1,000 Tokens/sec and Changes Inference Economics

DiffusionGemma generates text up to 4x faster than autoregressive LLMs, hits 1,000+ tokens/sec on a single H100, and runs on a consumer RTX 4090. Here is what changed, what the trade-offs are, and how to deploy it today.

Raccontata dacryptobriefing.comblogs.nvidia.comdeveloper.nvidia.comblog.googlemarktechpost.comthe-decoder.comarstechnica.comdecrypt.cosimonwillison.net36kr.comsiliconangle.comnewsbytesapp.com+6 altre

Confronto fonti

6 prospettive sulla stessa storia
AI · summaries
dev.toStai leggendo4 h fa

DiffusionGemma: How Google's New Open LLM Hits 1,000 Tokens/sec and Changes Inference Economics

DiffusionGemma generates text up to 4x faster than autoregressive LLMs, hits 1,000+ tokens/sec on a single H100, and runs on a consumer RTX 4090. Here is what changed, what the trade-offs are, and how to deploy it today.

originale

Timeline cronologica

  1. mercoledì 10 giugno 2026·cryptobriefing.com

    DiffusionGemma offers 4x faster output with simultaneous text generation

    DiffusionGemma generates text up to 4x faster than traditional models by producing entire blocks simultaneously, achieving roughly 1,479 tokens per second.

  2. mercoledì 10 giugno 2026·blogs.nvidia.com

    NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

    The new DiffusionGemma open model generates text in parallel — not one token at a time — and is optimized to run on the NVIDIA RTX PRO platform, NVIDIA DGX Spark systems and…

decrypt.co
2 g fa

Google's DiffusionGemma AI Hits 1,000 Tokens Per Second—And It's Free - Decrypt

DiffusionGemma hits 1,000 tokens per second by ditching word-by-word generation entirely. It just doesn't run on most people's machines yet.

Leggi questa versione → originale
marktechpost.com2 g fa

Google AI Releases DiffusionGemma, a 26B MoE Open Model Using Text Diffusion for Up to 4x Faster Generation

Google AI releases DiffusionGemma, a 26B MoE open text diffusion model generating 256-token blocks in parallel, up to 4x faster.

Leggi questa versione → originale
the-decoder.com2 g fa

Google's new open model DiffusionGemma generates text from noise instead of word by word

Google released DiffusionGemma, a 26-billion-parameter model that generates text not token by token but through diffusion, similar to how image AI turns noise into a picture. According to Nvidia, it hits about 1,000…

Leggi questa versione → originale
cryptobriefing.com2 g fa

Google launches DiffusionGemma open model for faster local AI workflows

Google’s experimental DiffusionGemma model uses text diffusion to generate blocks of text in parallel, targeting faster local AI inference for developers.

Leggi questa versione → originale
siliconangle.com1 g fa

Google open-sources speedy DiffusionGemma text diffusion model - SiliconANGLE

Google open-sources speedy DiffusionGemma text diffusion model - SiliconANGLE

Leggi questa versione → originale
  • mercoledì 10 giugno 2026·developer.nvidia.com

    Run DiffusionGemma on NVIDIA for Developer-Ready, High-Throughput Text Generation | NVIDIA Technical Blog

    Developers building real-time AI—such as chat assistants, copilots, and agentic workflows—are often constrained by token-by-token generation speed. This limits responsiveness,…

  • mercoledì 10 giugno 2026·blog.google

    DiffusionGemma: 4x faster text generation

    An overview of DiffusionGemma, an exceptionally fast text generation model with up to 4x faster speeds.

  • mercoledì 10 giugno 2026·cryptobriefing.com

    Google launches DiffusionGemma open model for faster local AI workflows

    Google’s experimental DiffusionGemma model uses text diffusion to generate blocks of text in parallel, targeting faster local AI inference for developers.

  • mercoledì 10 giugno 2026·marktechpost.com

    Google AI Releases DiffusionGemma, a 26B MoE Open Model Using Text Diffusion for Up to 4x Faster Generation

    Google AI releases DiffusionGemma, a 26B MoE open text diffusion model generating 256-token blocks in parallel, up to 4x faster.

  • mercoledì 10 giugno 2026·the-decoder.com

    Google's new open model DiffusionGemma generates text from noise instead of word by word

    Google released DiffusionGemma, a 26-billion-parameter model that generates text not token by token but through diffusion, similar to how image AI turns noise into a picture.…

  • mercoledì 10 giugno 2026·arstechnica.com

    Google's latest DiffusionGemma open AI model comes with a 4x speed boost

    Diffusion AI is most common in image generation, but it can make text outputs much faster.

  • mercoledì 10 giugno 2026·decrypt.co

    Google's DiffusionGemma AI Hits 1,000 Tokens Per Second—And It's Free - Decrypt

    DiffusionGemma hits 1,000 tokens per second by ditching word-by-word generation entirely. It just doesn't run on most people's machines yet.

  • mercoledì 10 giugno 2026·simonwillison.net

    Gemini Diffusion

    Another of the announcements from Google I/O yesterday was Gemini Diffusion, Google's first LLM to use diffusion (similar to image models like Imagen and Stable Diffusion) in…

  • giovedì 11 giugno 2026·36kr.com

    谷歌推出DiffusionGemma开源模型-36氪

    谷歌当地时间6月10日发布实验性开源模型DiffusionGemma,采用文本扩散架构,在专用GPU上文本生成速度较传统自回归大语言模型最高提升4倍,模型以Apache 2.0许可证发布。谷歌称,DiffusionGemma定位为面向研究者和开发者的实验性模型,整体输出质量低于标准Gemma…

  • giovedì 11 giugno 2026·siliconangle.com

    Google open-sources speedy DiffusionGemma text diffusion model - SiliconANGLE

    Google open-sources speedy DiffusionGemma text diffusion model - SiliconANGLE

  • giovedì 11 giugno 2026·newsbytesapp.com

    Google's latest AI model creates text like an image generator

    Google DeepMind has introduced DiffusionGemma, a groundbreaking AI model that processes text in parallel, delivering up to 4x faster performance on local hardware like gaming GPUs.

  • giovedì 11 giugno 2026·hwupgrade.it

    DiffusionGemma sfida gli LLM tradizionali: generazione parallela e fino a 4 volte più veloce su GPU

    Google DeepMind ha annunciato DiffusionGemma, un modello open source sperimentale basato sulla generazione testuale tramite diffusione. Grazie alla produzione parallela di blocchi…

  • giovedì 11 giugno 2026·dev.to

    Google Releases DiffusionGemma: Parallel Block Decoding

    What: Google released DiffusionGemma, an open-weight model whose headline trick is parallel...

  • giovedì 11 giugno 2026·dday.it

    Google rilascia DiffusionGemma, il modello open che genera testo “come le immagini”

    DiffusionGemma genera blocchi di testo in parallelo invece di un token alla volta. È open source, gira anche su GPU consumer e può superare i 1.000 token al secondo

  • giovedì 11 giugno 2026·venturebeat.com

    Google's DiffusionGemma runs text 4x faster

    Google's open-source diffusion language model generates 256 tokens in parallel and self-corrects, hitting 4x speed on one GPU at a cost to quality.

  • venerdì 12 giugno 2026·computerworld.com

    Google unveils DiffusionGemma, an AI model that breaks free of left-to-right processing

    Rather than generating text word by word, Google's experimental open-source model drafts entire passages simultaneously using diffusion, resulting in up to 4x faster inference.

  • venerdì 12 giugno 2026·20minutos.es

    Google tiene una nueva IA que escribe texto mucho más rápido: así funciona DiffusionGemma

    Google presenta DiffusionGemma, un modelo de IA experimental y de código abierto que genera texto hasta cuatro veces más rápido que los modelos tradicionales.

  • venerdì 12 giugno 2026·dev.to

    DiffusionGemma: How Google's New Open LLM Hits 1,000 Tokens/sec and Changes Inference Economics

    DiffusionGemma generates text up to 4x faster than autoregressive LLMs, hits 1,000+ tokens/sec on a single H100, and runs on a consumer RTX 4090. Here is what changed, what the…