Autoregressive Generation

15d

Google's DiffusionGemma generates 256 tokens in parallel and self-corrects as it goes

Google's open-source diffusion language model generates 256 tokens in parallel and self-corrects, hitting 4x speed on one GPU ...

Developer Tech

NVIDIA: DFlash block diffusion accelerates autoregressive LLMs

Deploying DFlash block diffusion on NVIDIA hardware accelerates autoregressive LLMs during latency-sensitive inference.

16d

Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

Another day, another AI model from Google. This time, Google DeepMind has released a new member of the Gemma 4 open model ...

Forbes

Beyond Autoregression: A New Model For Text Generation

Every time a language model like GPT-4, Claude or Mistral generates a sentence, it does something deceptively simple: It picks one word at a time. This word-by-word approach is what gives ...

XDA Developers on MSN

I tried Google's new DiffusionGemma, and watching it generate text like an image is unlike any local LLM

Google recently released DiffusionGemma, and it's weird in the best way.

Crypto Briefing

DiffusionGemma offers 4x faster output with simultaneous text generation

DiffusionGemma generates text up to 4x faster than traditional models by producing entire blocks simultaneously, achieving roughly 1,479 tokens per second.

Tech Times

Google’s DiffusionGemma Generates Text 4x Faster: Diffusion Replaces Token-by-Token Output

Google DeepMind released DiffusionGemma on June 10, 2026, an experimental open-weights model that writes text using discrete ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results