Storia in 1 fonti

Orthrus: Parallel Token Generation That Doesn't Change Your Model's Output

Orthrus injects diffusion attention into each layer of a frozen autoregressive Transformer to generate 32 tokens in parallel — without altering the base model's output distribution.

Raccontata da

dev.to

Timeline cronologica

giovedì 28 maggio 2026·dev.to
Orthrus: Parallel Token Generation That Doesn't Change Your Model's Output
Orthrus injects diffusion attention into each layer of a frozen autoregressive Transformer to generate 32 tokens in parallel — without altering the base model's output…