A Transformer Decoder does not generate a sentence all at once.

It predicts one token.

Then it feeds that token back and predicts the next one.

That simple loop is the core of modern LLM generation.

Core Idea