Models Intermediate Also known as: Modello decoder-only · Solo decoder

Decoder-only

A Transformer architecture made up of only the decoder side, where each token looks only at previous tokens to predict the next one.

ShareLinkedIn X

In practice

It is the architecture of GPT, Llama, Mistral, Claude, and basically every modern generative LLM. It contrasts with encoder-only (BERT, for classification) and encoder-decoder (T5, for translation). Its simplicity is the reason it scales so well in pretraining.

Seen in the wild

0 entries mentioning it

No archive entry mentions it explicitly. Appears in broader contexts.

← All terms

In practice

Related terms

Seen in the wild