Models Intermediate Also known as: Autoregressivo

Autoregressive

A model that generates a sequence one element at a time, each time using the previous output as part of the new input.

In practice

It is how every GPT-style LLM works: each new token depends on all the previous ones. It explains why generation is inherently sequential and hard to parallelize, and is the reason behind tricks like speculative decoding to speed it up.

Seen in the wild

3 entries mentioning it

October 20, 2024

EMU3: a single transformer for text, images, and video

High
May 18, 2023

SoundStorm: Google generates 30 seconds of natural dialogue in half a second

High
January 20, 2023

Speculative Decoding: 2-3x LLM inference speedup without changing output

High

← All terms

In practice

Related terms

Seen in the wild