Training Intermediate Also known as: Supervised Fine-Tuning · Fine-tuning supervisionato

SFT

/es-ef-tee/

Fine-tuning where the model learns from input-output pairs written by humans, for example questions with ideal answers.

In practice

It is the first step in turning a base model into an instruction-following assistant. A few thousand high-quality examples are enough for large gains in a domain. In practice it is almost always the first option before moving to RLHF or DPO.

Related terms

Fine-tuning Pretraining RLHF DPO LoRA

Seen in the wild

0 entries mentioning it

No archive entry mentions it explicitly. Appears in broader contexts.

← All terms