Training Intermediate Also known as: Instruction Fine-Tuning · FLAN-style Tuning

Instruction Tuning

Instruction tuning is a training phase in which an already-pretrained LLM is further optimized on (instruction, expected-response) pairs, structured as natural-language task descriptions. Unlike generic supervised fine-tuning, it explicitly focuses on standardized task descriptions to instill the ability to follow arbitrary commands. Google's FLAN work (2021) showed that training on 60+ diverse tasks dramatically improves zero-shot generalization. It is the technical foundation of models such as ChatGPT, Vicuna, and Flan-T5.

ShareLinkedIn X

In practice

In practice, you prepare a dataset of thousands of examples in the format 'Instruction: … Response: …', often derived from existing NLP benchmarks reformatted as prompts. The base model is then fine-tuned on this data using a standard cross-entropy objective. A developer adapting an open-weights model (e.g., LLaMA) to a specific domain builds a vertical instruction dataset and uses frameworks like LLaMA-Factory, Axolotl, or HuggingFace TRL to run instruction tuning in a few hours on a single GPU.

Related terms

SFT RLHF Fine-tuning Few-shot learning

Seen in the wild

5 entries mentioning it

← All terms