In practice
It is the first step in turning a base model into an instruction-following assistant. A few thousand high-quality examples are enough for large gains in a domain. In practice it is almost always the first option before moving to RLHF or DPO.
Related terms
Seen in the wild
0 entries mentioning itNo archive entry mentions it explicitly. Appears in broader contexts.