High Foundation Models · 1 min read
InstructGPT: the fine-tuning that teaches GPT to obey
In one sentence OpenAI introduces InstructGPT: a GPT-3 refined with human feedback (RLHF) that follows instructions better than the 175B base model despite being much smaller (1.3B parameters).
Reading level
Through 2021 GPT-3, powerful as it was, had to be coaxed into doing what you asked: prompt engineering was a dark art. OpenAI shows that a much smaller model, trained with feedback from real people, can follow instructions better than its bigger sibling.
The technique is called RLHF (Reinforcement Learning from Human Feedback): humans rank model responses, and those rankings train a second model that acts as a "judge" to align the main one.
It's the recipe that, ten months later, becomes ChatGPT.
Companies
OpenAI
Tools
InstructGPT
Tags
OpenAIInstructGPTRLHFAlignmentFine-tuning
Sources