Reading level
Phi-1 is a small model — just 1.3 billion parameters — created by Microsoft Research. The surprise is that on programming benchmarks it beats models ten times larger like Codex and StarCoder.
The secret isn't size but data: the team used GPT-4 to generate a collection of synthetic "textbooks" for programming, data much denser in useful concepts than raw GitHub code.
The paper sparked a debate about what really matters in training: the quantity of raw data, or the pedagogical quality of what you show the model?
Companies
Microsoft
Tools
Phi-1
Tags
Phi-1MicrosoftSmall ModelsSynthetic DataHumanEval
Sources