Phi-1: 1.3B parameters beating models 10x larger on code

In one sentence Microsoft Research releases Phi-1, 1.3B parameters trained on high-quality synthetic data ('textbooks'), outperforming models 10x larger on HumanEval.

Verified Official source

ShareLinkedIn X

Phi-1 is a small model — just 1.3 billion parameters — created by Microsoft Research. The surprise is that on programming benchmarks it beats models ten times larger like Codex and StarCoder.

The secret isn't size but data: the team used GPT-4 to generate a collection of synthetic "textbooks" for programming, data much denser in useful concepts than raw GitHub code.

The paper sparked a debate about what really matters in training: the quantity of raw data, or the pedagogical quality of what you show the model?