Phi-2: Microsoft's 2.7B model that beats a 13B

In one sentence Microsoft Research releases Phi-2, 2.7B params trained on 'textbook-quality' data. Beats LLaMA 2 7B and Mistral 7B on reasoning benchmarks, runs on laptops. 'Small + clean data' philosophy.

Verified Official source

ShareLinkedIn X

Microsoft Research publishes Phi-2, a model with just 2.7 billion parameters (5× smaller than LLaMA 7B). The surprise: on reasoning and code benchmarks, Phi-2 beats models 5–10× larger like LLaMA 2 13B and Mistral 7B.

The recipe isn't "more data" but "better data". The team — led by Sebastien Bubeck — trains Phi-2 on a specific mix: synthetic "textbook-style" text generated with GPT-4, code filtered for didactic quality, tightly curated web data. The thesis of the earlier "Textbooks Are All You Need" paper (June 2023) holds at larger scale.

Practical consequences: a 2.7B model runs comfortably on laptop CPU, on a Raspberry Pi 5 with quantization, on modern smartphones. Phi-2 opens the "Small Language Models" (SLM) lane as a local alternative to cloud giants. It paves the way to Phi-3 (April 2024), Phi-3.5, Gemini Nano, Llama 3.2 1B/3B.