Llama 3.3 70B: Meta brings 70B to 405B-level performance via post-training

In one sentence Meta releases Llama 3.3 70B Instruct: same parameter count as 3.1 70B but reported performance close to 405B thanks to a new post-training pipeline — no new base model.

Verified Official source

ShareLinkedIn X

In July 2024 Meta released Llama 3.1 in three sizes: 8B, 70B, and the huge 405B. The 405B was heavy to run (multiple server GPUs needed) but very good.

In December they ship "Llama 3.3 70B": same base model as 3.1 70B (the same 70 billion parameters), but refined with new post-training techniques. The claim: the 70B now nearly matches the 405B on many benchmarks while being 6× lighter.

In practice: anyone with serious hardware (2 H100 GPUs or a Mac Studio M2 Ultra) can run something rivaling the GPT-4-Turbo API at home, free, on-prem.