Reading path
Open-source developer in the model era
Llama, Mistral, Gemma, DeepSeek: the history of open weights that matter.
You are a developer who does not want to depend on closed APIs and believes in the value of open-weight models: you can inspect the weights, fine-tune, and run everything locally. This path follows the open community's history — from EleutherAI and GPT-Neo through Llama 4 and DeepSeek — covering the milestones that redefined what you can do without paying tokens to anyone.
- 01
Why it matters to you
EleutherAI's first large open-weight model: proof that open research could challenge OpenAI at scale, before Hugging Face became the center of the world.
High Open Source ModelsGPT-Neo: the first open source clone of GPT-3
EleutherAI releases GPT-Neo 1.3B and 2.7B, open source language models trained on The Pile — the first serious attempt to replicate the GPT-3 architecture with public weights.
- 02
Why it matters to you
176 billion parameters, trained collaboratively by hundreds of researchers: the model that showed a distributed community can compete with private labs.
High Open Source ModelsBLOOM 176B: the first truly open large multilingual LLM
The BigScience collective releases BLOOM, a 176-billion-parameter model trained on 46 human languages and 13 programming languages, under an open RAIL license.
- 03
Why it matters to you
The leak that changed everything: LLaMA weights circulate freely and within weeks anyone can fine-tune on a laptop. The modern open ecosystem is born.
High Open Source ModelsLLaMA: Meta opens foundation models to research
Meta releases LLaMA in four sizes (7B, 13B, 33B, 65B), available to researchers on request. One week later, the weights leak publicly.
- 04
Why it matters to you
7 billion parameters that beat Llama 2 13B on almost every benchmark: Mistral proves that efficiency and architecture matter more than raw scale.
High Open Source ModelsMistral 7B: Europe joins the open-source race
Mistral AI (Paris), a three-month-old startup founded by ex-Meta/DeepMind researchers, releases Mistral 7B under Apache 2.0. Beats Llama 2 13B on most benchmarks with half the parameters.
- 05
Why it matters to you
Mixture-of-Experts open to the community: the technique that delivers 70B-quality at 13B inference cost becomes accessible to anyone with a serious GPU.
Landmark Open Source ModelsMixtral 8x7B: open-source Mixture of Experts that beats GPT-3.5
Mistral drops Mixtral 8x7B via magnet link with no warning: SMoE with 8 experts of 7B, 13B active params out of 47B total. Performance matches/exceeds GPT-3.5. Apache 2.0.
- 06
Why it matters to you
Google releases open weights optimized for single-GPU deployment: the signal that even big labs must reckon with the open-weight ecosystem.
High Open Source ModelsGemma: Google enters the open-weights game
Google releases Gemma 2B and 7B, open-weight models derived from Gemini research. For the first time Google competes directly with Llama and Mistral on open ground.
- 07
Why it matters to you
An open-weight model with reasoning competitive with OpenAI o1, trained at a fraction of the cost: the strongest proof yet that open source has reached the frontier level.
Landmark Open Source ModelsDeepSeek-R1: open reasoning matches o1 at 1/30 the cost
Chinese startup DeepSeek releases R1, a reasoning model with MIT-licensed open weights. Performance on par with OpenAI o1, API pricing $0.55/$2.19 per 1M tokens (vs o1 $15/$60). Nasdaq AI loses $1T in two days.