OLMo 2: fully open model that surpasses Llama 3.1 while maintaining transparency
In one sentence AllenAI releases OLMo 2 at 7B and 13B with staged mid-training and specialized data mixing, outperforming Llama 3.1 and Qwen 2.5 on instruction following while preserving full transparency on data, code, and checkpoints.
The first OLMo from 2024 was a completely open model, but it wasn't the most capable in its class. AllenAI corrected course with OLMo 2: this time not just maximum transparency, but also competitive performance with the best open models of the moment.
The main novelty is how it was trained: instead of one long training run on all data, OLMo 2 is trained in phases, with different specific data mixes for each phase. It's like learning the fundamentals first and then specializing progressively, rather than mixing everything together from the start.
The result is a model that on certain comprehension and instruction-following tests surpasses models like Meta's Llama 3.1 and Alibaba's Qwen 2.5 — and all with completely public data, code, and checkpoints.
Companies
AllenAI
Tools
—
Tags
Sources