Gemma 2: Google's second-gen open model with Gemini distillation
In one sentence Google releases Gemma 2 (9B and 27B), a second-gen open family with Gemini-derived architecture, soft attention capping, knowledge distillation, and class-leading performance in the <30B range.
Google releases the second generation of its open Gemma family, after February's first. Two sizes: 9 billion and 27 billion parameters, both free to download.
The new trick: instead of training them from scratch, they used "distillation." A much bigger model (an internal Gemini) acted as a "teacher" for Gemma 2, which learned from it directly rather than only from raw data.
Result: Gemma 2 27B beats Llama 3 70B on many benchmarks while being less than half the size.
Companies
Google, Google DeepMind
Tools
Gemma 2, Gemma 2 9B, Gemma 2 27B
Tags
Sources