February 5, 2025 High Foundation Models · 1 min read

Gemini 2.0 Flash GA: Google ships its fast multimodal model to production

In one sentence Google makes Gemini 2.0 Flash generally available, introduces cheaper Flash-Lite, and previews Gemini 2.0 Pro Experimental with a 2M-token context window.

Needs review Official source

ShareLinkedIn X

Reading level

Google ships the Gemini 2.0 family to production after a December 2024 preview. Three variants:

Flash: the workhorse. Fast, cheap, already used in Google AI Studio and Vertex AI;
Flash-Lite: even cheaper, designed for high-volume use cases on a budget;
Pro Experimental: the flagship, with a huge context window (2 million tokens, i.e. it can read a 1,500-page book).

All models are natively multimodal: they understand text, images, audio and video as input. Compared to a year ago, the difference is that they're now fast enough for real-time apps (voice chat, live image analysis).

Companies

Google DeepMind

Tools

Gemini 2.0 Flash, Gemini 2.0 Flash-Lite, Gemini 2.0 Pro