May 28, 2025 High Multimodal AI · 1 min read

Llama 4 Scout: 109B multimodal MoE with 10M context and vision SOTA

In one sentence Meta releases Llama 4 Scout, a 109B MoE model with 17B active parameters, 10M token context, multiple image support, and vision SOTA benchmarks among open models.

Verified Official source

ShareLinkedIn X

Reading level

Llama 4 Scout is the first Llama that truly sees images well. With 109 billion total parameters but only 17 billion active at a time (Mixture of Experts architecture), it's as efficient as a smaller model. The 10 million token context window is unprecedented: you can give it hours of video, hundreds of images, or enormous documents. It sets new records among open-source models in visual comprehension, bringing Llama to the top tier of multimodal AI.

Companies