Skip to content
AImpact
IT EN
High Open Source Models · 1 min read

Llama 4: Meta moves to MoE and native multimodal, but the community is unimpressed

In one sentence Meta releases Llama 4 Scout (17B active/109B total) and Maverick (17B/400B), multimodal MoEs with 10M context for Scout. Behemoth (2T) in training. Benchmark claims contested by the community.

Needs review Official source
ShareLinkedInX
Reading level

Meta moves to Llama 4 with a major architectural change: for the first time it uses Mixture-of-Experts, like DeepSeek and Mistral. Three models announced:

  • Scout: the small one, 109B total parameters but only 17B active. Promises a 10M-token context window (Meta calls it "industry leading");
  • Maverick: the medium one, 400B total, 17B active. Pitched as a general "worker";
  • Behemoth: the giant, 2 trillion parameters, still training at announcement time.

Issues: right after release, independent developers test the models and find real performance below Meta's claims. The version on LMArena turns out to be an "experimental variant" different from the public weights. A debate opens up on benchmark gaming and trust in vendor numbers.

Companies

Meta

Tools

Llama 4 Scout, Llama 4 Maverick, Llama 4 Behemoth

Tags

MetaLlama 4MoEMultimodalOpen Source

Sources