Llama 4: Meta moves to MoE and native multimodal, but the community is unimpressed

In one sentence Meta releases Llama 4 Scout (17B active/109B total) and Maverick (17B/400B), multimodal MoEs with 10M context for Scout. Behemoth (2T) in training. Benchmark claims contested by the community.

Needs review Official source

ShareLinkedIn X

Meta moves to Llama 4 with a major architectural change: for the first time it uses Mixture-of-Experts, like DeepSeek and Mistral. Three models announced:

Scout: the small one, 109B total parameters but only 17B active. Promises a 10M-token context window (Meta calls it "industry leading");
Maverick: the medium one, 400B total, 17B active. Pitched as a general "worker";
Behemoth: the giant, 2 trillion parameters, still training at announcement time.

Issues: right after release, independent developers test the models and find real performance below Meta's claims. The version on LMArena turns out to be an "experimental variant" different from the public weights. A debate opens up on benchmark gaming and trust in vendor numbers.