Skip to content
AImpact
IT EN
High Open Source Models · 1 min read

Llama 3.2: Meta brings vision and edge to open models

In one sentence Meta releases Llama 3.2 in 4 sizes: 1B and 3B for edge/mobile, 11B and 90B multimodal (vision). First time Meta seriously enters open multimodal + on-device.

Verified Official source
ShareLinkedInX
Reading level

Meta updates the Llama family with two big additions. First: two very small models (1 and 3 billion parameters) designed to run on a phone or Raspberry Pi. Second: for the first time Llama "sees": the 11B and 90B versions accept images as input, so you can show them a chart, a receipt, a photo and ask questions.

For open-source developers this matters: until now, doing vision with an open model meant stitching pieces together (Llava, Bunny, etc.) of variable quality. Now there's an official Meta baseline comparable to GPT-4o on the vision side.

A note: the vision models (11B and 90B) are not distributed in the EU due to regulatory issues (AI Act), opening a debate on how much European regulation is slowing access to open models.

Companies

Meta

Tools

Llama 3.2 1B, Llama 3.2 3B, Llama 3.2 11B Vision, Llama 3.2 90B Vision

Tags

MetaLlama 3.2MultimodalVisionEdge AI

Sources