Skip to content
AImpact
IT EN
Medium Multimodal AI · 1 min read

OpenFlamingo (LAION/UW): open reproduction of Flamingo with multi-image few-shot visual learning

In one sentence LAION and University of Washington release OpenFlamingo, an open-source reproduction of DeepMind's Flamingo: few-shot visual learning from image+text examples, available in 3B and 9B parameter variants. The first open model enabling multimodal research without API costs.

Needs review Reputable source
ShareLinkedInX
Reading level

In 2022, DeepMind introduced Flamingo — an extraordinary model that could learn new visual tasks simply by looking at a few examples in the conversation. Show it three photos of dogs with their breeds, then give it an unknown photo: it figures out what you want it to do.

The problem: Flamingo was proprietary, accessible only through a paid API, with no public code. For university researchers or budget-limited teams, it was practically out of reach.

LAION (the German community that had already built LAION-5B, the massive dataset used to train Stable Diffusion) and the University of Washington set out to reproduce Flamingo using public resources.

The result is OpenFlamingo: two variants (3B and 9B parameters), fully open code, freely downloadable weights. Given a set of examples in the form "image, description, image, description, image, ?", the model completes the pattern.

This few-shot learning capability was innovative because it required no additional fine-tuning — you just needed to structure the prompt correctly. For the first time, researchers without access to massive GPU clusters could work on this class of models.

Companies

LAION, University of Washington

Tools

Tags

OpenFlamingoFlamingoopen sourcefew-shotvisual Q&ALAIONmulti-image

Sources