OpenFlamingo (LAION/UW): open reproduction of Flamingo with multi-image few-shot visual learning

In one sentence LAION and University of Washington release OpenFlamingo, an open-source reproduction of DeepMind's Flamingo: few-shot visual learning from image+text examples, available in 3B and 9B parameter variants. The first open model enabling multimodal research without API costs.

Needs review Reputable source

ShareLinkedIn X

In 2022, DeepMind introduced Flamingo — an extraordinary model that could learn new visual tasks simply by looking at a few examples in the conversation. Show it three photos of dogs with their breeds, then give it an unknown photo: it figures out what you want it to do.

The problem: Flamingo was proprietary, accessible only through a paid API, with no public code. For university researchers or budget-limited teams, it was practically out of reach.

LAION (the German community that had already built LAION-5B, the massive dataset used to train Stable Diffusion) and the University of Washington set out to reproduce Flamingo using public resources.

The result is OpenFlamingo: two variants (3B and 9B parameters), fully open code, freely downloadable weights. Given a set of examples in the form "image, description, image, description, image, ?", the model completes the pattern.

This few-shot learning capability was innovative because it required no additional fine-tuning — you just needed to structure the prompt correctly. For the first time, researchers without access to massive GPU clusters could work on this class of models.