April 29, 2022 High Multimodal AI · 1 min read

DeepMind Flamingo: the first few-shot visual language model

In one sentence Flamingo brings few-shot learning to vision: SOTA on VQA and captioning with no task-specific fine-tuning.

Verified Official source

ShareLinkedIn X

Reading level

Flamingo is a model from DeepMind that understands both text and images together. Remarkably, it can answer questions about images or describe them by seeing only a few examples, without being retrained from scratch. It was the first model to achieve state-of-the-art results on visual benchmarks using just a handful of demonstrations. It paved the way for modern multimodal assistants.

Companies

DeepMind

Tools

Flamingo