Reading level
Flamingo is a model from DeepMind that understands both text and images together. Remarkably, it can answer questions about images or describe them by seeing only a few examples, without being retrained from scratch. It was the first model to achieve state-of-the-art results on visual benchmarks using just a handful of demonstrations. It paved the way for modern multimodal assistants.
Companies
DeepMind
Tools
Flamingo
Tags
Visual Language ModelFew-Shot LearningVQAImage CaptioningDeepMind
Sources