Reading level
OpenAI ships two models on the same day: DALL·E, which paints images from a text description, and CLIP, which figures out which caption best fits an image.
DALL·E isn't public yet, but the demos (an avocado armchair, a daikon walking a dog) spread everywhere. CLIP gets open-sourced and immediately becomes a building block of half the generative research that follows.
This is the first time a computer "sees" and "writes" using the same logic. From here on, every generative image model uses CLIP variants to understand prompts.
Companies
OpenAI
Tools
DALL-E, CLIP
Tags
OpenAIDALL-ECLIPText-to-ImageMultimodal
Sources