Skip to content
AImpact
IT EN
High Image & Video Gen · 1 min read

Imagen: Google enters text-to-image generation

In one sentence Google Research unveils Imagen, a text-to-image diffusion model that uses a frozen T5 text encoder and beats DALL-E 2 on benchmarks for photorealistic fidelity.

Verified Official source
ShareLinkedInX
Reading level

A few months after DALL-E 2, Google shows its own image generator: write a sentence, it paints. It's called Imagen.

The new trick is how it understands the text: it uses a large language model, already trained to read and write, and lets it "explain" the sentence to the painter. Simple idea, big effect.

Google doesn't open it to the public, though. No DALL-E-style website, no demos. Fear of fake imagery and problematic content holds the launch back. The result: everyone talks about it, few use it, and in the meantime Stable Diffusion and Midjourney take the audience.

Companies

Google Research

Tools

Imagen

Tags

GoogleImagenText-to-ImageDiffusionT5

Sources