Reading path
Videomakers and motion designers
From the first AI text-to-clip to cinematic generative photorealism.
You are a videomaker, motion designer, or creative director and want to understand how generative video is reshaping creative production. This path follows the key jumps: from Meta and Google's early text-to-video experiments to commercial models (Sora, Veo, Runway, Luma) and on to Veo 3's native synchronized audio, where the entire pipeline of a short film becomes accessible to a single author.
- 01
Why it matters to you
The first public model generating coherent clips from text: it proves generative video is a solvable problem, not speculation.
Medium Image & Video GenMake-A-Video: Meta unveils the first credible text-to-video
Meta AI shows Make-A-Video, a system that generates short animated clips from a text description by reusing a pre-existing text-to-image model.
- 02
Why it matters to you
Google confirms the direction with visual quality surpassing predecessors: big tech enters the AI video race in earnest.
Medium Image & Video GenImagen Video and Phenaki: Google answers on text-to-video
A week after Make-A-Video, Google Research unveils Imagen Video and, around the same time, Phenaki: two different approaches to text-to-video, with longer, more coherent clips.
- 03
Why it matters to you
The first long, physically coherent, cinematic AI video: storyboards, B-roll and visual pitches change their nature from this moment on.
Landmark Image & Video GenSora: OpenAI shows cinema-quality AI video
OpenAI announces Sora, a text-to-video model producing 1080p clips up to 60 seconds with temporal consistency, plausible physics, and realistic camera moves. Limited release to red-teamers and selected artists.
- 04
Why it matters to you
The first publicly accessible image-to-video tool with fluid, realistic motion: it enters the real workflow of motion designers and videomakers.
Medium Image & Video GenLuma Dream Machine: the first publicly accessible high-quality video generator
Luma AI launches Dream Machine, a text-to-video model freely accessible via web (with a queue), 5-second 1280×720 clips — the consumer answer to Sora, still unreleased.
- 05
Why it matters to you
Veo 2 raises the bar of cinematic control: definable camera movements, credible physical adherence, production-house quality.
High Image & Video GenGoogle Veo 2 and Imagen 3: the response to Sora Turbo with 4K video and improved physics
Google DeepMind announces Veo 2, a text-to-video model with up to 4K output and 2-minute clips, and updates Imagen 3 — released on VideoFX/ImageFX and later in the Gemini app stack.
- 06
Why it matters to you
OpenAI generative video finally available in production: professional workflows begin integrating AI generation as a real pipeline stage.
High Image & Video GenSora Turbo: ten months after the demo, OpenAI ships video gen to the public
OpenAI ships Sora Turbo to ChatGPT Plus/Pro users: videos up to 20s, 1080p, image-to-video, remix, storyboard. Faster, less faithful version than the February Sora demo.
- 07
Why it matters to you
Veo 3 introduces native synchronized audio (dialogue, SFX, music): for the first time a complete clip comes from a single prompt, revolutionizing pre-production.
High Image & Video GenVeo 3 at Google I/O: video generation with native synced audio
At Google I/O 2025, DeepMind unveils Veo 3 (video gen with native audio, dialogue, effects), Imagen 4 (more detailed images), and Flow (AI video tool for creators).