Skip to content
AImpact
IT EN
← Reading paths

Reading path

Creator, marketing and content

From the first generated image to real-time AI video.

You are a designer, content creator, copywriter or marketer and you need to understand how generative AI is rewriting your workflow. This path covers the key jumps from image (DALL·E 2, Stable Diffusion, Midjourney) to voice (ElevenLabs) to video (Sora, Veo 3) and conversational multimodal (GPT-4o).

  1. 01

    Why it matters to you

    First time a text-to-image output becomes indistinguishable from a real photograph: the stock photo and craft debate is born here.

    High Image & Video Gen

    DALL·E 2: the quality leap in image generation

    OpenAI announces DALL·E 2, a diffusion-based text-to-image model producing photorealistic 1024×1024 images. Initially waitlist-only, public access in July.

  2. 02

    Why it matters to you

    Open weights make image generation free and customizable: creators start fine-tuning their own visual style.

    Landmark Image & Video Gen

    Stable Diffusion: image generation goes open

    Stability AI publicly releases weights and code of a text-to-image latent diffusion model that runs on a consumer GPU. AI image generation leaves the cloud.

  3. 03

    Why it matters to you

    Defines a recognizable, mainstream aesthetic: it permanently changes moodboards, concept art and editorial illustration.

    High Image & Video Gen

    Midjourney opens public beta on Discord

    Midjourney opens its public beta with a text-to-image model accessible via a Discord bot. Its strong aesthetic default and community turn image generation into a mass phenomenon.

  4. 04

    Why it matters to you

    The first long, coherent, cinematic AI video: storyboards and visual pitches will never be the same.

    Landmark Image & Video Gen

    Sora: OpenAI shows cinema-quality AI video

    OpenAI announces Sora, a text-to-video model producing 1080p clips up to 60 seconds with temporal consistency, plausible physics, and realistic camera moves. Limited release to red-teamers and selected artists.

  5. 05

    Why it matters to you

    Native multimodality in chat: you go from brief to images, audio and variations in a single session, without tool switching.

    High Multimodal AI

    GPT-4o: text, voice and images in a single model

    OpenAI unveils GPT-4o (omni), a single model that natively handles text, audio, and images with ~320 ms voice latency and GPT-4-class text quality — free for ChatGPT free users.

  6. 06

    Why it matters to you

    Generative video finally usable by the public for real projects, not just demos.

    High Image & Video Gen

    Sora Turbo: ten months after the demo, OpenAI ships video gen to the public

    OpenAI ships Sora Turbo to ChatGPT Plus/Pro users: videos up to 20s, 1080p, image-to-video, remix, storyboard. Faster, less faithful version than the February Sora demo.

  7. 07

    Why it matters to you

    Veo 3 raises the bar of video photorealism: it gets hard to tell an AI ad from a traditional one, with all that implies for your craft.

    High Image & Video Gen

    Veo 3 at Google I/O: video generation with native synced audio

    At Google I/O 2025, DeepMind unveils Veo 3 (video gen with native audio, dialogue, effects), Imagen 4 (more detailed images), and Flow (AI video tool for creators).