Skip to content
AImpact
IT EN
High Image & Video Gen · 1 min read

Stable Diffusion 3: Diffusion Transformer architecture and improved text

In one sentence Stability AI announces SD3 with a Multi-Modal Diffusion Transformer (MMDiT) architecture, text rendering competitive with Imagen 2 and DALL-E 3, and visual quality superior to SDXL.

Verified Official source
ShareLinkedInX
Reading level

Stable Diffusion 3 is not an incremental update: it is an architecture change. It abandons the classic UNet of previous models and adopts a Transformer as the main engine — the same type used in text language models.

This brings two concrete advantages: text in images is far more readable and precise, and overall visual quality — composition, proportions, details — improves noticeably compared to Stable Diffusion XL.

The model is announced in early-access preview form, with open weights planned for later. The community awaits it as a potential new open-source standard.

Companies

Stability AI

Tools

Stable Diffusion 3, SD3

Tags

Stability AIStable Diffusion 3MMDiTDiffusion TransformerText Rendering

Sources