Mixture of Denoisers
A pretraining strategy (UL2, Google 2022) that trains a single model on multiple denoising objectives simultaneously: left-to-right language modeling, span prediction (BERT-style masked spans of varying lengths and corruptions), and prefix language modeling. Unifies the strengths of GPT-style and T5-style pretraining. The model learns when to use each mode based on a sentinel token that signals the objective type.
In practice
A researcher wanting a flexible model for both completion and question answering can use UL2 or a Flan-UL2 checkpoint without choosing between encoder-decoder (T5) and decoder-only (GPT) architectures. In practice, the sentinel token `[S2S]`, `[NLU]`, or `[NLG]` must be prepended to the prompt to activate the correct mode — a detail that significantly impacts performance and is often omitted, causing poor results.
Related terms
Seen in the wild
102 entries mentioning it- MediumRealtime voice AI: sub-second latency and multilingual become the norm
- HighMCP at 18 months: the server ecosystem hits critical mass
- HighRobotics foundation model: a new step toward the "GPT of manipulation"
- HighMistral Small 4: three models (reasoning + vision + coding) fused into one open weight
- MediumNano Banana 2: Google rebuilds its viral image model around consistency and text
- HighGemini 3 Pro and Flash: Google relaunches the frontier challenge
- HighMCP ecosystem 2025: Inspector, UI, registry, and cross-vendor adoption
- MediumClaude Haiku 4.5: the small model that matches May's Sonnet 4
- HighRunway Gen-4: AI video with consistent characters across multiple scenes
- MediumCline: the open-source VS Code coding agent that splits Plan and Act
- HighApollo Research: frontier models 'scheme' in evals — paper published
- MediumLocal AI 2025: Ollama, MLX LM, Apple Foundation Models triple the speed
- LandmarkGPT-5: OpenAI merges fast and reasoning models into an automatic router
- HighCursor Agent and Background Agents: from autocomplete to cloud coding agent
- HighOllama 1.0: first stable release with multimodal, tool calling, and Windows GA
- MediumOllama native vision model support: local VLMs with a one-liner
- HighKimi VL Thinking (Moonshot AI): first open visual model with RL-trained chain-of-thought reasoning
- HighCrossFormer: a single transformer for 20+ robot embodiments with rigorous scaling analysis
- MediumModel Cards 2.0: industry convergence on standardized AI safety reports
- HighLlama 4: Meta moves to MoE and native multimodal, but the community is unimpressed