Training Advanced Also known as: Diffusion-based Imitation Learning

Diffusion Policy

An imitation learning method for robots where the policy is a denoising diffusion model: given an observation, it iteratively denoises a random action sequence into the action to execute. Unlike deterministic policies, diffusion policies learn multi-modal action distributions — they handle tasks with multiple valid solutions without averaging them into a bad one. Outperforms behavioral cloning by 46%+ on manipulation benchmarks.

ShareLinkedIn X

In practice

A robotics researcher collecting human demonstrations for an assembly task trains a Diffusion Policy on that data: the model learns that 'place the piece on the left' and 'place it on the right' are both valid solutions and coherently samples one of them, instead of producing the (wrong) average movement as classic behavioral cloning does. Libraries like Columbia's diffusion_policy or Hugging Face's LeRobot offer ready-to-use implementations.

Seen in the wild

4 entries mentioning it

← All terms

In practice

Related terms

Seen in the wild