ALOHA Unleashed: folding clothes and loading the dishwasher with diffusion policies

In one sentence DeepMind demonstrates zero-shot generalization of diffusion policies on deformable objects like clothes and dishes, tasks where robots had systematically failed until now.

Verified Official source

ShareLinkedIn X

Folding a t-shirt or loading the dishwasher: trivial tasks for a human, but historically impossible for robots. The problem is that clothes and dishes are deformable or variable-shape objects — the robot cannot predict where they will be after touching them, and every attempt can lead to a different configuration.

DeepMind's ALOHA Unleashed uses diffusion policies (the same type of model used to generate AI images) to predict not one trajectory but a distribution of possible trajectories, selecting the one best suited to the current configuration of the object.

The most surprising result is zero-shot generalization: the robot handles clothes and dish configurations never seen during training.