ALOHA Unleashed: folding clothes and loading the dishwasher with diffusion policies
In one sentence DeepMind demonstrates zero-shot generalization of diffusion policies on deformable objects like clothes and dishes, tasks where robots had systematically failed until now.
Folding a t-shirt or loading the dishwasher: trivial tasks for a human, but historically impossible for robots. The problem is that clothes and dishes are deformable or variable-shape objects — the robot cannot predict where they will be after touching them, and every attempt can lead to a different configuration.
DeepMind's ALOHA Unleashed uses diffusion policies (the same type of model used to generate AI images) to predict not one trajectory but a distribution of possible trajectories, selecting the one best suited to the current configuration of the object.
The most surprising result is zero-shot generalization: the robot handles clothes and dish configurations never seen during training.
Companies
Google DeepMind
Tools
ALOHA Unleashed, Diffusion Policy, ALOHA 2
Tags
Sources