Skip to content
AImpact
IT EN
High Multimodal AI · 1 min read

Gato: DeepMind tries a single agent for 600+ tasks

In one sentence DeepMind unveils Gato, a 1.2-billion-parameter Transformer that with the same weights plays Atari games, controls a robot arm, captions images and chats.

Verified Official source
ShareLinkedInX
Reading level

Usually each AI does one thing only: one translates, one plays chess, another recognizes images. DeepMind builds Gato to do the opposite: a single neural net doing hundreds of different jobs.

With the same parameters Gato plays old video games, captions photos, has conversations, and moves a robot arm to stack blocks.

It's not as good as a specialist in any of these, but the message is bold: "maybe we don't need a thousand models, we need a generalist". That idea will come back hard in later years when people talk about "AI agents".

Companies

DeepMind

Tools

Gato

Tags

DeepMindGatoGeneralist AgentMultimodalTransformer

Sources