← Reading paths

◦

Reading path

Robotics engineer in the Physical AI era

Foundation models for robots, VLA, Pi0, Figure, Gemini Robotics: the milestones of embodied AI.

You are a robotics engineer or embodied AI researcher who wants to understand how foundation models are radically changing the design of robotic systems: from manual reward engineering to generalist policies trained on heterogeneous data. This path follows the releases that have shifted the boundary between simulation and real-world deployment.

01

Why it matters to you

MuJoCo becomes free: the reference physics simulator opens to the entire community, accelerating research on control policies and reinforcement learning for robots.

October 26, 2020 Medium Robotics

DeepMind acquires MuJoCo and makes it free

DeepMind announces it has acquired MuJoCo, the physics simulator used in most RL and robotics research, and commits to making it free for everyone — a first step toward the full open-source release in 2022.
02

Why it matters to you

Codex shows that transformers trained on code generalize beyond text: the conceptual proof that foundation models can learn behaviors from unstructured data — the premise behind VLAs.

July 7, 2021 High AI Coding

Codex paper: OpenAI publishes HumanEval and the model behind Copilot

OpenAI releases Evaluating Large Language Models Trained on Code describing Codex (the model powering GitHub Copilot) and introduces HumanEval, the standard benchmark for code generation.
03

Why it matters to you

Figure 01 demonstrates a humanoid robot that reasons and plans actions using an LLM in closed loop: the first convincing deployment of language as a planning layer on real hardware.

March 13, 2024 High Robotics

Figure 01 + OpenAI: first end-to-end LLM-driven humanoid demo

Figure publishes a video of its Figure 01 humanoid conversing, recognizing objects, and manipulating them using OpenAI models for language and vision, in an end-to-end pipeline.
04

Why it matters to you

Physical Intelligence's Pi0 is the first true foundation model for generalist robots: a pre-trained cross-embodiment policy that adapts to different tasks with minimal fine-tuning.

October 31, 2024 High Robotics

Physical Intelligence's π0: the first cross-embodiment robotic foundation model

Startup Physical Intelligence (Karol Hausman, Sergey Levine) releases π0, a 3B generalist robotic foundation model trained on 10k+ hours of cross-embodiment data, capable of skills like laundry folding and making coffee.
05

Why it matters to you

Figure's Helix introduces an end-to-end VLA (Vision-Language-Action) on a humanoid: it proves that language-action perceptual alignment scales on complex bodies in unstructured environments.

February 20, 2025 High Robotics

Figure Helix: first generalist VLA driving a full-body humanoid

Figure announces Helix, a proprietary Vision-Language-Action model controlling the Figure 02 humanoid at 200Hz, two robots in collaboration, fingers included. Demos: fold laundry and tidy a kitchen from language alone.
06

Why it matters to you

Pi0.5 extends generalization to real domestic scenes across different morphologies: the signal that robotic foundation models are leaving the lab and moving toward in-the-wild deployment.

March 12, 2025 High Robotics

Physical Intelligence π0.5: first policy that generalizes to new homes

Physical Intelligence publishes π0.5, an evolution of the π0 VLA. New: zero-shot deployment in homes never seen during training (cleaning unknown kitchens, putting groceries away).
07

Why it matters to you

Gemini Robotics integrates Google's multimodal model directly into the control loop: the architecture that unifies visual perception, natural language and motor action in a single model.

November 25, 2025 High Robotics

Gemini Robotics: DeepMind brings foundation models into the physical world

Google DeepMind updates Gemini Robotics and Gemini Robotics-ER: generalist VLAs on Gemini 2 base that drive industrial arms and humanoids (Apptronik Apollo) zero-shot on never-seen tasks.