H-Zero: Cross-Humanoid Locomotion Pretraining Enables Few-shot Novel Embodiment Transfer
Published:
Humanoid robots have long symbolized the promise of versatile, general‑purpose machines — capable of walking, balancing, and interacting in human environments. But as any robotics researcher will tell you, getting a humanoid to walk well is hard: even state‑of‑the‑art controllers tend to be painstakingly tuned for each specific robot design, with custom rewards, dynamics parameters, and training regimes. Changing the embodiment — a longer leg, a different joint layout or mass distribution — can mean starting from scratch.
What if we could capture the essence of locomotion across many robots, and reuse that knowledge to quickly adapt to new ones? That’s the promise of H‑Zero, a cross‑humanoid pretraining framework that learns a generalized base walking policy and enables zero‑shot and few‑shot transfer to novel robots with minimal fine‑tuning.
The Core Challenge: From Specific Controllers to Shared Skills
Modern deep reinforcement learning has delivered impressive locomotion behaviors in simulation and in the real world, but most methods treat each robot as its own task. This creates two bottlenecks:
- Policies are morphology‑specific, trained from scratch for each robot.
- Small changes in shape or dynamics degrade performance sharply.
With the rapid rise of new humanoid designs and customized platforms, this brittleness limits scalability. H‑Zero asks: can we learn a policy that understands “locomotion” at an abstract level?
H‑Zero at a Glance: Unified Pretraining for Cross‑Embodiment Control

H‑Zero reframes humanoid control as a cross‑embodiment learning problem:
Unified Control Semantics Different robots speak different “control languages” — joint definitions, action dimensions, observation spaces. H‑Zero introduces transformation layers that standardize these inputs and outputs into a shared action/observation space, enabling a single policy to operate across many designs.
Diverse Embodiment Pretraining The framework pretrains the policy on a curriculum of varied robots, both in morphology and dynamics, using domain randomization, varied physics parameters, and mixed real‑world and simulated conditions. The result is a base policy that captures shared locomotion strategies, rather than overfitting to any one robot.
Zero‑Shot & Few‑Shot Transfer When presented with a new robot, H‑Zero can often produce a stable walking gait without any additional training (zero‑shot), and with just a few minutes of fine‑tuning it rapidly reaches performance close to scratch‑trained controllers — all with far fewer samples.
What We Found: Transfer That Works
In experiments across many humanoid models with diverse kinematics and dynamics:
- The pretrained policy maintained up to 81% of full episode performance on unseen robots without retraining.
- With just 30 minutes of fine‑tuning, the policy adapted to new humanoids and even upright quadrupeds, reaching stability and control quality comparable to bespoke controllers.
These results suggest something powerful: rather than viewing each robot as an isolated control problem, we can leverage shared structure in locomotion to build transferable skills — bridging robots with very different physical attributes.
Why This Matters
Humanoid locomotion sits at the intersection of reinforcement learning, control theory, and real‑world robotics. H‑Zero contributes to each of these areas:
- Reinforcement learning benefits from a move toward foundation policies for embodied agents, akin to pretrained models in vision and language.
- Robotics engineering gains a tool to reduce the tuning burden for new hardware designs.
- Sim‑to‑real transfer becomes more reliable when the policy already understands a broad range of dynamics.
In essence, H‑Zero turns locomotion into a shared skill rather than a bespoke product of tuning — a shift that we believe will accelerate progress in general‑purpose robotic control.
Looking Ahead
The H‑Zero paradigm opens several exciting directions:
- Can these pretrained policies be combined with perception for end‑to‑end embodied navigation?
- What happens when we include manipulation tasks along with locomotion?
- How far can the transfer go — to radically different bodies or real‑world terrains?
By focusing on generalization and transfer, H‑Zero points toward a future where robots can learn once and adapt anywhere.
Cite
@article{lin2025h,
title={H-Zero: Cross-Humanoid Locomotion Pretraining Enables Few-shot Novel Embodiment Transfer},
author={Lin, Yunfeng and Liu, Minghuan and Xue, Yufei and Zhou, Ming and Yu, Yong and Pang, Jiangmiao and Zhang, Weinan},
journal={arXiv preprint arXiv:2512.00971},
year={2025}
}