Skip to content

Project Documentation

This is the technical documentation for a research workspace that reproduces, analyzes, and extends the ETH Zurich Robotic World Model pipeline for quadruped locomotion, with the eventual target of Unitree Go2 integration.

The project builds on two papers by Li, Krause, and Hutter (ETH Zurich):

  1. Robotic World Model (RWM) with the MBPO-PPO policy optimizer, trained online with environment interaction.
  2. Uncertainty-Aware Robotic World Model (RWM-U) with the MOPO-PPO policy optimizer, trained fully offline with ensemble-based uncertainty penalization.

The upstream codebase contains two model-based training pipelines that share backbone components but have separate entry scripts, configs, environments, and runners. The manager-based pipeline implements online RWM with MBPO-PPO; the standalone model_based/ pipeline implements offline RWM-U with MOPO-PPO. The project exercises both. The two-pipeline structure is documented in detail in Uncertainty-Aware Implementation Analysis §1.

For high-level project status and the milestone checklist, see the repository README. This documentation site is the technical companion to that README.

Reading paths

Different readers will want different starting points.

Documentation layout

  • Project: goal, repository layout, status, roadmap.
  • Setup: local environment, hardware specs, lab workstation migration.
  • Validation: what has been executed locally and what has been verified.
  • Robotic World Model: paper, code, synthesis, task structure, runtime flow, and the cross-paper bridge to RWM-U.
  • Uncertainty-Aware Robotic World Model: paper, code, synthesis for the offline RWM-U + MOPO-PPO pipeline.
  • Development: submodule and fork strategy.

Conventions

The documentation makes two kinds of claims, each with its own vocabulary.

Execution claims describe whether code runs as expected:

  • Validated: a concrete command was executed, the expected outcome was observed, and the run is cited.
  • Validated at reduced scale: a concrete run completed end-to-end with reduced hyperparameters or smaller-than-paper-scale assets, sufficient to confirm the code path but not the paper's quantitative results.
  • Qualitatively validated at reduced scale: a paper claim has been supported in trend or sign at reduced scale, without reproducing the paper's quantitative numbers.
  • Structurally understood: the code path has been traced and is consistent with its expected role, but no execution has been run end-to-end.
  • Not verified: the claim has neither been executed nor traced sufficiently to make a judgment.

Mapping claims describe whether code corresponds to what the paper states:

  • Mapped: the paper concept has been identified in the code, with a file path or symbol reference.
  • Partially mapped: the concept is identified but the code differs from the paper in scope or simplification, and the difference is documented.
  • Discrepancy noted: the code diverges from the paper in a meaningful way (for example, a different loss formulation or an inactive component) and the divergence is documented.
  • Not mapped: no code location has been identified yet for the paper concept.

The Reproduction Status page tracks execution claims. The Paper-to-Code Synthesis page (RWM) and the Paper-to-Code Synthesis page (RWM-U) track mapping claims and discrepancies for their respective papers.

Naming convention used throughout: RWM and MBPO-PPO refer to the first paper's method and policy optimizer; RWM-U and MOPO-PPO refer to the uncertainty-aware extension and its policy optimizer.