A 2D navigation task through doorways

pusht

Description

A 2D navigation task where a circular agent must reach a target position in another room by navigating through doorways. The environment uses PyTorch-based rendering and collision detection with a wall dividing the space into two rooms connected by one or more doors.

The agent starts in one room and must navigate to the target in the opposite room. The task requires planning a path through the door openings rather than simple point-to-point navigation.

Success criteria: The episode terminates when the agent is within 16 pixels of the target.

import stable_worldmodel as swm
world = swm.World('swm/TwoRoom-v1', num_envs=4, image_shape=(128, 128))

Environment Specs

Property Value
Action Space Box(-1, 1, shape=(2,)) — 2D velocity direction
Observation Space Box(0, 224, shape=(10,)) — state vector
Reward 0 (sparse)
Episode Length Until target reached or timeout
Render Size 224×224 (fixed)
Physics Torch-based, 10 Hz control

Fixed Geometry Constants

Constant Value Description
IMG_SIZE 224 Image dimensions
BORDER_SIZE 14 Border thickness
WALL_CENTER 112 Wall position (center)
MAX_DOOR 3 Maximum number of doors
MAX_SPEED 10.5 Maximum agent speed

Observation Details

The observation is a flat state vector of shape (10,):

Index Description
0-1 Agent position (x, y)
2-3 Target position (x, y)
4-9 Door center positions (up to 3 doors × 2 coords)

Info Dictionary

The info dict returned by step() and reset() contains:

Key Description
target Target image (3, H, W) — agent rendered at target position
pos_agent Current agent position as numpy array
pos_target Target position as numpy array
target_pos Target position from variation space
state Agent position tensor
distance_to_target Euclidean distance to target

Variation Space

tworoom_fov

The environment supports extensive customization through the variation space:

Factor Type Description
agent.color RGBBox Agent color (default: red)
agent.radius Box(7, 14) Agent radius in pixels
agent.position Box Starting position (can be either room)
agent.speed Box(1.75, 10.5) Movement speed in pixels/step
target.color RGBBox Target color (default: green)
target.radius Box(7, 14) Target radius in pixels
target.position Box Target position (opposite room from agent)
wall.color RGBBox Wall color (default: black)
wall.thickness Discrete(7, 35) Wall thickness in pixels
wall.axis Discrete(2) 0: horizontal, 1: vertical
wall.border_color RGBBox Border color (default: black)
door.color RGBBox Door color (default: white)
door.number Discrete(1, 3) Number of doors
door.size MultiDiscrete(1, 21) Half-extent size of each door
door.position MultiDiscrete(0, 224) Center position of each door along wall
background.color RGBBox Background color (default: white)
rendering.render_target Discrete(2) Whether to render target (0: no, 1: yes)
task.min_steps Discrete(15, 100) Minimum steps required to reach target

Constraints

  • Agent position: Must not overlap with wall (collision-constrained)
  • Target position: Must be in opposite room from agent
  • Door size: At least one door must fit the agent (door_size ≥ 1.1 × agent_radius)
  • Min steps: Target is sampled such that path length / speed ≥ min_steps

Default Variations

By default, these factors are randomized at each reset:

  • agent.position
  • target.position

To randomize additional factors:

# Randomize colors for domain randomization
world.reset(options={'variation': ['agent.color', 'target.color', 'background.color']})

# Randomize everything
world.reset(options={'variation': ['all']})

Datasets

Name Episodes Policy Download
tworoom_expert 1000 Weak Expert

Expert Policy

This environment includes a built-in weak expert policy for data collection:

from stable_worldmodel.envs.two_room import ExpertPolicy

policy = ExpertPolicy()
world.set_policy(policy)