Solver

Model-based planning solvers for action optimization

[ Base Class ]

Solver

Bases: Protocol

Protocol for model-based planning solvers.

configure

configure(
    *, action_space: Space, n_envs: int, config: Any
) -> None

Configure the solver with environment and planning specifications.

Parameters:

action_space (Space) –

The action space of the environment.
n_envs (int) –

Number of parallel environments.
config (Any) –

Planning configuration object.

solve

solve(
    info_dict: dict, init_action: Tensor | None = None
) -> dict

Solve the planning optimization problem to find optimal actions.

Parameters:

info_dict (dict) –

Dictionary containing environment state information.
init_action (Tensor | None, default: None ) –

Optional initial action sequence to warm-start the solver.

Returns:

dict –

Dictionary containing optimized actions and other solver-specific info.

action_dim `property`

action_dim: int

Flattened action dimension including action_block grouping.

n_envs `property`

n_envs: int

Number of parallel environments being planned for.

horizon `property`

horizon: int

Planning horizon length in timesteps.

[ Implementations ]

CEMSolver

CEMSolver(
    model: Costable,
    batch_size: int = 1,
    num_samples: int = 300,
    var_scale: float = 1,
    n_steps: int = 30,
    topk: int = 30,
    device: str | device = 'cpu',
    seed: int = 1234,
)

Cross Entropy Method solver for action optimization.

Parameters:

model (Costable) –

World model implementing the Costable protocol.
batch_size (int, default: 1 ) –

Number of environments to process in parallel.
num_samples (int, default: 300 ) –

Number of action candidates to sample per iteration.
var_scale (float, default: 1 ) –

Initial variance scale for the action distribution.
n_steps (int, default: 30 ) –

Number of CEM iterations.
topk (int, default: 30 ) –

Number of elite samples to keep for distribution update.
device (str | device, default: 'cpu' ) –

Device for tensor computations.
seed (int, default: 1234 ) –

Random seed for reproducibility.

configure

configure(
    *, action_space: Space, n_envs: int, config: Any
) -> None

Configure the solver with environment specifications.

solve

solve(
    info_dict: dict, init_action: Tensor | None = None
) -> dict

Solve the planning problem using Cross Entropy Method.

MPPISolver

MPPISolver(
    model: Costable,
    batch_size: int = 1,
    num_samples: int = 300,
    var_scale: float = 1.0,
    n_steps: int = 30,
    topk: int = 30,
    temperature: float = 0.5,
    device: str | device = 'cpu',
    seed: int = 1234,
)

Model Predictive Path Integral solver for action optimization.

Parameters:

model (Costable) –

World model implementing the Costable protocol.
batch_size (int, default: 1 ) –

Number of environments to process in parallel.
num_samples (int, default: 300 ) –

Number of action candidates to sample per iteration.
var_scale (float, default: 1.0 ) –

Initial variance scale for action noise.
n_steps (int, default: 30 ) –

Number of MPPI iterations.
topk (int, default: 30 ) –

Number of elite samples for weighted averaging.
temperature (float, default: 0.5 ) –

Temperature parameter for softmax weighting.
device (str | device, default: 'cpu' ) –

Device for tensor computations.
seed (int, default: 1234 ) –

Random seed for reproducibility.

configure

configure(
    *, action_space: Space, n_envs: int, config: Any
) -> None

Configure the solver with environment specifications.

solve

solve(
    info_dict: dict, init_action: Tensor | None = None
) -> dict

Solve the planning problem using MPPI.

GradientSolver

GradientSolver(
    model: Costable,
    n_steps: int,
    batch_size: int | None = None,
    var_scale: float = 1,
    num_samples: int = 1,
    action_noise: float = 0.0,
    device: str | device = 'cpu',
    seed: int = 1234,
    optimizer_cls: type[Optimizer] = SGD,
    optimizer_kwargs: dict | None = None,
)

Bases: Module

Gradient-based solver using backpropagation through the world model.

Parameters:

model (Costable) –

World model implementing the Costable protocol.
n_steps (int) –

Number of gradient descent iterations.
batch_size (int | None, default: None ) –

Number of environments to process in parallel.
var_scale (float, default: 1 ) –

Initial variance scale for action perturbations.
num_samples (int, default: 1 ) –

Number of action samples to optimize in parallel.
action_noise (float, default: 0.0 ) –

Noise added to actions during optimization.
device (str | device, default: 'cpu' ) –

Device for tensor computations.
seed (int, default: 1234 ) –

Random seed for reproducibility.
optimizer_cls (type[Optimizer], default: SGD ) –

PyTorch optimizer class to use.
optimizer_kwargs (dict | None, default: None ) –

Keyword arguments for the optimizer.

configure

configure(
    *, action_space: Space, n_envs: int, config: Any
) -> None

Configure the solver with environment specifications.

solve

solve(
    info_dict: dict, init_action: Tensor | None = None
) -> dict

Solve the planning problem using gradient descent.

PGDSolver

PGDSolver(
    model: Costable,
    n_steps: int,
    batch_size: int | None = None,
    var_scale: float = 1,
    num_samples: int = 1,
    action_noise: float = 0.0,
    device: str | device = 'cpu',
    seed: int = 1234,
)

Bases: Module

Projected Gradient Descent solver for discrete action optimization.

Parameters:

model (Costable) –

World model implementing the Costable protocol.
n_steps (int) –

Number of gradient descent iterations.
batch_size (int | None, default: None ) –

Number of environments to process in parallel.
var_scale (float, default: 1 ) –

Initial variance scale for action perturbations.
num_samples (int, default: 1 ) –

Number of action samples to optimize in parallel.
action_noise (float, default: 0.0 ) –

Noise added to actions during optimization.
device (str | device, default: 'cpu' ) –

Device for tensor computations.
seed (int, default: 1234 ) –

Random seed for reproducibility.

configure

configure(
    *, action_space: Space, n_envs: int, config: Any
) -> None

Configure the solver with environment specifications.

solve

solve(
    info_dict: dict,
    init_action: Tensor | None = None,
    from_scalar: bool = False,
) -> dict

Solve the planning problem using projected gradient descent.

Solver

[ Base Class ]

Solver

configure

solve

action_dim property

n_envs property

horizon property

[ Implementations ]

CEMSolver

configure

solve

MPPISolver

configure

solve

GradientSolver

configure

solve

PGDSolver

configure

solve

action_dim `property`

n_envs `property`

horizon `property`