Skip to content

projects.gym_baselines.experiments.mujoco.gym_mujoco_inverteddoublependulum_ddppo#

[view_source]

GymMuJoInvertedDoublePendulumConfig#

class GymMuJoInvertedDoublePendulumConfig(GymMuJoCoPPOConfig)

[view_source]

GymMuJoInvertedDoublePendulumConfig.create_model#

 | @classmethod
 | create_model(cls, **kwargs) -> nn.Module

[view_source]

We define our ActorCriticModel agent using a lightweight implementation with separate MLPs for actors and critic, MemorylessActorCritic.

Since this is a model for continuous control, note that the superclass of our model is ActorCriticModel[GaussianDistr] instead of ActorCriticModel[CategoricalDistr], since we'll use a Gaussian distribution to sample actions.