projects.gym_baselines.experiments.mujoco.gym_mujoco_invertedpendulum_ddppo
#
GymMuJoCoInvertedPendulumConfig
#
class GymMuJoCoInvertedPendulumConfig(GymMuJoCoPPOConfig)
GymMuJoCoInvertedPendulumConfig.create_model
#
| @classmethod
| create_model(cls, **kwargs) -> nn.Module
We define our ActorCriticModel
agent using a lightweight
implementation with separate MLPs for actors and critic,
MemorylessActorCritic.
Since this is a model for continuous control, note that the
superclass of our model is ActorCriticModel[GaussianDistr]
instead of ActorCriticModel[CategoricalDistr]
, since we'll use
a Gaussian distribution to sample actions.