Skip to content



Defines the reinforcement learning OnPolicyRLEngine.


class OnPolicyRLEngine(object)


The reinforcement learning primary controller.

This OnPolicyRLEngine class handles all training, validation, and testing as well as logging and checkpointing. You are not expected to instantiate this class yourself, instead you should define an experiment which will then be used to instantiate an OnPolicyRLEngine and perform any desired tasks.


 | __init__(experiment_name: str, config: ExperimentConfig, results_queue: mp.Queue, checkpoints_queue: Optional[
 |             mp.Queue
 |         ], checkpoints_dir: str, mode: str = "train", seed: Optional[int] = None, deterministic_cudnn: bool = False, mp_ctx: Optional[BaseContext] = None, worker_id: int = 0, num_workers: int = 1, device: Union[str, torch.device, int] = "cpu", distributed_port: int = 0, deterministic_agents: bool = False, max_sampler_processes_per_worker: Optional[int] = None, **kwargs, ,)




  • config : The ExperimentConfig defining the experiment to run.
  • output_dir : Root directory at which checkpoints and logs should be saved.
  • seed : Seed used to encourage deterministic behavior (it is difficult to ensure completely deterministic behavior due to CUDA issues and nondeterminism in environments).
  • mode : "train", "valid", or "test".
  • deterministic_cudnn : Whether or not to use deterministic cudnn. If True this may lower training performance this is necessary (but not sufficient) if you desire deterministic behavior.
  • extra_tag : An additional label to add to the experiment when saving tensorboard logs.


 | @staticmethod
 | worker_seeds(nprocesses: int, initial_seed: Optional[int]) -> List[int]


Create a collection of seeds for workers without modifying the RNG state.