Defines the reinforcement learning
The reinforcement learning primary controller.
OnPolicyRLEngine class handles all training, validation, and
testing as well as logging and checkpointing. You are not expected
to instantiate this class yourself, instead you should define an
experiment which will then be used to instantiate an
OnPolicyRLEngine and perform any desired tasks.
| __init__(experiment_name: str, config: ExperimentConfig, results_queue: mp.Queue, checkpoints_queue: Optional[ | mp.Queue | ], checkpoints_dir: str, mode: str = "train", seed: Optional[int] = None, deterministic_cudnn: bool = False, mp_ctx: Optional[BaseContext] = None, worker_id: int = 0, num_workers: int = 1, device: Union[str, torch.device, int] = "cpu", distributed_port: int = 0, deterministic_agents: bool = False, max_sampler_processes_per_worker: Optional[int] = None, **kwargs, ,)
- config : The ExperimentConfig defining the experiment to run.
- output_dir : Root directory at which checkpoints and logs should be saved.
- seed : Seed used to encourage deterministic behavior (it is difficult to ensure completely deterministic behavior due to CUDA issues and nondeterminism in environments).
- mode : "train", "valid", or "test".
- deterministic_cudnn : Whether or not to use deterministic cudnn. If
Truethis may lower training performance this is necessary (but not sufficient) if you desire deterministic behavior.
- extra_tag : An additional label to add to the experiment when saving tensorboard logs.
| @staticmethod | worker_seeds(nprocesses: int, initial_seed: Optional[int]) -> List[int]
Create a collection of seeds for workers without modifying the RNG state.