Skip to content

utils.experiment_utils#

[view_source]

Utility classes and functions for running and designing experiments.

recursive_update#

recursive_update(original: Union[Dict, collections.abc.MutableMapping], update: Union[Dict, collections.abc.MutableMapping])

[view_source]

Recursively updates original dictionary with entries form update dict.

Parameters

  • original : Original dictionary to be updated.
  • update : Dictionary with additional or replacement entries.

Returns

Updated original dictionary.

Builder#

class Builder(tuple,  typing.Generic[ToBuildType])

[view_source]

Used to instantiate a given class with (default) parameters.

Helper class that stores a class, default parameters for that class, and key word arguments that (possibly) overwrite the defaults. When calling this an object of the Builder class it generates a class of type class_type with parameters specified by the attributes default and kwargs (and possibly additional, overwriting, keyword arguments).

Attributes

  • class_type: The class to be instantiated when calling the object.
  • kwargs: Keyword arguments used to instantiate an object of type class_type.
  • default: Default parameters used when instantiating the class.

Builder.__new__#

 | __new__(cls, class_type: ToBuildType, kwargs: Optional[Dict[str, Any]] = None, default: Optional[Dict[str, Any]] = None)

[view_source]

Create a new Builder.

For parameter descriptions see the class documentation. Note that kwargs and default can be None in which case they are set to be empty dictionaries.

Builder.__call__#

 | __call__(**kwargs) -> ToBuildType

[view_source]

Build and return a new class.

Parameters

  • kwargs : additional keyword arguments to use when instantiating the object. These overwrite all arguments already in the self.kwargs and self.default attributes.

Returns

Class of type self.class_type with parameters taken from self.default, self.kwargs, and any keyword arguments additionally passed to __call__.

ScalarMeanTracker#

class ScalarMeanTracker(object)

[view_source]

Track a collection scalar key -> mean pairs.

ScalarMeanTracker.add_scalars#

 | add_scalars(scalars: Dict[str, Union[float, int]], n: int = 1) -> None

[view_source]

Add additional scalars to track.

Parameters

  • scalars : A dictionary of scalar key -> value pairs.

ScalarMeanTracker.pop_and_reset#

 | pop_and_reset() -> Dict[str, float]

[view_source]

Return tracked means and reset.

On resetting all previously tracked values are discarded.

Returns

A dictionary of scalar key -> current mean pairs corresponding to those values added with add_scalars.

LinearDecay#

class LinearDecay(object)

[view_source]

Linearly decay between two values over some number of steps.

Obtain the value corresponding to the ith step by calling an instantiation of this object with the value i.

Parameters

  • steps : The number of steps over which to decay.
  • startp : The starting value.
  • endp : The ending value.

LinearDecay.__init__#

 | __init__(steps: int, startp: float = 1.0, endp: float = 0.0) -> None

[view_source]

Initializer.

See class documentation for parameter definitions.

LinearDecay.__call__#

 | __call__(epoch: int) -> float

[view_source]

Get the decayed value for epoch number of steps.

Parameters

  • epoch : The number of steps.

Returns

Decayed value for epoch number of steps.

set_deterministic_cudnn#

set_deterministic_cudnn() -> None

[view_source]

Makes cudnn deterministic.

This may slow down computations.

set_seed#

set_seed(seed: Optional[int] = None) -> None

[view_source]

Set seeds for multiple (cpu) sources of randomness.

Sets seeds for (cpu) pytorch, base random, and numpy.

Parameters

  • seed : The seed to set. If set to None, keep using the current seed.

EarlyStoppingCriterion#

class EarlyStoppingCriterion(abc.ABC)

[view_source]

Abstract class for class who determines if training should stop early in a particular pipeline stage.

EarlyStoppingCriterion.__call__#

 | @abc.abstractmethod
 | __call__(stage_steps: int, total_steps: int, training_metrics: ScalarMeanTracker, test_valid_metrics: List[Tuple[str, int, Union[float, np.ndarray]]]) -> bool

[view_source]

Returns True if training should be stopped early.

Parameters

  • stage_steps: Total number of steps taken in the current pipeline stage.
  • total_steps: Total number of steps taken during training so far (includes steps taken in prior pipeline stages).
  • training_metrics: Metrics recovered over some fixed number of steps (see the metric_accumulate_interval attribute in the TrainingPipeline class) training.
  • test_valid_metrics: A tuple (key, steps, value) where key is the metric's name prefixed by either "valid/" or "test/", steps is the total number of steps that the validation/test model was trained for, and value is the value of the metric.

NeverEarlyStoppingCriterion#

class NeverEarlyStoppingCriterion(EarlyStoppingCriterion)

[view_source]

Implementation of EarlyStoppingCriterion which never stops early.

OffPolicyPipelineComponent#

class OffPolicyPipelineComponent(NamedTuple)

[view_source]

An off-policy component for a PipeLineStage.

Attributes

  • data_iterator_builder: A function to instantiate a Data Iterator (with a next(self) method)
  • loss_names: list of unique names assigned to off-policy losses
  • updates: number of off-policy updates between on-policy rollout collections
  • loss_weights: A list of floating point numbers describing the relative weights applied to the losses referenced by loss_names. Should be the same length as loss_names. If this is None, all weights will be assumed to be one.
  • data_iterator_kwargs_generator: Optional generator of keyword arguments for data_iterator_builder (useful for distributed training. It takes a cur_worker int value, a rollouts_per_worker list of number of samplers per training worker, and an optional random seed shared by all workers, which can be None.

PipelineStage#

class PipelineStage(object)

[view_source]

A single stage in a training pipeline.

Attributes

  • loss_name: A collection of unique names assigned to losses. These will reference the Loss objects in a TrainingPipeline instance.
  • max_stage_steps: Either the total number of steps agents should take in this stage or a Callable object (e.g. a function)
  • early_stopping_criterion: An EarlyStoppingCriterion object which determines if training in this stage should be stopped early. If None then no early stopping occurs. If early_stopping_criterion is not None then we do not guarantee reproducibility when restarting a model from a checkpoint (as the EarlyStoppingCriterion object may store internal state which is not saved in the checkpoint).
  • loss_weights: A list of floating point numbers describing the relative weights applied to the losses referenced by loss_name. Should be the same length as loss_name. If this is None, all weights will be assumed to be one.
  • teacher_forcing: If applicable, defines the probability an agent will take the expert action (as opposed to its own sampled action) at a given time point.

TrainingPipeline#

class TrainingPipeline(object)

[view_source]

Class defining the stages (and global parameters) in a training pipeline.

The training pipeline can be used as an iterator to go through the pipeline stages in, for instance, a loop.

Attributes

  • named_losses: Dictionary mapping a the name of a loss to either an instantiation of that loss or a Builder that, when called, will return that loss.
  • pipeline_stages: A list of PipelineStages. Each of these define how the agent will be trained and are executed sequentially.
  • optimizer_builder: Builder object to instantiate the optimizer to use during training.
  • num_mini_batch: The number of mini-batches to break a rollout into.
  • update_repeats: The number of times we will cycle through the mini-batches corresponding to a single rollout doing gradient updates.
  • max_grad_norm: The maximum "inf" norm of any gradient step (gradients are clipped to not exceed this).
  • num_steps: Total number of steps a single agent takes in a rollout.
  • gamma: Discount factor applied to rewards (should be in [0, 1]).
  • use_gae: Whether or not to use generalized advantage estimation (GAE).
  • gae_lambda: The additional parameter used in GAE.
  • save_interval: The frequency with which to save (in total agent steps taken). If None then no checkpoints will be saved. Otherwise, in addition to the checkpoints being saved every save_interval steps, a checkpoint will always be saved at the end of each pipeline stage. If save_interval <= 0 then checkpoints will only be saved at the end of each pipeline stage.
  • metric_accumulate_interval: The frequency with which training/validation metrics are accumulated (in total agent steps). Metrics accumulated in an interval are logged (if should_log is True) and used by the stage's early stopping criterion (if any).
  • should_log: True if metrics accumulated during training should be logged to the console as well as to a tensorboard file.
  • lr_scheduler_builder: Optional builder object to instantiate the learning rate scheduler used through the pipeline.

TrainingPipeline.__init__#

 | __init__(named_losses: Dict[str, Union[Loss, Builder[Loss]]], pipeline_stages: List[PipelineStage], optimizer_builder: Builder[optim.Optimizer], num_mini_batch: int, update_repeats: int, max_grad_norm: float, num_steps: int, gamma: float, use_gae: bool, gae_lambda: float, advance_scene_rollout_period: Optional[int], save_interval: Optional[int], metric_accumulate_interval: int, should_log: bool = True, lr_scheduler_builder: Optional[Builder[optim.lr_scheduler._LRScheduler]] = None)

[view_source]

Initializer.

See class docstring for parameter definitions.