Skip to content



Baseline models for use in the Arm Point Navigation task.

Arm Point Navigation is currently available as a Task in ManipulaTHOR.


class DisjointArmPointNavBaselineActorCritic(ActorCriticModel[CategoricalDistr])


Disjoint Baseline recurrent actor critic model for armpointnav.


  • action_space: The space of actions available to the agent. Currently only discrete actions are allowed (so this space will always be of type gym.spaces.Discrete).
  • observation_space: The observation space expected by the agent. This observation space should include (optionally) 'rgb' images and 'depth' images and is required to have a component corresponding to the goal goal_sensor_uuid.
  • goal_sensor_uuid: The uuid of the sensor of the goal object. See GoalObjectTypeThorSensor as an example of such a sensor.
  • hidden_size: The hidden size of the GRU RNN.
  • object_type_embedding_dim: The dimensionality of the embedding corresponding to the goal object type.


 | __init__(action_space: gym.spaces.Discrete, observation_space: SpaceDict, hidden_size=512, obj_state_embedding_size=512, trainable_masked_hidden_state: bool = False, num_rnn_layers=1, rnn_type="GRU")



See class documentation for parameter definitions.


 | @property
 | recurrent_hidden_state_size() -> int


The recurrent hidden state size of the model.


 | @property
 | num_recurrent_layers() -> int


Number of recurrent hidden layers.


 | forward(observations: ObservationType, memory: Memory, prev_actions: torch.Tensor, masks: torch.FloatTensor) -> Tuple[ActorCriticOutput[DistributionType], Optional[Memory]]


Processes input batched observations to produce new actor and critic values. Processes input batched observations (along with prior hidden states, previous actions, and masks denoting which recurrent hidden states should be masked) and returns an ActorCriticOutput object containing the model's policy (distribution over actions) and evaluation of the current state (value).


  • observations : Batched input observations.
  • memory : Memory containing the hidden states from initial timepoints.
  • prev_actions : Tensor of previous actions taken.
  • masks : Masks applied to hidden states. See RNNStateEncoder. Returns

Tuple of the ActorCriticOutput and recurrent hidden state.