projects.manipulathor_baselines.armpointnav_baselines.models.disjoint_arm_pointnav_models
#
Baseline models for use in the Arm Point Navigation task.
Arm Point Navigation is currently available as a Task in ManipulaTHOR.
DisjointArmPointNavBaselineActorCritic
#
class DisjointArmPointNavBaselineActorCritic(ActorCriticModel[CategoricalDistr])
Disjoint Baseline recurrent actor critic model for armpointnav.
Attributes
action_space
: The space of actions available to the agent. Currently only discrete actions are allowed (so this space will always be of typegym.spaces.Discrete
).observation_space
: The observation space expected by the agent. This observation space should include (optionally) 'rgb' images and 'depth' images and is required to have a component corresponding to the goalgoal_sensor_uuid
.goal_sensor_uuid
: The uuid of the sensor of the goal object. SeeGoalObjectTypeThorSensor
as an example of such a sensor.hidden_size
: The hidden size of the GRU RNN.object_type_embedding_dim
: The dimensionality of the embedding corresponding to the goal object type.
DisjointArmPointNavBaselineActorCritic.__init__
#
| __init__(action_space: gym.spaces.Discrete, observation_space: SpaceDict, hidden_size=512, obj_state_embedding_size=512, trainable_masked_hidden_state: bool = False, num_rnn_layers=1, rnn_type="GRU")
Initializer.
See class documentation for parameter definitions.
DisjointArmPointNavBaselineActorCritic.recurrent_hidden_state_size
#
| @property
| recurrent_hidden_state_size() -> int
The recurrent hidden state size of the model.
DisjointArmPointNavBaselineActorCritic.num_recurrent_layers
#
| @property
| num_recurrent_layers() -> int
Number of recurrent hidden layers.
DisjointArmPointNavBaselineActorCritic.forward
#
| forward(observations: ObservationType, memory: Memory, prev_actions: torch.Tensor, masks: torch.FloatTensor) -> Tuple[ActorCriticOutput[DistributionType], Optional[Memory]]
Processes input batched observations to produce new actor and critic
values. Processes input batched observations (along with prior hidden
states, previous actions, and masks denoting which recurrent hidden
states should be masked) and returns an ActorCriticOutput
object
containing the model's policy (distribution over actions) and
evaluation of the current state (value).
Parameters
- observations : Batched input observations.
- memory :
Memory
containing the hidden states from initial timepoints. - prev_actions : Tensor of previous actions taken.
- masks : Masks applied to hidden states. See
RNNStateEncoder
. Returns
Tuple of the ActorCriticOutput
and recurrent hidden state.