# `allenact.embodiedai.mapping.mapping_utils.point_cloud_utils`#

[view_source]

### `camera_space_xyz_to_world_xyz`#

``````camera_space_xyz_to_world_xyz(camera_space_xyzs: torch.Tensor, camera_world_xyz: torch.Tensor, rotation: float, horizon: float) -> torch.Tensor
``````

[view_source]

Transforms xyz coordinates in the camera's coordinate frame to world- space (global) xyz frame.

This code has been adapted from https://github.com/devendrachaplot/Neural-SLAM.

IMPORTANT: We use the conventions from the Unity game engine. In particular:

• A rotation of 0 corresponds to facing north.
• Positive rotations correspond to CLOCKWISE rotations. That is a rotation of 90 degrees corresponds to facing east. THIS IS THE OPPOSITE CONVENTION OF THE ONE GENERALLY USED IN MATHEMATICS.
• When facing NORTH (rotation==0) moving ahead by 1 meter results in the the z coordinate increasing by 1. Moving to the right by 1 meter corresponds to increasing the x coordinate by 1. Finally moving upwards by 1 meter corresponds to increasing the y coordinate by 1. Having x,z as the ground plane in this way is common in computer graphics but is different than the usual mathematical convention of having z be "up".
• The horizon corresponds to how far below the horizontal the camera is facing. I.e. a horizon of 30 corresponds to the camera being angled downwards at an angle of 30 degrees.

Parameters

• camera_space_xyzs : A 3xN matrix of xyz coordinates in the camera's reference frame.
• __Here `x, y, z = camera_space_xyzs[__:, i]` should equal the xyz coordinates for the ith point.
• camera_world_xyz : The camera's xyz position in the world reference frame.
• rotation : The world-space rotation (in degrees) of the camera.
• horizon : The horizon (in degrees) of the camera.

Returns

`3xN tensor with entry [`:, i] is the xyz world-space coordinate corresponding to the camera-space `coordinate camera_space_xyzs[`:, i]

### `depth_frame_to_camera_space_xyz`#

``````depth_frame_to_camera_space_xyz(depth_frame: torch.Tensor, mask: Optional[torch.Tensor], fov: float = 90) -> torch.Tensor
``````

[view_source]

Transforms a input depth map into a collection of xyz points (i.e. a point cloud) in the camera's coordinate frame.

Parameters

• depth_frame : A square depth map, i.e. an MxM matrix with entry `depth_frame[i, j]` equaling the distance from the camera to nearest surface at pixel (i,j).
• mask : An optional boolean mask of the same size (MxM) as the input depth. Only values where this mask are true will be included in the returned matrix of xyz coordinates. If `None` then no pixels will be masked out (so the returned matrix of xyz points will have dimension 3x(M*M)
• fov: The field of view of the camera.

Returns

`A 3xN matrix with entry [`:, i] equalling a the xyz coordinates (in the camera's coordinate frame) of a point in the point cloud corresponding to the input depth frame.

### `depth_frame_to_world_space_xyz`#

``````depth_frame_to_world_space_xyz(depth_frame: torch.Tensor, camera_world_xyz: torch.Tensor, rotation: float, horizon: float, fov: float)
``````

[view_source]

Transforms a input depth map into a collection of xyz points (i.e. a point cloud) in the world-space coordinate frame.

IMPORTANT: We use the conventions from the Unity game engine. In particular:

• A rotation of 0 corresponds to facing north.
• Positive rotations correspond to CLOCKWISE rotations. That is a rotation of 90 degrees corresponds to facing east. THIS IS THE OPPOSITE CONVENTION OF THE ONE GENERALLY USED IN MATHEMATICS.
• When facing NORTH (rotation==0) moving ahead by 1 meter results in the the z coordinate increasing by 1. Moving to the right by 1 meter corresponds to increasing the x coordinate by 1. Finally moving upwards by 1 meter corresponds to increasing the y coordinate by 1. Having x,z as the ground plane in this way is common in computer graphics but is different than the usual mathematical convention of having z be "up".
• The horizon corresponds to how far below the horizontal the camera is facing. I.e. a horizon of 30 corresponds to the camera being angled downwards at an angle of 30 degrees.

Parameters

• depth_frame : A square depth map, i.e. an MxM matrix with entry `depth_frame[i, j]` equaling the distance from the camera to nearest surface at pixel (i,j).
• mask : An optional boolean mask of the same size (MxM) as the input depth. Only values where this mask are true will be included in the returned matrix of xyz coordinates. If `None` then no pixels will be masked out (so the returned matrix of xyz points will have dimension 3x(M*M)
• camera_space_xyzs : A 3xN matrix of xyz coordinates in the camera's reference frame.
• __Here `x, y, z = camera_space_xyzs[__:, i]` should equal the xyz coordinates for the ith point.
• camera_world_xyz : The camera's xyz position in the world reference frame.
• rotation : The world-space rotation (in degrees) of the camera.
• horizon : The horizon (in degrees) of the camera.
• fov: The field of view of the camera.

Returns

`A 3xN matrix with entry [`:, i] equalling a the xyz coordinates (in the world coordinate frame) of a point in the point cloud corresponding to the input depth frame.

### `project_point_cloud_to_map`#

``````project_point_cloud_to_map(xyz_points: torch.Tensor, bin_axis: str, bins: Sequence[float], map_size: int, resolution_in_cm: int, flip_row_col: bool)
``````

[view_source]

Bins an input point cloud into a map tensor with the bins equaling the channels.

This code has been adapted from https://github.com/devendrachaplot/Neural-SLAM.

Parameters

• xyz_points : (x,y,z) pointcloud(s) as a torch.Tensor of shape (... x height x width x 3). All operations are vectorized across the `...` dimensions.
• bin_axis : Either "x", "y", or "z", the axis which should be binned by the values in `bins`. If you have generated your point clouds with any of the other functions in the `point_cloud_utils` module you almost certainly want this to be "y" as this is the default upwards dimension.
• bins: The values by which to bin along `bin_axis`, see the `bins` parameter of `np.digitize` for more info.
• map_size : The axes not specified by `bin_axis` will be be divided by `resolution_in_cm / 100` and then rounded to the nearest integer. They are then expected to have their values within the interval [0, ..., map_size - 1].
• resolution_in_cm: The resolution_in_cm, in cm, of the map output from this function. Every grid square of the map corresponds to a (`resolution_in_cm`x`resolution_in_cm`) square in space.
• flip_row_col: Should the rows/cols of the map be flipped? See the 'Returns' section below for more info.

Returns

A collection of maps of shape (... x map_size x map_size x (len(bins)+1)), note that bin_axis has been moved to the last index of this returned map, the other two axes stay in their original order unless `flip_row_col` has been called in which case they are reversed (useful as often rows should correspond to y or z instead of x).