Homework 04 - ICP SLAM

Simultaneous Localization and Mapping (SLAM)

When a robot explores an unknown environment, it has to simultaneously acquire a map of the environment and localize itself within the map, hence the name Simultaneous Localization and Mapping (SLAM). We will focus on the online SLAM problem in which we are interested only in the current robot pose and map given all previous measurements. In other words, past robot poses are not updated when new measurements arrive.

One can find many approaches both in literature and deployed in field. Main differences come from the sensors used (camera, lidar, sonar, etc.) and the assumptions taken (2D vs 3D, static vs dynamic world). A few of them can be seen in action below:

We will implement a ROS node similar to the ICP Mapper above but we will restrict ourselves to two dimensions, localizing and mapping only in horizontal x-y plane, as the TurtleBots in the labs are only equipped with 2D lidar and can operate only on a flat ground.

The main building block for localization is an algorithm for alignment of two point clouds. Robot can be localized either by aligning each new input point cloud to some previous input point cloud (frame-to-frame) or to a map built from previously aligned measurements (frame-to-map).

Iterative Closest Point (ICP)

Optimal least-squares alignment $\mathtt{R}^*, \mathtt{t}^* = \mathrm{argmin}_{\mathtt{R}, \mathtt{t}} \sum_i \|\mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}_i \|^2$ of corresponding points $\{(\mathtt{x}_i, \mathtt{y}_i)\}_i$, $\mathtt{x}_i \in \mathcal{X} \subseteq \mathbb{R}^D$, $\mathtt{y}_i \in \mathcal{Y} \subseteq \mathbb{R}^D$, can be found using the absolute orientation [1] algorithm described in the lecture.

The Iterative Closest Point (ICP) [2] algorithm provides an approximate solution when the correspondences $(\mathtt{x}_i, \mathtt{y}_i)$ are not known in advance. Instead, it solves the correspondence and alignment problem iteratively, with ad hoc correspondences $(\mathtt{x}_i, \mathrm{argmin}_{\mathtt{y}\in\mathcal{Y}}\|\mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}\|)$ using alignment $\mathtt{R}, \mathtt{t}$ from the previous iteration. In the first iteration, an initial estimate or identity $\mathtt{R} = \mathtt{I}$, $\mathtt{t} = \mathtt{0}$ can be used.

Recall the ICP algorithm from the lecture. Note that we will provide the ICP algorithm with an initial estimate of the transform which does not have to be identity. This initial alignment is to be applied to the input points $\mathtt{x}_i$ prior the nearest neighbor search.

Algorithm Outline

While not done
- Establish correspondences $\mathcal{C} \gets \{(\mathtt{x}, \mathrm{argmin}_{\mathtt{y}\in\mathcal{Y}}\|\mathtt{R}\mathtt{x} + \mathtt{t} - \mathtt{y}\|) \mid \mathtt{x} \in \mathcal{X} \}$.
- Compute inlier distance threshold $d_\mathrm{inl}$ and let $d_i = \|\mathtt{x}_i - \mathtt{y}_i\|$ be the distance between the corresponding points.
- Construct set of inliers $\mathcal{C}' \gets \{(\mathtt{x}_i, \mathtt{y}_i) \mid (\mathtt{x}_i, \mathtt{y}_i) \in \mathcal{C} \land d_i \le d_\text{inl} \}$.
- Solve absolute orientation $\mathtt{R}, \mathtt{t} \gets [1](\mathcal{C}')$ for the inliers.

Outlier Rejection

When the robot is exploring an unknown environment, some measured points correspond to the previously unseen areas and for these points true correspondences cannot be found. In some cases, such as going through a narrow opening such as a door, the relative overlap can easily decrease almost to a half. To handle incomplete overlaps between the point clouds, we will employ the following outlier rejection strategy in the ICP algorithm.

Let $\mathcal{C} = \{(\mathtt{x}_i$, $\mathtt{y}_i)\}_i$ be the set of correspondences from the nearest neighbor search and $d_i$ the distance between corresponding points, $d_i = \|\mathtt{x}_i - \mathtt{y}_i\|$. Inlier distance threshold can be adjusted adaptively using $p$-quantile of the distances $d_i$, $Q_d(p)$: $d_\mathrm{inl} = \min(r Q_d(p), d_\mathrm{max})$ where $r$ is a multiplier and $d_\mathrm{max}$ the maximum allowed inlier distance.

In our icp_slam node, these constants corresponds to the following parameters:

$p =$ inlier_ratio,
$r =$ inlier_dist_mult, and
$d_\mathrm{max} =$ max_inlier_dist.

Improvements

Multi-channel input point clouds $\mathcal{X}$ and $\mathcal{Y}$.
- E.g., $\mathcal{X} = \{ x_i \}_i, x_i = (\mathtt{x}_i,\mathtt{n}_i, \ldots)$.
Establish correspondences using a generic descriptor:
- $\mathcal{C} \gets \{ (x, \mathrm{argmin}_{y\in\mathcal{Y}} \|\mathrm{desc}(\mathtt{T}(\mathcal{x})) - \mathrm{desc}(y) \| ) \mid x \in \mathcal{X} \}$,
  where descriptor function $\mathrm{desc}: \mathcal{X}^N \mapsto \mathbb{R}^{N \times D}$ produces $D$-dimensional descriptor vectors suitable for nearest neighbor search.
Allow minimizing point-to-plane loss (see the lecture):
- Instead of $\mathrm{argmin}_{\mathtt{R}, \mathtt{t}} \sum_i \| \mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}_i\|^2$, find $\mathrm{argmin}_{\mathtt{R}, \mathtt{t}} \sum_i (\mathtt{n}_i^\intercal( \mathtt{R}\mathtt{x}_i + \mathtt{t} - \underline{\mathtt{y}}_i))^2$ where $\underline{\mathtt{y}}_i$ is the point on plane $\mathtt{n}_i^\intercal (\mathtt{x} - \mathtt{y}_i) = 0$ closest to $\mathtt{R} \mathtt{x}_i + \mathtt{t}$ from the previous iteration (so that the absolute_orientation function can be used).

Occupancy Mapping

Localization is used to build occupancy grid map for frontier detection and path planning from registered scans. Occupancy mapping is implemented in a separate node, occupancy_mapper, which is provided and need not be changed.

The occupancy model of each grid cell $b_i$ is updated with every scan: the cells containing the measured points have their probability of being occupied increased and other cells incident with the rays from the sensor towards these points have their probability of being occupied decreased.

Additive updates take form of $\Delta b_i = \log \frac{p(o_i|\mathtt{z}_{t})}{p(\neg o_i|\mathtt{z}_{t})} + \log \frac{p(\neg o_i)}{p(o_i)} $. See the lecture.

The occupancy values are clipped to interval $[b_\mathrm{low}, b_\mathrm{high}]$ to avoid accumulating overconfident occupancy belief and allow to handle dynamic objects. We use unit updates $\Delta b_i \in \{-1, 1\}$ and let parameters $b_\text{low}$, $b_\text{high}$, $b_\text{occupied}$ be tuned to get the required trade-off between noise removal and response time to dynamic changes in the environment.

Major limitation of the provided implementation is the fact that the scene is discretized to cells of a given size and spatial distribution of surfaces within the cells are not modeled. As a result, a ray with high angle of incidence may pass through many cells, which it marks as empty, before it reaches the measured cell, and the previously occupied cells it passes through may incorrectly switch to empty state. A workaround in frontier detection may be needed to deal with such cases.

Prerequisities

TF Transforms

TF transforms between reference frames are organized in a directed tree and distributed on topics /tf and /tf_static.
Reference frames are identified by IDs (msg.header.frame_id).
Each transform comes with a timestamp. API allows interpolating transforms based on these timestamps.
Common tree structure for a mobile robot is
- map ← odom ← base_footprint ← … if SLAM corrects an already provided odometry, or
- map ← base_footprint ← … if SLAM provides complete localization.
- Additional frames may follow, with the sensors typically being at leaves.

API:

tf_buffer = tf2_ros.Buffer()
tf_sub = tf2_ros.TransformListener(tf_buffer)
tf_pub = tf2_ros.TransformBroadcaster()
# ...
try:
    transform = tf_buffer.lookup_transform(...)
    transform = tf_buffer.lookup_transform_full(...)
except tf2_ros.TransformException as ex:
    # ...

Debugging tools
- rosrun rqt_tf_tree rqt_tf_tree
Limitations of tf2_ros.Buffer
- Parent frame cannot change.
- Transforms cannot change at the given timestamp.

Point Cloud Representation

Serialized and transmitted as ''sensor_msgs/PointCloud2'' ROS messages.
Convenient representations in Python:
- unstructured matrix array, which allows easy transformations of point positions or normal vectors, or
- structured array with several channels, providing full representation (without ROS header).

API:

import numpy as np
from numpy.lib.recfunctions import structured_to_unstructured, unstructured_to_structured
from ros_numpy import msgify, numpify
from sensor_msgs.msg import PointCloud2
# ...
input = numpify(input_msg)
position = structured_to_unstructured(input[['x', 'y', 'z']])
normal = structured_to_unstructured(input[['normal_%s' % f for f in 'xyz']])
# ...
map_mgs = msgify(PointCloud2, map)
msg.header.frame_id = 'map'
msg.header.stamp = ros.Time.now()

Nearest Neighbor Search

Nearest neighbors can efficiently be found using KD-tree and its query method.

Assignment

Workspace Configuration

Enter Singularity shell:

singularity shell deploy/images/robolab_noetic.simg

or

deploy/scripts/start_singularity_aro

Configure workspace with the provided aro_slam package. Note, that if you already have an existing workspace from previous assignments, just replace the aro_slam package and rebuild the workspace:

ws=~/workspace/aro
mkdir -p "${ws}/src"
cd "${ws}/src"
curl -sSL https://cw.fel.cvut.cz/b222/_media/courses/aro/tutorials/hw_04_pkgs.zip -O
unzip hw_04_pkgs.zip
curl -sSL https://cw.fel.cvut.cz/b222/_media/courses/aro/tutorials/aro_slam.zip -O
unzip aro_slam.zip
wstool init
wstool merge aro_slam/dependencies.rosinstall
wstool up -j 4
cd "${ws}"
catkin init
catkin config --extend /opt/ros/aro
catkin config --cmake-args -DCMAKE_BUILD_TYPE=Release
catkin build -c

Package Structure

Get familiar with the package. The structure of the package is as follows:

launch/ provides convenient launchers for the nodes below:
- aro_slam.launch launches the whole SLAM pipeline, including visualization,
- bag.launch launches the pipeline along with a bag file(s) to provide inputs, and
- sim.launch launches the pipeline along with the Gazebo simulator.
scripts/ contains executable nodes or scripts:
- cloud_filter node filters point clouds and estimates surface normals,
- icp_slam node provides robot localization,
- occupancy_mapper creates the occupancy map to be later used in planning and eventually also for evaluation of exploration,
- cloud_color node adds color attributes to points in a cloud by projecting pixel values from cameras images, and
- tf_metrics allows evaluating accuracy of localization.
src/aro_slam/ contains Python package with the following modules:
- cloud.py contains helpers for handling point clouds,
- icp.py contains implementation of absolute orientation and ICP,
- io.py provides support for reading and writing Python objects,
- occupancy.py provides support for occupancy mapping, and
- utils.py contains common utility functions.
CMakeLists.txt describes package build.
dependencies.rosinstall lists dependencies which can directly be added into the workspace.
package.xml contains the Catkin package description.

Package Usage

Don't forget to source your workspace:

source "${ws}/devel/setup.bash"

To test the icp_slam node in simulator:

roslaunch aro_slam sim.launch world:=aro_maze_1 keyboard:=true

It also runs keyboard (and joypad) controls, so you can manually move the robot using keyboard (keys WASDX) if the terminal is focused. Worlds to test: stage_1, stage_2, stage_3, stage_4, world, aro_maze_1, aro_maze_2.

Or, launch the SLAM pipeline alone (no controls):

roslaunch aro_slam aro_slam.launch

See the launch files for possible arguments.

In order to debug your implementations of Absolute Orientation and ICP algorithms:

python -m aro_slam.icp

Absolute Orientation (2 points)

Implement (and test) absolute orientation algorithm for alignment with known correspondences in the absolute_orientation function in file aro_slam/src/aro_slam/icp.py:

in SE(2) domain as described in the lecture (1 point),

in SE(3) domain as described in the lecture (1 point).

Feel free to use absorient_demo function to test your implementation.

Point-to-Point ICP (1 point)

Implement (and test) the ICP algorithm for alignment without known correspondences using position descriptors and point-to-point loss in the icp function in file aro_slam/src/aro_slam/icp.py.

Feel free to use icp_demo function to test your implementation with point cloud scans recorded in the FEE corridor at Charles Square.

Test and debug above functionality within the icp_slam node.

Point-to-Plane ICP (1 point)

Add point-to-plane loss to the icp function as described in the lecture.

Mapping with Frame-to-Map Alignment (1 point)

Build a map from registered measurements and use it as a reference for aligning the input point clouds, instead of a previous key frame. Add only the points which are at least mapping_min_dist (= node parameter) meters far from the current points in the map. Implement this functionality in the icp_slam node itself (file path aro_slam/scripts/icp_slam).

A possible way to implement the frame-to-map alignment:

For all points in the current scan (defined as a structured array in x_struct variable) find correspondences in the point cloud map.
Select only “new points” from the current scan. These points should not have a close enough neighbor (defined by the mapping_min_dist parameter).
Concatenate the “new points” with the map.
Convert the map to ROS message (of type PointCloud2) to be published.

You can test your implementation by launching the simulator with alignment argument:

   roslaunch aro_slam sim.launch alignment:=frame_to_map keyboard:=true world:=aro_maze_1

Move the robot around and make sure that the history of point cloud scans is being stored. Visualize the /map topic in RViz to see the result.

Submission Instructions

Compress the whole aro_slam package with your implementation and results excluding bag files into a zip archive and upload it into Brute.

The implementation should be based on the provided ''aro_slam'' package template.

See video for an example run with the robot.

References

[1] K. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. PAMI-9, no. 5, pp. 698–700, 1987.

[2] Y. Chen and G. Medioni, “Object modelling by registration of multiple range images,” Image and Vision Computing, vol. 10, no. 3, pp. 145–155, 1992.

Table of Contents