Homework 03 - ICP SLAM

TODO: Deadline: 24 March 2024, 23:59 CET (Monday labs) / 27 March 2024, 23:59 CET (Thursday labs)

Your task will be to finish an ICP SLAM node that provides the robot with basic localization and mapping.

Responsible lecturer: Martin Pecka (peckama2@fel.cvut.cz)

Simultaneous Localization and Mapping (SLAM)

When a robot explores an unknown environment, it has to simultaneously acquire a map of the environment and localize itself within the map, hence the name Simultaneous Localization and Mapping (SLAM). We will focus on the online SLAM problem in which we are interested only in the current robot pose and map given all previous measurements. In other words, past robot poses are not updated when new measurements arrive.

One can find many approaches both in literature and deployed in field. Main differences come from the sensors used (camera, lidar, sonar, etc.) and the assumptions taken (2D vs 3D, static vs dynamic world). A few of them can be seen in action below:

We will implement a ROS node similar to the ICP Mapper above but we will restrict ourselves to two dimensions, localizing and mapping only in horizontal x-y plane, as the TurtleBots in the labs are only equipped with 2D lidar and can operate only on a flat ground. However, it might be useful to do at least parts of the localization in 3D, because at least the real TurtleBots can tilt a bit when accelerating, which results in suboptimal localization performance when done solely in 2D.

The main building block for localization is an algorithm for alignment of two point clouds. Robot can be localized either by aligning each new input point cloud to some previous input point cloud (frame-to-frame) or to a map built from previously aligned measurements (frame-to-map).

Iterative Closest Point (ICP)

Optimal least-squares alignment $\mathtt{R}^*, \mathtt{t}^* = \underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum} \|\mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}_i \|^2$ of corresponding points $\{(\mathtt{p}_i, \mathtt{q}_i)\}_i$, $\mathtt{p}_i \in \mathcal{P} \subseteq \mathbb{R}^3$, $\mathtt{q}_i \in \mathcal{Q} \subseteq \mathbb{R}^3$, can be found using the absolute orientation [1] algorithm described in the lecture.

The Iterative Closest Point (ICP) [2] algorithm provides an approximate solution when the correspondences $(\mathtt{p}_i, \mathtt{q}_i)$ are not known in advance. Instead, it solves the correspondence and alignment problem iteratively, with ad hoc correspondences $(\mathtt{p}_i, \mathrm{argmin}_{\mathtt{q}\in\mathcal{Q}}\|\mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}\|)$ using alignment $\mathtt{R}, \mathtt{t}$ from the previous iteration. In the first iteration, either an initial estimate from wheel odometry, or identity $\mathtt{R} = \mathtt{I}$, $\mathtt{t} = \mathtt{0}$, can be used.

Recall the ICP algorithm from the lecture. Note that we will provide the ICP algorithm with an initial estimate of the transform which does not have to be identity. This initial alignment is to be applied to the input points $\mathtt{p}_i$ prior to the nearest neighbor search.

ICP Algorithm Outline

Start with empty map $\mathcal{Q} \gets \{\}$.
For each new scan $\mathcal{P}$:
- Initial transformation $\mathtt{R}^\ast \gets \mathtt{R_0}, \mathtt{t}^\ast \gets \mathtt{t_0}$ (e.g., from wheel odometry)
- If map $\mathcal{Q}$ is empty, $\mathcal{Q} \gets \mathcal{P}$.
- While maximum number of iterations is not reached:
  - Establish correspondences mapping $\mathtt{c}^\ast = \underset{\mathtt{c}}{\mathrm{argmin}} \underset{i}{\sum} || \mathtt{R}^\ast\mathtt{p}_i + \mathtt{t}^\ast − \mathtt{q}_{\mathtt{c}(i)} ||^2$.
  - Let $d_i = \|\mathtt{p}_i - \mathtt{q}_{\mathtt{c}^\ast(i)}\|$ be the distance (loss) between corresponding points.
  - Compute inlier distance threshold $d_\mathrm{inl}$ to be the median (or a different quantile) of the distances $d_i$.
  - Construct set of inliers $\mathcal{C}' \gets \{i \mid d_i \le d_\text{inl} \}$.
  - If mean inlier error $\text{inl_err} = \underset{i\in\mathcal{C}'}{\sum} d_i / ||\mathcal{C}'||$ is not changing much, stop iterating, ICP has converged.
  - Solve absolute orientation $\mathtt{R}^\ast, \mathtt{t}^\ast \gets \underset{\mathtt{R}\in\mathcal{SO3},\mathtt{t}}{\mathrm{argmin}} \underset{i\in\mathcal{C}'}{\sum} || \mathtt{R}\mathtt{p}_i + \mathtt{t} − \mathtt{q}_{\mathtt{c}^\ast(i)} ||^2$ for the inliers.
- Update map $\mathcal{Q}$ (either just replace it with $\mathcal{P}$ (frame-to-frame), or add novel points from $\mathcal{P}$ to $\mathcal{Q}$ (frame-to-map).

There are two basic paradigms for ICP. Both work equally well, but you need to make sure you are not mixing them together:

T = T' = T0
for i:
  p = transform(T', p)
  inl = q.query(p)
  T' = absor(p[inl], q[inl])
  T = T * T'

or

T = T0
for i:
  p_aligned = transform(T, p)
  inl = q.query(p_aligned)
  T = absor(p[inl], q[inl])  # do NOT use p_aligned!

Assignment

You should already have the ARO ROS environment installed from Lab 01. All the necessary files are already there.
Find places in aro_exploration/src/aro_slam/icp.py with comment TODO HW 3: or FIXME: and fill in the missing pieces so that the ICP SLAM works. There are several functionalities you can get points for.
Test your implementation both using the tests described in Visualization and Testing and also interactively in simulation or on the real robots. The Brute testing dataset contains several difficult instances which will really test the quality of your implementation. Good implementation will achieve localization accuracy about 5 cm on nice worlds, while it may shoot to several meters on ICP-unfriendly worlds.
Also make sure that the callback for one scan is fast enough when using the simulator, usually under 150 ms (you can see the ICP duration printed into console when debug_slam is true).

Evaluation

Absolute Orientation in SE(2) (1 point, required): Finish implementation of the absolute_orientation() function in file aro_exploration/src/aro_slam/icp.py. The function should work in SE(2) domain as described in the lecture.
Absolute Orientation in SE(3) (1 point): Finish implementation of the absolute_orientation() function in file aro_exploration/src/aro_slam/icp.py. The function should work in SE(3) domain as described in the lecture.
Point-to-Point ICP (1 point, required): Finish implementation for point-to-point loss in the icp() function in file aro_exploration/src/aro_slam/icp.py so that the ICP alignment works.
Point-to-Plane ICP (1 point): Add support for point-to-plane loss to the icp() function as described in the lecture.
Mapping with Frame-to-Map Alignment (1 point): Finish implementation for frame_to_map alignment in the update_map() function in file aro_exploration/src/aro_slam/icp.py so that a map of the environment is built and scans are aligned to the map in icp().
1. Add only the points which are at least mapping_min_dist (= node parameter) meters far from the current points in the map.
2. Visualize the /map topic in RViz to see the computed map.

Submission Instructions

Run script aro_exploration/homeworks/create_hw_03_package.sh to create an archive for Brute. Upload the ZIP to Brute. Your icp.py file will be used on a set of static datasets to find out it works correctly. Only the icp.py and the YAML config will be used from your codebase - all other files will be ignored.

Relevant files

These files are relevant for the ICP SLAM. It should be enough to edit just icp.py, but you might also want to change the YAML config. Other files should not be modified.

aro_exploration/config/slam.yaml contains various configurations of the package. You can edit this file to change the default loss, absolute orientation domain, or choosed frame-to-frame or frame-to-map alignment mode.
aro_exploration/launch/slam provides convenient launchers for the nodes below:
- aro_slam.launch launches the ICP SLAM nodes.
- aro_slam_sim.launch launches everything needed for interactive testing with Gazebo and ROS.
aro_exploration/nodes/slam/icp_slam contains the ROS node that uses icp.py module and feeds it with data coming from ROS.
aro_exploration/src/aro_slam/icp.py contains implementation of ICP SLAM. This is the file you should edit.

Visualization and Testing

Everything related to this homework has to be done inside the provided Singularity container and with your ARO workspace sourced, i.e. after calling deploy/scripts/start_singularity_aro.

You can directly run the Python module using python3 aro_exploration/src/aro_slam/icp.py (or by launching it from IDE as a script). This will run a few demonstrations which will test your code.

You can also run python3 -m unittest aro_exploration/src/aro_slam/icp.py (or instruct the IDE to run unit tests). There are some unit tests your code will be tested with. Passing these tests is mandatory (but not sufficient) for getting points for the tested functionalities.

You can also run roslaunch aro_exploration aro_slam_sim.launch world:=aro_maze_1 to test your code interactively in various simulated worlds.

To use your random walking algorithm from HW 02, call the roslaunch like this: roslaunch aro_exploration aro_slam_sim.launch world:=aro_maze_1 reactive_control:=true

The launch file has also parameters loss, alignment and absorient_domain which you can use to easily change these important algorithm parameters when launching the simulation.

The launch file has also argument debug_slam which enables debug (green) messages in the console from the icp_slam node. This argument is set to true when launching aro_slam_sim.launch and defaults to false when the ICP SLAM is launched by later homeworks or the semestral work. However, you are free to pass debug_slam:=true even to these later homeworks to see how the ICP SLAM is doing. If, in your code, you use loginfo() or logwarn() to log some information, it will be shown regardless of the setting of this argument. Only logdebug() prints are affected.

See video for an example run with the robot.

Difficult worlds

To test the localization on a world where ICP odometry may fail horribly, test with world:=aro_hallway_1 and loss:=point_to_plane ICP alignment. There is not much ICP can do about that. We'll solve that with Factorgraph-based localization in the next homework! More worlds for testing can be found in aro_sim/worlds.

Gamepad teleoperation

If you have a gamepad that appears as `/dev/js0` device in Linux, you should be able to use it to control the simulated robot. Keep button A pressed all the time and use left thumbstick to control the robot. XBox gamepads can be used with the xow or xone drivers (do not mix both!).

It can happen that your gamepad model is not recognized. In such case, the console would get spammed by a lot of error messages. To get rid of them, disable gamepad control by adding roslaunch argument joy_teleop:=false.

Keyboard teleoperation

If you add roslaunch argument keyboard_teleop:=true, it opens a window in which you can control the robot using WXAD keys to test your algorithm.

RViz visualization

The launch file aro_exploration/launch/slam/aro_slam_sim.launch starts also a preconfigured RViz window that shows the current ICP and occupancy map, estimated surface normals etc. ICP Odom Markers visualize $\mathtt{p}$ and $\mathtt{R}\mathtt{q} + \mathtt{t}$ clouds and how does the correspondence matching work.

In the top left corner, you can also see a history of the ICP localization error. The error history is also saved to file /tmp/tf_metrics.csv.

Real robot caveats

There are a few differences between the simulator and the real robot:

The real robot tilts back a bit when accelerating. This can corrupt the ICP result if done only in SE(2).
Quality of the wheel-IMU odometry used for ICP initialization differs between the simulation and the real robot.
In simulation, ground truth position is available for free. To get something similar for the real robot, a complicated setup of the room would be required.

Task Details

Nearest Neighbor Search

The nearest neighbor search has in general complexity $O(d N)$, for $d$ dimensions and $N$ points. However, KD-trees are an efficient data structure that offers a lookup in $O(\log N)$ time. We pay for this speedup by $O(d N\log N)$ build time of the tree. But if there are much more lookups than rebuilds, it really pays off. And this is exactly the case of ICP.

In Python, be sure to use class scipy.spatial.cKDTree and not just scipy.spatial.KDTree. The KDTree class is much slower than its cKDTree alternative. The cKDTree class has method query(p) which returns distances and indices of the nearest neighbors in the tree. This output can be directly used in the ICP implementation as $d_i$ and $\mathtt{c}^\ast(i)$.

Point Cloud Representation

Serialized and transmitted as sensor_msgs/PointCloud2 ROS messages.
Convenient representations in Python:
- unstructured matrix array, which allows easy transformations of point positions or normal vectors, or
- structured array with several channels, providing full representation (without ROS header).

API:

import numpy as np
from numpy.lib.recfunctions import structured_to_unstructured, unstructured_to_structured
from ros_numpy import msgify, numpify
from sensor_msgs.msg import PointCloud2
# ...
input_struct = numpify(input_msg)
position = structured_to_unstructured(input_struct[['x', 'y', 'z']])
normal = structured_to_unstructured(input_struct[['normal_%s' % f for f in 'xyz']])
# ...
output_msg = msgify(PointCloud2, input_struct)
output_msg.header.frame_id = 'map'
output_msg.header.stamp = rospy.Time.now()

Outlier Rejection

When the robot is exploring an unknown environment, some measured points correspond to the previously unseen areas and for these points true correspondences cannot be found. In some cases, such as going through a narrow opening such as a door, the relative overlap can easily decrease almost to a half. To handle incomplete overlaps between the point clouds, we will employ the following outlier rejection strategy in the ICP algorithm.

Let $\mathtt{c}^\ast(i)$ be the set of correspondences from the nearest neighbor search and $d_i$ the distances between corresponding points, $d_i = \|\mathtt{p}_i - \mathtt{q}_i\|$. Inlier distance threshold $d_\mathrm{inl}$ can be adjusted adaptively using $p$-quantile of the distances, denoted $Q_d(p)$ (i.e. $Q_{0.5}(p)$ is median): $d_\mathrm{inl} = \min(r Q_d(p), d_\mathrm{max})$ where $r$ is a multiplier and $d_\mathrm{max}$ the maximum allowed inlier distance.

In our icp_slam node, these constants correspond to the following parameters:

$p = \text{inlier_ratio}$,
$r = \text{inlier_dist_mult}$, and
$d_\mathrm{max} = \text{max_inlier_dist}$.

Point-to-Plane Loss

In many situations, it is more precise to minimize the point-to-plane loss (see the lecture) instead of point-to-point:

Instead of $\underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum} \| \mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}_i\|^2$, compute $\underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum}(\mathtt{n}_i^\intercal( \mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}_i))^2$.

This formulation has no closed-form solution, but we can rephrase it as $\underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum} \| \mathtt{R}\mathtt{p}_i + \mathtt{t} - \hat{\mathtt{q}}_i\|^2$ where $\hat{\mathtt{q}}_i = \mathtt{p}_i - \mathtt{n}_i \ast \mathtt{n}_i^\intercal(\mathtt{p}_i - \mathtt{q}_i)$ is the point on plane $\mathtt{n}_i^\intercal (\mathtt{p}_i - \mathtt{q}_i) = 0$ closest to $\mathtt{R} \mathtt{p}_i + \mathtt{t}$ from the previous iteration (so that the absolute_orientation function can be used).

The new formulation is nothing but application of absolute orientation, only to a slightly different set of point $\{\mathtt{p}_i, \hat{\mathtt{q}}_i\}$ instead of $\{\mathtt{p}_i, \mathtt{q}_i\}$.

If you explicitly compute a distance of (p-q) projected onto the normal, you usually get signed distance. Make sure you turn it into unsigned distance when passing it to the rest of the code.

Occupancy Mapping

Localization is used to build occupancy grid map for frontier detection and path planning from registered scans. Occupancy mapping is implemented in a separate node, occupancy_mapper, which is provided and need not be changed.

The occupancy model of each grid cell $b_i$ is updated with every scan: the cells containing the measured points have their probability of being occupied increased and other cells incident with the rays from the sensor towards these points have their probability of being occupied decreased.

Additive updates take form of $\Delta b_i = \log \frac{p(o_i|\mathtt{z}_{t})}{p(\neg o_i|\mathtt{z}_{t})} + \log \frac{p(\neg o_i)}{p(o_i)} $. See the lecture.

The occupancy values are clipped to interval $[b_\mathrm{low}, b_\mathrm{high}]$ to avoid accumulating overconfident occupancy belief and allow to handle dynamic objects. We use unit updates $\Delta b_i \in \{-1, 1\}$ and let parameters $b_\text{low}$, $b_\text{high}$, $b_\text{occupied}$ be tuned to get the required trade-off between noise removal and response time to dynamic changes in the environment.

Major limitation of the provided implementation is the fact that the scene is discretized to cells of a given size and spatial distribution of surfaces within the cells are not modeled. As a result, a ray with high angle of incidence may pass through many cells, which it marks as empty, before it reaches the measured cell, and the previously occupied cells it passes through may incorrectly switch to empty state. A workaround in frontier detection may be needed to deal with such cases.

References

[1] K. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. PAMI-9, no. 5, pp. 698–700, 1987.

[2] Y. Chen and G. Medioni, “Object modelling by registration of multiple range images,” Image and Vision Computing, vol. 10, no. 3, pp. 145–155, 1992.

Table of Contents