======= Homework 03 - ICP SLAM ======= TODO: Deadline: 24 March 2024, 23:59 CET (Monday labs) / 27 March 2024, 23:59 CET (Thursday labs) Your task will be to finish an ICP SLAM node that provides the robot with basic localization and mapping. **Responsible lecturer:** Martin Pecka ([[mailto:peckama2@fel.cvut.cz?subject=ARO HW03|peckama2@fel.cvut.cz]]) ====== Simultaneous Localization and Mapping (SLAM) ====== When a robot explores an unknown environment, it has to simultaneously acquire a map of the environment and localize itself within the map, hence the name //Simultaneous Localization and Mapping// (SLAM). We will focus on the //online SLAM problem// in which we are interested only in the current robot pose and map given all previous measurements. In other words, past robot poses are not updated when new measurements arrive. One can find many approaches both in literature and deployed in field. Main differences come from the sensors used (camera, lidar, sonar, etc.) and the assumptions taken (2D vs 3D, static vs dynamic world). A few of them can be seen in action below: * [[https://www.youtube.com/watch?v=F8pdObV_df4&t=8s|2D lidar + IMU, indoor (Hector)]] * [[https://www.youtube.com/watch?v=qpTS7kg9J3A|Stereo, outdoor (RTAB-Map)]] * [[https://www.youtube.com/watch?v=08GTGfNneCI&t=8s|3D lidar, indoor (BLAM)]] * [[https://www.youtube.com/watch?v=cMgLyLpnsoU&t=19s|3D lidar, outdoor with dynamic obstacles (ICP Mapper)]] * [[https://www.youtube.com/watch?v=9FhKgAEQTOg&t=40s|3D lidar, outdoor + indoor (KISS-ICP)]] We will implement a ROS node similar to the //ICP Mapper// above but we will restrict ourselves to two dimensions, localizing and mapping only in horizontal //x-y// plane, as the TurtleBots in the labs are only equipped with 2D lidar and can operate only on a flat ground. However, it might be useful to do at least parts of the localization in 3D, because at least the real TurtleBots can tilt a bit when accelerating, which results in suboptimal localization performance when done solely in 2D. The main building block for localization is an algorithm for alignment of two point clouds. Robot can be localized either by aligning each new input point cloud to some previous input point cloud (//frame-to-frame//) or to a map built from previously aligned measurements (//frame-to-map//). ====== Iterative Closest Point (ICP) ====== Optimal least-squares alignment $\mathtt{R}^*, \mathtt{t}^* = \underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum} \|\mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}_i \|^2$ of corresponding points $\{(\mathtt{p}_i, \mathtt{q}_i)\}_i$, $\mathtt{p}_i \in \mathcal{P} \subseteq \mathbb{R}^3$, $\mathtt{q}_i \in \mathcal{Q} \subseteq \mathbb{R}^3$, can be found using the //absolute orientation// [1] algorithm described in [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/00_absolute_orientation.pdf#page=1|the lecture]]. /*[[https://cw.fel.cvut.cz/wiki/_media/courses/b3m33aro/lectures/01_transformations_and_lidar_calibration.pdf#page=48|the lecture]].*/ The //Iterative Closest Point// (ICP) [2] algorithm provides an approximate solution when the correspondences $(\mathtt{p}_i, \mathtt{q}_i)$ are not known in advance. Instead, it solves the correspondence and alignment problem iteratively, with ad hoc correspondences $(\mathtt{p}_i, \mathrm{argmin}_{\mathtt{q}\in\mathcal{Q}}\|\mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}\|)$ using alignment $\mathtt{R}, \mathtt{t}$ from the previous iteration. In the first iteration, either an initial estimate from wheel odometry, or identity $\mathtt{R} = \mathtt{I}$, $\mathtt{t} = \mathtt{0}$, can be used. Recall the ICP algorithm from the [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/01_icp.pdf#page=1|lecture]]. /*[[https://cw.fel.cvut.cz/wiki/_media/courses/b3m33aro/lectures/02_localization_lidar.pdf#page=32|lecture]].*/ Note that we will provide the ICP algorithm with an initial estimate of the transform which does not have to be identity. This initial alignment is to be applied to the input points $\mathtt{p}_i$ prior to the nearest neighbor search. ===== ICP Algorithm Outline ===== * Start with empty map $\mathcal{Q} \gets \{\}$. * For each new scan $\mathcal{P}$: * Initial transformation $\mathtt{R}^\ast \gets \mathtt{R_0}, \mathtt{t}^\ast \gets \mathtt{t_0}$ (e.g., from wheel odometry) * If map $\mathcal{Q}$ is empty, $\mathcal{Q} \gets \mathcal{P}$. * While maximum number of iterations is not reached: * Establish correspondences mapping $\mathtt{c}^\ast = \underset{\mathtt{c}}{\mathrm{argmin}} \underset{i}{\sum} || \mathtt{R}^\ast\mathtt{p}_i + \mathtt{t}^\ast − \mathtt{q}_{\mathtt{c}(i)} ||^2$. * Let $d_i = \|\mathtt{p}_i - \mathtt{q}_{\mathtt{c}^\ast(i)}\|$ be the distance (loss) between corresponding points. * Compute inlier distance threshold $d_\mathrm{inl}$ to be the median (or a different quantile) of the distances $d_i$. * Construct set of inliers $\mathcal{C}' \gets \{i \mid d_i \le d_\text{inl} \}$. * If mean inlier error $\text{inl_err} = \underset{i\in\mathcal{C}'}{\sum} d_i / ||\mathcal{C}'||$ is not changing much, stop iterating, ICP has converged. * Solve absolute orientation $\mathtt{R}^\ast, \mathtt{t}^\ast \gets \underset{\mathtt{R}\in\mathcal{SO3},\mathtt{t}}{\mathrm{argmin}} \underset{i\in\mathcal{C}'}{\sum} || \mathtt{R}\mathtt{p}_i + \mathtt{t} − \mathtt{q}_{\mathtt{c}^\ast(i)} ||^2$ for the inliers. * Update map $\mathcal{Q}$ (either just replace it with $\mathcal{P}$ (//frame-to-frame//), or add novel points from $\mathcal{P}$ to $\mathcal{Q}$ (//frame-to-map//). There are two basic paradigms for ICP. Both work equally well, but you need to make sure you are not mixing them together: T = T' = T0 for i: p = transform(T', p) inl = q.query(p) T' = absor(p[inl], q[inl]) T = T * T' or T = T0 for i: p_aligned = transform(T, p) inl = q.query(p_aligned) T = absor(p[inl], q[inl]) # do NOT use p_aligned! ====== Assignment ====== - You should already have the ARO ROS environment installed from [[courses:aro:tutorials:lab01|Lab 01]]. All the necessary files are already there. - Find places in ''aro_exploration/src/aro_slam/icp.py'' with comment ''TODO HW 3:'' or ''%%FIXME:%%'' and fill in the missing pieces so that the ICP SLAM works. There are several functionalities you can get points for. - Test your implementation both using the tests described in [[homework03#visualization_and_testing|Visualization and Testing]] and also interactively in simulation or on the real robots. The Brute testing dataset contains several difficult instances which will really test the quality of your implementation. Good implementation will achieve localization accuracy about 5 cm on nice worlds, while it may shoot to several meters on ICP-unfriendly worlds. - Also make sure that the callback for one scan is fast enough when using the simulator, usually under 150 ms (you can see the ICP duration printed into console when ''debug_slam'' is true). ===== Evaluation ===== - **Absolute Orientation in SE(2) (1 point, required):** Finish implementation of the ''absolute_orientation()'' function in file ''aro_exploration/src/aro_slam/icp.py''. The function should work in SE(2) domain as described in [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/00_absolute_orientation.pdf#page=23|the lecture]]. - **Absolute Orientation in SE(3) (1 point):** Finish implementation of the ''absolute_orientation()'' function in file ''aro_exploration/src/aro_slam/icp.py''. The function should work in SE(3) domain as described in [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/00_absolute_orientation.pdf#page=24|the lecture]]. - **Point-to-Point ICP (1 point, required):** Finish implementation for point-to-point loss in the ''icp()'' function in file ''aro_exploration/src/aro_slam/icp.py'' so that the ICP alignment works. - **Point-to-Plane ICP (1 point):** Add support for point-to-plane loss to the ''icp()'' function as described in [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/01_icp.pdf#page=27|the lecture]]. - **Mapping with Frame-to-Map Alignment (1 point):** Finish implementation for //frame_to_map// alignment in the ''update_map()'' function in file ''aro_exploration/src/aro_slam/icp.py'' so that a map of the environment is built and scans are aligned to the map in ''icp()''. - Add only the points which are at least //mapping_min_dist// (= node parameter) meters far from the current points in the map. - Visualize the ''/map'' topic in RViz to see the computed map. ===== Submission Instructions ===== Run script ''aro_exploration/homeworks/create_hw_03_package.sh'' to create an archive for Brute. Upload the ZIP to Brute. Your ''icp.py'' file will be used on a set of static datasets to find out it works correctly. Only the ''icp.py'' and the YAML config will be used from your codebase - all other files will be ignored. ===== Relevant files ===== These files are relevant for the ICP SLAM. It should be enough to edit just ''icp.py'', but you might also want to change the YAML config. Other files should not be modified. * ''aro_exploration/config/slam.yaml'' contains various configurations of the package. You can edit this file to change the default loss, absolute orientation domain, or choosed frame-to-frame or frame-to-map alignment mode. * ''aro_exploration/launch/slam'' provides convenient launchers for the nodes below: * ''aro_slam.launch'' launches the ICP SLAM nodes. * ''aro_slam_sim.launch'' launches everything needed for interactive testing with Gazebo and ROS. * ''aro_exploration/nodes/slam/icp_slam'' contains the ROS node that uses ''icp.py'' module and feeds it with data coming from ROS. * ''aro_exploration/src/aro_slam/icp.py'' contains implementation of ICP SLAM. This is the file you should edit. ====== Visualization and Testing ====== Everything related to this homework has to be done inside the provided Singularity container and with your ARO workspace sourced, i.e. after calling ''%%deploy/scripts/start_singularity_aro%%''. You can directly run the Python module using ''python3 aro_exploration/src/aro_slam/icp.py'' (or by launching it from IDE as a script). This will run a few demonstrations which will test your code. You can also run ''python3 -m unittest aro_exploration/src/aro_slam/icp.py'' (or instruct the IDE to run unit tests). There are some unit tests your code will be tested with. Passing these tests is mandatory (but not sufficient) for getting points for the tested functionalities. You can also run ''roslaunch aro_exploration aro_slam_sim.launch world:=aro_maze_1'' to test your code interactively in various simulated worlds. To use your random walking algorithm from HW 02, call the roslaunch like this: ''roslaunch aro_exploration aro_slam_sim.launch world:=aro_maze_1 reactive_control:=true'' The launch file has also parameters ''loss'', ''alignment'' and ''absorient_domain'' which you can use to easily change these important algorithm parameters when launching the simulation. The launch file has also argument ''debug_slam'' which enables debug (green) messages in the console from the ''icp_slam'' node. This argument is set to true when launching ''aro_slam_sim.launch'' and defaults to false when the ICP SLAM is launched by later homeworks or the semestral work. However, you are free to pass ''debug_slam:=true'' even to these later homeworks to see how the ICP SLAM is doing. If, in your code, you use ''loginfo()'' or ''logwarn()'' to log some information, it will be shown regardless of the setting of this argument. Only ''logdebug()'' prints are affected. See [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/tutorials/icp_slam.mp4|video]] for an example run with the robot. **Difficult worlds** To test the localization on a world where ICP odometry may fail horribly, test with ''world:=aro_hallway_1'' and ''loss:=point_to_plane'' ICP alignment. There is not much ICP can do about that. We'll solve that with Factorgraph-based localization in the next homework! More worlds for testing can be found in ''aro_sim/worlds''. **Gamepad teleoperation** If you have a gamepad that appears as `/dev/js0` device in Linux, you should be able to use it to control the simulated robot. Keep button ''A'' pressed all the time and use left thumbstick to control the robot. XBox gamepads can be used with the [[https://github.com/medusalix/xow|xow]] or [[https://github.com/medusalix/xone|xone]] drivers (do not mix both!). It can happen that your gamepad model is not recognized. In such case, the console would get spammed by a lot of error messages. To get rid of them, disable gamepad control by adding roslaunch argument ''joy_teleop:=false''. **Keyboard teleoperation** If you add roslaunch argument ''keyboard_teleop:=true'', it opens a window in which you can control the robot using WXAD keys to test your algorithm. **RViz visualization** The launch file ''aro_exploration/launch/slam/aro_slam_sim.launch'' starts also a preconfigured RViz window that shows the current ICP and occupancy map, estimated surface normals etc. //ICP Odom Markers// visualize $\mathtt{p}$ and $\mathtt{R}\mathtt{q} + \mathtt{t}$ clouds and how does the correspondence matching work. In the top left corner, you can also see a history of the ICP localization error. The error history is also saved to file ''/tmp/tf_metrics.csv''. {{ :courses:aro:tutorials:rviz_icp.png?nolink |}} ====== Real robot caveats ====== There are a few differences between the simulator and the real robot: * The real robot tilts back a bit when accelerating. This can corrupt the ICP result if done only in SE(2). * Quality of the wheel-IMU odometry used for ICP initialization differs between the simulation and the real robot. * In simulation, ground truth position is available for free. To get something similar for the real robot, a complicated setup of the room would be required. ====== Task Details ====== ===== Nearest Neighbor Search ===== The nearest neighbor search has in general complexity $O(d N)$, for $d$ dimensions and $N$ points. However, KD-trees are an efficient data structure that offers a lookup in $O(\log N)$ time. We pay for this speedup by $O(d N\log N)$ build time of the tree. But if there are much more lookups than rebuilds, it really pays off. And this is exactly the case of ICP. In Python, be sure to use class [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.html|scipy.spatial.cKDTree]] and not just //scipy.spatial.KDTree//. The //KDTree// class is much slower than its //cKDTree// alternative. The //cKDTree// class has method [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.query.html#scipy.spatial.cKDTree.query|query(p)]] which returns distances and indices of the nearest neighbors in the tree. This output can be directly used in the ICP implementation as $d_i$ and $\mathtt{c}^\ast(i)$. ===== Point Cloud Representation ===== * Serialized and transmitted as [[http://docs.ros.org/en/noetic/api/sensor_msgs/html/msg/PointCloud2.html|sensor_msgs/PointCloud2]] ROS messages. * Convenient representations in Python: * unstructured matrix array, which allows easy transformations of point positions or normal vectors, or * [[https://numpy.org/doc/stable/user/basics.rec.html|structured array]] with several channels, providing full representation (without ROS header). * API: * import numpy as np from numpy.lib.recfunctions import structured_to_unstructured, unstructured_to_structured from ros_numpy import msgify, numpify from sensor_msgs.msg import PointCloud2 # ... input_struct = numpify(input_msg) position = structured_to_unstructured(input_struct[['x', 'y', 'z']]) normal = structured_to_unstructured(input_struct[['normal_%s' % f for f in 'xyz']]) # ... output_msg = msgify(PointCloud2, input_struct) output_msg.header.frame_id = 'map' output_msg.header.stamp = rospy.Time.now() ===== Outlier Rejection ===== When the robot is exploring an unknown environment, some measured points correspond to the previously unseen areas and for these points true correspondences cannot be found. In some cases, such as going through a narrow opening such as a door, the relative overlap can easily decrease almost to a half. To handle incomplete overlaps between the point clouds, we will employ the following outlier rejection strategy in the ICP algorithm. Let $\mathtt{c}^\ast(i)$ be the set of correspondences from the nearest neighbor search and $d_i$ the distances between corresponding points, $d_i = \|\mathtt{p}_i - \mathtt{q}_i\|$. Inlier distance threshold $d_\mathrm{inl}$ can be adjusted adaptively using $p$-quantile of the distances, denoted $Q_d(p)$ (i.e. $Q_{0.5}(p)$ is median): $d_\mathrm{inl} = \min(r Q_d(p), d_\mathrm{max})$ where $r$ is a multiplier and $d_\mathrm{max}$ the maximum allowed inlier distance. In our //icp_slam// node, these constants correspond to the following parameters: * $p = \text{inlier_ratio}$, * $r = \text{inlier_dist_mult}$, and * $d_\mathrm{max} = \text{max_inlier_dist}$. ===== Point-to-Plane Loss ===== In many situations, it is more precise to minimize the //point-to-plane// loss (see [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/01_icp.pdf#page=20|the lecture]]) instead of //point-to-point//: Instead of $\underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum} \| \mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}_i\|^2$, compute $\underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum}(\mathtt{n}_i^\intercal( \mathtt{R}\mathtt{p}_i + \mathtt{t} - \mathtt{q}_i))^2$. This formulation has no closed-form solution, but we can rephrase it as $\underset{\mathtt{R}, \mathtt{t}}{\mathrm{argmin}} \underset{i}{\sum} \| \mathtt{R}\mathtt{p}_i + \mathtt{t} - \hat{\mathtt{q}}_i\|^2$ where $\hat{\mathtt{q}}_i = \mathtt{p}_i - \mathtt{n}_i \ast \mathtt{n}_i^\intercal(\mathtt{p}_i - \mathtt{q}_i)$ is the point on plane $\mathtt{n}_i^\intercal (\mathtt{p}_i - \mathtt{q}_i) = 0$ closest to $\mathtt{R} \mathtt{p}_i + \mathtt{t}$ from the previous iteration (so that the //absolute_orientation// function can be used). {{ :courses:aro2024:tutorials:point_to_plane.png?nolink&400 |}} The new formulation is nothing but application of absolute orientation, only to a slightly different set of point $\{\mathtt{p}_i, \hat{\mathtt{q}}_i\}$ instead of $\{\mathtt{p}_i, \mathtt{q}_i\}$. If you explicitly compute a distance of (p-q) projected onto the normal, you usually get signed distance. Make sure you turn it into unsigned distance when passing it to the rest of the code. ====== Occupancy Mapping ====== Localization is used to build occupancy grid map for frontier detection and path planning from registered scans. Occupancy mapping is implemented in a separate node, ''occupancy_mapper'', which is provided and need not be changed. The occupancy model of each grid cell $b_i$ is updated with every scan: the cells containing the measured points have their probability of being occupied increased and other cells incident with the rays from the sensor towards these points have their probability of being occupied decreased. Additive updates take form of $\Delta b_i = \log \frac{p(o_i|\mathtt{z}_{t})}{p(\neg o_i|\mathtt{z}_{t})} + \log \frac{p(\neg o_i)}{p(o_i)} $. See [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/01_icp.pdf#page=33|the lecture]]. The occupancy values are clipped to interval $[b_\mathrm{low}, b_\mathrm{high}]$ to avoid accumulating overconfident occupancy belief and allow to handle dynamic objects. We use unit updates $\Delta b_i \in \{-1, 1\}$ and let parameters $b_\text{low}$, $b_\text{high}$, $b_\text{occupied}$ be tuned to get the required trade-off between noise removal and response time to dynamic changes in the environment. Major limitation of the provided implementation is the fact that the scene is discretized to cells of a given size and spatial distribution of surfaces within the cells are not modeled. As a result, a ray with high angle of incidence may pass through many cells, which it marks as empty, before it reaches the measured cell, and the previously occupied cells it passes through may incorrectly switch to empty state. A workaround in frontier detection may be needed to deal with such cases. ====== References ====== [1] K. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” //Pattern Analysis and Machine Intelligence, IEEE Transactions on//, vol. PAMI-9, no. 5, pp. 698–700, 1987. \\ [2] Y. Chen and G. Medioni, “Object modelling by registration of multiple range images,” //Image and Vision Computing//, vol. 10, no. 3, pp. 145–155, 1992.