======= Homework 03 - ICP SLAM ======= Error in ''aro_slam/launch/icp_slam.launch'': \\ Line ''loss: point_to_point'' should read ''loss: $(arg loss)'' ====== Simultaneous Localization and Mapping (SLAM) ====== When a robot explores an unknown environment, it has to simultaneously acquire a map of the environment and localize itself within the map, hence the name //Simultaneous Localization and Mapping// (SLAM). We will focus on the //online SLAM problem// in which we are interested only in the current robot pose and map given all previous measurements. In other words, past robot poses are not updated when new measurements arrive. One can find many approaches both in literature and deployed in field. Main differences come from the sensors used (camera, lidar, sonar, etc.) and the assumptions taken (2D vs 3D, static vs dynamic world). A few of them can be seen in action below: * [[https://www.youtube.com/watch?v=F8pdObV_df4&t=8s|2D lidar + IMU, indoor (Hector)]] * [[https://www.youtube.com/watch?v=qpTS7kg9J3A|Stereo, outdoor (RTAB-Map)]] * [[https://www.youtube.com/watch?v=08GTGfNneCI&t=8s|3D lidar, indoor (BLAM)]] * [[https://www.youtube.com/watch?v=cMgLyLpnsoU&t=19s|3D lidar, outdoor with dynamic obstacles (ICP Mapper)]] We will implement a ROS node similar to the //ICP Mapper// above but we will restrict ourselves to two dimensions, localizing and mapping only in horizontal //x-y// plane, as the TurtleBots in the labs are only equipped with 2D lidar and can operate only on a flat ground. The main building block for localization is an algorithm for alignment of two point clouds. Robot can be localized either by aligning each new input point cloud to some previous input point cloud (//frame-to-frame//) or to a map built from previously aligned measurements (//frame-to-map//). ====== Iterative Closest Point (ICP) ====== Optimal least-squares alignment $\mathtt{R}^*, \mathtt{t}^* = \mathrm{argmin}_{\mathtt{R}, \mathtt{t}} \sum_i \|\mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}_i \|^2$ of corresponding points $\{(\mathtt{x}_i, \mathtt{y}_i)\}_i$, $\mathtt{x}_i \in \mathcal{X} \subseteq \mathbb{R}^D$, $\mathtt{y}_i \in \mathcal{Y} \subseteq \mathbb{R}^D$, can be found using the //absolute orientation// [1] algorithm described in [[https://cw.fel.cvut.cz/wiki/_media/courses/b3m33aro/lectures/01_transformations_and_lidar_calibration.pdf#page=48|the lecture]]. /*[[https://cw.fel.cvut.cz/b202/_media/courses/aro/lectures/robot_lidars_tf.pdf#page=43|the lecture]].*/ The //Iterative Closest Point// (ICP) [2] algorithm provides an approximate solution when the correspondences $(\mathtt{x}_i, \mathtt{y}_i)$ are not known in advance. Instead, it solves the correspondence and alignment problem iteratively, with ad hoc correspondences $(\mathtt{x}_i, \mathrm{argmin}_{\mathtt{y}\in\mathcal{Y}}\|\mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}\|)$ using alignment $\mathtt{R}, \mathtt{t}$ from the previous iteration. In the first iteration, an initial estimate or identity $\mathtt{R} = \mathtt{I}$, $\mathtt{t} = \mathtt{0}$ can be used. Recall the ICP algorithm from the [[https://cw.fel.cvut.cz/wiki/_media/courses/b3m33aro/lectures/02_localization_lidar.pdf#page=32|lecture]]. /*[[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/mapping_lidar_camera.pdf#page=17|lecture]].*/ Note that we will provide the ICP algorithm with an initial estimate of the transform which does not have to be identity. This initial alignment is to be applied to the input points $\mathtt{x}_i$ prior the nearest neighbor search. ===== Algorithm Outline ===== * While not done * Establish correspondences $\mathcal{C} \gets \{(\mathtt{x}, \mathrm{argmin}_{\mathtt{y}\in\mathcal{Y}}\|\mathtt{R}\mathtt{x} + \mathtt{t} - \mathtt{y}\|) \mid \mathtt{x} \in \mathcal{X} \}$. * Compute inlier distance threshold $d_\mathrm{inl}$ and let $d_i = \|\mathtt{x}_i - \mathtt{y}_i\|$ be the distance between the corresponding points. * Construct set of inliers $\mathcal{C}' \gets \{(\mathtt{x}_i, \mathtt{y}_i) \mid (\mathtt{x}_i, \mathtt{y}_i) \in \mathcal{C} \land d_i \le d_\text{inl} \}$. * Solve absolute orientation $\mathtt{R}, \mathtt{t} \gets [1](\mathcal{C}')$ for the inliers. ===== Outlier Rejection ===== When the robot is exploring an unknown environment, some measured points correspond to the previously unseen areas and for these points true correspondences cannot be found. In some cases, such as going through a narrow opening such as a door, the relative overlap can easily decrease almost to a half. To handle incomplete overlaps between the point clouds, we will employ the following outlier rejection strategy in the ICP algorithm. Let $\mathcal{C} = \{(\mathtt{x}_i$, $\mathtt{y}_i)\}_i$ be the set of correspondences from the nearest neighbor search and $d_i$ the distance between corresponding points, $d_i = \|\mathtt{x}_i - \mathtt{y}_i\|$. Inlier distance threshold can be adjusted adaptively using $p$-quantile of the distances $d_i$, $Q_d(p)$: $d_\mathrm{inl} = \min(r Q_d(p), d_\mathrm{max})$ where $r$ is a multiplier and $d_\mathrm{max}$ the maximum allowed inlier distance. In our //icp_slam// node, these constants corresponds to the following parameters: * $p =$ //inlier_ratio//, * $r =$ //inlier_dist_mult//, and * $d_\mathrm{max} =$ //max_inlier_dist//. ===== Improvements ===== * Multi-channel input point clouds $\mathcal{X}$ and $\mathcal{Y}$. * E.g., $\mathcal{X} = \{ x_i \}_i, x_i = (\mathtt{x}_i,\mathtt{n}_i, \ldots)$. * Establish correspondences using a generic descriptor: * $\mathcal{C} \gets \{ (x, \mathrm{argmin}_{y\in\mathcal{Y}} \|\mathrm{desc}(\mathtt{T}(\mathcal{x})) - \mathrm{desc}(y) \| ) \mid x \in \mathcal{X} \}$, \\ where descriptor function $\mathrm{desc}: \mathcal{X}^N \mapsto \mathbb{R}^{N \times D}$ produces $D$-dimensional descriptor vectors suitable for nearest neighbor search. * Allow minimizing point-to-plane loss (see [[https://cw.fel.cvut.cz/b202/_media/courses/aro/lectures/mapping_lidar_camera.pdf#page=33|the lecture]]): * Instead of $\mathrm{argmin}_{\mathtt{R}, \mathtt{t}} \sum_i (\mathtt{n}_i^\intercal( \mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}_i))^2$, find $\mathrm{argmin}_{\mathtt{R}, \mathtt{t}} \sum_i \| \mathtt{R}\mathtt{x}_i + \mathtt{t} - \underline{\mathtt{y}}_i\|^2$ where $\underline{\mathtt{y}}_i$ is the point on plane $\mathtt{n}_i^\intercal (\mathtt{x} - \mathtt{y}_i) = 0$ closest to $\mathtt{R} \mathtt{x}_i + \mathtt{t}$ from the previous iteration (so that the //absolute_orientation// function can be used). ====== Occupancy Mapping ====== Localization is used to build occupancy grid map for frontier detection and path planning from registered scans. Occupancy mapping is implemented in a separate node, ''occupancy_mapper'', which is provided and need not be changed. The occupancy model of each grid cell $b_i$ is updated with every scan: the cells containing the measured points have their probability of being occupied increased and other cells incident with the rays from the sensor towards these points have their probability of being occupied decreased. Additive updates take form of $\Delta b_i = \log \frac{p(o_i|\mathtt{z}_{t})}{p(\neg o_i|\mathtt{z}_{t})} + \log \frac{p(\neg o_i)}{p(o_i)} $. See [[https://drive.google.com/file/d/1HHYr86MQaFV1ZypDYWfuOhDIFbeq__jO/view|the lecture]]. The occupancy values are clipped to interval $[b_\mathrm{low}, b_\mathrm{high}]$ to avoid accumulating overconfident occupancy belief and allow to handle dynamic objects. We use unit updates $\Delta b_i \in \{-1, 1\}$ and let parameters $b_\text{low}$, $b_\text{high}$, $b_\text{occupied}$ be tuned to get the required trade-off between noise removal and response time to dynamic changes in the environment. Major limitation of the provided implementation is the fact that the scene is discretized to cells of a given size and spatial distribution of surfaces within the cells are not modeled. As a result, a ray with high angle of incidence may pass through many cells, which it marks as empty, before it reaches the measured cell, and the previously occupied cells it passes through may incorrectly switch to empty state. A workaround in frontier detection may be needed to deal with such cases. ====== Prerequisities ====== ===== TF Transforms ===== * TF transforms between reference frames are organized in a directed tree and distributed on topics ''/tf'' and ''/tf_static''. * Reference frames are identified by IDs (''msg.header.frame_id''). * Each transform comes with a timestamp. API allows interpolating transforms based on these timestamps. * Common tree structure for a mobile robot is * ''map'' <- ''odom'' <- ''base_footprint'' <- ... if SLAM corrects an already provided odometry, or * ''map'' <- ''base_footprint'' <- ... if SLAM provides complete localization. * Additional frames may follow, with the sensors typically being at leaves. * API: * tf_buffer = tf2_ros.Buffer() tf_sub = tf2_ros.TransformListener(tf_buffer) tf_pub = tf2_ros.TransformBroadcaster() # ... try: transform = tf_buffer.lookup_transform(...) transform = tf_buffer.lookup_transform_full(...) except tf2_ros.TransformException as ex: # ... * [[http://wiki.ros.org/tf/Debugging%20tools|Debugging tools]] * ''rosrun rqt_tf_tree rqt_tf_tree'' * Limitations of ''tf2_ros.Buffer'' * Parent frame cannot change. * Transforms cannot change at the given timestamp. ===== Point Cloud Representation ===== * Serialized and transmitted as [[http://docs.ros.org/en/noetic/api/sensor_msgs/html/msg/PointCloud2.html|''sensor_msgs/PointCloud2'']] ROS messages. * Convenient representations in Python: * unstructured matrix array, which allows easy transformations of point positions or normal vectors, or * structured array with several channels, providing full representation (without ROS header). * API: * import numpy as np from numpy.lib.recfunctions import structured_to_unstructured, unstructured_to_structured from ros_numpy import msgify, numpify from sensor_msgs.msg import PointCloud2 # ... input = numpify(input_msg) position = structured_to_unstructured(input[['x', 'y', 'z']]) normal = structured_to_unstructured(input[['normal_%s' % f for f in 'xyz']]) # ... map_mgs = msgify(PointCloud2, map) msg.header.frame_id = 'map' msg.header.stamp = ros.Time.now() ===== Nearest Neighbor Search ===== Nearest neighbors can efficiently be found using [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.html|KD-tree]] and its [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.query.html#scipy.spatial.cKDTree.query|query]] method. ====== Assignment ====== ===== Workspace Configuration ===== Enter Singularity shell: singularity shell /opt/singularity/robolab/noetic Configure workspace with the provided [[https://cw.fel.cvut.cz/b212/_media/courses/b3m33aro/tutorials/aro_slam.zip|aro_slam]] package: ws=~/workspace/aro mkdir -p "${ws}/src" cd "${ws}/src" curl -sSL https://cw.fel.cvut.cz/b212/_media/courses/b3m33aro/tutorials/hw_02_pkgs.zip -O unzip hw_02_pkgs.zip curl -sSL https://cw.fel.cvut.cz/b212/_media/courses/b3m33aro/tutorials/aro_slam.zip -O unzip aro_slam.zip wstool init wstool merge aro_slam/dependencies.rosinstall wstool up -j 4 cd "${ws}" catkin init catkin config --extend /opt/ros/aro catkin config --cmake-args -DCMAKE_BUILD_TYPE=Release catkin build -c ===== Package Structure ===== Get familiar with the package. The structure of the package is as follows: * ''launch/'' provides convenient launchers for the nodes below: * ''aro_slam.launch'' launches the whole SLAM pipeline, including visualization, * ''bag.launch'' launches the pipeline along with a bag file(s) to provide inputs, and * ''sim.launch'' launches the pipeline along with the Gazebo simulator. * ''scripts/'' contains executable nodes or scripts: * ''cloud_filter'' node filters point clouds and estimates surface normals, * ''icp_slam'' node provides robot localization, * ''occupancy_mapper'' creates the occupancy map to be later used in planning and eventually also for evaluation of exploration, * ''odom_to_tf'' converts odometry messages to transforms, and * ''tf_metrics'' allows evaluating accuracy of localization. * ''src/aro_slam/'' contains Python package with the following modules: * ''cloud.py'' contains helpers for handling point clouds, * ''icp.py'' contains implementation of absolute orientation and ICP, * ''io.py'' provides support for reading and writing Python objects, * ''occupancy.py'' provides support for occupancy mapping, and * ''utils.py'' contains common utility functions. * ''CMakeLists.txt'' describes package build. * ''dependencies.rosinstall'' lists dependencies which can directly be added into the workspace. * ''package.xml'' contains the Catkin package description. ===== Package Usage ===== Don't forget to source your workspace: source "${ws}/devel/setup.bash" To test the //icp_slam// node in simulator: roslaunch aro_slam sim.launch odom:=true world:=aro_maze_1 keyboard:=true It also runs keyboard (and joypad) controls, so you can manually move the robot using keyboard (keys WASDX) if the terminal is focused. Worlds to test: stage_1, stage_2, stage_3, stage_4, world, aro_maze_1, aro_maze_2. Or, launch the SLAM pipeline alone (no controls): roslaunch aro_slam aro_slam.launch See the launch files for possible arguments. ===== Absolute Orientation (2 point) ===== Implement (and test) absolute orientation algorithm for alignment with known correspondences in the //absolute_orientation// function in file ''aro_slam/src/aro_slam/icp.py''. ===== Point-to-Point ICP (3 points) ===== Implement (and test) the ICP algorithm for alignment without known correspondences using position descriptors and point-to-point loss in the //icp// function in file ''aro_slam/src/aro_slam/icp.py''. Test and debug above functionality within the //icp_slam// node. ===== Point-to-Plane ICP (2 point) ===== Add point-to-plane loss to the //icp// function as described above. ===== Custom Descriptor (1 point) ===== Allow using a custom descriptor in the //icp// function. Use estimated surface normal vectors along with point positions as descriptors. See the //position// function used as the default descriptor in the //icp// function as an example. ===== Mapping with Frame-to-Map Alignment (1 point) ===== Build a map from registered measurements and use it as a reference for aligning the input point clouds, instead of a previous //key frame//. Add only the points which are at least //mapping_min_dist// (= node parameter) meters far from the current points in the map. Implement this functionality in the //icp_slam// node itself (file path ''aro_slam/scripts/icp_slam''). ===== SLAM Accuracy Evaluation (1 point) ===== Record bag files with robot exploring various worlds and evaluate various SLAM configurations on the bag files. You can use ''scripts/eval_slam'' script as a template for testing various parameter settings and computing accuracy statistics. Include the CSV file with reported results, ''aro_slam/data/results/summary.csv'', and briefly summarize the results in text file ''aro_slam/data/results/summary.txt''. Recording can be done using the additional ''record:=true'' argument to ''sim.launch'': roslaunch aro_slam sim.launch odom:=true world:=aro_maze_1 keyboard:=true record:=true ===== Instructions ===== Compress the whole //aro_slam// package with your implementation and results excluding bag files into a //zip// archive and upload it into Brute. The implementation should be based on the provided [[https://cw.fel.cvut.cz/b212/_media/courses/b3m33aro/tutorials/aro_slam.zip|''aro_slam'' package]] template. See [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/tutorials/icp_slam.mp4|video]] for an example run with the robot. ====== References ====== [1] K. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” //Pattern Analysis and Machine Intelligence, IEEE Transactions on//, vol. PAMI-9, no. 5, pp. 698–700, 1987. \\ [2] Y. Chen and G. Medioni, “Object modelling by registration of multiple range images,” //Image and Vision Computing//, vol. 10, no. 3, pp. 145–155, 1992.