======= Homework 03 - ICP SLAM =======


====== Simultaneous Localization and Mapping ======

When a robot explores an unknown environment, it has to solve localization and mapping simultaneously, hence the name //Simultaneous Localization and Mapping// (SLAM).
We will focus on the //online SLAM problem// in which we are interested only in the current robot pose and map given all previous measurements.
In other words, past robot poses are not updated when new measurements arrive.

One can find many approaches both in literature and deployed in field.
Main differences come from the sensors used (camera, lidar, sonar, etc.) and the assumptions taken (2D vs 3D, static vs dynamic world).
A few of them can be seen in action below:
  * [[https://www.youtube.com/watch?v=F8pdObV_df4&t=8s|2D lidar + IMU, indoor (Hector)]]
  * [[https://www.youtube.com/watch?v=qpTS7kg9J3A|Stereo, outdoor (RTAB-Map)]]
  * [[https://www.youtube.com/watch?v=08GTGfNneCI&t=8s|3D lidar, indoor (BLAM)]]
  * [[https://www.youtube.com/watch?v=cMgLyLpnsoU&t=19s|3D lidar, outdoor with dynamic obstacles (ICP Mapper)]]


====== Assignment: ICP-based 2D SLAM ======

We will implement a ROS node similar to the //ICP Mapper// above but we will restrict ourselves to two dimensions, localizing and mapping only in horizontal //x-y// plane.
We assume that the robot operates on a flat ground since the TurtleBots used in the labs cannot traverse larger obstacles anyway.

Your task is to get familiar with the provided [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/tutorials/aro_slam.zip|aro_slam]] ROS package and to implement the //absolute_orientation// and //icp// functions in the //aro_slam.icp// Python module (file path ''aro_slam/src/aro_slam/icp.py'').
Then compress the whole //aro_slam// package excluding the data folder and any bag files into a //zip// archive and upload it into Brute.

Many parts of the package, including the //icp_slam_2d// node (file path ''aro_slam/scripts/icp_slam_2d'')
with subscribers and publishers, map updating etc., are already provided, nevertheless, feel free to experiment with these as well.
The node builds a global point map from lidar scans which it first aligns with the map via ICP.
Odometry provides an initial estimate for the alignment.
See [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/tutorials/icp_slam.mp4|video]] for an example run with the robot.

The transforms related to navigation, organized in a directed tree structure, are //map// $\to$ //odom// $\to$ //base_link//.
The transform //odom// $\to$ //base_link// is provided by odometry of the TurtleBot platform, by fusing the information from encoders in wheels and an possibly //inertial measurement unit// (IMU).
The transform //map// $\to$ //odom// is therefore only a correction of odometry-only localization, which would slowly drift by accumulating errors over time.

While building the map, the node avoids excessive map updates and reduces map drift by keeping only the first measurement in every cell.
It also keeps only the points from stable occupied cells, the rest is discarded each time the map is updated.


====== Iterative Closest Point ======

Optimal least-squares alignment $\mathtt{R}, \mathtt{t}$ of corresponding points $\mathtt{y}_i = \mathtt{R}\mathtt{x}_i + \mathtt{t}$, $i = 1, \ldots, N$, $\mathtt{x}_i \in \mathcal{X} \subseteq \mathbb{R}^D$, $\mathtt{y}_i \in \mathcal{Y} \subseteq \mathbb{R}^D$, can be found using the absolute orientation algorithm [1] described in [[https://cw.fel.cvut.cz/b202/_media/courses/aro/lectures/robot_lidars_tf.pdf#page=43|the lecture]].

The //Iterative Closest Point// (ICP) [2] algorithm provides an approximate solution when the correspondences $(\mathtt{x}_i, \mathtt{y}_i)$ are not known in advance.
Instead, it solves the correspondence and alignment problem iteratively, with
ad hoc correspondences $(\mathtt{x}_i, \arg\min_{\mathtt{y}\in\mathcal{Y}}\|\mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}\|)$ using alignment $\mathtt{R}, \mathtt{t}$ from the previous iteration.
In the first iteration, an initial estimate or identity $\mathtt{R} = \mathtt{I}$, $\mathtt{t} = \mathtt{0}$ can be used.

Recall the ICP and absolute orientation algorithms from  [[https://cw.fel.cvut.cz/wiki/_media/courses/aro/lectures/mapping_lidar_camera.pdf#page=17|the lecture]].
Note that we will provide the ICP algorithm with an initial estimate different from the identity used in the lecture.
This initial alignment is to be applied to the input points $\mathtt{x}_i$ prior the nearest neighbor search.
Thus a small change to the algorithm from the lecture is needed.

**Input:**
  * input points $\mathcal{X} = \{ \mathtt{x}_i \}_i$ to align,
  * reference points $\mathcal{Y} = \{ \mathtt{y}_j \}_j$ to align to,
  * inlier ratio $p$, inlier distance multiplier $r > 0$ (see below),
  * initial alignment $\mathtt{R}$, $\mathtt{t}$.

**Algorithm:**
  * While not done
    * Establish correspondences $\mathcal{C} \gets \{(\mathtt{x}, \arg\min_{\mathtt{y}\in\mathcal{Y}}\|\mathtt{R}\mathtt{x}_i + \mathtt{t} - \mathtt{y}\|) \mid \mathtt{x} \in \mathcal{X} \}$.
    * Compute distance threshold for inliers $d_\text{inl} \gets r Q_d(p)$ where $Q_d(p)$ is the $p$-quantile of distances $d_i = \|\mathtt{x}_i - \mathtt{y}_i\|$, $(\mathtt{x}_i, \mathtt{y}_i) \in \mathcal{C}$.
    * Construct set of inliers $\mathcal{C}' \gets \{(\mathtt{x}_i, \mathtt{y}_i) \mid (\mathtt{x}_i, \mathtt{y}_i) \in \mathcal{C} \land d_i \le d_\text{inl} \}$.
    * Solve absolute orientation $\mathtt{R}, \mathtt{t} \gets [1](\mathcal{C}')$, for the inliers.

**Output:**
  * alignment $\mathtt{R}$, $\mathtt{t}$,
  * average distance for the inliers $(\sum_{d_i \le d_\text{inl}} d_i) / (\sum_{d_i \le d_\text{inl}} 1)$,
  * inlier indicators $a_i = [d_i \le d_\text{inl}]$.


Rigid transform $\mathtt{R}, \mathtt{t}$ is represented by matrix
$$\mathtt{T} = \left[
\begin{array}{c c}
\mathtt{R} & \mathtt{t} \\
\mathtt{0}^\mathsf{T} & 1
\end{array} \right]
$$.


====== Occupancy Map ======

To identify the stable occupied cells, an extra occupancy grid is maintained and updated with aligned scans.
The occupancy model of each grid cell $b_i$ is updated with every scan:
the cells containing the measured points have their probability of being occupied increased and other cells incident with the rays from the sensor towards these points have their probability of being occupied decreased.
A point $\mathtt{x}_i$ is kept in the point map only if the corresponding occupancy $b_i$ is higher than threshold $b_\mathrm{occupied}$.

Additive updates take form of
$\Delta b_i = \log \frac{p(o_i|\mathtt{z}_{t})}{p(\neg o_i|\mathtt{z}_{t})} + \log \frac{p(\neg o_i)}{p(o_i)} $. See [[https://drive.google.com/file/d/1HHYr86MQaFV1ZypDYWfuOhDIFbeq__jO/view|the lecture]].

The occupancy values are clipped to interval $[b_\mathrm{low}, b_\mathrm{high}]$ to avoid accumulating overconfident occupancy belief and allow to handle dynamic objects.
We use unit updates $\Delta b_i \in \{-1, 1\}$ and let parameters $b_\text{low}$, $b_\text{high}$, $b_\text{occupied}$ be tuned to get the required trade-off between noise removal and response time to dynamic changes in the environment.


====== Outlier Correspondence Rejection ======


In some cases, such as going through a narrow opening such as a door, large portions of the new measurements can correspond to the previously unseen areas, i.e., for many points from the input point cloud a true correspondence in the map cannot be found.
In such cases, the relative overlap can easily decrease to as low as $1/2$.
To handle smaller overlaps between the point clouds, we will employ the following outlier rejection strategy in the ICP algorithm.


Let $\mathcal{C} = \{(\mathtt{x}_i$, $\mathtt{y}_i)\}_i$ be the set of correspondences from the nearest neighbor search, $i\in\{1, \ldots, N\}$, and $d_i$ the distance between corresponding points, $d_i = \|\mathtt{x}_i - \mathtt{y}_i\|$.
Construct the set of inlier correspondences $\mathcal{C}'$ to be used in optimization of $\mathtt{R}, \mathtt{t}$,
$\mathcal{C}' = \{(\mathtt{x}_i, \mathtt{y}_i) \mid (\mathtt{x}_i, \mathtt{y}_i) \in \mathcal{C} \land d_i \le r Q_d(p) \}$,
using $p$-quantile of the distances $d_i$, $Q_d(p)$, and a multiplier $r$.
See the algorithm above.

Handling larger relative overlaps can be tuned with parameters $p =$ //inlier_ratio// and $r =$ //inlier_dist_mult// in the //icp_slam_2d// node.


====== Workspace Configuration ======

Enter Singularity shell:
<code>
singularity shell /opt/singularity/robolab/melodic
</code>

Configure workspace:
<code>
ws=~/workspace/aro
mkdir -p "${ws}/src"
cd "${ws}/src"
curl -sSL https://cw.fel.cvut.cz/wiki/_media/courses/aro/tutorials/aro_slam.zip -O
unzip *.zip
cd "${ws}"
catkin init
catkin config --extend /opt/ros/aro
catkin config --cmake-args -DCMAKE_BUILD_TYPE=Release
catkin build -c
</code>


====== Package Usage ======

Don't forget to source your workspace:
<code>
source "${ws}/devel/setup.bash"
</code>

To test the //icp_slam_2d// node in simulator:
<code>
roslaunch aro_slam sim.launch world:=stage_4
</code>
It also runs keyboard (and joypad) controls, so you can manually move the robot using keyboard (keys WASDX) if the terminal is focused. Worlds to test: stage_1, stage_2, stage_3, stage_4, world.

Or, launch the //icp_slam_2d// node alone (no controls):
<code>
roslaunch aro_slam icp_slam_2d.launch
</code>


====== Possible SLAM Improvements ======

Some of the possible SLAM improvements are listed below.
These would be evaluated as a part of your semestral work, contributing to explored map and object localization.

    * Stop ICP early once no more improvement can be made ("While not done")
    * Surfel map with point-to-plane metric
            * Allow storing normals in/along the map
            * After kNN search, the nearest point on the corresponding plane would be used instead of point itself
            * Absolute orientation would align to these points on plane
            * Keep backward compatibility of absolute_orientation and icp functions
                * e.g. ''icp(…, metric='point-to-point')'' default would keep the original semantics
                * for ''icp(…, metric='point-to-plane')''
                    * ''x'', ''y'', ''y_index'' could have a different semantics, e.g. ''x[3:6, :]'' could be normals, or
                    * additional 'y_normal' parameter could hold normals
        * Trying different initial estimates for ICP
            * Use constant velocity motion model
                - May work very well with a smooth controller ensuring slow velocity changes
            * Use IMU orientation with constant velocity motion model
            * Keep odom→base_footprint TF as is: ignore it as input, but consider it when outputing map→odom TF: T_r2m = T_o2m * T_r2o
            * IMU message (excerpt):<code>
                orientation: 
                  x: -0.0011246201996952122
                  y: -0.003690314694296936
                  z: 0.29388865004485826
                  w: -0.9558318775714033
                angular_velocity: 
                  x: -0.00011878946665939603
                  y: 4.463900499701507e-05
                  z: 0.0007740670965644818</code>
        * Use two occupancy thresholds b_empty and b_occupied
            * Remove points with $b_i < b_\mathrm{empty}$
            * In localization, consider only points with occupancy $b_i \geq b_\mathrm{occupied}$
        * Try different methods to filter out noise in the point map
            * Assess occupied/empty points based on signed distance from reconstructed surface
            * Reconstruct surface from input measurements (e.g., line segments from pair of points in 2D, polygons in 3D) and compare map points with possibly occluding line segments
            * Similar but comparing measured point with surfels in the map: a surfel may need to filtered out if it occludes a measured point


====== References ======

[1] K. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d
point sets,” //Pattern Analysis and Machine Intelligence, IEEE Transactions
on//, vol. PAMI-9, no. 5, pp. 698–700, 1987. \\

[2] Y. Chen and G. Medioni, “Object modelling by registration of multiple
range images,” //Image and Vision Computing//, vol. 10, no. 3, pp. 145–155,
1992.