Search
In this tutorial, we will go over several heuristics of classical planning and probabilistic planning as a preparation for exam. If you have your own exercises or you struggle with any topic, prepare it for the tutorial.
A relaxed planning task of STRIPS task $\Pi = \langle F, A, s_0, g \rangle$, where $a = \langle pre_a, add_a, del_a \rangle$ is a task $Pi^+ = \langle F, A^+, s_0, g \rangle$, where the delete effects of actions are omitted, i.e., $a^+ = \langle pre_{a^+}, add_{a^+}, \emptyset \rangle$. The optimal heuristic of $\Pi$ is denoted $h^*$ and for $\Pi^+$. However, the heuristic values cannot be computed in polynomial time, therefore, we use estimates: $h^{max}, h^{add}, h^{ff}$. In this tutorial, we consider $\Pi = \Pi^+$ and $h^{max} \leq h^+ = h^* \leq h^{add}$ and $h^+ = h^* \leq h^{ff}$ hold.
Exercise: Consider the following STRIPS task $\Pi = \langle F, A, s_0, g \rangle$, where
Compute the perfect heuristic $h^*$, $h^{max}$, $h^{add}$, and $h^{ff}$ for the initial state, i.e., $h^*(s_0), h^{max}(s_0),h^{add}(s_0)$ and $h^{ff}(s_0)$.
Solution
$s_0=\{a\} \stackrel{a_1}{\longrightarrow} \{a,b,c\} \stackrel{a_2}{\longrightarrow} \{a,b,c,d\} \stackrel{a_6}{\longrightarrow} \{a,b,c,d,g\} \stackrel{a_5}{\longrightarrow} \{a,b,c,d,e,f,g\} \subseteq g$
Thus $h^*(s_0)=\mathsf{cost}(a_1) + \mathsf{cost}(a_2) + \mathsf{cost}(a_6) + \mathsf{cost}(a_5)= 4$.
Compute $h^{LM-Cut}(s_0)$.
Do not forget to extend $\Pi$ by the by new facts $\bot$ and $\top$ so the new initial state is $\{\bot\}$ and the new goal state is $\{\top\}$.
Solution - Extend $\Pi$
$F = \{a,b,c,d,e,f,\top,\bot\}, s_0 = \{\bot\}, g = \{\top\}$, and $A = A \cup \{a_{\bot}, a_{\top}\}$, where
Solution - $1^{st}$-iteration
(If tie-break use and lexicographical order.)
From justification graph $\rightarrow L = \{a_1, a_2\}$ and $\textrm{min}\{a_1,a_2\} = 1$.
Exercise 1: Consider the following FDR task $\Pi = \langle V, A, s_0, g \rangle$, where
Create a causal graph of $\Pi$, where vertices corresnponds to $V$ and edges to actions $a_k$, where for each $v_i,v_j \in V$ either $\mathsf{pre}_{a_k}(v_i)$ and $\mathsf{eff}_{a_k}(v_j)$ are defined or $\mathsf{eff}_{a_k}(v_i)$ and $\mathsf{eff}_{a_k}(v_j)$ are defined. And from the graph select interesting patterns up to size 2.
Solution - interesting patterns
Causal graph:
Interesting patterns of size 1 are $P_1 = \{ \{v_a\}, \{v_e\} \}$, since $g(v_a)$ and $g(v_e)$ are defined. Interesting patterns of size 2 are $P_2 = \{ \{v_a,v_b\}, \{v_a, v_d\}, \{v_a,v_e\}, \{v_b,v_e\}, \{v_c,v_e\}, \{v_d,v_e\} \}$, where each contains variable $v_i$ for which $g(v_i)$ is defined. Hence, all interesting patters $P_{\rm up 2}$ are $P_{\rm up 2} = P_1 \cup P_2$.
Take the interesting patterns $P$ and used them to form a linear program to compute post-hoc heuristic value.
Solution - post-hoc heuristic
0 --(a3)--> 1
0 --(a2)--> 1
00 -(a1)-> 10 | (a3) | v 11
01 -(a3)-> 11
00 -(a2)-> 01 -(a3)-> 11 | ^ | | --------(a1)------
00 -(a2)-> 01
00 -(a1) -> 01 10 -(a2)-> 11
00 -(a4)-> 10 -(a2)-> 11
\[ \min 3X_1 + X_2 + X_3 + X_4 + 2X_5 \qquad\text{subject to} \] \begin{align*} X_3 \geq 1 \\ X_2 \geq 1 \\ X_3 \geq 1 \\ X_3 \geq 1 \\ X_2 + X_3\geq 2 \\ X_2 \geq 1 \\ X_2 \geq 1 \\ X_2 \geq 1 \end{align*}
Solution of LP is 3.
Exercise 2: Consider the following FDR task $\Pi = \langle V, A, s_0, g \rangle$, where
Given projection $P=\{v_3\}$, compute the heuristic $h_{P}(s_0)$ and post-hoc heuristic.
Solution - heuristic
$P=\{v_3\}$, $s_0 = \{v_3=h\}, g=\{v_3=k\}$, and $A$ is
The constructed LTS is following $ \{v_3=h\} \stackrel{a_1}{\longrightarrow} \{v_3=j\} \stackrel{a_3}{\longrightarrow} \{v_3=k\}$. And $h_{P}(s_0) = \mathsf{cost}(a_1) + \mathsf{cost}(a_3) = 2 + 1 = 3$.
\[ \min 2X_1 + X_3 + 5X_5 \qquad\text{subject to} \] \begin{align*} 2X_1 + X_3 + 5X_5 \geq 3 \end{align*}
The optimal value of LP is 3.
Exercise: Wanting to read a book as soon as possible, we decide to go and get it. We can either take bus or bike to the city center, where there are a library as well as a shop. The problem is modelled as the following LTS.
At first, compute the heuristic values of each state $S = \{home, station, bus, library, shop, reading\}$.
Solution - heuristic values
Then perform ILAO$^*$ and get optimal policy $\pi^*$.
Solution - ILAO$^*$
Recall two steps of ILAO$^*$:
Optimal policy is $\pi^*(home) = go to bus$, $\pi^*(station) = wait$, $\pi^*(bus) = take bus$, $\pi^*(library) = borrow$.