===== 10. LP-based heuristics ===== **Exercise 1:** Let $\Pi=\langle V,A,s_0,g\rangle$ be an FDR task where $V=\{v_1,v_2,v_3,v_4\}$ and $\mathsf{dom}(v_i)=\{0,1\}$. The initial state is $s_0=\{v_1=0,v_2=0,v_3=0,v_4=0\}$ and the goal $g=\{v_3=1,v_4=1\}$. The actions $A=\{a_1,a_2,a_3,a_4\}$ are defined as follows: ^ action ^ $\mathsf{pre}$ ^ $\mathsf{eff}$ ^ cost ^ | $a_1$ | $v_1=0$ | $v_1=1$ | $1$ | | $a_2$ | $v_1=1$ | $v_1=0$, $v_2=1$ | $2$ | | $a_3$ | $v_2=1$ | $v_2=0$, $v_3=1$ | $2$ | | $a_4$ | $v_3=1$ | $v_4=1$ | $3$ | Draw the causal graph of $\Pi$ and write down all the interesting patterns up to size $2$. ++++ Solution| Recall that the causal graph is the digraph whose vertices are variables $v_1,v_2,v_3,v_4$ without self-loops. There is a directed edge $(v_i,v_j)$ for $i\neq j$ iff there is $a_k\in A$ such that either $\mathsf{pre}_{a_k}(v_i),\mathsf{eff}_{a_k}(v_j)$ are defined or $\mathsf{eff}_{a_k}(v_i),\mathsf{eff}_{a_k}(v_j)$. The causal graph for $\Pi$ looks as follows: v1 <---> v2 <---> v3 ---> v4 Recall that the interesting patterns $P$ should be weakly connected and causally relevant (i.e., there is a directed path from any $v\in P$ to a variable $u\in P$ such that $g(u)$ is defined). Thus the interesting patterns of size $1$ are just $\{v_3\}$ and $\{v_4\}$. For the size $2$ we have $\{v_2,v_3\}$ and $\{v_3,v_4\}$. That is all. For instance, $P=\{v_2,v_4\}$ is not interesting since the subgraph of the causal graph induced by $P$ is not weakly connected. ++++ **Exercise 2:** Formulate the linear program whose optimum value is the post-hoc heuristic for the initial state $s_0$ in the task from the previous exercise using the interesting patterns up to size $2$. ++++ Solution| We must first compute the PDB heuristic $h_P(s_0)$ for each $P\in\{\{v_3\},\{v_4\},\{v_2,v_3\},\{v_3,v_4\}\}$. We draw the LTS induced by the syntactic projection for each pattern $P$. The LTS for $P=\{v_3\}$ looks as follows (self-loops omitted): a3 0 ----> 1 Thus $h_{\{v_3\}}(s_0)=2$. Similarly, the LTS for $P=\{v_4\}$ looks as follows (self-loops omitted): a4 0 ----> 1 Thus $h_{\{v_4\}}(s_0)=3$. The LTS for $P=\{v_2,v_3\}$ is the following (the state $\{v_2=x,v_3=y\}$ is denoted $xy$, self-loops omitted): a2 -----> 01 <----- 11 ^ a3 | |a3 a2 | 00 -----> 10 Thus $h_{\{v_2,v_3\}}(s_0)=2+2=4$. Finally, the LTS for $P=\{v_3,v_4\}$ is the following one (the state $\{v_3=x,v_4=y\}$ is denoted $xy$, self-loops omitted): a3 01 -----> 11 ^ | |a4 a3 | 00 -----> 10 Thus $h_{\{v_3,v_4\}}(s_0)=2+3=5$. Next, we must find out which actions affect which patterns. Recall that an action $a$ affects a pattern $P$ if there is $v\in P$ such that $\mathsf{eff}_{a}(v)$ is defined. ^ action\pattern ^ $\{v_3\}$ ^ $\{v_4\}$ ^ $\{v_2,v_3\}$ ^ $\{v_3,v_4\}$ ^ | $a_1$ | no | no | no | no | | $a_2$ | no | no | yes | no | | $a_3$ | yes | no | yes | yes | | $a_4$ | no | yes | no | yes | For each action $a_i$, we introduce a variable $X_i$ representing the number of applications of $a_i$. The linear program whose value is the post-hoc heuristic for $s_0$ is the following: \[ \min X_1 + 2X_2 + 2X_3 + 3X_4\qquad\text{subject to} \] \begin{align*} 2X_3 &\geq 2\\ 3X_4 &\geq 3\\ 2X_2+2X_3 &\geq 4\\ 2X_3+3X_4 &\geq 5 \end{align*} It is not so difficult to see that the optimum value of the above LP is $7$. ++++ **Exercise 3:** For the FDR task from Exercise 1, formulate the linear program computing the potential heuristic optimized for the initial state. ++++ Solution| The potential heuristic uses linear programming to find an optimized linear function mapping a state to a heuristic value. The coefficients of the linear map are called potentials. Recall that a fact in an FDR task is a pair $(v,d)$ for $v\in V$ and $d\in\mathsf{dom}(v)$ and we understand a state $s$ as a set of facts. We introduce a variable $X_{(v,d)}$ for each fact $(v,d)$. Its value represents the potential of the fact. The heuristic value is then computed by $h^{\mathrm{pot}}(s)=\sum_{(v,d)\in s} X_{(v,d)}$. The linear program maximizes the heuristic value for a given state; in our case, the initial state $s_0$ subject to constraints ensuring that the heuristic is goal-aware and consistent (and thus admissible). The following are the variables of the LP: \[X_{(v_1,0)},X_{(v_1,1)},X_{(v_2,0)},X_{(v_2,1)},X_{(v_3,0)},X_{(v_3,1)},X_{(v_4,0)},X_{(v_1,4)},\] and auxiliary variables representing the maximum of all potential for a single variable \[M_{v_1}, M_{v_2}, M_{v_3}, M_{v_4}.\] The objective function maximizes the heuristic value for the initial state $s_0=\{(v_1,0),(v_2,0),(v_3,0),(v_4,0)\}$: \[\max X_{(v_1,0)} + X_{(v_2,0)} + X_{(v_3,0)} + X_{(v_4,0)}\] subject to the constraints for the auxiliary variables \begin{align*} X_{(v_1,0)}\leq M_{v_1} && X_{(v_1,1)}\leq M_{v_1} \\ X_{(v_2,0)}\leq M_{v_2} && X_{(v_2,1)}\leq M_{v_2} \\ X_{(v_3,0)}\leq M_{v_3} && X_{(v_3,1)}\leq M_{v_3} \\ X_{(v_4,0)}\leq M_{v_4} && X_{(v_4,1)}\leq M_{v_4} \\ \end{align*} the constraint ensuring the goal-awareness \[M_{v_1}+M_{v_2}+X_{(v_3,1)}+X_{(v_4,1)}\leq 0\] and the constraints ensuring consistency \begin{align*} X_{(v_1,0)} - X_{(v_1,1)} & \leq 1\\ X_{(v_1,1)} + M_{v_2} - X_{(v_1,0)} - X_{(v_2,1)} & \leq 2\\ X_{(v_2,1)} + M_{v_3} - X_{(v_2,0)} - X_{(v_3,1)} & \leq 2\\ M_{v_4} - X_{(v_4,1)} & \leq 3\\ \end{align*} The optimum value for the LP above is $8$. Thus, we get a better value than the post-hoc heuristic. The perfect heuristic $h^*(s_0)=9$. ++++