Warning

# T4c-patrol - Patrolling in polygonal environment

The task is to implement the computation of turning probability in perimeter patrolling.

 Deadline January 7, 23:59 PST Points 3 Label in BRUTE t4c-patrol Evaluation upload strategy computation in strategy.py to brute Resources none

#### Assignment

Implement following algorithm to compute turning probability $p$.

Create a matrix of size [2d + 1, 2d + 1], initialized with 0s
// Fill out all entries in M as follows:
M[2d, 2d] = 1
for i = 0, i < 2d, i++ do
direction = (i mod 2==0 ? 1 : -1)
M[i,(2*direction+i) mod (2d+1)] = p
M[i,(direction+i) mod (2d+1)] = 1-p
end
// M to the power of t to calculate state probabilities after t steps
// in a sense of a matrix multiplication not only elementwise
MT = M^t
Res = vector of size d initialized with 0s
// Collect resulting probabilities
for loc = 0, loc < d, loc++ do
Res[loc] = MT[2loc][2d]
end
return Res

To solve the problem, implement the following two functions, generate_function and optimal_strategy and upload them to BRUTE in file strategy.py. The generate_function function is prescribed as follows:

def generate_function(d, t, p):
"""
Function to generate the probability of stopping an attack for each segment given some p

Parameters
----------
d:int
number of spaces between two closest agents
t:int
the time required by the attacker to breach the fence
p:float
probability of continuing forward

Returns
-------
f:float[d]
list of probabilities of stopping an attack for each segment
example behavior: input: $(3,2,0.5)$, returns: $[0.25,0.25,0.5]$.

example behavior: input: $(4,2,0.25)$, returns: $[0.1875, 0.0, 0.0625, 0.25]$.

The optimal_strategy function has the following prescription

def optimal_strategy(d, t, splits):
"""
Function to generate the probability of stopping an attack for each segment given some p

Parameters
----------
d:int
number of spaces between two closest agents
t:int
the time required by the attacker to breach the fence
splits:int
a number of values of p from 0 to 1 to be created, e.g. for splits = 3, we will have p in [0,0.5,1]

Returns
-------
f_all:float[d][splits]
list of lists, one for each segment, and in the list for the segment are probabilities of stopping an attack for all possible p
f_min:float[splits]
list of minimal probabilities of stopping an attack for each p
strategy:float
value of the optimal p
prob:float
the worst-case probability of stopping an attack given strategy

example behaviors: input: $(3,2,3)$, returns: $([[0.0, 0.25, 0.0], [0.0, 0.25, 1.0], [0.0, 0.5, 1.0]], [0.0, 0.25, 0.0], 0.5, 0.25)$

input: $(3,2,5)$, returns: $([[0.0, 0.1875, 0.25, 0.1875, 0.0], [0.0, 0.0625, 0.25, 0.5625, 1.0], [0.0, 0.25, 0.5, 0.75, 1.0]], [0.0, 0.0625, 0.25, 0.1875, 0.0], 0.5, 0.25)$

#### Evaluation

The strategy computing functions should be submitted to Brute and if there is some mistake it will show you the input and wrong output on public testing scenarios.

##### Detailed description of the problem

The environment comprises a closed polygonal environment $P$ which is threatened by an adversary attack. To prevent this attack, the polygon $P$ is being patrolled by $k$ homogeneous mobile robots. The adversary needs to continuously attack a spot on the polygon for the duration $t$ to penetrate the polygon. The aim of this assignment is to compute such a patrolling strategy that minimizes the probability of a successful attack. Such a strategy comprises dividing the polygon into segments, and changing the direction of the robot patrol.

##### Polygon segmentation

The path around the polygon $P$ may be divided into $N$ segments, where a robot traverses each segment with uniform duration. Therefore, there exists such a time-periodic function that each robot traverses exactly one segment per its period. Moreover, the robot turning action has a non-zero duration of $\tau$. It is recommended to use the turning time $\tau$ as the time unit, i.e., $\tau = 1$.

Figure 1: Polygon environment and the division to segments. Note that on the polygon segments are homogeneously spaced since the segmentation is based on a uniform division of time and some actions, e.g., driving along curves, may take an increased amount of time.

##### The optimal strategy

The authors of [1] show that in an optimal strategy the robots are always uniformly distributed along the fence, and that at any time all the robots are always moving in a same direction. Let $d = \frac{N}{k}$ be the distance in segments between neighbor robots. Then if $t \geq d$ the strategy is trivial since it is sufficient for the robots to move forward in an arbitrary direction as such a behavior will always intercept any attacker. Conversely, if $t < \frac{d+\tau -1}{2}$, $\tau = 1$, and $t < \frac{d}{2}$ the attacker can always select a point on the polygon where the penetration succeeds. Assuming that neither of the trivial cases holds, the optimal strategy is a stochastic one that may be generalized as follows: for each segment, continue moving forward with the probability $p$, or turn all robots at once with the probability $1-p$.

To compute the optimal $p$, consider the segments between two neighbor robots. Note, due to the symmetric nature of the problem, this applies for all such segments. A Markov chain may be used to model the possible states and transitions between states in such system. For each segment $s_i$ in the original path, consider two states in the graphic model G; one state for being on $s_i$ and moving clockwise ($s_i^{cw}$), and one for moving counterclockwise ($s_i^{cc}$). Since $\tau = 1$, the edges of G are defined as follows. The robot direction change is represented by an $1-p$ probability edge, which leads from $s_i^{cc}$ to $s_i^{cw}$. Similarly, an $1-p$ probability edge leads from $s_i^{cw}$ to $s_i^{cc}$. Finally, edges with probability $p$ lead from $s_i^{cw}$ to $s_{i-1}^{cw}$, and from $s_i^{cc}$ to $s_{i+1}^{cc}$, respectively. Note, if the index of the target state of an edge would be lower than one or bigger than $d$, the edge goes instead to the absorbing state $s_{dt}$.

The algorithm returns a list of values, one for each of the $d$ segments. The value denotes the probability of stopping an attack in the segment given $p$. Therefore, to minimize the probability of a successful attack, optimize over values of $p$ from 0 to 1. Note, thousand or even hundred values should be sufficient. For each value of $p$ compute the worst-case segment, i.e., the segment with the lowers probability of stopping the attack. Then, choose $p$ that maximizes this worst-case.

a) All segment probabilities for all $p$s; b) the worst cases. The selected maximum over the worst cases corresponds to $p = 0.75$; c) matrix M after first iteration.

#### Reference

[1] Agmon, Noa, Gal A. Kaminka, and Sarit Kraus. “Multi-robot adversarial patrolling: facing a full-knowledge opponent.” Journal of Artificial Intelligence Research 42 (2011): 887-916.