====== T4c-patrol - Patrolling in polygonal environment ======
The task is to implement the computation of turning probability in perimeter patrolling.
|**Deadline** | January 7, 23:59 PST |
|**Points** | 3 |
|**Label in BRUTE** | t4c-patrol |
|**Evaluation** | upload strategy computation in ''strategy.py'' to brute |
|**Resources** | none |
\\
===Assignment===
Implement following algorithm to compute turning probability $p$.
Create a matrix of size [2d + 1, 2d + 1], initialized with 0s
// Fill out all entries in M as follows:
M[2d, 2d] = 1
for i = 0, i < 2d, i++ do
direction = (i mod 2==0 ? 1 : -1)
M[i,(2*direction+i) mod (2d+1)] = p
M[i,(direction+i) mod (2d+1)] = 1-p
end
// M to the power of t to calculate state probabilities after t steps
// in a sense of a matrix multiplication not only elementwise
MT = M^t
Res = vector of size d initialized with 0s
// Collect resulting probabilities
for loc = 0, loc < d, loc++ do
Res[loc] = MT[2loc][2d]
end
return Res
To solve the problem, implement the following two functions, ''generate_function'' and ''optimal_strategy'' and upload them to BRUTE in file ''strategy.py''.
The ''generate_function'' function is prescribed as follows:
def generate_function(d, t, p):
"""
Function to generate the probability of stopping an attack for each segment given some p
Parameters
----------
d:int
number of spaces between two closest agents
t:int
the time required by the attacker to breach the fence
p:float
probability of continuing forward
Returns
-------
f:float[d]
list of probabilities of stopping an attack for each segment
example behavior: input: $(3,2,0.5)$, returns: $[0.25,0.25,0.5]$.
example behavior: input: $(4,2,0.25)$, returns: $[0.1875, 0.0, 0.0625, 0.25]$.
The ''optimal_strategy'' function has the following prescription
def optimal_strategy(d, t, splits):
"""
Function to generate the probability of stopping an attack for each segment given some p
Parameters
----------
d:int
number of spaces between two closest agents
t:int
the time required by the attacker to breach the fence
splits:int
a number of values of p from 0 to 1 to be created, e.g. for splits = 3, we will have p in [0,0.5,1]
Returns
-------
f_all:float[d][splits]
list of lists, one for each segment, and in the list for the segment are probabilities of stopping an attack for all possible p
f_min:float[splits]
list of minimal probabilities of stopping an attack for each p
strategy:float
value of the optimal p
prob:float
the worst-case probability of stopping an attack given strategy
example behaviors: input: $(3,2,3)$, returns: $([[0.0, 0.25, 0.0], [0.0, 0.25, 1.0], [0.0, 0.5, 1.0]], [0.0, 0.25, 0.0], 0.5, 0.25)$
input: $(3,2,5)$, returns: $([[0.0, 0.1875, 0.25, 0.1875, 0.0], [0.0, 0.0625, 0.25, 0.5625, 1.0], [0.0, 0.25, 0.5, 0.75, 1.0]], [0.0, 0.0625, 0.25, 0.1875, 0.0], 0.5, 0.25)$
\\
=== Evaluation ===
The strategy computing functions should be submitted to Brute and if there is some mistake it will show you the input and wrong output on public testing scenarios.
\\
==Detailed description of the problem==
The environment comprises a closed polygonal environment $P$ which is threatened by an adversary attack.
To prevent this attack, the polygon $P$ is being patrolled by $k$ homogeneous mobile robots.
The adversary needs to continuously attack a spot on the polygon for the duration $t$ to penetrate the polygon.
The aim of this assignment is to compute such a patrolling strategy that minimizes the probability of a successful attack.
Such a strategy comprises dividing the polygon into segments, and changing the direction of the robot patrol.
\\
==Polygon segmentation==
The path around the polygon $P$ may be divided into $N$ segments, where a robot traverses each segment with uniform duration.
Therefore, there exists such a time-periodic function that each robot traverses exactly one segment per its period.
Moreover, the robot turning action has a non-zero duration of $\tau$.
It is recommended to use the turning time $\tau$ as the time unit, i.e., $\tau = 1$.
{{ :courses:uir:projects:polygon.png?1200|Polygon environment division.}}
Figure 1: Polygon environment and the division to segments.
Note that on the polygon segments are homogeneously spaced since the segmentation is based on a uniform division of time and some actions, e.g., driving along curves, may take an increased amount of time.
\\
==The optimal strategy==
The authors of [1] show that in an optimal strategy the robots are always uniformly distributed along the fence, and that at any time all the robots are always moving in a same direction.
Let $d = \frac{N}{k}$ be the distance in segments between neighbor robots.
Then if $t \geq d$ the strategy is trivial since it is sufficient for the robots to move forward in an arbitrary direction as such a behavior will always intercept any attacker.
Conversely, if $t < \frac{d+\tau -1}{2}$, $\tau = 1$, and $t < \frac{d}{2}$ the attacker can always select a point on the polygon where the penetration succeeds.
Assuming that neither of the trivial cases holds, the optimal strategy is a stochastic one that may be generalized as follows: for each segment, continue moving forward with the probability $p$, or turn all robots at once with the probability $1-p$.
To compute the optimal $p$, consider the segments between two neighbor robots.
Note, due to the symmetric nature of the problem, this applies for all such segments.
A Markov chain may be used to model the possible states and transitions between states in such system.
For each segment $s_i$ in the original path, consider two states in the graphic model G; one state for being on $s_i$ and moving clockwise ($s_i^{cw}$), and one for moving counterclockwise ($s_i^{cc}$).
Since $\tau = 1$, the edges of G are defined as follows.
The robot direction change is represented by an $1-p$ probability edge, which leads from $s_i^{cc}$ to $s_i^{cw}$.
Similarly, an $1-p$ probability edge leads from $s_i^{cw}$ to $s_i^{cc}$.
Finally, edges with probability $p$ lead from $s_i^{cw}$ to $s_{i-1}^{cw}$, and from $s_i^{cc}$ to $s_{i+1}^{cc}$, respectively.
Note, if the index of the target state of an edge would be lower than one or bigger than $d$, the edge goes instead to the absorbing state $s_{dt}$.
The algorithm returns a list of values, one for each of the $d$ segments.
The value denotes the probability of stopping an attack in the segment given $p$.
Therefore, to minimize the probability of a successful attack, optimize over values of $p$ from 0 to 1.
Note, thousand or even hundred values should be sufficient.
For each value of $p$ compute the worst-case segment, i.e., the segment with the lowers probability of stopping the attack.
Then, choose $p$ that maximizes this worst-case.
{{:courses:uir:projects:all_functions.png?450 |Plot of functions for all segments.}}{{:courses:uir:projects:min_function.png?450 |Min taken from all the functions.}}
{{ :courses:uir:projects:matrix_patroll.png?400 |Matrix after first for-cycle}}
a) All segment probabilities for all $p$s; b) the worst cases. The selected maximum over the worst cases corresponds to $p = 0.75$; c) matrix M after first iteration.
\\
===Reference===
[1] Agmon, Noa, Gal A. Kaminka, and Sarit Kraus. "Multi-robot adversarial patrolling: facing a full-knowledge opponent." Journal of Artificial Intelligence Research 42 (2011): 887-916.