Local search 2

The goal of this lab exercise is the implementation of a local search algorithm for problems where the solution is represented as a vector of real numbers. This algrithm will serve as one of the baseline algorithms in our comparisons; its parts will be useable in evolutionary algorithms. If you do not finish the tasks during the lab exercise, finish them as a homework!

You are supposed to come up with your own implementation of the tasks below. You can probably find many libraries for EA in each language, but for educational reasons, here we require your own work. Please read Rules of independent work.

Fitness functions

You already should have functions $f_{Sphere}$ a $f_{Rosenbrock}$ implemented from the last week. Today, we will try to optimize them using their natural representation, vector of real numbers. In addition, will shall try to optimize also the functions below.

Again, we assume that $x_i$ denotes $i$th item of the real vector, $D$ is the length of the vector (if not stated otherwise).

Linear

Linear is a basic, easily solvable function of real arguments and serves mostly as a sanity check of your algorithm. It tests a different ability that the sphere function: the results on linear function shows how your algorithm behaves if you initialize it in a wrong way, i.e., when the population does not surround the optimum.

Minimize $f_{Linear}(\mathbf{x}) = a_0 + \sum_{i=1}^D a_ix_i$.

This function can be used without constraints, but then it does not have a finite optimum (which is not an issue for algorithm comparison). If you wanted this function to have a finite optimum, you have to constrain the feasible space. Some algorithms may behave differently if you choose the constraints near the coordinate origin, e.g., $\langle -1,1 \rangle$, or if you choose the feasible space away from origin, e.g., $\langle 99, 101 \rangle$. The signs of $a_i$ determine in which corner of the box-constrained feasible space the optimum lies. You can choose all $a_i=1$, or choose them randomly.

Step

Step function is similar to linear function, but it has a step-wise character. Its gradient (where it exists) does not provide any information where the optimum lies.

Minimize $f_{Step}(\mathbf{x}) = a_0 + \sum_{i=1}^{D} \lfloor a_ix_i \rfloor$.

Constraints and the optimal point - see linear function (the only difference is that the set of global optima is formed by the whole single “stair”).

Rastrigin

Rastrigin's function is separable and multimodal (has many local optima). It looks like a bended eggholder.

Minimize $f_{Rastrigin}(\mathbf{x}) = 10D + \sum_{i=1}^{D} \left[x_i^2 - 10\cos(2\pi x_i) \right]$.

You can use this function in an uncostrained setting: the initialization range is usually chosen as $\langle -5.12, 5.12 \rangle^D$. But you can try to initialize the algorithm in much broader range, e.g., $\langle -100, 100\rangle^D$, and evaluate whether it helps or harms the algorithm.

Griewank

Griewank's function is multimodal too (similarl to Rastrigin), but it has more local optima and is not separable.

Minimize $f_{Griewank}(\mathbf{x}) = 1 + \frac1{4000}\sum_{i=1}^{D} x_i^2 - \prod_{i=1}^D \cos\left(\frac{x_i}{\sqrt{i}}\right)$.

You can use this function in an unconstrained setting, the initialization range is usually set to $\langle -600, 600 \rangle^D$.

Schwefel

Schwefel's is multimodal too, but its local optima are the closer to each other, the further you are from global optimum. Thus, it has some deceptive features.

Minimize $f_{Schwefel}(\mathbf{x}) = -\sum_{i=1}^{D} x_i\sin(\sqrt{|x_i|})$.

The function is usually constrained to the range $\langle -512.03, 511.97 \rangle^D$ and its minimal value is -418.9829*D.

Perturbation, mutation

With normal distribution

Implement the operation of perturbation/mutation for vectors of real numbers. The operator should create the perturbed vector by adding a vector of real numbers, each sampled from a normal distribution $N(0, \sigma^2)$.

Parameter $\sigma$, standard deviation of the used normal distribution.
Input Vector of real numbers.
Output Perturbed vector.

Similarly to binary representation, decide whether you will directly modify the input to the operator, or wheter you will return a perturbed copy.

Any other perturbation

Implement at least one other perturbation operator for real vectors. E.g.,

  • use a different distribution than Gaussian:
    • uniform on an interval centered on the perturbed vector, where the interval size will be a parameter, or use
    • Cauchy distribution in each coordinate with parameter $\gamma$, or
  • use a pattern of points in the neigborhood of the perturbed point, e.g.
    • return any of the $2D$ points in the distance of $\pm d$ from the perturbed point in one of $D$ coordinates, etc.

Apply the algorithm of local search (that you implemented last week) to the above listed functions. If you implemented the algorithm in a suitable way, you should not be required to change the implementation; you should only change the fitness function, initialization procedure and perturbation operator.

Local search and one-fifth rule

You have probably realized that local search with first-improving strategy is actually equivalent to (1+1)-ES. Create an adaptive version of the local search algorithm using the 1/5 rule to adapt the size parameter of the perturbation. The 1/5 rule is described in the lecture on real-valued EAs (somewhere between slides 20-30).

Vizualization

Try to run the simulations, collect the data and create graphs similar to the graphs in the lecture slides. Try to do it for various fitness functions.

Choose any vizualization tool you like. If your chosen language does not provide an easy way to vizualize the data, store the data e.g. in CSV file and vizualize it in MATLAB, Python, Excel or Google Sheets.

courses/a0m33eoa/labs/week_03.txt · Last modified: 2022/10/03 18:22 by xposik