Warning
This page is located in archive.

Exercise 2

Program:

  • Revision: Training and testing error, basis expansion
  • Linear regression and polynomial model
  • Nearest neighbors regression

Downloads:

Likely, you will not manage to finish the implementation of all the functions during the excercise in the lab. Finish them as a home work.

Revision

Be sure to understand the following concepts:

  • training and testing error, their relation to model flexibility (lecture 1)
  • basis expansion (lecture 2)

Linear regression "theory"

Linear regression is used to model continuous dependent variables. The model has the form of a linear function.

1D case

Assume we have 1D data. Each object is described only with 1 feature (<latex>$x_i \in X$</latex>) and with the value of dependent variable (<latex>y_i \in Y</latex>). The goal is to find the function (or more precisely the parameters <latex>$w_1$</latex> and <latex>$w_0$</latex> of the function)

<latex> $$ f(x) = w_1x + w_0 $$ </latex>

so that the error

<latex> $$ MSE = \frac{1}{N} \sum_{i=1}^{N} (y_i - f(x_i))^2 = \frac{1}{N} \sum_{i=1}^{N} (y_i - (w_1x_i + w_0))^2 $$ </latex>

which the model makes on the training data is minimal.

In case of linear regression and MSE minimization, there are explicit formulas to compute the weights w from training data:

<latex> \[w_1 = \frac{\sum_{i=1}^N (x_i - \bar X)(y_i - \bar Y)}{\sum_{i=1}^N (x_i - \bar X)^2} = \frac{s_{XY}}{s_X^2}\ \ \ \text{a}\ \ \ w_0 &=& \bar Y - w_1\bar X, \] </latex>

where <latex>$s_{XY}$</latex> is the covariance of X and Y and <latex>$s_{X}^2$</latex> is the variance of X.

1D case, homogeneous coordinates

If we introduce the so-called homogeneous coordinates (the description of objects is enriched with one feature that always have value 1), the description of input variables will look like this:

<latex> \[ X = \left( \begin{array}{cccc} x_1 & x_2 & \ldots & x_N \\ 1 & 1 & \ldots & 1 \end{array} \right). \] </latex>

Arranging all the coefficients of the linear function into a vector <latex>\mathbf{w}=(w_1, w_0)^T</latex>, we can apply the linear function on all objects at once as

<latex> \[ Y = \mathbf{w}^T X. \] </latex>

The weight vector can then be computed as

<latex> \[ \mathbf{w} = (XX^T)^{-1}XY^T \] </latex>

Moredimensional case

In D-dimensional space we can use the same formula. The weight vector will contain D+1 coefficients, the description of 1 object will be constituted by D+1 variables (one of them always equal to 1).

Modeling data using linear regression

Linear regression in MATLAB

Learn how to compute the weight vector in MATLAB. Do not use the above mentioned formula, see

  • help slash and
  • help regress.

Simple polynomial mapping

Using linear regression, we would like to fit also polynomial model to the data. This can be accomplished by basis expansion, i.e. by first mapping the data into a higher-dimensional space.

Create function with the following prototype:

function xp = mapForPolynom(x, k)

Inputs x [1 x N] 1D description of objects. Values of the input for all objects.
k scalar The highest power of x to be present in the output
Outputs xp [k+1 x N] Matrix of powers of x.

Output xp: the first row will be equal to xk, the second row will be equal to xk-1, …, and the one to last row will be equal to x and the last row is full of ones.

Training the polynomial model

We would like to use polynomial models of various degrees to model the data.

Create function with the following prototype:

function model = trainRegrPolynom(x, y, k)

Inputs x [1 x N] 1D description of objects. Values of the independent variable for all training data.
y [1 x N] The vector of values of the dependent variable for all training data.
k scalar The degree of polynom we would like to use as the model.
Outputs model [(k+1) x 1] Vector of coefficients of the polynomial model.

Applying the polynomial model on new data

After training a polynomial model, we would also like to apply the model on new data.

Create function with the following prototype:

function yp = predRegrPolynom(model, x)

Inputs model [(k+1) x 1] Vector of coefficients of the linear model.
x [1 x N] 1D description of objects. Values of the input for all test objects.
Outputs yp [1 x N] The vector of predictions of the dependent variable for all test objects.

Putting it all together

Create a script that will

  • load the data,
  • split them into training and testing set,
  • train a polynomial model of chosen degree, and
  • compute error for both training and testing data.

After this script will work perfectly, enrich it with the ability to

  • model the data with a polynom with increasing degree (say, from 0 to 10),
  • plot the error on training and testing data as a function of the degree of polynom.

Nearest-neighbors method

K-nearest neighbors (kNN) method is a simple and universal modeling method. It creates its prediction in the following way:

  • for each object x in the test set
    • it looks up k most similar objects in the training set and
    • it aggregates the values of the dependent variable to form its prediction. Usually, the aggregation means
      • (weighted) averaging (or finding the median) in case of regression modeling,
      • (weighted) majority voting in case of classification.

In case of polynomial modeling, the model was a vector of weights. What is the model in kNN method?

Helper function

We have to be able to determine the k nearest neighbors of points from one set in another set.

Create function with the following prototype:

function iknn = findkNN(xtr, xtst, k)

Inputs xtr [D x Ntr] Matrix with the training vectors. Prototype vectors. We search the neighbors among them.
xtst [D x Ntst] Matrix with the testing vectors. Query vectors. We search the neighbors for them.
k scalar The number of neighbors.
Outputs iknn [k x Ntst] Matrix of indices of the nearest neighbors.

Output iknn: for each testing vector, it will contain indices into the matrix of training vectors.

  • iknn(:, 3) will be a column vector of indices of k nearest neighbors of the 3rd testing vector.
  • xtr(:, iknn(1,3) ) should be the nearest neighbor of the 3rd testing vector found in the training points.

A small help:

  • Do you remember distmat() function we created last week?
  • What does function min return when called in the following way: [foo, imin] = min(dm, [], 1)?
  • Can the distance matrix be modified somehow so that additional call to min can be used to return the second, third, etc. lowest element?

Training the kNN model

Create function with the following prototype:

function model = trainRegrKNN(x, y, k)

Inputs x [D x N] Values of the inputs for all objects. In this case, D = 1.
y [1 x N] The vector of values of the dependent variable for all objects.
k scalar The number of nearest neighbors we would like to use as the model.
Outputs model struct Training data x,y and the number of neighbors k.

Applying the kNN model on new data

After training the kNN model, we would also like to apply the model on new data.

Create function with the following prototype:

function yp = predRegrKNN(model, x)

Inputs model struct Model for the kNN method. The output of the trainRegrKNN function.
x [D x N] Values of the inputs for all test objects. In this case, D = 1.
Outputs yp [1 x N] The vector of predictions of the dependent variable for all test objects.

Putting it all together

Create a script that will

  • load the data,
  • split them into training and testing set,
  • train a kNN model with chosen k, and
  • compute error for both training and testing data.

After this script will work perfectly, enrich it with the ability to

  • model the data with a kNN method with increasing k (say, from 1 to 20),
  • plot the error on training and testing data as a function of the number of nearest neighbors k.
courses/y33aui/cviceni/cviceni02.txt · Last modified: 2013/10/04 13:02 (external edit)