Exercise 6

Program:

Introduction to STPR toolbox
Classification using SVM from STPRtoolbox:
- Classification of the wedge dataset
- Classification of the XOR dataset
- Classification of the hand-written digits

Downloads:

STPR toolbox
Wedge dataset
XOR dataset
Hand-written digits dataset
Helper functions for visualization (not part of the STPR toolbox)

Likely, you will not manage to finish the implementation of all the functions during the excercise in the lab. Finish them as a home work.

STPR toolbox installation

Download the STPR toolbox, unpack it to a directory.
In MATLAB, go to that directory and run the script stprpath which sets the needed paths to the individual parts of the toolbox.
In the same directory, run the script compilemex which compiles the parts of toolbox written in C.

Answer the following questions:

What is the puprose of the following functions:
- svm2()
- bsvm2()
- svmclass()
Do you understand the meaning of their arguments?
In what format does the toolbox expect the data? Columns or rows? Is it the same format we used in previous exercises?
Do you see the analogy with functions you used last week?

The ''model'' structure

Output of all training algorithms in the STPR toolbox is a structure called a model. After using the svm2 function, the model structure may look like this:

model = 
 
       Alpha: [40x1 double]
           b: -2.3417
          sv: [1x1 struct]
         nsv: 40
           W: [64x1 double]
     options: [1x1 struct]
      kercnt: 46909
      trnerr: 0
      errcnt: 0
    exitflag: 2
        stat: [1x1 struct]
     cputime: 0.1993
         fun: 'svmclass'

The structure contains all the information needed to use a SVM classifier:

model.Alpha is a vector of the Lagrange multipliers alpha. Each training example has its own alpha. Many of these alpha may be zero; non-zero alpha indicates that the respective training example is actually a support vector. The vector model.Alpha containes only non-zero values, thus its length is equal to the number of support vectors.
model.b is the absolute term of the model (the bias).
model.W is a weight vector of the resulting linear discriminant function. It is part of the model structure only in case we use the linear kernel. Using the weight vector, the discriminannt function is equal to model.W'*x + model.b.
model.sv is a structure containing information about the support vectors. It contains X and y for all support vectors and also indices of the support vectors in the original training set.
model.fun is the name of the MATLAB function that should be used to apply the model to a new data, i.e. to classify the data in this case. For SVMs, this field will contain either linclass or svmclass.
the other info on the model structure are not so important. You can find here e.g. a copy of options used when training the model, or some statistics about the model optimization process, or the size of error on the training dataset, etc.

What exactly do we check by the following command?

all(model.sv.X * model.Alpha == model.W)

The wedge and XOR datasets

Use the scripts from the last week where you classified the wedge and XOR datasets using neural networks from NETLAB toolbox.

Modify the scripts to classify the datasets using SVM (start with the default linear kernel).
- The STPR toolbox needs class labels 1 and 2 (instead of our 0 and 1). Recode the y variable.
Visualize the prediction of the models.
- You can use the functions pboundary(), pareas() and pwpatterns(), e.g. in the following way:

    % Create a new figure
    figure; hold on;
    % Fill areas that belong to classes 1 and 2, respectively
    ha = pareas(model);
    % Plot the data, distinguished by colors
    hx = pwpatterns(data); 
    % Plot circles around the support vectors
    plot(model.sv.X(1,:), model.sv.X(2,:),'ko', 'Linewidth', 2, 'MarkerSize',8);
    % Plot the boundary between classes and make it thicker
    hl = pboundary(model); set(hl,'Linewidth',3,'color','k');

Explore:

What are the types of the kernels available for SVM in STPRtool? Try to use all of them, try setting different parameters for them, and look at the effects.

Hand-written digits dataset

Binary classification

Use the hand-written digits dataset as data for classification.

Define 2 classes (e.g. 1s against 8s, or {0,1,2,3,4} against {5,6,7,8,9}).
Train a linear SVM (optimal separating hyperplane) on the training data.
Test the trained SVM on the testing data:
- Use errRate() and confmat() to compute the misclassification rate and to display the confusion matrix.
Try to change the kernel type and kernel parameters. Write down the error rates for training and testing data.
Which of the SVM setting (kernel type + parameter setting) is the best for your task???

Multiclass classification

Training of multiclass SVM can be done e.g. by bsvm2() function.

What are the arguments of bsvm2()? How should the class labels be given to this function???
What are the other functions for multiclass SVM training???
Try to build SVM model able to discriminate between all the digits (10 classes):
- Display the confusion matrix and explore which numbers are hard to discriminate from the others.

Additional questions

Explore the functions perceptron() and linclass(). Do you see the similarity with the functions trainClassLinearPerceptron() and predClassLinear() that you created during the Exercise 3?
Try the demos from the STPR toolbox, demo_linclass and demo_svm.

Table of Contents