====== Exercise 6 ====== Program: * Introduction to STPR toolbox * Classification using SVM from STPRtoolbox: * Classification of the wedge dataset * Classification of the XOR dataset * Classification of the hand-written digits Downloads: * [[http://cmp.felk.cvut.cz/cmp/software/stprtool/index.html|STPR toolbox]] * {{:courses:y33aui:cviceni:datawedge.txt|Wedge dataset}} * {{:courses:y33aui:cviceni:dataxor.txt|XOR dataset}} * {{:courses:y33aui:cviceni:optdigits.zip|Hand-written digits dataset}} * {{:courses:y33aui:cviceni:visfuncsforstpr.zip|Helper functions for visualization}} (not part of the STPR toolbox) **Likely, you will not manage to finish the implementation of all the functions during the excercise in the lab. Finish them as a home work.** ===== STPR toolbox installation ===== - Download the STPR toolbox, unpack it to a directory. - In MATLAB, go to that directory and run the script ''stprpath'' which sets the needed paths to the individual parts of the toolbox. - In the same directory, run the script ''compilemex'' which compiles the parts of toolbox written in C. Answer the following questions: * What is the puprose of the following functions: * ''svm2()'' * ''bsvm2()'' * ''svmclass()'' * Do you understand the meaning of their arguments? * In what format does the toolbox expect the data? Columns or rows? Is it the same format we used in previous exercises? * Do you see the analogy with functions you used last week? ===== The ''model'' structure ===== Output of all training algorithms in the STPR toolbox is a structure called a ''model''. After using the ''svm2'' function, the ''model'' structure may look like this: model = Alpha: [40x1 double] b: -2.3417 sv: [1x1 struct] nsv: 40 W: [64x1 double] options: [1x1 struct] kercnt: 46909 trnerr: 0 errcnt: 0 exitflag: 2 stat: [1x1 struct] cputime: 0.1993 fun: 'svmclass' The structure contains all the information needed to use a SVM classifier: * ''model.Alpha'' is a vector of the Lagrange multipliers //alpha//. Each training example has its own //alpha//. Many of these //alpha// may be zero; non-zero //alpha// indicates that the respective training example is actually a //support vector//. The vector ''model.Alpha'' containes only non-zero values, thus its length is equal to the number of support vectors. * ''model.b'' is the absolute term of the model (the bias). * ''model.W'' is a weight vector of the resulting linear discriminant function. It is part of the ''model'' structure only in case we use the ''linear'' kernel. Using the weight vector, the discriminannt function is equal to ''model.W'*x + model.b''. * ''model.sv'' is a structure containing information about the support vectors. It contains ''X'' and ''y'' for all support vectors and also indices of the support vectors in the original training set. * ''model.fun'' is the name of the MATLAB function that should be used to apply the ''model'' to a new data, i.e. to classify the data in this case. For SVMs, this field will contain either ''linclass'' or ''svmclass''. * the other info on the ''model'' structure are not so important. You can find here e.g. a copy of ''options'' used when training the model, or some statistics about the model optimization process, or the size of error on the training dataset, etc. What exactly do we check by the following command? all(model.sv.X * model.Alpha == model.W) ===== The wedge and XOR datasets ===== Use the scripts from the last week where you classified the wedge and XOR datasets using neural networks from NETLAB toolbox. - Modify the scripts to classify the datasets using SVM (start with the default linear kernel). * The STPR toolbox needs class labels 1 and 2 (instead of our 0 and 1). Recode the ''y'' variable. - Visualize the prediction of the models. * You can use the functions ''pboundary()'', ''pareas()'' and ''pwpatterns()'', e.g. in the following way: % Create a new figure figure; hold on; % Fill areas that belong to classes 1 and 2, respectively ha = pareas(model); % Plot the data, distinguished by colors hx = pwpatterns(data); % Plot circles around the support vectors plot(model.sv.X(1,:), model.sv.X(2,:),'ko', 'Linewidth', 2, 'MarkerSize',8); % Plot the boundary between classes and make it thicker hl = pboundary(model); set(hl,'Linewidth',3,'color','k'); Explore: * What are the types of the kernels available for SVM in STPRtool? Try to use all of them, try setting different parameters for them, and look at the effects. ===== Hand-written digits dataset ===== ==== Binary classification ==== Use the hand-written digits dataset as data for classification. * Define 2 classes (e.g. 1s against 8s, or {0,1,2,3,4} against {5,6,7,8,9}). * Train a linear SVM (optimal separating hyperplane) on the training data. * Test the trained SVM on the testing data: * Use ''errRate()'' and ''confmat()'' to compute the misclassification rate and to display the confusion matrix. * Try to change the kernel type and kernel parameters. Write down the error rates for training and testing data. * Which of the SVM setting (kernel type + parameter setting) is the best for your task??? ==== Multiclass classification ==== Training of multiclass SVM can be done e.g. by ''bsvm2()'' function. * What are the arguments of ''bsvm2()''? How should the class labels be given to this function??? * What are the other functions for multiclass SVM training??? * Try to build SVM model able to discriminate between all the digits (10 classes): * Display the confusion matrix and explore which numbers are hard to discriminate from the others. ===== Additional questions ===== - Explore the functions ''perceptron()'' and ''linclass()''. Do you see the similarity with the functions ''trainClassLinearPerceptron()'' and ''predClassLinear()'' that you created during the Exercise 3? - Try the demos from the STPR toolbox, ''demo_linclass'' and ''demo_svm''.