====== Exercise 3 ====== Program: * Revision * Linear classifier (classifier using linear discrimination function) * Perceptron algorithm for linear discrimination function * Classification of hand-written digits Downloads: * {{optdigits.zip|Hand-written digits dataset}} **Likely, you will not manage to finish the implementation of all the functions during the excercise in the lab. Finish them as a home work.** ===== Revision ===== Refresh your knowledge of * linear discrimination function and how it is used to make predictions, * what the homogeneous coordinates are. ===== Helper functions ===== We may find useful a simple function which would homogenize our data. Create function with the following prototype: function xh = homog(x) ^ Inputs | ''x'' | [//D// x //N//] | D-dimensional description of objects. Values of the input variables. | ^ Outputs | ''xh'' | [//D+1// x //N//] | Descriptions of objects in homogeneous coordinates. Values of input variables with 1s added to the end. | ===== Linear classifier ===== Assume we have a linear classifier represented by a weight vector ''w'' and we would like to use this classifier on new data. Create function with the following prototype: function yp = predClassLinear(model, x) ^ Inputs | ''model'' | [(//D+1//) x 1] | The linear model, vector of weights. | ^ | ''x'' | [//D// x //N//] | D-dimensional description of objects. Values of the input variables for all test objects. | ^ Outputs | ''yp'' | [1 x //N//] | Vector of predictions of the dependent variable for all test objects. | ===== Peceptron algorithm ===== We have training data (matrix ''x'' and vector ''y'') and we want to use the perceptron algorithm to learn a weight vector //w// of the linear classifier Create function with the following prototype: function model = trainClassLinearPerceptron(x, y) ^ Inputs | ''x'' | [//D// x //N//] | D-dimensional description of objects. Values of the input variables for all training objects. | ^ | ''y'' | [1 x //N//] | Vector of values of the dependent variable for all training objects. | ^ Outputs | ''model'' | [(//D+1//) x 1] | The linear model, vector of weights. | ===== Classification of hand-written digits ===== Apply the perceptron learning algorithm on the hand-written digits datatset. Each digit is represented as a picture in a grid of 8x8 pixels, each pixel can have value from 0 to 16 describing the shade of gray. 65th feature is the true class, i.e. the number depicted in the picture. Since our algorithm can work with binary classification only, choose 2 classes (e.g. 1 and 8, or 1 and 7, or 3 and 8) and try to train a linear classifier able to distinguish both classes. Study error rate and confusion matrix on training and testing data. Further assignments: - Try various definitions of classes (different combinations of pairs of digits, or various sets of digits, e.g. try to distinguish the digits 4 and below from digits 5 and above). Try to find classes which are easy to separate by the linear classifier and which are hard. - Try to compare this linear classifier with the method of nearest neighbors (which you must first modify to handle classification tasks).