Search
Deep Convolutional Neural Networks (CNNs) re-entered into the computer vision community, especially after the breakthrough paper of Krizhevsky et al. [1] that presented a large scale image category recognition with a remarkable success. In 2012, the CNN-based algorithm outperformed competing teams from many renowned institutions by a significant margin. This success initiated an enormous interest in neural networks in computer vision, to the extent that most successful methods are using neural networks nowadays.
The convolutional network is an extremely flexible classifier that is capable of fitting on very complex recognition/regression problems with a good generalization ability. The network consists of a nested ensemble of non-linear functions. The network is usually deep, i.e. it has many layers. Typically it has more parameters than number of data samples in the training set. There are mechanism to prevent overfitting. One of the basic tricks is leveraging the convolutional layers. The network learns shift-invariant filters instead of individual weights on every input pixel. Thus much fewer parameters are required, since the weights are shared.
Fig. 1: Architecture of a Deep Convolutional Neural Network. Figure adapted from [1].
Usually, the architecture of an image classification CNN is composed of several convolutional layers (which are meant to learn a representation) followed by a few fully connected layers (which implement the non-linear classification stage on top of the invariant representation), see figure 1.
In this lab, you will train your own network for image classification from scratch. We will be using pytorch library for that.
In this lab we will train convolution neural network for image classification from scratch. It is typically done on GPUs, often multiple ones, as the process is very computationally intensive. The current assignment, on the other hand, is created to be run on a CPU: a single training epoch takes 1-5 minutes on CPU, depending on CNN architecture and hardware.
Specifically, it takes 90 sec for training epoch on mobile i7 CPU and 26 sec on mobile GT940M GPU.
You will implement a CNN, training and validation loop, custom dataset loader and a couple of helper functions.
First, make sure that you have conda virtual environment set-up. If not, create one from here https://gitlab.fel.cvut.cz/mishkdmy/mpv-python-assignment-templates/-/tree/master/conda_env_yaml
Second, pull the assignment template from here https://gitlab.fel.cvut.cz/mishkdmy/mpv-python-assignment-templates/-/blob/master/assignment_6_7_cnn_template/training-imagenette-CNN.ipynb.
You can download the data via resp. section of the notebook or directly from https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-160.tgz
For this lab all explanations are contained in the corresponding notebook. Please, refer there.
Introduction into PyTorch Image Processing.
To fulfil this assignment, you need to submit these files (all packed in one .zip file) into the upload system:
.zip
results.html
training-imagenette-CNN.ipynb
jupyter nbconvert -–to html training-imagenette-CNN.ipynb -–EmbedImagesPreprocessor.resize=small -–output results.html
submission.csv
cnn_training.py
get_dataset_statistics
SimpleCNN
weight_init
validate
train_and_val_single_epoch
lr_find
TestFolderDataset
get_predictions
Use template of the assignment. When preparing a zip file for the upload system, do not include any directories, the files have to be in the zip file root. Please upload the same zip to the two tasks: 01_cnnclf and 02_tourn.
Your code and notebook will be checked manually. submission.csv will be used for two things. First, for the evaluation of quality of your trained network (task 08_cnnclf). Second, you will get bonus points based on its performance: 1 point for being in 20%-quantile. E.g. top-20% will bring you 5 points, top40%: 4 points and so on. In order to get any points your classification accuracy should be >70%.
This lab does not have an assignment. The slides are here
Jan Čech 2016/04/26 17:07