Quick links: Schedule | Forum | BRUTE | Lectures | Labs
The main task is to fine-tune a pretrained CNN for a new classification task (transfer learning).
Skills: data loader from an image folder, data preprocessing, loading pretrained models, remote GPU servers, training part of the model. Insights: convolutional filters, error case analysis
In this lab we start from a model already pretrained on the ImageNet classification dataset (1000 categories and 1.2 million images) and try to adjust it for solving a small-scale but otherwise challenging classification problem.
It is a good time now to start working with GPU servers. Check How To page. The recommended setup is as follows:
Beware: VScode tends to keep the connection active even after you turn off your computer. As the GPU memory is expensive, login to the server regularly and check if your processes still occupy some GPUs. You may call pkill -f ipykernel
to kill these processes.
SOTA pretrained architectures are available in PyTorch. We will use the following models:
import torchvision.models model1 = torchvision.models.vgg11(weights=torchvision.models.VGG11_Weights.DEFAULT) model2 = torchvision.models.squeezenet1_0(weights=torchvision.models.SqueezeNet1_0_Weights.DEFAULT)You can see the structure of the loaded model by calling
print(model)
. You can also open the source defining the network architecture https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py. Usually it is defined as a hierarchy of Modules, where each Module is either an elementary layer (e.g. Conv2d, Linear, ReLU) or a container (e.g. Sequential).
The data will be placed in /local/temporary/butterflies/
on both servers (for a faster access and to avoid multiple copies).
You can also download the dataset (e.g. to use on your computer):
The dataset contain color images 224×224 pixels of 10 categories. The scientific (Latin) names of the butterfly categories are:
01: Danaus plexippus 02: Heliconius charitonius 03: Heliconius erato 04: Junonia coenia 05: Lycaena phlaeas 06: Nymphalis antiopa 07: Papilio cresphontes 08: Pieris rapae 09: Vanessa atalanta 10: Vanessa cardui
This lab is substantially renewed this year, please let us know of any problems you encounter with the template or the task.
The first task will be just to load the pretrained network, apply it to test image and visualize the convolution filters and activations in the first layer. For this task, squeezenet is more suitable as it has 7×7 convolution filters in the first layer. There are a couple of technicalities, prepared in the template.
Sequential
container supports slicing, so that model.features[0:2]
is a small neural network consisting of the first few layers.
To address the classification task, we first need to load in the data: create dataset, split into training and validation, create loaders. Fortunately, there are convenient tools for all the steps. The respective technicalities are prepared in the template.
datasets.ImageFolder
: from torchvision import datasets, transforms train_data = datasets.ImageFolder('/local/temporary/butterflies/train', transform.ToTensor()) train_loader = torch.utils.data.DataLoader(train_data, batch_size=1, shuffle=True, num_workers=0)
mean=[0.485, 0.456, 0.406] std=[0.229, 0.224, 0.225]
train_loader
) and the loader used for validation (val_loader
). Use the sampler
argument of DataLoader
with SubsetRandomSampler. Use random subsets instead of just slicing the dataset: you should not assume that the dataset is randomly shuffled (and in this task it is really not).
We will first try to learn only the last layer of the network on the new data. I.e. we will use the network as a feature extractor and learn a linear classifier on top of it, as if it was a logistic regression model on some features. This task is somewhat simpler with VGG architecture (SqueezeNet uses fully convolutional architecture and global pooling in the end).
model.to(dev)
for param in model.parameters(): param.requires_grad = False
model.train(False)
will fix the behaviour of batchnorm and dropout layers (if present) to deterministic input-independent
requires_grad = True
by default, i.e. will be trainable.
optimizer
and nll_loss
as proposed in the template. When loading the data, move the data to GPU as well, note to(dev)
is not an in-place operation for Tensors, unlike for Modules.
val_loader
to evaluate the validation accuracy in the end of each training epoch. Select the model that achieves the best validation accuracy over all of the learning rates and training epochs. Save the best network using torch.save
. See Saving / Loading Tutorial .
Report the final test classification accuracy of the best model (selected on the validation set). The test set is specified as a separate folder:
test_data = datasets.ImageFolder('/local/temporary/butterflies/test', transform) test_loader = torch.utils.data.DataLoader(test_data, batch_size=8, shuffle=False, num_workers=0)Use the same input transform as for training. Do not re-tune the hyperparameters to achieve a better test set perforomance! The network will probably make a few errors on the test set. For these cases display and report: 1) the input test image, 2) its correct class label, 3) the class labels and network confidence (predictive probabilities) of the top 3 network predictions (classes with highest predictive probability).
Because we have very limited training / testing data available, it is a good idea to use also data augmentation. Let us select some transforms, which can be expected to result in realistic images of the same class. A possible set is
See Torchvision transform examples.
Note that transforms inherit torch.nn.Module
and therefore can be used the same way as layers, or as functions applied to data Tensors (however, not batched). They can be also built-in the Dataset by setting the transform argument. They can process PIL.Image or a Tensor. For efficiently reasons it is better to use them as functions on Tensors.