Search
Quick links: Schedule | Forum | BRUTE | Lectures | Labs
The main task is to fine-tune a pretrained CNN for a new classification task (transfer learning).
Skills: data loader from an image folder, data preprocessing, loading pretrained models, remote GPU servers, training part of the model. Insights: convolutional filters, error case analysis
In this lab we start from a model already pretrained on the ImageNet classification dataset (1000 categories and 1.2 million images) and try to adjust it for solving a small-scale but otherwise challenging classification problem.
It is a good time now to start working with GPU servers. Check How To page. The recommended setup is as follows:
Beware: VScode tends to keep the server daemon active even after you turn off your computer. As the GPU memory is expensive, login to the server regularly and check if your processes still occupy some GPUs. You may call pkill -f ipykernel to kill these processes.
pkill -f ipykernel
SOTA pretrained architectures are available in PyTorch. We will use the following models:
import torchvision.models model1 = torchvision.models.squeezenet1_0(weights=torchvision.models.SqueezeNet1_0_Weights.DEFAULT) model2 = torchvision.models.resnet18(pretrained=True)
print(model)
The data will be placed in /local/temporary/Datasets/PACS_cartoon and in /local/temporary/Datasets/PACS_cartoon_few_shot on both servers (for a faster access and to avoid multiple copies). You can also download the dataset (e.g. to use on your computer):
/local/temporary/Datasets/PACS_cartoon
/local/temporary/Datasets/PACS_cartoon_few_shot
The PACS_cartoon dataset contain colored images of cartoons with 227×227 pixels and of 7 categories:
01: Dog 02: Elephant 03: Giraffe 04: Guitar 05: Horse 06: House 07: Person
Template
This lab is substantially renewed this year, please let us know of any problems you encounter with the template or the task.
The first task will be just to load the pretrained network, apply it to test image and visualize the convolution filters and activations in the first layer. For this task, squeezenet is more suitable as it has 7×7 convolution filters in the first layer. There are a couple of technicalities, prepared in the template.
Sequential
model.features[0:2]
To address the classification task, we first need to load in the data: create dataset, split into training and validation, create loaders. Fortunately, there are convenient tools for all the steps. The respective technicalities are prepared in the template.
datasets.ImageFolder
from torchvision import datasets, transforms train_data = datasets.ImageFolder('/local/temporary/Datasets/PACS_cartoon/train', transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(train_data, batch_size=1, shuffle=True, num_workers=0)
mean=[0.485, 0.456, 0.406] std=[0.229, 0.224, 0.225]
train_loader
val_loader
sampler
DataLoader
We will investigate the benefits of using a pre-trained network, even if the distribution of our task is different from the pre-training (e.g. train on cartoons with a network pre-trained on photographs). First let's check the performance of a model trained on cartoons from scratch:
model = torchvision.models.resnet18(pretrained=False) model.to(dev)
optimizer
nll_loss
to(dev)
torch.save
test_data = datasets.ImageFolder('/local/temporary/Datasets/PACS_cartoon/test', transform) test_loader = torch.utils.data.DataLoader(test_data, batch_size=8, shuffle=False, num_workers=0)
model = torchvision.models.resnet18(pretrained=True) model.to(dev)
for param in model.parameters(): param.requires_grad = False
model.train(False)
requires_grad = True
In PACS_cartoon_few_shot the training data are very limited. A good practice is to use data augmentations during training. Select some transforms, which can be expected to result in a more diverse dataset. A possible set is
See Pytorch transform examples. Note that transforms inherit torch.nn.Module and therefore can be used the same way as layers, or as functions applied to data Tensors (however, not batched). They can be also built-in the Dataset by setting the transform argument. They can process PIL.Image or a Tensor. For efficiently reasons it is better to use them as functions on Tensors.
torch.nn.Module