Search
Fine-tuning a pretrained CNN for a new task.
Skills: creating dataset from an image folder, data preprocessing, loading pretrained models, working remotely with GPU server, training part of the model, hyper-parameter search.
In this lab we start from a model already pretrained on the ImageNet classification dataset (1000 categories and 1.2 million images) and try to adjust it for solving a small-scale but otherwise challenging classification problem.
Pytorch has a tutorial closely related to this assignment: Transfer Learning For Computer Vision Tutorial
Fortunately, many excellent pretrained architectures are available in pytorch. You can use one of the following models:
import torch model = torch.hub.load('pytorch/vision:v0.9.0', 'vgg11', pretrained=True)
You might get the 'CERTIFICATE_VERIFY_FAILED' error, meaning that it cannot connect on a secure connection to download the model. In this case use the non-secure workaround:
import torchvision.models from torchvision.models.vgg import model_urls # from torchvision.models.squeezenet import model_urls for k in model_urls.keys(): model_urls[k] = model_urls[k].replace('https://', 'http://') model = torchvision.models.vgg11(pretrained=True) # model = torchvision.models.squeezenet1_0(pretrained=True)
You can see the structure of the loaded model by calling print(model). You can also open the source defining the network architecture https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py. Usually it is defined as a hierarchy of Modules, where each Module is either an elementary layer (e.g. Conv2d, Linear, ReLU) or a container (e.g. Sequential).
print(model)
Download one of the datasets we offer for this task:
All of the datasets contain color images 224×224 pixels of 10 categories.
Here we will practice data loading and preprocessing techniques.
from torchvision import datasets, transforms train_data = datasets.ImageFolder('../data/butterflies/train', transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(train_data, batch_size=4, shuffle=True, num_workers=0)
transforms.Normalize
train_loader
val_loader
We will first try learning the last layer of the network on the new data. I.e. we will use the network as a feature extractor and learn a linear classifier on top of it, as if it was a logistic regression model on some features. We need to do the following:
for param in model.parameters(): param.requires_grad = False
requires_grad = True
torch.nn.BatchNorm1d
torch.nn.Dropout
Depending on the size of the dataset and how much it is different from Imagenet, one of the following options may give better results compared to training last layer only.
model.eval(False)
model.eval(True)