Search
Deep Learning (SS2020) computer lab (10p)
In this lab we start from a model already pretrained on the ImageNet classification dataset (1000 categories and 1.2 million images) and try to adjust it for solving a small-scale but otherwise challenging classification problem.
Pytorch has a tutorial closely related to this assignment: Transfer Learning For Computer Vision Tutorial
Fortunately, many excellent pretrained architectures are available in pytorch. You will use one of the following models:
import torch model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg11', pretrained=True)
You might get the 'CERTIFICATE_VERIFY_FAILED' error, meaning that it cannot connect on a secure connection to download the model. In this case use the non-secure workaround:
import torchvision.models from torchvision.models.vgg import model_urls # from torchvision.models.squeezenet import model_urls for k in model_urls.keys(): model_urls[k] = model_urls[k].replace('https://', 'http://') model = torchvision.models.vgg11(pretrained=True) # model = torchvision.models.squeezenet1_0(pretrained=True)
You can see the structure of the loaded model by calling print(model). You can also open the source defining the network architecture https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py. Usually it is defined as a hierarchy of Modules, where each Module is either an elementary layer (e.g. Conv2d, Linear, ReLU) or a container (e.g. Sequential).
print(model)
Download one of the datasets we offer for this task:
All of the datasets contain color images 224×224 pixels of 10 categories.
Here we will practice some data loading and preprocessing techniques. – Create dataset and a loader for training images. We can use this existing dataset interface that loads images from the disk:
from torchvision import datasets, transforms train_data = datasets.ImageFolder('../data/butterflies/train', transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(train_data, batch_size=4, shuffle=True, num_workers=0)
– Perform standartization of the data: on the training set compute mean and standard deviation per color channel over all pixels and all images in the training set. We will put these constant values in the code as a preprocessing, not to recompute them over again. – Add transforms.Normalize with the statistics you found to your dataset constructor. This is in order to standardize (whiten) the input for better conditioned training and also for better matching to what the pretrained model expects. Apply this transform as well on the test dataset.
transforms.Normalize
– From the train dataset create two loaders: the loader used for optimizing hyperparameters (train_loader) and the loader used for validation (val_loader). This is similar to Lab3, using SubsetRandomSampler.
train_loader
val_loader
We will first try learning the last layer of the network on the new data. We will use the network as a feature extractor and learn a linear classifier on top of it, as if it was a logistic regression model on some features. We need to do the following:
for param in model.parameters(): param.requires_grad = False
Linear
Depending on the size of the dataset and how much it is different from what is represented in Imagenet, one of the following options may give better results. With larger datasets, we expect them to improve over training only the last layer.
requires_grad = True
model.eval(False)
model.eval(True)