Search
Fine-tuning a pretrained CNN for a new task.
Skills: creating dataset from an image folder, data preprocessing, loading pretrained models, working remotely with GPU server, training part of the model, hyper-parameter search.
In this lab we start from a model already pretrained on the ImageNet classification dataset (1000 categories and 1.2 million images) and try to adjust it for solving a small-scale but otherwise challenging classification problem.
Fortunately, many excellent pretrained architectures are available in pytorch. You can use one of the following models:
import torchvision.models model = torchvision.models.vgg11(pretrained=True)
You might get the 'CERTIFICATE_VERIFY_FAILED' error, meaning that it cannot connect on a secure connection to download the model. In this case use the non-secure workaround:
import torchvision.models from torchvision.models.vgg import model_urls # from torchvision.models.squeezenet import model_urls for k in model_urls.keys(): model_urls[k] = model_urls[k].replace('https://', 'http://') model = torchvision.models.vgg11(pretrained=True) # model = torchvision.models.squeezenet1_0(pretrained=True)
You can see the structure of the loaded model by calling print(model). You can also open the source defining the network architecture https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py. Usually it is defined as a hierarchy of Modules, where each Module is either an elementary layer (e.g. Conv2d, Linear, ReLU) or a container (e.g. Sequential).
print(model)
Download the dataset we prepared:
The dataset contain color images 224×224 pixels of 10 categories.
Here we will practice data loading and preprocessing techniques.
from torchvision import datasets, transforms train_data = datasets.ImageFolder('../data/butterflies/train', transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(train_data, batch_size=4, shuffle=True, num_workers=0)
We will put these constant values in the code as a preprocessing, not to recompute them over again.
transforms.Normalize
train_loader
val_loader
We will first try learning the last layer of the network on the new data. I.e. we will use the network as a feature extractor and learn a linear classifier on top of it, as if it was a logistic regression model on some features. We need to do the following:
for param in model.parameters(): param.requires_grad = False
requires_grad = True
torch.nn.BatchNorm1d
torch.nn.Dropout
Depending on the size of the dataset and how much it is different from Imagenet, the following option may give better results compared to training the last layer only.
For the whole model finetuning (as described in Part 3) try the following regularization method, that aims at a stochastic smoothing of the loss function:
\begin{align*} &\hat \theta = \theta^t + \varepsilon, \ \ \ \ \varepsilon\sim \mathcal{N}(0,\sigma^2)\\ &\theta^{t+1} = \theta^t - \alpha \frac{d \mathcal{L}(\theta)}{d \theta} \Big|_{\theta = \hat \theta} \end{align*}
Practically it can be implemented as follows. For each training step
The parameter $\sigma$ needs to be chosen by cross-validation. Fix the learning rate to the one found in Part 3.
model.eval(False)
model.eval(True)