Search
Your second homework will be Image recognition. For this task, we have created our own dataset, which is based on ImageNet.
The homework will be introduced in the labs in the 5th week, we will try to clear any doubts in Video.
The dataset consists of 10 classes, 500 training images for each class, and 50 testing and 50 validation images for each class. Each image has resolution of 128×128 and has three color channels (R, G, B)
The dataset is available at taylor and cantor servers in directory /local/temporary/vir/hw02. It can also be downloaded in two formats. Either as a pickle file in which is a standard Python dict with three keys:
taylor
cantor
/local/temporary/vir/hw02
data
[Nx128x128x3]
np.uint8
N
labels
[N]
filenames
Second format is as raw images
Classes mapping is following:
Design and train a neural network, achieving high accuracy on unknown test part of the dataset (which is from the same distribution as training and validation parts of the dataset).
Submit a Python module/package, that is importable by name hw_2 and has a function load_model(). Function load_model() needs to return an instance of torch.nn.Module (or a subclass) which is
hw_2
load_model()
torch.nn.Module
[Bx3x128x128]
B
torch.float32
[Bx10]
This is the only portion of your code, that will be automatically checked. However, in addition, submit also all other code that you used for the training. This is for us, to be amazed if you achieve an impossibly high score, to know how you did it.
torch.load
map_location=“cpu”
# Get the weights and biases model.state_dict() # store it on harddisk torch.save(model.state_dict(), 'weights.pts') # load weights from harddisk and return to model model = My_Net() model.load_state_dict(torch.load('weights.pts', map_location="cpu")) # ^ should print "<All keys matched successfully>"
Simplest submitted code (that won't achieve any points) can be along these lines. Name it hw_2.py, it should be submitted together with the model's weights stored through model.state_dict() in file named weights.pth.
import torch import os class Model(torch.nn.Module): '''This is my super cool, but super dumb module''' def __init__(self): super().__init__() def forward(self, x): batch_size = x.shape[0] return torch.rand(batch_size, 10, device=x.device) def load_model(): # This is the function to be filled. Your returned model needs to be an instance of subclass of torch.nn.Module # Model needs to be accepting tensors of shape [B, 3, 128, 128], where B is batch_size, which are in a range of [0-1] and type float32 # It should be possible to pass in cuda tensors (in that case, model.cuda() will be called first). # The model will return scores (or probabilities) for each of the 10 classes, i.e a tensor of shape [B, 10] # The resulting tensor should have same device and dtype as incoming tensor directory = os.path.abspath(os.path.dirname(__file__)) # The model should be trained in advance and in this function, you should instantiate model and load the weights into it: model = Model() model.load_state_dict(torch.load(directory + '/weights.pth', map_location='cpu')) # For more info on storing and loading weights, see https://pytorch.org/tutorials/beginner/saving_loading_models.html return model
In order to participate in the tournament part, your model also has to have non-empty docstring, briefly summarizing your model. We do not want you to spill your secrets, however, the main idea should be evident from the docstring. The only thing checked about docstring is whether it's there and nonempty. It will be visible to other students. Please, do not use diacritics in the docstring.
The dataset is rather difficult, therefore in order to get any points, you only need top-1 accuracy of 40 %. In order to get a full amount of points for the individual part of the assignment, you need top-1 accuracy of 60 %. Anything in between will be linearly spaced. The maximum amount of points from the individual part of the assignment is 12. The equation is:
$$pts_{individual} = 12\times\text{clip}(\frac{acc - 0.4}{0.6 - 0.4}, 0, 1)$$
Every 24 hours after the deadline, you will lose 1 points. However, you will not gain a negative number of points, so the minimum is 0.
Take into account, that training a neural network takes some non-trivial time. Do not start working on the homework at the last moments. We recommend allowing at least a full day for work on this homework.
Because there are two separate deadlines for this task, there are also two homeworks in BRUTE. You need to submit your work to both of them. The evaluation script is the same in both of them. However, in the tournament part, the BRUTE will always report 0 points for Automatic Evaluation and you will only gain points in the tournament. Do not be alarmed of this behavior, it is expected.
For the individual part, you may not use additional training data. For the tournament part, the use of additional data is allowed, however, you must follow couple rules:
Code along these lines is used for evaluation in BRUTE. Feel free to use it.
#!/usr/bin/env python3 import argparse import pickle import numpy as np import torch import torch.utils.data as tdata import hw_2 CLASSES = { 0: 'bird', 1: 'lizard', 2: 'snake', 3: 'spider', 4: 'dog', 5: 'cat', 6: 'butterfly', 7: 'monkey', 8: 'fish', 9: 'fruit', } class Dataset(tdata.Dataset): def __init__(self, pkl_name): self.pkl_name = pkl_name with open(self.pkl_name, 'rb') as f: loaded_data = pickle.load(f) self.labels = loaded_data['labels'] self.data = loaded_data['data'] def __getitem__(self, i): return { 'labels': self.labels[i].astype( 'i8' ), # torch wants labels to be of type LongTensor, in order to compute losses 'data': self.data[i].astype('f4').transpose((2, 0, 1)), # First retype to float32 (default dtype for torch) # then permute axes (torch expects data in CHW order) # Scale input data in your model's forward pass!!! } def __len__(self): return self.labels.shape[0] def get_prediction_order(prediction, label): # prediction has shape [B, 10] (where B is batch size, 10 is number of classes) # label has shape [B] # both are torch tensors, prediction represents either score or probability of each class. # probability is torch.softmax(score, dim=1) # either way, the higher the value for each class, the more probable it is according to your model # therefore we can sort it according to given probability - and check on which place is the correct label. # ideally you want it to be at first place, but for example ImageNet is also evaluated on top-5 error # take 5 most confident predictions and only if your label is not in those best predictions, count it as error # Since ImageNet dataset has 1000 classes, if your predictions were random, top-5 error should be around 99.5 % prediction = prediction.detach() # detach from computational graph (no grad) label = label.detach() prediction_sorted = torch.argsort(prediction, 1, True) finder = ( label[:, None] == prediction_sorted ) # None as an index creates new dimension of size 1, so that broadcasting works as expected order = torch.nonzero(finder)[:, 1] # returns a tensor of indices, where finder is True. return order def create_confusion_matrix(num_classes, prediction, label): prediction = prediction.detach() label = label.detach() prediction = torch.argmax(prediction, 1) cm = torch.zeros( (num_classes, num_classes), dtype=torch.long, device=label.device ) # empty confusion matrix indices = torch.stack((label, prediction)) # stack labels and predictions new_indices, counts = torch.unique( indices, return_counts=True, dim=1 ) # Find, how many cases are for each combination of (pred, label) cm[new_indices[0], new_indices[1]] += counts return cm def print_stats(conf_matrix, orders): num_classes = conf_matrix.shape[0] print('Confusion matrix:') print(conf_matrix) print('\n---\n') print('Precision and recalls:') for c in range(num_classes): precision = conf_matrix[c, c] / conf_matrix[:, c].sum() recall = conf_matrix[c, c] / conf_matrix[c].sum() f1 = (2 * precision * recall) / (precision + recall) print( 'Class {cls:10s} ({c}):\tPrecision: {prec:0.5f}\tRecall: {rec:0.5f}\tF1: {f1:0.5f}'.format( cls=CLASSES[c], c=c, prec=precision, rec=recall, f1=f1 ) ) print('\n---\n') print('Top-n accuracy and error:') order_len = len(orders) for n in range(num_classes): topn = (orders <= n).sum() acc = topn / order_len err = 1 - acc print( 'Top-{n}:\tAccuracy: {acc:0.5f}\tError: {err:0.5f}'.format(n=(n + 1), acc=acc, err=err) ) def evaluate(num_classes, dataset_file, batch_size=32, model=None): if model is None: model = hw_1.load_model() # load model, your hw if torch.cuda.is_available(): device = torch.device('cuda') else: device = torch.device('cpu') model = model.to(device) model = model.eval() # switch to eval mode, so that some special layers behave nicely dataset = Dataset(dataset_file) loader = tdata.DataLoader(dataset, batch_size=batch_size) confusion_matrix = torch.zeros( (num_classes, num_classes), dtype=torch.long, device=device ) # empty confusion matrix orders = [] with torch.no_grad(): # disable gradient computation for i, batch in enumerate(loader): data = batch['data'].to(device) labels = batch['labels'].to(device) prediction = model(data) confusion_matrix += create_confusion_matrix(num_classes, prediction, labels) order = get_prediction_order(prediction, labels).cpu().numpy() orders.append(order) print('Processed {i:02d}th batch'.format(i=(i + 1))) print('\n---\n') orders = np.concatenate(orders, 0) confusion_matrix = confusion_matrix.cpu().numpy() print_stats(confusion_matrix, orders) return (orders == 0).mean() # Return top-1 accuracy if __name__ == '__main__': parser = argparse.ArgumentParser('Evaluation demo for HW01') parser.add_argument('dataset', type=str) parser.add_argument('--batch_size', '-bs', default=32, type=int) parser.add_argument('--num_classes', '-nc', default=10, type=int) args = parser.parse_args() evaluate(args.num_classes, args.dataset, args.batch_size)