HW 3 - Segmentation

The task is partially defined by you with potential “loose end” for a specific reason. You are used that school tasks and exams have only one solution for sake of understanding one concept per time. In reality, you won't encounter these problems, but in fact open, unresolved definitions of what needs to be done. The evaluation is therefore constructed with the fact in mind and we give points not for accuracy metrics, but for reasonable approach to the problem and interpretation, why you did what you did, and how would you do it differently. As a result, the homework will be hard at first, but after thinking it through, it should become easy.
Fill or find team composition (groups of 3 max.) on sheets.

The task and tools are described in HW3 materials.
The homework is completed, when you have uploaded the video presentation (length of 4 minutes max.) to Drive
Inter-group discussions are encouraged.
Deadline is on Thursday 21.12. at 23:59 for everyone

Training data - teacher model on Taylor

The annotations for training data should be accessible through teacher model on server Taylor described in materials above. Secondly, you can use already existing open datasets.
If you do not know password, change it as described in Taylor
The final output classes from the teacher model are accessible as follows:

name_list = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush', 'banner', 'blanket', 'bridge', 'cardboard', 'counter', 'curtain', 'door-stuff', 'floor-wood', 'flower', 'fruit', 'gravel', 'house', 'light', 'mirror-stuff', 'net', 'pillow', 'platform', 'playingfield', 'railroad', 'river', 'road', 'roof', 'sand', 'sea', 'shelf', 'snow', 'stairs', 'tent', 'towel', 'wall-brick', 'wall-stone', 'wall-tile', 'wall-wood', 'water-other', 'window-blind', 'window-other', 'tree-merged', 'fence-merged', 'ceiling-merged', 'sky-other-merged', 'cabinet-merged', 'table-merged', 'floor-other-merged', 'pavement-merged', 'mountain-merged', 'grass-merged', 'dirt-merged', 'paper-merged', 'food-other-merged', 'building-other-merged', 'rock-merged', 'wall-other-merged', 'rug-merged', 'unsegmented']

clazz = 0 # value in final segmentation
name_list[clazz] # semantic class of the value

If there are some inconsistencies with the output classes, please let us know
Using directly the output from the teacher model into final video presentation as your architecture output is forbidden and is, to some degree, detectable. Do not risk losing majority of points, if you do not have codes and logs for training your architecture to avoid problems.

Evaluation and Output

We give points for:

Motivation (1 pts)
- why you picked the specific classes
- why it might be important
Data (3 pts)
- how did you acquired the data and visualization how they look like, visualization of annotated data
- in which scenarios you expect it should work
- how did you split the training and testing split
Model (3 pts)
- what architecture have you used, pre-trained or not, and how many classes do you segment
- what regularization technique have you used (data augmentation, optimizers, cross-validation …) and why
Training (5 pts)
- show process of training in terms of training, validation and testing losses
- show examples where the model failed according to you
- show examples with high loss and low loss and describe if the segmentation was successful
Output (3 pts)
- show use case how would you apply the segmentation output (can be just indicator of where something is present or localization in image etc.)
- create a video sequence of your segmented outputs for qualitative comparison what to expect with your model and how fast it is on GPU and CPU

Do not be afraid to add some comedy or irony to the presentation if frustrated. It won't cost you points and might increase the attention of other students when watching if done properly. The only rule is that it must not interfere with your technical statements and must not de-valuate your work.

If any questions, contact us on email.

Table of Contents

HW 3 - Segmentation

Training data - teacher model on Taylor

Evaluation and Output