This page is located in archive. Go to the latest version of this course pages. Go the latest version of this page.

2013 XEP33SAM -- Understan​ding State of the Art Methods, Algorithms​, and Implementa​tions

Tuesdays 16:15

First meeting 2/4/2013


Marius Muja and David G. Lowe: “Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration”, in International Conference on Computer Vision Theory and Applications (VISAPP'09), 2009 PDF software page

  • Implement approximate k-means algorithm, use approximate NN instead of exact NN
  • Construct k-NN graph: a directed graph, vertex is a feature vector, from each vertex there are k edges to k nearest data vectors

Oxford 5k dataset: image thumbnails, descriptors (128 D vectors, single precision, stored one by one), and corresponing image names (one name per line, i-th name corresponds to i-th descriptors).

The following lines will read the descriptors and the image names in MATLAB:

fid = fopen('imagedesc.dat', 'r');
X = fread(fid, [128,inf], 'single⇒single');
Names = textread('imagenames.txt', '%s');

SIFT dataset: 2M SIFT descriptors are available here. The descriptors are 128D unsigned byte precision, the following Matlab lines will read the descriptors:

fid = fopen('SIFT.dat', 'r');
X = fread(fid, [128,inf], 'uint8⇒uint8');

Use the SIFT dataset for the approximate k-means. Use 32k cluster centers. Compare three different assignments to the nearest cluster (kd-forest, k-means tree, exact assignmet). For all three cases, start from identical inicialization. Compare the final results (after say 15 iterations) in terms of sum of squared distances, that is Σ (X - f(X))^2, where f(X) is the assigned cluster center.

Looking forward to results on your own data too.

Second meeting 30/4/2013 (the date has changed)


Herve Jegou, Matthijs Douze, Cordelia Schmid: “Product quantization for nearest neighbor search”, PAMI 2011. PDF software page

(Do not get confused by the text on the page. The mex version is in the package.)


Oxford 105k dataset: image thumbnails were distributed during the previous meeting,descriptors (128 D vectors, single precision, stored one by one), and corresponing image names (one name per line, i-th name corresponds to i-th descriptors, Oxford 5k image names are given without a directory, remaining filenames contain a directory corresponding to the distributed directory structure).

  • For each image find k-NN, visually verify the quality - script that for a selected image shows the k neighbours, etc.
  • Compare the quality and running time of product quantization and FLANN. Select 1000 images at random and find exact k-NN, for each of the algorithms compute an estimate of its precision.

Third meeting 21/5/2013


Yuri Boykov, Olga Veksler and Ramin Zabih: “Fast Approximate Energy Minimization via Graph Cuts”, PAMI 2001. PDF, (further reading: KZ-PAMI04 and SAUAI11) software page

The code is by Vladimir Kolmogorov, considered the best current implementation. There is a number of wrappers, this one was pointed out by Kostia (he has experience compiling it).


The Oxford 5k dataset. Use the he Oxford 5k dataset for this task. This dataset is the first 5062 images from the Oxford 105k used in the previous task (the images in the top directory). There are 11 landmarks and there is a ground truth provided for each of the landmark. There are three groups of images listed for each landmark: Good and OK - these contain the landmark and should be recognized, and Junk - these partially contain the landmark, but it is difficult to recognize. Images not listed in any of these groups do not contain the landmark.

Construct the k-nn graph on the Oxford 5k dataset (try different values of k). Use the labels (for the Oxford 5k provided in oxford5k_labels.zip in file labels.txt) and experiment with selecting images for the background label (how many images need to be labelled, could a constant probability be used for the background class, etc…). Exploit the k-nn graph to propagate the labels to other images using graph-cut (alpha expansion). Try to tune reasonable unary and binary potentials based on the similarity measure (the distance of the image descriptors).

Evaluate the labeling using the ground truth. Prepare the confusion matrix, show groups of correctly labelled and mislabelled images. The Junk images can be labelled both, as the landmark or as background.

Fourth meeting 13/6/2013

Carsten Rother, Vladimir Kolmogorov, and Andrew Blake‡: “GrabCut” - Interactive Foreground Extraction using Iterated Graph Cuts, SIGGRAPH 2004. PDF, project page


Implement GrabCut using the graph cut (max flow) algorithm by Kolmogorov from previous task. Use some standard images (from the project page) to verify that you are doing the right thing.

Improve results of the previous task. Make sure that your alpha-expansion is working correctly. 1) compare several unary potentials, make and verify labellings based solely on the unary potentials (label is determined by the smallest penalty) 2) design several binary potentials, use alpha-expansion to label the images. Is the labeling better than using just unary potentials? Is Potts model good or are different penalties for different pairs of labels improving the results?

Fifth meeting 1/8/2013


Wei Dong, Moses Charikar, Kai Li : “Efficient K-Nearest Neighbor Graph Construction for Generic Similarity Measures.” In Proceedings of the 20th international conference on World Wide Web (WWW). New York, NY. 2011. PDF software page


Create a k-NN graph on the Oxford 105k dataset and compare the precision and the run-time with the previous methods.

courses/xp33vtp/2013.txt · Last modified: 2018/03/01 14:28 by chumondr