Contents

timestamp

fprintf(1,'Started on %s\n', datestr(now));
Started on 28-Mar-2010 04:55:17

load kmeans data

load('data2d.mat');
plot(data(1,:),data(2,:),'g.', 'markersize', 10);
set(gca, 'xtick', [], 'ytick', []); axis image;

kmeans

tic
means=kmeans(4, data)
toc

figure;
plot(data(1,:),data(2,:),'g.','markersize', 10); hold on;
plot(means(1,:),means(2,:),'rx', 'linewidth', 3, 'markersize', 15);
% visualise voronoi cells
h=voronoi(means(1,:), means(2,:)); delete(h(1));
set(gca, 'xtick', [], 'ytick', []); axis image;
01. iteration, overall error: 6800.64111, 0 unused means. Assignement: 0.001sec, update: 0.000sec.
02. iteration, overall error: 3591.86914, 0 unused means. Assignement: 0.000sec, update: 0.000sec.
03. iteration, overall error: 2370.99683, 0 unused means. Assignement: 0.000sec, update: 0.000sec.
04. iteration, overall error: 1976.98657, 0 unused means. Assignement: 0.000sec, update: 0.000sec.
05. iteration, overall error: 1957.85071, 0 unused means. Assignement: 0.000sec, update: 0.000sec.
06. iteration, overall error: 1957.63867, 0 unused means. Assignement: 0.000sec, update: 0.000sec.
07. iteration, overall error: 1957.63867, 0 unused means. Assignement: 0.000sec, update: 0.000sec.
Terminating, improvement is under 0.01%.

means =

   -0.0363    2.9093    2.0114   -4.9374
    0.0647    3.9711   -3.8859   -1.0452

Elapsed time is 0.000528 seconds.

assign to labels using nearest

tic
[lbls dists] = nearest(means, data);
toc

figure; hold on;
h=voronoi(means(1,:), means(2,:)); delete(h(1));
plot(data(1,lbls==1),data(2,lbls==1),'b.','markersize', 10);
plot(data(1,lbls==2),data(2,lbls==2),'g.','markersize', 10);
plot(data(1,lbls==3),data(2,lbls==3),'r.','markersize', 10);
plot(data(1,lbls==4),data(2,lbls==4),'k.','markersize', 10);
set(gca, 'xtick', [], 'ytick', []); axis image;
Elapsed time is 0.000514 seconds.

load centers of sshessian detektor and SIFT descriptor

load('hess_centers10k.mat');
load('realdata.mat');

create tf database, check consistency

tic
DB=createdb(lbls, 10000);
toc

fprintf(1, 'Lenghts of #2: %f, #18: %f, #30: %f\n', full([sum(DB(2,:).^2), sum(DB(18,:).^2), sum(DB(30,:).^2)]));
weights=full(DB(2,find(DB(2,:))));

fprintf(1, 'Weights img #2: '); fprintf(1, '%f ', weights(1:10)); fprintf(1, '\n\n');
Elapsed time is 0.040100 seconds.
Lenghts of #2: 1.000000, #18: 1.000000, #30: 1.000000
Weights img #2: 0.017206 0.017206 0.034411 0.034411 0.017206 0.034411 0.034411 0.034411 0.017206 0.034411 

query with few random documents

idf=ones(1,10000);
tic
[idxs scores]=query(DB, lbls{2}, idf);
toc

fprintf(1, 'Ordering with query #2: '); fprintf(1, '%d ', idxs(1:10)); fprintf(1, '\n');
fprintf(1, 'Scores with query #2: '); fprintf(1, '%f ', scores(1:10)); fprintf(1, '\n\n');

[idxs scores]=query(DB, lbls{18}, idf);
fprintf(1, 'Ordering with query #18: '); fprintf(1, '%d ', idxs(1:10)); fprintf(1, '\n');
fprintf(1, 'Scores with query #18: '); fprintf(1, '%f ', scores(1:10)); fprintf(1, '\n\n');

[idxs scores]=query(DB, lbls{30}, idf);
fprintf(1, 'Ordering with query #30: '); fprintf(1, '%d ', idxs(1:10)); fprintf(1, '\n');
fprintf(1, 'Scores with query #30: '); fprintf(1, '%f ', scores(1:10)); fprintf(1, '\n\n');
Elapsed time is 0.001871 seconds.
Ordering with query #2: 2 4 3 5 1 7 39 19 16 9 
Scores with query #2: 1.000000 0.414541 0.362490 0.321133 0.300475 0.264582 0.260485 0.239409 0.228193 0.213244 

Ordering with query #18: 18 20 16 17 19 22 37 39 36 24 
Scores with query #18: 1.000000 0.488779 0.487859 0.474571 0.405532 0.300967 0.286715 0.282207 0.279095 0.276393 

Ordering with query #30: 30 26 29 27 28 31 33 13 11 9 
Scores with query #30: 1.000000 0.776173 0.766426 0.765783 0.741209 0.237022 0.208674 0.174830 0.165422 0.132621 

create tf-idf database, check consistency

idf=getidf(lbls, 10000);
fprintf(1, 'IDF weights of first 10 visual words: '); fprintf(1, '%f ', idf(1:10)); fprintf(1, '\n\n');

tic
DB=createdb_tfidf(lbls, 10000, idf);
toc

fprintf(1, 'Lenghts of #2: %f, #8: %f, #10: %f\n', full([sum(DB(2,:).^2), sum(DB(18,:).^2), sum(DB(30,:).^2)]));
weights=full(DB(2,find(DB(2,:))));

fprintf(1, 'Weights img #2: '); fprintf(1, '%f ', weights(1:10)); fprintf(1, '\n\n');
IDF weights of first 10 visual words: 1.491655 2.590267 1.491655 2.590267 2.302585 2.590267 1.742969 1.897120 1.491655 0.000000 

Elapsed time is 0.041250 seconds.
Lenghts of #2: 1.000000, #8: 1.000000, #10: 1.000000
Weights img #2: 0.021145 0.019427 0.025055 0.038854 0.016626 0.038854 0.042291 0.021865 0.019427 0.033252 

query with few random documents

tic
[idxs scores]=query(DB, lbls{2}, idf);
toc
fprintf(1, 'Ordering with query #2: '); fprintf(1, '%d ', idxs(1:10)); fprintf(1, '\n');
fprintf(1, 'Scores with query #2: '); fprintf(1, '%f ', scores(1:10)); fprintf(1, '\n\n');

[idxs scores]=query(DB, lbls{18}, idf);
fprintf(1, 'Ordering with query #18: '); fprintf(1, '%d ', idxs(1:10)); fprintf(1, '\n');
fprintf(1, 'Scores with query #18: '); fprintf(1, '%f ', scores(1:10)); fprintf(1, '\n\n');

[idxs scores]=query(DB, lbls{30}, idf);
fprintf(1, 'Ordering with query #30: '); fprintf(1, '%d ', idxs(1:10)); fprintf(1, '\n');
fprintf(1, 'Scores with query #30: '); fprintf(1, '%f ', scores(1:10)); fprintf(1, '\n\n');
Elapsed time is 0.001227 seconds.
Ordering with query #2: 2 4 3 5 1 19 39 16 7 17 
Scores with query #2: 1.000000 0.347696 0.279267 0.247230 0.222187 0.193157 0.190336 0.187540 0.181372 0.168625 

Ordering with query #18: 18 16 17 20 19 37 22 24 39 36 
Scores with query #18: 1.000000 0.393213 0.374217 0.371489 0.322828 0.214533 0.208463 0.203933 0.203057 0.199907 

Ordering with query #30: 30 26 29 27 28 31 33 13 34 32 
Scores with query #30: 1.000000 0.749451 0.747558 0.745768 0.699909 0.235228 0.197142 0.116511 0.109158 0.108668