Search
Video tracking is a process of object position estimation across video frames.
Simple models for tracking include:
Popular tracking algorithms are: correlation, mean-shift, Kanade-Lucas-Tomasi (KLT) tracking, Kalman tracking, particle filtering, (sequential) linear predictors and so on. In this lab, you will implement and use the KLT tracker.
Your task will be to hide a selected object in a video sequence, by tracking it and bluring it. You will select the object in the first frame and after that you will track and blur this object in the whole sequence (in a similar way as you know from a TV.)
We will assume that the whole scene is planar. The scene object is thus part of that planar scene. Therefore, the task can be acomplished by tracking points in the whole scene, estimating the motion, and then bluring the right place.
Homography estimation can be done in the same way as in second task.
Steps:
track_init_klt.m
track_klt.m
processmpvframe.m
Example of tracking (without blurring) export_billa_xvid.avi
Download the function processMpvVideo(filename,method,options) (part of demo template), where filename is name of the videofile (e.g. example.avi), method is a tracking method and options is a structure with parameters for tracking function. The function creates a video sequence with tracked points shown and writes the output into the folder ./export/ (plus shows the preview.)
processMpvVideo(filename,method,options)
filename
example.avi
method
options
./export/
The script works well with nearly all Windows versions, but it is possible that you will encounter problems with codecs. Therefore, we have prepared the input video in several versions. On a linux system, or if any of the video version will not work, use the slightly changed function processmpvvideo_jpeg.m for reading from video decomposed to sequence of jpeg images. You can download or create these images:
processmpvvideo_jpeg.m
mplayer -vo png video.avi
Function processmpvvideo_jpeg.m creates sequence of images in folder ./export/. For joining images to video, you can use:
mencoder mf://*.png -mf w=640:h=480:fps=15:type=png -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=2400 -oac copy -o video.avi
Download the demo template.
The function processMpvVideo calls function data = processMpvInit(img,options), which selects the object for tracking in the first image img of the sequence. options is a structure defining options (see the code). The output is a bounding box:
processMpvVideo
data = processMpvInit(img,options)
img
data.xRect %x coordinates of bounding box corners (anticlockwise) data.yRect %y coordinates of bounding box corners (anticlockwise)
The function x = track_init_demo(img,options), returns points x for tracking.
x = track_init_demo(img,options)
x
The function is xNew = track_demo(imgPrev, imgNew, xPrev, options): imgPrev is a previous frame in the sequence, imgNew is a new image, xPrev are tracked points in the previous frame. Tracked points are represented as a structure:
xNew = track_demo(imgPrev, imgNew, xPrev, options)
imgPrev
imgNew
xPrev
x.x % COLUMN vector of ''x'' coordinates x.y % COLUMN vector of ''y'' coordinates x.ID % COLUMN vector unique identifier of the point (index to the array is not enough because some points will disappear during tracking) x.data % specific information for the tracking method
options are dependent on the method (see structure fields for demo). For the KLT tracker, the options will be:
options.klt_window % size of patch W(x,p) in the sense of getPatchSubpix.m options.klt_stop_count % maximal number of iteration steps options.klt_stop_treshold % minimal change epsilon^2 for termination (squared for easier computation of the distance) options.klt_show_steps % 0/1 turning on and of drawing during tracking
The output xNew are the tracked points found in the new frame.
xNew
The function processMpvFrame(data,imgPrev,imgNew,xPrev,xNew,options): function is called for each image. It uses found tracked points for estimating a homography between frames. Note that processMpvVideo is run by the main script cv08.m that also sets the options.
processMpvFrame(data,imgPrev,imgNew,xPrev,xNew,options)
cv08.m
KLT minimizes sum of squared difference of image intensities between windows in subsequent frames. The minimum is found iteratively by Newton-Raphson method.
We are given a patch template $T(\mathbf{x})$ centered at pixel $\mathbf{x} = [x,y]^T$ in image frame at time $t$. In a subsequent frame, at time $t+1$, the target moves to a new position described coordinate transformation $\mathbf{W}(\mathbf{x;p})=[x+p_x;y+p_y]^T$. The task is to estimate displacement $\mathbf{p} = [p_x, p_y]^T$.
$ (1)\;\;\sum_{\mathbf{x}}(I(\mathbf{W}(\mathbf{x;p})) - T(\mathbf{x}))^2. $
Consider the best shift is $\ \Delta \mathbf{p} $, from (1) we will get
$ (2)\;\;\sum_{\mathbf{x}}(I(\mathbf{W}(\mathbf{x;p}+\Delta \mathbf{p})) - T(\mathbf{x}))^2. $
We minimize this expression with respect to $\ \Delta \mathbf{p} $. Nonlinear expression (2) is linearized by (first order) Taylor expansion
$ (3)\;\;\sum_{\mathbf{x}}(I(\mathbf{W}(\mathbf{x;p})) + \nabla I\frac{\partial \mathbf{W}}{\partial \mathbf{p}}\Delta \mathbf{p}- T(\mathbf{x}))^2, $
where $ \nabla I = [\frac{\partial I}{\partial x},\frac{\partial I}{\partial y}] $ is gradient at $\ \mathbf{W}(\mathbf{x;p}). $ Term $ \frac{\partial \mathbf{W}}{\partial \mathbf{p}} $ is Jacobian matrix of the coordinate transformation
$(4)\;\; \frac{\partial \mathbf{W}}{\partial \mathbf{p}}= \frac{ \partial[x+p_x;y+p_y]^T}{\partial [p_x,p_y]^T}= \left[\begin{array}{cc} 1 & 0 0 & 1 \end{array}\right].$
The minimum of expresion (3) over $\Delta \mathbf{p}$ is
$ (5)\;\;\Delta \mathbf{p}=H^{-1}\sum_{\mathbf{x}}[\nabla I\frac{\partial \mathbf{W}}{\partial \mathbf{p}}]^T [T(\mathbf{x})-I(\mathbf{W}(\mathbf{x;p}))] $
where $ H $ is approximation of the Hessian matrix used in Gauss-Newton gradient method. This is a nonlinear regression in fact (nonlinear least squares). Note that the approximation of the Hassian matrix in this case is equal to the autocorrelation matrix (Harris matrix), i.e. a dot product of the first partial derivatives. This suggests that a good idea is to track the points in near neighborhood around Harris points
$ (6)\;\; H = \sum_{\mathbf{x}}[\nabla I\frac{\partial \mathbf{W}}{\partial \mathbf{p}}]^T[\nabla I\frac{\partial \mathbf{W}}{\partial \mathbf{p}}]$
Substituting (4) into (5) and (6) simplifies $ \nabla I\frac{\partial \mathbf{W}}{\partial \mathbf{p}}=\nabla I $. The displacement correction $\Delta \mathbf{p}$ is computed in each iteration and the estimated shift is updated by
$ (7)\;\; \mathbf{p} \leftarrow \mathbf{p} + \Delta \mathbf{p} $
The iterations are terminated by setting the maximum number of iteration steps and/or by convergence condition
$ (8)\;\; ||\Delta \mathbf{p}||\leq \epsilon. $
For the KLT algorithm, it is necessary to implement all operations in a sub-pixel precision.
You will need a function for a patch selection patch = getPatchSubpix(img,x,y,win_x,win_y), where the patch is a selection from image img around center x,y with size win_x*2+1 x win_y*2+1. Assume that win_x,win_y are integers, but x,y are real. Function interp2.m can be useful in your implementation. Tip: You will get much faster computation, if you crop the image before using interp2.m.
patch = getPatchSubpix(img,x,y,win_x,win_y)
patch
x,y
win_x*2+1 x win_y*2+1
win_x,win_y
interp2.m
The KLT algorithm can be summarized in a few steps:
For template $ T(\mathbf{x}) $ the neighborhood of Harris point $ \mathbf{x} $ in previous image xPrev, set $ \mathbf{p} = [0 ; 0] $ and iterate:
Set the new position of Harris point $ \mathbf{x} \leftarrow \mathbf{x} + \mathbf{p} $
If the algorithm did not converge in the maximum number of steps, discard the point from further tracking.
A simplified illustration of the algorithm demonstrated on a car tracking is in the figure below:
Tip: Do not ignore warnings from Matlab and use operator \ rather than function inv()
\
inv()
To easily check your implementation, include a drawing function after each iteration (function showKltStep available here).
showKltStep
if (options.klt_show_steps) showKltStep(step,T,I,E,Gx,Gy,aP); end % step - serial number of iteration (zero based) % T - template, (patch from imgPrev) % I - current sifted patch in imgNew % E - current error (I - T) % Gx,Gy - gradients % aP - size of current shift delta P
We will use our function [Hbest,inl]=ransac_h(u,threshold,confidence) from the second task. Create the vector of points u. The IDs are known and you also know that xPrev contains all points from xNew. Keep the setting in fields options.rnsc_threshold and options.rnsc_confidence.
[Hbest,inl]=ransac_h(u,threshold,confidence)
u
ID
options.rnsc_threshold
options.rnsc_confidence
Discard the points which are homography outliers from further tracking. Advanced algorithms have methods for adding new tracking points, however we will not implement any in this lab.
Knowing the homography (matrix H), you can transform ($x_{n} = H_{best} x_{p}$) the corners of bounding box which outlines the selected object from one frame to the next frame. Blur the interior region with function gaussfilter.m with high enough sigma. Do not blur the image outside of the selection (you can use your knowledge from this course).
H
gaussfilter.m
Join your functions into [dataOut xNewOut] = processMpvFrame(data,imgPrev,imgNew,xPrev,xNew,options). This function returns structure dataOut containing
[dataOut xNewOut] = processMpvFrame(data,imgPrev,imgNew,xPrev,xNew,options)
dataOut
dataOut.xRect %transfomed x-coordinates of bounding box corners (anti-clockwise) dataOut.yRect %transfomed y-coordinates of bounding box corners (anti-clockwise) dataOut.H %estimated homography
xNewOut
Implement KLT tracking algorithm, estimation of Harris point translations, and test it on the familiar sequence with promotional leaflet
getPatchSubpix.m
Submit file cv08.m; in this script, set structure options and call function processMpvVideo.m.
processMpvVideo.m
Include also completed functions track_init_klt.m, track_klt.m and getPatchSubpix.m together with all used non-standard functions you have created. Submit the generated video file export_billa_xvid.avi with blurred selection and a highlighted bouding box and the automatically generated file homography.mat with all homographies.
export_billa_xvid.avi
homography.mat
Please do not submit the function procesMpvVideo.m or showKltStep.m. It interferes with the automatic evaluation.
procesMpvVideo.m
showKltStep.m
For testing, download showkltstep.m for visualization of KLT iteration.
To test your code, you can use a matlab script and a function 'publish'. Copy klt_test.zip, unpack and add your files, which are requested for submission. Do not unpack it into your working folder, because it contains our version of processMpvVideo.m and showKltSteps.m. Compare your results with ours.
showKltSteps.m
Lucas-Kanade 20 Years On: A Unifying Framework
Predator: A Smart Camera that Learns