====== Introduction to Image Processing with Pytorch ====== Python, numpy and PyTorch will be used for the MPV labs. In case you are not familiar with them, study the following parts of the [[https://cw.fel.cvut.cz/wiki/courses/be5b33rpz/labs/01_intro/start|"Intro to numpy"]], [[https://www.analyticsvidhya.com/blog/2019/09/introduction-to-pytorch-from-scratch/|"A Beginner-Friendly Guide to PyTorch and How it Works from Scratch"]], [[https://github.com/wkentaro/pytorch-for-numpy-users| Pytorch for numpy users]], [[https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html||numpy for matlab users]] [[https://cw.fel.cvut.cz/b192/courses/mpv/labs/1_intro/start|Introduction into PyTorch Image Processing]]. To fulfil this assignment, you need to submit these files (all packed in one ''.zip'' file) into the [[https://cw.felk.cvut.cz/brute/|upload system]]: * **''imagefiltering.ipynb''** - a notebook for data initialisation, calling of the implemented functions and plotting of their results (for your convenience, will not be checked). * **''imagefiltering.py''** - file with the following methods implemented: * **''gaussian1d''**, **''gaussian_deriv1d''** - functions for computing Gaussian function and its first derivative. * **''filter2d''**, - function for applying 2d filter kernel to image tensor * **''gaussian_filter2d''**, **''spatial_gradient_first_order''**, **''spatial_gradient_second_order''** - functions for Gaussian blur and 1st and 2nd image spatial gradients computation * **''affine''**, - function for transforming 3 points in the image into affine transformation matrix * **''extract_affine_patches''**, - function extraction of the patches, defined by affine transform A. ** Use [[https://gitlab.fel.cvut.cz/mishkdmy/mpv-python-assignment-templates#Assignment Templates|template]] of the assignment.** When preparing a zip file for the upload system, **do not include any directories**, the files have to be in the zip file root. [[courses:mpv:labs:general_info|Python and PyTorch Development ]] ====== Basics of Image Processing in PyTorch ====== * To get the full usage of the parallel processing in PyTorch, the default choice is to work with 4d tensors of images. 4d tensor is an array of the shape **[BxChxHxW]**, where **B** is batch size aka number of images, **Ch** is number of channels (3 for RGB, 1 for grayscale, etc.) **H** and **W** are height and width of the tensor. * To convert image in form of numpy array (e.g., result of reading the image with OpenCV **cv2.imread** function), one could use function **[[https://kornia.readthedocs.io/en/latest/utils.html#kornia.utils.image_to_tensor|kornia.utils.image_to_tensor]]**. * PyTorch has a powerful autograd engine, which can be used for backpropagiting the error to the parameters and arguments. However, in the first part of this course we will not be using it, so one could save computation time and memory by running the functions under [[https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#gradients|torch.no_grad()]] import torch.nn.functional as F with torch.no_grad(): out = F.conv2d(in, weight) * PyTorch has two interfaces. One is object oriented and based on [[https://pytorch.org/docs/stable/nn.html#torch.nn.Module|Modules]], another is [[https://pytorch.org/docs/stable/nn.functional.html|functional]]. Functional is more suitable for this course, although feel free to use modules, if it is more convenient to you. * Remember, that you can use numpy functions on the pytorch tensors (only in CPU mode). Thus, if you are more familiar with numpy, you can use it for the labs. {{:courses:mpv:labs:1_intro:numpy-pytorch.png?nolink|}} * **Whenever possible, use vectorized operations instead of for-loops.** For loops are very inefficient in python, pytorch nad matlab, unlike in C++, especially for images. See example below: {{:courses:mpv:labs:1_intro:forloop-vs-vectorizes.png?nolink|}} ===== Convolution, Image Smoothing and Gradient ===== * The Gaussian function is often used in image processing as a low pass filter for noise reduction, or as a windowing function weighting points in a neighbourhood. Implement the function ''gaussian1d(x,sigma)'' that computes values of a (1D) Gaussian with zero mean and variance $\sigma^2$:\\ $$ G = \frac{1}{\sqrt{2\pi}\sigma}\cdot e^{-\frac{x^2}{2\sigma^2}} $$\\ in points specified by vector ''x''. * Implement function ''gaussian_deriv1d(x,sigma)'' that returns the first derivative of a Gaussian\\ $$\frac{d}{dx}G(x) = \frac{d}{dx}\frac{1}{\sqrt{2\pi}\sigma}\cdot e^{-\frac{x^2}{2\sigma^2}} = -\frac{1}{\sqrt{2\pi}\sigma^3}\cdot x\cdot e^{-\frac{x^2}{2\sigma^2}} = -\frac{x}{\sigma^2}G(x)$$\\ in points specified by vector ''x''. * Get acquainted with the function ''torch.nn.functional.conv2d''. Use padding mode "replicate" (see [[https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.pad|F.pad]].) * The effect of filtering with Gaussian and its derivative can be best visualized using an impulse (1-nonzero-pixel) image: from lab0_reference.imagefiltering import gaussian_filter2d inp = torch.zeros((1,1,32,32)) inp[...,15,15] = 1. imshow_torch(inp) sigma = 3.0 out = gaussian_filter2d(inp, sigma) imshow_torch(out) try to find out impulse responses of other combinations of the Gaussian and its derivatives. * Examples of impulse responses: input image, Gaussian filter, first and second derivatives \\ {{:courses:mpv:labs:1_intro:gauss_deriv.png?800|}}\\ * Write a function ''filter2d(in,kernel)'' that implements per-channel convolution of input tensor with kernel. * Write a function ''gaussian_filter2d(in,sigma)'' of an input image tensor //in// with a Gaussian filter of width ''2*ceil(sigma*3.0)+1'' and variance $\sigma^2$ and returns the smoothed image tensor //out// (e.g. {{:courses:mpv:labs:1_intro:lena512.png?linkonly|Lena}}). Exploit the separability property of Gaussian filter and implement the smoothing as two convolutions with one dimensional Gaussian filter (see function ''torch.nn.functional.conv2d''). * Modify function gaussfilter to a new function ''spatial_gradient_first_order(in,sigma)'' that returns the estimate of the gradient (gx, gy) in each point of the input image //in// (BxChxHxW tensor) after smoothing with Gaussian with variance $\sigma^2$. Use either first derivative of Gaussian or the convolution and symmetric difference to estimate the gradient BxChx2xHxW tensor. * Implement function ''spatial_gradient_second_order(in,sigma)'' that returns all second derivatives of input image //in// (BxChxHxW tensor) after smoothing with Gaussian of variance $\sigma^2$ BxChx3xHxW tensor. **Make sure that it outputs zeros for constant inputs** ==== References ==== [[https://staff.fnwi.uva.nl/r.vandenboomgaard/IPCV20172018/LectureNotes/IP/LocalStructure/GaussianDerivatives.html|Gaussian derivatives]] ===== Geometric Transformations and Interpolation of the Image ===== * Implement function ''affine(x1_y1,x2_y2,x3_y3)'' that returns 3x3 transformation matrix A which transforms point in homogeneous coordinates from canonical coordinate system into image: (0,0,1)->(x1,y1,1), (1,0,1)->(x2,y2,1), (0,1,1)->(x3,y3,1). \\ {{ https://cw.fel.cvut.cz/old/_media/courses/a4m33mpv/cviceni/1_uvod/affine.png?500 }} \\ * Write function ''extract_affine_patches(in,A,ps,ext)'' that warps a patch from image //in// (BxChxMxN tensor) into canonical coordinate system. Affine transformation matrix //A// (3x3 elements) is a transformation matrix from the canonical coordinate system into image from previous task. The parameter //ps// defines the dimensions of the output image (the length of each side) and //ext// is a real number that defines the extent of the patch in coordinates of the canonical coordinate system. E.g. ''extract_affine_patches(in,A,41,3.0)'', returns the patch of size 1xChx41x41 pixels that corresponds to the rectangle (-3.0,-3.0)x(3.0,3.0) in the canonical coordinate system. Top left corner of the image has coordinates (0,0). Use bilinear interpolation for image warping. Check the functionality on this {{ :courses:mpv:labs:1_intro:img1.png?linkonly |image}}. \\ {{ https://cw.fel.cvut.cz/old/_media/courses/a4m33mpv/cviceni/1_uvod/affinetr.png?600 }} \\ ==== References ==== [[http://people.ciirc.cvut.cz/~hlavac/TeachPresEn/11ImageProc/18BrightGeomTxEn.pdf|Geometric transformations - review of course Digital image processing]]\\ [[http://www.cs.princeton.edu/courses/archive/fall00/cs426/lectures/transform/transform.pdf|Geometric transformations - hierarchy of transformations, homogeneous coordinates]] ===== Checking Your Results ===== You can check results of the functions required in this lab using the Jupyter notebook [[https://gitlab.fel.cvut.cz/mishkdmy/mpv-python-assignment-templates/blob/master/assignment_0_3_correspondences_template/imagefiltering.ipynb|imagefiltering.ipynb]].