Search
We have all required tools for all tasks installed in Debian available in the laboratory. To run proper system, select “DCE PXE Menu → DCE Linux (first item)” in the boot menu. Login into system by using your CTU username and KOS password. Your home directories can be accessed also remotely by using ssh: “ssh username@postel.felk.cvut.cz”.
Imagine that you are a developer in a company which develops autonomous driving assistance systems. You are given an algorithm which doesn't run as fast as your manager want. The algorithm finds ellipses in given picture and will be used for wheels detection of a neighboring car while parking. Your task is to speed up this algorithm in order to run smoothly.
You probably need do following steps to achieve the desired speedup:
git fetch origin && git diff origin/master > ellipse.diff.txt
You are not allowed to modify the number of iterations and other parameters of RANSAC algorithm!
Program requirements – if you want to compile program on your own machine, you will need OpenCV library (libopencv-dev package) and boost library (libboost-all-dev package). If you don't want to install new libraries on your machine, you can connect our server via ssh (ssh user@postel.felk.cvut.cz) and work remotely.
How can we evaluate the efficiency of our implementation? Run time gives a simple overview of the program. However, much more useful are different types of information such as the number of performed instructions, cache misses, or memory references in respective lines of code in order to find hot spots of our program.
Easiest program analysis is time measurements, which can be done by using C time library. More precision values can be obtained by using high_resolution_clock in chrono library (C++11) or Linux function clock_gettime (man clock_gettime).
http://www.cplusplus.com/reference/ctime/
http://www.cplusplus.com/reference/chrono/
http://man7.org/linux/man-pages/man2/clock_gettime.2.html
GProf is a GCC profiling tool, which is based on statistical sampling (every 1 ms or 10 ms). It collects time spent in each function and constructs call graph. A program has to be compiled with a particular option and all libraries, which you want to profile, have to be linked statically. Then, running the program will generate profiling information. Note, that the resulting data are not exact. Shared library profiling can be done with sprof (man sprof).
https://sourceware.org/binutils/docs/gprof/
http://man7.org/linux/man-pages/man1/sprof.1.html
Cachegrind is part of Valgrind simulation tool. It uses the processor emulation to run the binary program and catches all performed instructions, memory accesses and their relationship to source lines and functions in a program. The program can have linked shared libraries, doesn't need to be recompiled to be simulated. However, you probably want to compile with debugging info (-g option) in order to match correctly source code lines. In any case, simulation usually takes about 50 times more time than running on real hardware. Profiling data generated by Cachegrind and gprof can be virtualised simply by opening log file in kcachegring.
http://valgrind.org/docs/manual/cg-manual.html
https://kcachegrind.github.io/
If you are interested also in a relationship and exact event counts spent while calling functions, you can use Callgrind, which extends Cachegrind by adding this functionality.
In the most modern processors are present performance counters, which can count various hardware events (clock cycles, executed instructions, cache reads/hits/misses, etc.). Linux perf is able to analyze program using these counters.
https://perf.wiki.kernel.org/index.php/Main_Page
Moreover, you can use any of hardware events listed in proper reference manual. For Intel processors – Intel® 64 and IA-32 architectures software developer’s manual: Volume 3B (Chapter 18 and 19) – available from https://software.intel.com/en-us/articles/intel-sdm.
If you get rubbish in perf report, try to specify event in perf record (example: perf record -e cycles ./program). Also call graph is useful (perf record --call-graph dwarf -e cycles ./program).
A few examples of perf usage: http://www.brendangregg.com/perf.html
By default, using performance counters without sudo rights is not allowed. You can enable non-sudo user pmc access by execution of this command:
sudo sh -c 'echo 1 >/proc/sys/kernel/perf_event_paranoid'
The Linux perf GUI for performance analysis offers UI around Linux perf. You can download AppImage from https://github.com/KDAB/hotspot/releases (don't forget to set permissions to run - chmod +x file) or build yourself (https://github.com/KDAB/hotspot).
If you are interested in performance counters, you can use it directly from your C/C++ program (without any external profiling tool). See perf_even_open manual page, or use some helper library built on kernel API (libpfm, PAPI toolkit, etc.)
http://man7.org/linux/man-pages/man2/perf_event_open.2.html
http://perfmon2.sourceforge.net
http://icl.cs.utk.edu/papi/index.html
We have no experience with Windows tools, however, there are a few free tools, for example, list on this page:
https://wiki.qt.io/Profiling_and_Memory_Checking_Tools
MS Visual Studio has profiler:
https://msdn.microsoft.com/en-us/library/mt210448.aspx
Another windows profilers
https://sourceforge.net/projects/lukestackwalker/ http://www.codersnotes.com/sleepy/
Also, Windows alternative to KCacheGrind – QCacheGrind:
https://sourceforge.net/projects/qcachegrindwin/
If you do not want install any of these tools, you can work remotely on our server via ssh. Putty and Xming are your friends.
There are many ways how to optimize your programs, you can
Several tip for optimizations: https://people.cs.clemson.edu/~dhouse/courses/405/papers/optimize.pdf
Sample CMakeLists.txt for compilation in various IDEs
cmake_minimum_required(VERSION 2.8) set(CMAKE_CXX_STANDARD 11) set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -O0 -Wall") project(find_ellipse) find_package(OpenCV REQUIRED) find_package(Boost 1.60 COMPONENTS filesystem REQUIRED ) include_directories( ${Boost_INCLUDE_DIR} ) aux_source_directory(. SRC_LIST) add_executable(${PROJECT_NAME} ${SRC_LIST}) target_link_libraries(${PROJECT_NAME} ${OpenCV_LIBS}) target_link_libraries(${PROJECT_NAME} ${Boost_LIBRARIES})
g++ -g -O0 -Wall -std=c++11 -I/usr/include/opencv *.cpp -o find_ellipse -lboost_filesystem -lboost_system -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core