Search
Most comfortable way for you right now. Install only a few required tools from the repository (perf, hotspot, libopencv, libboost). MAC users can survive in this task as well, instead of perf and hotspot, you can use instruments or Clion for profiling.
Usually, all required tools can be accessed from computers in the lab. This year, to support distance learning, we installed required software to our server and there are two possibilities how to work remotely with GUI applications:
Xpra is an open-source multi-platform persistent remote display server and client for forwarding applications and desktop screens. It gives you remote access to individual applications or full desktops. It is available for Windows / Mac OS X / Linux.
First, run xpra_launcher from command line:
xpra_launcher
Mode: SSH Server: <username>@ritchie.ciirc.cvut.cz:22 Server Password: <CVUT password> or empty if you copy your public ssh key to the server first
And connect to the server. You can save and load your configuration.
After successful login, click in the right bottom corner to settings icon and then on Move. Move the window so that it is visible and press Default configuration. This should create applications panel for you.
In case our server is overloaded, we will provide remote access to lab computers, however, it is a bit more complicated.
We have all required tools for all tasks installed in Debian available in the laboratory. To run proper system, select “DCE PXE Menu → DCE Linux (first item)” in the boot menu. Login into system by using your CTU username and KOS password. Your home directories can be accessed also remotely by using ssh: “ssh username@postel.felk.cvut.cz”.
Imagine that you are a developer in a company which develops autonomous driving assistance systems. You are given an algorithm which doesn't run as fast as your manager want. The algorithm finds ellipses in given picture and will be used for wheels detection of a neighbouring car while parking. Your task is to speed up this algorithm in order to run smoothly.
You probably need do following steps to achieve the desired speed-up:
git apply
git diff
git fetch origin && git diff origin/master > ellipse.diff.txt
You are not allowed to modify the number of iterations and other parameters of the RANSAC algorithm!
Program requirements – if you want to compile the program on your own machine, you will need the OpenCV library (libopencv-dev package) and the boost library (libboost-all-dev package). If you don't want to install new libraries on your machine, you can connect our server via ssh (ssh user@ritchie.ciirc.cvut.cz) and work remotely.
How can we evaluate the efficiency of our implementation? Run time gives a simple overview of the program. However, much more useful are different types of information such as the number of performed instructions, cache misses, or memory references in respective lines of code in order to find hot spots of our program.
The easiest way to analyze program performance is to measure its execution time. There are multiple ways how it can be done in a C/C++ program:
GProf is a GNU profiling tool based on statistical sampling (every 1 ms or 10 ms). It collects the time spent in each function and constructs call graph. A program must be compiled with a specific option and all libraries you want to profile must be statically linked. When you then run the program, profiling information is generated. Note that the resulting data is not exact. Shared libraries can be profiled with sprof.
Cachegrind is part of Valgrind simulation tool. It uses the processor emulation to run the binary program and catches all performed instructions, memory accesses and their relationship to source lines and functions in a program. The program can have linked shared libraries, doesn't need to be recompiled to be simulated. However, you probably want to compile with debugging info (-g option) in order to match correctly source code lines. In any case, simulation usually takes about 50 times more time than running on real hardware. Profiling data generated by Cachegrind and gprof can be virtualised simply by opening log file in kcachegring.
http://valgrind.org/docs/manual/cg-manual.html
https://kcachegrind.github.io/
If you are interested also in a relationship and exact event counts spent while calling functions, you can use Callgrind, which extends Cachegrind by adding this functionality.
Most modern processors have performance counters that can count various hardware events (clock cycles, instructions executed, cache reads/hits/misses, etc.). Linux perf is able to analyze a program based on these counters.
You can also use any hardware events listed in the appropriate reference manual. For Intel processors - Intel® 64 and IA -32 architectures software developer's manual: Volume 3B (Chapters 18 and 19) - available at https://software.intel.com/en-us/articles/intel-sdm.
If you get rubbish in your perf report, try specifying the event in the perf record (example: perf record -e cycles ./program). A call graph is also useful: perf record --call-graph dwarf -e cycles ./program.
perf record -e cycles ./program
perf record --call-graph dwarf -e cycles ./program
A few examples of perf usage: http://www.brendangregg.com/perf.html
By default, using performance counters is not allowed without sudo privileges. You can enable access for non-sudo user pmc by running this command:
echo 1 | sudo tee /proc/sys/kernel/perf_event_paranoid
The perf tool has a GUI called Hotspot, that makes it easier to run the recording and analyze and visualize the data. You can run it via an AppImage package (don't forget to set permissions to run - chmod +x file) or build it yourself.
If you are interested in performance counters, you can use it directly from your C/C++ program (without any external profiling tool). See perf_even_open manual page, or use some helper library built on kernel API (libpfm, PAPI toolkit, etc.)
http://man7.org/linux/man-pages/man2/perf_event_open.2.html
http://perfmon2.sourceforge.net
http://icl.cs.utk.edu/papi/index.html
We have no experience with Windows tools, however, there are a few free tools, for example, list on this page:
https://wiki.qt.io/Profiling_and_Memory_Checking_Tools
MS Visual Studio has profiler:
https://msdn.microsoft.com/en-us/library/mt210448.aspx
Another windows profilers
https://sourceforge.net/projects/lukestackwalker/ http://www.codersnotes.com/sleepy/
Also, Windows alternative to KCacheGrind – QCacheGrind:
https://sourceforge.net/projects/qcachegrindwin/
If you do not want install any of these tools, you can work remotely on our server via ssh. Putty and Xming are your friends.
There are many ways how to optimize your programs, you can
Several tip for optimizations: https://people.cs.clemson.edu/~dhouse/courses/405/papers/optimize.pdf
Sample CMakeLists.txt for compilation in various IDEs
cmake_minimum_required(VERSION 2.8) set(CMAKE_CXX_STANDARD 11) set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -O0 -Wall") project(find_ellipse) find_package(OpenCV REQUIRED) find_package(Boost 1.60 COMPONENTS filesystem REQUIRED ) include_directories( ${Boost_INCLUDE_DIR} ) aux_source_directory(. SRC_LIST) add_executable(${PROJECT_NAME} ${SRC_LIST}) target_link_libraries(${PROJECT_NAME} ${OpenCV_LIBS}) target_link_libraries(${PROJECT_NAME} ${Boost_LIBRARIES})
g++ -g -O0 -Wall -std=c++11 -I/usr/include/opencv *.cpp -o find_ellipse -lboost_filesystem -lboost_system -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core