Profiling in Java

Task assignment

Your task is to speed up a variant of Multi-Objective A* solving Bi-Objective Shortest Path Problem.

The main difference from classical single-objective A* is that instead of maintaining of a single search state per node, the algorithm must maintain a whole pareto-bag of non-dominated search states.

You probably need to do the following steps to run the algorithm:

Download the program from git repository: git clone https://gitlab.fel.cvut.cz/B192_B4M36ESW/multiobjective_shortest_path.git
Open the source code in IDE
Run the program (cz.cvut.fel.esw.shortestpath.Main class)
Do profiling (hints below)
Make changes which will improve the speed (code modifications)
Upload patch file into upload system and pass specified time limit: Patch will be applied using git apply command, therefore, best way to generate patch is the git diff command:
```
git fetch origin && git diff origin/master > mosp.diff.txt
```
(.txt file type because of the upload system restrictions).

Prerequisities

The automatic evaluation runs on Java 11

Profiling tools

IDE

Most of the IDEs now integrates some of the profilers (e.g. IntelliJ IDEA Ultimate from version 2019.3). It is the easiest way to get insight of what is happening during the execution; however, other more specialized tools typically provides more information.

Java VisualVM

The Java VisualVM is a baseline for profiling Java applications. It allows you to use both sampling and instrumentation approaches to profile the application. It allows to measure time spent in functions and see statistics about objects on the heap. It can also generate and analyse heap dumps, monitor garbage collector (and invoke it) and monitor threads.

https://visualvm.github.io/documentation.html

Java Mission Control with Java Flight Recorder

Java Flight Recorder (JFR) is a tool for collecting diagnostic and profiling data about a running Java application. It is integrated into the Java Virtual Machine (JVM) and causes almost no performance overhead, so it can be used even in heavily loaded production environments. The JFR can record detailed information about various aspects of the application - e.g. object statistics, compilation, detailed GC info, I/O operations, exceptions,etc.

Java Mission Control (JMC) is the usual tool to examine JFR recordings. It can also start and configure JFR recordings.

Flight Recording can be started automatically with VM arguments:

  -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -XX:StartFlightRecording=duration=10s,filename=myrecording.jfr

https://www.oracle.com/technetwork/java/javaseproducts/downloads/jmc7-downloads-5868868.html or https://adoptopenjdk.net/jmc.html

https://docs.oracle.com/javacomponents/index.html

async-profiler

Async-profiler is a low overhead profiling tool for Linux/macOS that uses perf. It has been recently integrated to Linux/macOS version of IntelliJ IDEA Ultimate.

https://github.com/jvm-profiling-tools/async-profiler

JConsole

JConsole is a tool with a basic monitoring functionality - heap size, CPU usage, thread monitor, etc. But it also allows using JMX instrumentation in order to see JVM parameters and in some cases it also allows you to change the parameters.

https://docs.oracle.com/en/java/javase/11/management/using-jconsole.html

There are also some command line tools (jcmd, jhat, jinfo,…) but most of their functionality is covered by the tools described above.

Tips

Try to find the code that is executed a lot and look what look suspicous. Avoid unnecessary object creation e.g. boxing and unboxing of primitive variables (int vs. Integer). Profile also the memory and look what objects are created the most (maybe even analyze heap dump). Is it really necessary to create all those objects?

Try to use various tools and methods, sampling vs instrumentation (in visualvm called profiling). Since their implementation and methodology differ they suffer from different profiling errors; therefore, they could provide different insights into the program.

You can also leverage knowledge about data that are handled by some data structures. You should ask following questions if profiling shows you that some data structure could be the bottleneck: What is the purpose of this structure? Do I know anything about the data that this data structure handles? Can I use different, faster (e.g. less general), data structure that leverages the knowledge about the data?

Table of Contents