Warning
This page is located in archive. Go to the latest version of this course pages.

Read Copy Update

Task assignment

Imagine that you are developing a multi-threaded application which proceeds concurrently many read and write requests on a shared data structure. Your goal is to speed up the application and achieve processing of 50k write operations per second and as much read operations as possible.

Steps:

  1. Download a naive implementation with mutexes from git repository:
    git clone https://gitlab.fel.cvut.cz/matejjoe/list_mutex.git
  2. Compile the program (simply run make in list_mutex directory)
  3. Run the program (./list_mutex <number or reader threads>) and see how slow the program is
    • The program creates one write thread and specified number of reader threads and prints achieved read and write operations per second
    • Since a priority of the mutex cannot be set, it may happen that writes or reads are not performed at all
  4. Replace the mutexes with rwlocks and measure improvement
    • Use lock priority, to guarantee that given amount of writes will be performed
  5. Replace list with liburcu (userspace RCU library) list and measure improvement
    • See hints below
  6. For rwlocks and librcu, measure reads with various number of reader threads and plot the results into a graph (for example from one to twenty readers) – use sufficient hardware (at least 8 cores – total number of threads shouldn't be larger than physical cores) – you can use some of remote servers such as ritchie.ciirc.cvut.cz (CTU login with KOS password). It is sufficient to measure just average values from let say 10 iterations.
  7. Commit the graph and code changes to your tutor (then into upload system). The graph should look roughly like this:

 sample graph

pthread rwlocks

The rwlock is part of pthread library. You could be interested in:

This part should be quite straightforward. In short, replace all pthread_mutex_lock/unlock, and update initialization. Variables (also structures) in C created on the stack (e.g. local variables) are initialized by random values, therefore, make sure that you initialize any created structure before use.

Userspace RCU

The liburcu should work on Linux, Cygwin and MacOS X. You can obtain liburcu from distribution repository or compile yourself. I would recommend you compiling the library because libraries in repositories are usually a bit old. In order to compile and use library do following:

  • Download and compile library
    • wget --no-check-certificate http://www.lttng.org/files/urcu/userspace-rcu-latest-0.10.tar.bz2
    • tar -xjf userspace-rcu-latest-0.10.tar.bz2
    • cd userspace-rcu-0.10.2/
    • ./configure --prefix=/home/me/somefolder/mybuild/output/target
      replace the prefix by something meaningful on your system – this is a place where you will install the library and link with the application afterwards
    • make -j8
    • make install
  • In ./doc/examples/list/ are useful examples
  • Add library into Makefile
    • add -lurcu or other rcu flavour into $LIBS (before -lpthread)
    • add path to include directory -I/home/me/somefolder/mybuild/output/target/include into $INCLUDES
    • add path to library directory -L/home/me/somefolder/mybuild/output/target/lib into $LFLAGS

Which steps should I do?

  1. Get liburcu library
  2. Look into examples folder (doc/examples/list/) to see how can you use the library
  3. Chose user-space rcu flavor (check table with requirements and quick start guide)
  4. Integrate it into the application (you can do it directly in main or in list implementation – I recommend you the latter)
    1. Update esw_node structure – add two elements (struct cds_list_head and struct rcu_head required by rcu implementation)
    2. Replace declaration of list (use CDS_LIST_HEAD macro)
    3. Implement required functions (take inspiration from examples) – you don't have to implement not used operations
    4. Don't forget to put requred library calls – depending on selected flavour, you have to insert some of these: rcu_read_(un)lock, synchronize_rcu, rcu_(un)register_thread, rcu_quiescent_state, (sys_membarrier)
  5. Measure speed-up

Supplementary materials

courses/b4m36esw/labs/lab04.txt · Last modified: 2019/03/29 15:50 by matejjoe