Read Copy Update
Task assignment
Imagine that you are developing a multi-threaded application which proceeds concurrently many read and write requests on a shared data structure. Your goal is to speed up the application and achieve processing of 50k write operations per second and as much read operations as possible.
Steps:
Download a naive implementation with mutexes from git repository:
git clone https://gitlab.fel.cvut.cz/matejjoe/list_mutex.git
Compile the program (simply run make
in list_mutex directory)
Run the program (./list_mutex <number or reader threads>
) and see how slow the program is
The program creates one write thread and specified number of reader threads and prints achieved read and write operations per second
Since a priority of the mutex cannot be set, it may happen that writes or reads are not performed at all
Replace the mutexes with rwlocks and measure improvement
Replace list with liburcu (userspace RCU library) list and measure improvement
For rwlocks and librcu, measure reads with various number of reader threads and plot the results into a graph (for example from one to twenty readers) – use sufficient hardware (at least 8 cores – total number of threads shouldn't be larger than physical cores) – you can use some of remote servers such as ritchie.ciirc.cvut.cz (CTU login with KOS password). It is sufficient to measure just average values from let say 10 iterations.
Commit the graph and code changes to your tutor (then into upload system). The graph should look roughly like this:
pthread rwlocks
The rwlock is part of pthread library. You could be interested in:
pthread_rwlockattr_setkind_np() – reader/writer priority – use PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP and don't forget to init atribute before (pthread_rwlockattr_init)
This part should be quite straightforward. In short, replace all pthread_mutex_lock/unlock, and update initialization. Variables (also structures) in C created on the stack (e.g. local variables) are initialized by random values, therefore, make sure that you initialize any created structure before use.
Userspace RCU
The liburcu should work on Linux, Cygwin and MacOS X. You can obtain liburcu from distribution repository or compile yourself. I would recommend you compiling the library because libraries in repositories are usually a bit old. In order to compile and use library do following:
Download and compile library
wget --no-check-certificate http://www.lttng.org/files/urcu/userspace-rcu-latest-0.10.tar.bz2
tar -xjf userspace-rcu-latest-0.10.tar.bz2
./configure --prefix=/home/me/somefolder/mybuild/output/target
replace the prefix by something meaningful on your system – this is a place where you will install the library and link with the application afterwards
In ./doc/examples/list/ are useful examples
Add library into Makefile
add -lurcu
or other rcu flavour into $LIBS
(before -lpthread
)
add path to include directory -I/home/me/somefolder/mybuild/output/target/include
into $INCLUDES
add path to library directory -L/home/me/somefolder/mybuild/output/target/lib
into $LFLAGS
Which steps should I do?
Get liburcu library
Look into examples folder (doc/examples/list/) to see how can you use the library
Integrate it into the application (you can do it directly in main or in list implementation – I recommend you the latter)
Update esw_node structure – add two elements (struct cds_list_head and struct rcu_head required by rcu implementation)
Replace declaration of list (use CDS_LIST_HEAD macro)
Implement required functions (take inspiration from examples) – you don't have to implement not used operations
Don't forget to put requred library calls – depending on selected flavour, you have to insert some of these: rcu_read_(un)lock, synchronize_rcu, rcu_(un)register_thread, rcu_quiescent_state, (sys_membarrier)
Measure speed-up
Supplementary materials