====== Bonus Tasks for Simulator ====== The tasks are submitted through the personal GIT repositories. Code is committed into the appropriate directory and pushed to the APO course personal GitLab repository, see the [[..:..:documentation:githowto:start|how-to for the work with GIT]]. Code can be tested in a native or an online [[https://comparch.edu.cvut.cz/|QtRvSim]] graphical version. Follow the course forum for information on when each of the following tasks are open and what the deadline is. You can ask for help on the forum as well. ===== Bubble-sort and Possibly Other Assembler Tasks Submissions ===== The automated evaluation is used to ensure timely feedback to students. The first task is designed to familiarize with the basic principles of processor instruction execution and practice its use (see [[..:..:tutorials:03:start|Tutorial 3]]). The goal of the first task is to implement [[https://en.wikipedia.org/wiki/Bubble_sort|bubble-sort]] algorithm in base RISC-V 32-bit instruction set. A file at the location ''work/bubble-sort/bubble-sort.S'' in the personal repository is expected. Implementation must export the ''array_size'' and ''array_start'' symbols with the ''.globl'' directive, with no data or code behind the ''array_start'' symbol in the next 200 bytes. Overwriting this region should not adversely affect the running of the program. The test program loads the test data set into memory starting from the ''array_start'' address and fills the number of elements into the word at address ''array_size''. The program is run then, and when it stops at the ''ebreak'' instruction, processed data are retrieved from the ''array_start'' address. The template for the project and its evaluation is provided in the repository [[https://gitlab.fel.cvut.cz/b35apo/stud-support|stud-support]]. See bubble-sort project template in the directory ([[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/tree/master/seminaries%2Fqtrvsim%2Fbuble-sort|seminaries/qtrvsim/buble-sort]]). It holds even template source file ''[[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/blob/master/seminaries/qtrvsim/buble-sort/bubble-sort-template.S|work/bubble-sort/bubble-sort.S]]''. The file should be renamed your implementation should be added into it. Copy the file and possibly other files under the correct name to the ''work/bubble-sort'' directory of your repository. All development can be done using the integrated assembler. If you want to perform testing as we do to automate testing, you need an external riscv32-unknown-elf compiler (by default, it is provided by riscv64-unknown-elf package when 32-bit ABI ''-mabi=ilp32'' and ISA ''-march=rv32i'' are selected). With the integrated assembler in the command line mode, we are not yet able to refer to the correct addresses when uploading the data. If you specify addresses as absolute values, it is also possible to test with internal assembler qtrvsim_cli --dump-cycles --asm bubble-sort.S --load-range 0x12340,array_size.in --load-range 0x12344,array_data.in --dump-range 0x12344,60,array_data.out The evaluation automation checks every two minutes a list of received notifications delivered by hooks when individual personal course repository is updated. The code is fetched from each updated repository and evaluated in the same way as done by the provided ''Makefile'' but with different data. The result and possible error specification from each evaluation are then stored in a file derived from the personal repository login name and is published at next location [[http://pisa-virt.felk.cvut.cz/apo/bubble-sort-ci/|http://pisa-virt.felk.cvut.cz/apo/bubble-sort-ci/]]. The summary rank file can be used to compare the mutual efficiency of submitted solutions. ===== Optimization of Code and Cache Organisation ===== The evaluation system expects the designed sorting algorithm to be implemented in a file ''work/apo-sort/apo-sort.S'' of your personal repository. The symbols ''array_size'' and ''array_start'' has to be exported by ''.globl'' directive and the space of at least 200 bytes after ''array_start'' symbol/address can/will be overwritten by the test data, so code or data which rewrite would lead to the algorithm failure must not be stored there. The evaluation system loads a set of 32-bit integer words starting at ''array_start'', and the number of words is stored into a data word at the location defined by ''array_size'' symbol. Then the program is executed, and data starting at address ''array_start'' are retrieved and evaluated after reaching ''ebreak'' instruction. The template for the project is located in ([[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/tree/master/seminaries%2Fqtrvsim%2Fapo-sort|seminaries/qtrvsim/apo-sort]]) directory of [[https://gitlab.fel.cvut.cz/b35apo/stud-support|stud-support]] repository. There is a template ''[[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/blob/master/seminaries/qtrvsim/apo-sort/apo-sort-template.S|work/apo-sort/apo-sort.S]]'' for actual code implementation. Copy the file and possibly the whole directory content into the correct submission directory ''work/apo-sort'' of your personal repository. The integrated assembler compiler allows complete development and manual testing and inspection. The processor can be set to mode ''No pipeline no cache'' for initial testing. Then enable cache and tune parameters and observe required time to solve the task. The data cache parameters are set in the file ''[[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/blob/master/seminaries/qtrvsim/apo-sort/d-cache-template.par|work/apo-sort/d-cache-template.par]]''. The file format follows policy,sets,words_in_block,ways,write_method For example lru,1,1,1,wb The allowed maximal cache capacity is limited in the final evaluation by 16 32-bit words size. The length of the official evaluation dataset to sort is in the range of 24 to 32 words. The initial main memory memory access latency is set to 10 cycles. The burst latency 2 is configured for the following consecutive accesses. If you want to test the project in the same environment as we use for evaluation, you need riscv32-unknown-elf or riscv64-unknown-elf compiler toolchain. It is not possible to reference symbols for data injection and retrieval with the integrated assembler yet. If the locations for the word count and the start of the actual sorted array are taken from GUI version, then the CLI version can be invoked even with the integrated assembler qtrvsim_cli --dump-cycles --dump-cache-stats --d-cache lru,1,2,2,wb --read-time 10 --write-time 10 --burst-time 2 --asm apo-sort.S --load-range array_size,array_size.in --load-range array_start,array_data.in --dump-range array_start,60,array_data.out The automatic evaluation system processes a list of the received update notifications and stores evaluation results into a log (build+execution success or errors) and the rank files starting by matching the login name. The results even with overall ranking are published at [[http://pisa-virt.felk.cvut.cz/apo/apo-sort-ci/|http://pisa-virt.felk.cvut.cz/apo/apo-sort-ci/]]. ===== Data Hazards Prevention in Software by Code Adaptation/Optimization for Non-regular RISC-V CPU ===== The goal is to implement Fibonacci's series computation algorithm for pipelined processors without a hazard unit. The code template is ([[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/blob/master/seminaries/qtrvsim/fibo-hazards/fibo-hazards-template.S|fibo-hazards]]). The code is submitted to the [[..:..:documentation:githowto:start|personal GIT]] repository. The actual [[http://pisa-virt.felk.cvut.cz/apo/fibo-hazards-ci/rank.txt|list]] with successful implementations and required cycles. The log files with individual implementations errors at location [[http://pisa-virt.felk.cvut.cz/apo/fibo-hazards-ci/|http://pisa-virt.felk.cvut.cz/apo/fibo-hazards-ci/]]. ===== Value Output Sent Hexadecimal to Serial Port ===== The template with instructions [[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/blob/master/seminaries/qtrvsim/print-hex-to-uart/print-hex-to-uart-template.S|print-hex-to-uart]]. The evaluation results at [[http://pisa-virt.felk.cvut.cz/apo/print-hex-to-uart-ci/|http://pisa-virt.felk.cvut.cz/apo/print-hex-to-uart-ci/]]. The input is a randomly generated number at ''input_val'' location, and the task is to convert it into series of ASCII characters sent to the serial port. The task solution can even be tuned to work in a constant time, independent of the randomly assigned value, but such optimization is not required. ===== Simple "Calculator" Implementation as C Language RISC-V Program ===== The goal is to implement a simple "calculator" as the C language program. The code template with instructions [[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/blob/master/seminaries/qtrvsim/uart-calc-add/uart-calc-add-template.c|seminaries/qtrvsimuart-calc-add/uart-calc-add-template.c]]. The successful results with cycles spent are available in the file ''rank.txt'' at the location [[http://pisa-virt.felk.cvut.cz/apo/uart-calc-add-ci/|http://pisa-virt.felk.cvut.cz/apo/uart-calc-add-ci/]]. The switch to the higher level C language corresponds to the course approaching the similar work to control peripherals from the program running on MZ_APO Xylinx Zynq based educational kits (semestral work). The C language program execution requires at least a minimal startup code sequence, even for the simplest bare-metal environment. The C code ABI for RISC-V requires at least to set global pointer (''gp'') register, see the provided file ''crt0local.S'' located in the template directory. The provided ''Makefile'' builds the program, invokes the command-line simulator version, and tests it. The C language compiler for bare-metal (without operating system) RISC-V environment/CPU is required. You can test your code in the lab or remotely on the Postel server. Installation of complete riscv64-unknown-elf GCC toolchain is easy on modern GNU/Linux distributions as well.