===== Bonus: Periferie mapované do paměťového adresního prostoru ===== Pro zájemce, kteří by se na cvičení nudili. Simulátor [[https://github.com/cvut/QtRvSim/|QtRvSim]] nabízí i několik jednoduchých periferií, které jsou mapované do paměťového adresního prostoru. První periferií je jednoduchý sériový port ([[https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter|UART]]) napojený na okénko terminálu. V simulátoru QtRvSim je mapovaný jak na adresu 0xffff0000, tak na adresu 0xffffc000, která je dosažitelná absolutním adresováním v instrukcích LW a SW proti registru zero. ^ Adresa ^ Symbolické označení ^ Bity ^ Popis ^ | 0xffffc000 | SERP_RX_ST_REG | | Stavový registr přijímače znaků z terminálu | | | | 0 | Pokud je 1 tak je v SERP_RX_DATA_REG nový přijatý znak | | | | 1 | Natavením na 1 povoluje přerušení od příjmu [[https://github.com/cvut/QtMips/blob/master/README.md#interrupts-and-coprocessor-0-support|podrobněji]] | | 0xffffc004 | SERP_RX_DATA_REG | 7 .. 0 | ASCII kód přijatého znaku | | 0xffffc008 | SERP_TX_ST_REG | | Stavový registr vysílače znaků na terminál | | | | 0 | Pokud je 1 tak je vysílač připravený na zápis znaku | | | | 1 | Natavením na 1 povoluje přerušení od vysílání [[https://github.com/cvut/QtMips/blob/master/README.md#interrupts-and-coprocessor-0-support|podrobněji]] | | 0xffffc00c | SERP_TX_DATA_REG | 7 .. 0 | ASCII kód vysílaného znaku | Další periferie emuluje interakci s reálnými prvky zařízení. Tato periferie přesně odpovídá rozložením registrů a bitů zjednodušené periferii otočných voličů a LED indikátorů, která je použitá pro vstup a výstup na vývojových kitech [[..:..:documentation:mz_apo:start|MicroZed APO]], na které budou použité v semestrálních úlohách. ^ Adresa ^ Symbolické označení ^ Bity ^ Popis ^ | 0xffffc104 | SPILED_REG_LED_LINE | 31 .. 0 | Slovo zobrazené binárně, dekadicky a hexadecimálně | | 0xffffc110 | SPILED_REG_LED_RGB1 | 23 .. 0 | Zápis RGB hodnot do PWM registrů pro RGB LED 1 | | | | 23 .. 16 | Červená složka R | | | | 15 .. 8 | Zelená složka G | | | | 7 .. 0 | Modrá složka B | | 0xffffc114 | SPILED_REG_LED_RGB2 | 23 .. 0 | Zápis RGB hodnot do PWM registrů pro RGB LED 2 | | | | 23 .. 16 | Červená složka R | | | | 15 .. 8 | Zelená složka G | | | | 7 .. 0 | Modrá složka B | | 0xffffc124 | SPILED_REG_KNOBS_8BIT | 31 .. 0 | Filtrované hodnoty voličů jako 8 bit čísla | | | | 7 .. 0 | Nastavení modrého voliče B | | | | 15 .. 8 | Nastavení zeleného voliče G | | | | 23 .. 16 | Nastavení natočení červeného voliče R | ==== Příklad na práci s perifériemi ==== .globl _start .option norelax .text _start: li x8, 0xffffc100 // base address into memory mapped I/O area loop: lw x9, 0x24(x8) // load packed knob value into x9 (from addr. 0xffffc124) // depack x9 and store individual knob values into registers andi x12, x9, 0xFF // x12 <-blue knob is in bits 7..0 srli x1, x9, 8 // x11 <-green knob is in bits 15..8 andi x11, x1, 0xFF srli x1, x9, 16 // x10 <-red knob is in bits 23..16 andi x10, x1, 0xFF sw x9, 0x10(x8) //write packed knobs to RGB led_1,(to addr. 0xffffc110) sw x9, 4(x8) // and also to LED-line word-box of QtRvSim,(to addr. 0xffffc104) // bit negation of the packed knob value addi x1, x0, -1 // x1 = 0xffffffff xor x1,x1,x9 // xor with all 1 performs the bit negation of x9 sw x1, 0x14(x8) // write negated value of the packed knob to RGB led_2 // (to addr. 0xffffc114) beq x0, x0, loop // repeat rd/wr in endless loop nop ==== Analýza výsledku kompilace ==== Jednoduchý program pro čtení polohy otočných voličů a převodu hodnoty na barvu a textový výstup naleznete na laboratorních počítačích v adresáři ''/opt/apo/binrep/qtrvsim_binrep''. Archiv lze stáhnout i jako [[https://gitlab.fel.cvut.cz/b35apo/stud-support/-/archive/master/stud-support-master.zip?path=seminaries/binrep/qtrvsim_binrep|ZIP]]. Můžete také použít šablony z repozitáře https://gitlab.fel.cvut.cz/b35apo/stud-support v adresari ''seminaries/binrep'' Zdrojový kód je zkompilovaný sekvencí příkazů riscv64-unknown-elf-gcc -D__ASSEMBLY__ -ggdb -mabi=ilp32 -march=rv32i -fno-lto -c crt0local.S -o crt0local.o riscv64-unknown-elf-gcc -ggdb -Os -Wall -mabi=ilp32 -march=rv32i -fno-lto -c qtrvsim_binrep.c -o qtrvsim_binrep.o riscv64-unknown-elf-gcc -ggdb -nostartfiles -nostdlib -static -mabi=ilp32 -march=rv32i -fno-lto crt0local.o qtrvsim_binrep.o -lgcc -o qtrvsim_binrep Alternativní kompilace pro RISC-V s využitím C knihovny [[https://github.com/picolibc/picolibc|picolibc]]. riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 --specs=/opt/picolibc/lib/riscv64-unknown-elf/specs/picolibc.specs /opt/apo/binrep/qtrvsim_binrep/qtrvsim_binrep.c -o qtrvsim_binrep Následuje příklad obsahu binárního výstup ve formátu ELF převedený do textové podoby pro případ kompilace pro architekturu MIPS příkazem mips-elf-objdump --source -M no-aliases,reg-names=numeric qtmips_binrep A doplněný o komentáře. qtmips_binrep: file format elf32-bigmips Disassembly of section .text: 00400018
: /* * The main entry into example program */ int main(int argc, char *argv[]) { 400018: 27bdffe8 addiu $29,$29,-24 allocate space on the stack for main() function stack frame 40001c: afbf0014 sw $31,20($29) save previous value of the return address register to the stack. while (1) { uint32_t rgb_knobs_value; unsigned int uint_val; rgb_knobs_value = *(volatile uint32_t*)(mem_base + SPILED_REG_KNOBS_8BIT_o); 400020: 8c04c124 lw $4,-16092($0) Read value from the address corresponding to the sum of "SPILED_REG_BASE" and "SPILED_REG_KNOBS_8BIT_o" peripheral register offset LW is instruction to load the word. Address is formed from the sum of register $0 (fixed zero) and -16092, which is represented in hexadecimal as 0xffffc124 i.e., sum of 0xffffc100 and 0x24. The read value is stored in register $4. 400024: 00000000 sll $0,$0,0x0 one NOP instruction to ensure that load finishes before the further value use. 400028: 00041027 nor $2,$0,$4 Compute bit complement "~" of the value in the register $4 and store it into register $2 *(volatile uint32_t*)(mem_base + SPILED_REG_LED_LINE_o) = rgb_knobs_value; 40002c: ac04c104 sw $4,-16124($0) Store RGB knobs values from register $4to the "LED" line register which is shown in binary decimal and hexadecimal on the QtMips target. Address 0xffffc104 *(volatile uint32_t*)(mem_base + SPILED_REG_LED_RGB1_o) = rgb_knobs_value; 400030: ac04c110 sw $4,-16112($0) Store RGB knobs values to the corresponding components controlling a color/brightness of the RGB LED 1 Address 0xffffc110 *(volatile uint32_t*)(mem_base + SPILED_REG_LED_RGB2_o) = ~rgb_knobs_value; 400034: ac02c114 sw $2,-16108($0) Store complement of RGB knobs values to the corresponding components controlling a color/brightness of the RGB LED 2 Address 0xffffc114 /* Assign value read from knobs to the basic signed and unsigned types */ uint_val = rgb_knobs_value; the read value resides in the register 4, which correspond to the first argument register a0 /* Print values */ serp_send_hex(uint_val); 400038: 0c100028 jal 4000a0 40003c: 00000000 sll $0,$0,0x0 call the function to send hexadecimal value to the serial port, one instruction after JAL is executed in its delay-slot, PC pointing after this instruction (0x400040) is stored to the register 31, return address register serp_tx_byte('\n'); 400040: 0c100020 jal 400080 400044: 2404000a addiu $4,$0,10 call routine to send new line character to the serial port. The ASCII value corresponding to '\n' is set to argument a0 register in delay slot of JAL. JAL is decoded and in parallel instruction addiu $4,$0,10 is executed then PC pointing to the address 0x400048 after delay slot is stored to return address register and next instruction is fetch from the JAL instruction target address, start of the function serp_tx_byte 400048: 1000fff5 beqz $0,400020 40004c: 00000000 sll $0,$0,0x0 branch back to the start of the loop reading value from the knobs 00400050 <_start>: la $gp, _gp 400050: 3c1c0041 lui $28,0x41 400054: 279c90e0 addiu $28,$28,-28448 Load global data base pointer to the global data base register 28 - gp. Symbol _gp is provided by linker. addi $a0, $zero, 0 400058: 20040000 addi $4,$0,0 Set regist a0 (the first main function argument) to zero, argc is equal to zero. addi $a1, $zero, 0 40005c: 20050000 addi $5,$0,0 Set regist a1 (the second main function argument) to zero, argv is equal to NULL. jal main 400060: 0c100006 jal 400018
nop 400064: 00000000 sll $0,$0,0x0 Call the main function. Return address is stored in the ra ($31) register. 00400068 : quit: addi $a0, $zero, 0 400068: 20040000 addi $4,$0,0 If the main functio returns, set exit value to 0 addi $v0, $zero, 4001 /* SYS_exit */ 40006c: 20020fa1 addi $2,$0,4001 Set system call number to code representing exit() syscall 400070: 0000000c syscall Call the system. 00400074 : loop: break 400074: 0000000d break If there is not a system try to stop the execution by invoking debugging exception beq $zero, $zero, loop 400078: 1000fffe beqz $0,400074 nop 40007c: 00000000 sll $0,$0,0x0 If even this does not stop execution, command CPU to spin in busy loop. void serp_tx_byte(int data) { 00400080 : while (!(serp_read_reg(SERIAL_PORT_BASE, SERP_TX_ST_REG_o) & SERP_TX_ST_REG_READY_m)); 400080: 8c02c008 lw $2,-16376($0) 400084: 00000000 sll $0,$0,0x0 Read serial port transmit status register, address 0xffffc008 while (!(serp_read_reg(SERIAL_PORT_BASE, SERP_TX_ST_REG_o) & 400088: 30420001 andi $2,$2,0x1 40008c: 1040fffc beqz $2,400080 400090: 00000000 sll $0,$0,0x0 Wait again till UART is ready to accept character - bit 0 is not zero. NOP in the delayslot. *(volatile uint32_t *)(base + reg) = val; 400094: ac04c00c sw $4,-16372($0) write value from register 4 (the first argument a0) to the address 0xffffc00c (SERP_TX_DATA_REG_o) serial port tx data register. } 400098: 03e00008 jr $31 40009c: 00000000 sll $0,$0,0x0 jump/return back to continue in callee program address of the next fetch instruction is read from the return address register 32 ra void serp_send_hex(unsigned int val) { 004000a0 : 4000a0: 27bdffe8 addiu $29,$29,-24 allocate space on the stack for the routine stack frame 4000a4: 00802825 or $5,$4,$0 copy value of the fisrt argument regsiter 4 (a0) to the register 5 for (i = 8; i > 0; i--) { 4000a8: 24030008 addiu $3,$0,8 set the value of the register 3 to the 8 4000ac: afbf0014 sw $31,20($29) save previous value of the return address register to the stack. char c = (val >> 28) & 0xf; 4000b0: 00051702 srl $2,$5,0x1c shift value in register 5 right by 28 bits and store result in the register 2 4000b4: 304600ff andi $6,$2,0xff abundant operation to limit value range to the character type variable and store result in the register 6 if (c < 10 ) 4000b8: 2c42000a sltiu $2,$2,10 set register 2 to one if the value is smaller than 10 c += 'A' - 10; 4000bc: 10400002 beqz $2,4000c8 4000c0: 24c40037 addiu $4,$6,55 if value is larger or equal (register 2 is 0/false) then add value 55 ('A' - 10)..(0x41 - 0xa) = 0x37 = 55 to the register 6 and store result in the register 4. This operation is executed even when the branch arm before else is executed, but result is immediately overwritten by next instruction c += '0'; 4000c4: 24c40030 addiu $4,$6,48 add value 0x30 = 48 = '0' to the value in the register 6 and store result in the register 4 - the fisrt argument a0 serp_tx_byte(c); 4000c8: 0c100020 jal 400080 4000cc: 2463ffff addiu $3,$3,-1 call subroutine to send byte to the serial port decrement loop control variable (i) in delay-slot for (i = 8; i > 0; i--) { 4000d0: 1460fff7 bnez $3,4000b0 4000d4: 00052900 sll $5,$5,0x4 the final condition of for loop converted to do {} while() loop. If not all 8 character send loop again. Shift left value in the register 5 by 4 bit positions. The compiler does not store values of local variables to the stack even does not store values in caller save registers (which requires to save previous values to the function stack frame). Compiler can use this optimization because it knows registers usage of called function serp_tx_byte(). } 4000d8: 8fbf0014 lw $31,20($29) 4000dc: 00000000 sll $0,$0,0x0 restore return address register value to that found at function start 4000e0: 03e00008 jr $31 4000e4: 27bd0018 addiu $29,$29,24 return to the caller function. Instruction in jump register delay-slot is used to restore stack pointer/free function frame.