Search
The seminar start with the classroom introduction, then the basic concepts of computer data representation and their use and connection to languages at different levels of abstraction downto machine code is refreshed. Keep in mind that this is only an overview and an idea of what is covered by the course on next seminaries and for the KyR program students in parallel in the course B3B36PRG - C Programming. For the first half of the semester, the programming language C will be used only for general notation of algorithms (most constructions are understandable as the equivalent of the algorithm in Python for example) and you should gain enough knowledge of C language in frame of B3B36PRG course for real programming work in the second half of semester.
It is a dynamically typed higher level language, to run the program it is necessary to use an interpreter (it can also be a runtime environment with a partial translation at runtime) most often written in C.
Test the following simple Python example. To create and edit the source file sum2vars.py use one of the installed editors (geany, vim, emacs, qtcreator, clion, …). For those without personal preference, is is advisable to start with geany.
sum2vars.py
#!/usr/bin/python3 var_a = 0x1234 var_b = 0x2222 var_c = var_a + var_b print('sum %d + %d -> %d'%(var_a, var_b, var_c)) print('sum 0x%x + 0x%x -> 0x%x'%(var_a, var_b, var_c))
The program can be passed as a parameter to be run in the interpreter
python3 sum2vars.py
If you mark the file as executable
chmod +x sum2vars.py
it can be run “dirrectly” even though nor processor nor operating system can run such file directly. The first line (shebang) ensures that command line interpreter (shell) does not pass the file directly to execute/spawn by operating system, but calls interpreter specified in the first line and passes the file as parameter to that interpreter.
Minor pitfalls and at the same time a critical habit to increase the safety agiant user's attack (try to figure out why) lies in that the command without path specification is searched only in the directories specified in the path list (environment variable PATH) and that the path does not contain the current directory on reasonably designed and managed systems. Therefore, we have to specify the path to start the program. In this case, the current directory - represented by the dot '.' .
'.'
./sum2vars.py
It is a language with strictly defined data types by the programmer. The actual binding of type identifiers to actual data represrentation se;lected for type may differ between architectures because, for example, the signed integer type (int) represents such integer encoding which the range is at least −32,767 to +32,767 and that best suits the target processor for processing. Most range from int −2,147,483,648 to +2,147,483,647 ($ -2^{31}$ to $2^{31}-1$), that is the value is stored in 32-bits, is the choice for of the most of today processor architectures.
int
It is usually necessary to compile the program before execution (there are other alternatives - CERN ROOT Cling) into a binary form in which it can be loaded by operating system into the memory and processor fulfills operations according to translated/binary machine instructions.
Rewrite of above listed Python program into C language store into file sum2vars.c. Definition of the main() function has to be introduced because this name is reserved for user program entry point by C language standard (ISO/IEC 9899 - latest free). In order to access the function printf (), it is necessary to reference/include header file stdio.h.
sum2vars.c
main()
printf ()
stdio.h
#include <stdio.h> int var_a = 0x1234; int var_b = 0x2222; int var_c = 0x3333; int main() { var_c = var_a + var_b; printf("sum %d + %d -> %d\n", var_a, var_b, var_c); printf("sum 0x%x + 0x%x -> 0x%x\n", var_a, var_b, var_c); return 0; }
The program can be compiled into binarry form by GNU compiler of C language (manual).
gcc -Wall sum2vars.c
The implicit name of the output executable file is a.out. The program can be run by invocation by its full or lelative path name ./a.out. The desired binary file name can be specified on the command line with the -o switch as well as insertion of debug information is selected by -ggdb. You can also ask the compiler to optimize the program for minimal size by using the -Os switch.
a.out
./a.out
-o
-ggdb
-Os
gcc -Wall -Os -ggdb -o sum2vars sum2vars.c
Executable file content in ELF file format can be examined by tool objdump.
objdump -S sum2vars
Translation of the algorithm to individual machine instructions can be explored online as well
Godbolt Compiler Explorer
The algorithm described by C language source is translated to such a sequence of machine instructions operating on precisely specified data types that external effect/result of the entire program execution or better execution between sequence point is equivalent to the written algorithm. Compiler output is stored in the the form of the selected machine instructions, but without deciding on their final location in the memory. Native translation (translation for the system on which the compiler is currently running) of the source is obtained by command
gcc -Wall -Os -S -o sum2vars.s sum2vars.c
Full understanding of how to write assembler programs for the x86_64 architecture and for running under the GNU / Linux operating system is demanding, so we chose the MIPS architecture to teach processor architectures (reasons) and we will use it in its most limited and simplified form in initial seminaries. In order to observe the internal state and the visualize principle of processor operation, we will not use processor boards with real chips which implement the architecture, but a graphical simulator created specifically for our course needs and goals - QtMips (materials and video (in Czech only for now) from its introduction LinuxDays 2019).
The following examples are presented mainly to provide overview what will be focus of the subject and semster work. We will return to writing the algorithm in machine instructions in the third lecture and seminar in depth with full description.
For the first approximation, we will compile the code with a MIPS cross compiler.
mips-linux-gnu-gcc -ggdb -static -Os -o sum2vars-mips sum2vars.c
The program can be run in our laboratory on GNU/Linux systems with x86_64 architecture, because binary files for MIPS are automatically interpreted by the QEMU emulator in user-space emulation mode.
Manual translation to assembler (file sum2vars.S) may look like this (for simplicity without calling the print function)
sum2vars.S
.globl main .globl _start .text _start: main: lw $4, var_a($0) lw $5, var_b($0) add $6, $4, $5 sw $6, var_c($0) addi $2, $0, 0 jr $ra nop .data var_a: .word 0x1234; var_b: .word 0x2222; var_c: .word 0x3333;
We can translate the program
mips-elf-gcc -o sum2vars sum2vars.S
load into simulator
and run step by step.
The program can be compiled directly in the simulator. To prevent the simulator from loading an external one program, it should be started in mode without loading ELF file
Select File → New source
The precise/fixed location of variables can be specified this time (directive .org 2000).
.org 2000
_start: main: lw $4, var_a($0) lw $5, var_b($0) add $6, $4, $5 sw $6, var_c($0) addi $2, $0, 0 jr $ra nop .data .org 0x2000 var_a: .word 0x1234; var_b: .word 0x2222; var_c: .word 0x3333; #pragma qtmips show registers #pragma qtmips show memory #pragma qtmips focus memory var_a #pragma qtmips tab core
Directives for quick location of variables in the data memory view are also added for variable var_a.
var_a
Compile by choosing Machine→Compile source
and we can step through the code
Additionally, 0x1234 can be replaced by 0x12345678 and in the dump observe how the value is stored into into individual bytes or 16-bit half words. Use choice World, Half-word and Byte chooser in the data memory view, try changing the view/dump width.
An example of how to try visualize numbers representation from Python, especially for those who do not know even basics of C language
#!/usr/bin/python3 import struct a = 0x1234567 b = -12345678 c = a + b buf = struct.pack('<IiI', a, b, c) print (["{0:02x}".format(b) for b in buf]) (u32_a, u32_b, u8_c0, u8_c1, u8_c2, u8_c3) = struct.unpack('<IIBBBB', buf) print(u32_a, u32_b, u8_c0, u8_c1, u8_c2, u8_c3) (s32_a, s32_b, s8_c0, s8_c1, s8_c2, s8_c3) = struct.unpack('<iibbbb', buf) print(s32_a, s32_b, s8_c0, s8_c1, s8_c2, s8_c3)
Learn more about storing variables in a specified data format representationa in Python module struct — Interpret bytes as packed binary data.
struct
Classroom KN: E-2 is equipped with computers with network installation of Debian GNU/Linux Bullseye
When the computer is turned on, it loads via BIOS PXE network boot option PXElinux boot loader and its configuration. It allows to select
DCE Linux Bullseye (Debian)
Choosing the second option selects to load the GNU/Linux kernel image and the initial RAM disk from the network using the TFTP protocol. When the kernel starts, it mounts root filesystem from the NFS server. However, it is connected in read only mode. To save local changes when computer runs, a file system for temporarily saving local changes in memory and swap file is mapped above basic directory structure. This is the Overlayfs module (AUFS in the past). The Kerberos system is used to verify the credentials and the password is verified against the main CTU identities server. After successful login, the volume with the user account is connected to the station directory structure via the NFS, to which the user has read and write rights.
More information about the solution can be found on the Wiki How to create a diskless machine running GNU/Linux. There are also a DiskLess Debian/GNU Linux slides from our solution presentation at the Install Fest conference/event.
The computer access and logins are authenticated against CTU central Kerberos server. The main CTU password is used for computers access.
The data in your home directories are available in rooms KN:E-2, KN:E-23, KN:E-24, KN:E-s109 and are also accessible on server postel.felk.cvut.cz via SSH protocol. You can use SCP/SFTP protocol to access data. In Linux OS you can mount your home directories even from your home computer using sshfs utility (for example sshfs jmeno@postel.felk.cvut.cz /mnt/tmp).
postel.felk.cvut.cz
sshfs jmeno@postel.felk.cvut.cz: /mnt/tmp
fusermount -u /mnt/tmp
ssh -X jmeno@postel.felk.cvut.cz
Remark: The name was chosen not only for convenience access from the comfort of home, but is primarily a reminder of one of the key person of the Internet - Jon Poste.
In the case of problem with computers or network, please, contact Ing. Ales Kapica or other people from IT 13135 group.