2. Data representation in memory and real numbers

for lecturer: cvičení 2

Exercise outline

basic data types (integer number) storage in computer memory, program in C
integer representation, addition, subtraction, multiplication, division
real number represetation (in IEEE 754)

What should I repeat before the first exercise

binary representation and hexadecimal numbers
C language syntax
terms little and big endian
the second complement code
logic operations (and, or, invert, rotation, …)
two's complement code and IEEE 754.
to understand the program from the last class.

What shall we do on the first exercise

The exercise will be based on following C code. It prints number representation in computer memory. We shall modify the code during the class.

/* Simple program to examine how are different data types encoded in memory */
 
#include <stdio.h>
 
/*
 * The macro determines size of given variable and then
 * prints individual bytes of the value representation
 */
#define PRINT_MEM(a) print_mem((unsigned char*)&(a), sizeof(a))
 
void print_mem(unsigned char *ptr, int size) {
  int i;
  printf("address = 0x%016lx\n", (long unsigned int)ptr);
 
  for (i = 0; i < size; i++) {
    printf("0x%02x ", *(ptr+i)); // == printf("0x%02x ", ptr[i]);
  }
 
  printf("\n");
}
 
int main() {
  /* try for more types: long, float, double, pointer */
  unsigned int unsig = 5;
  int sig = -5;
 
  /* Read GNU C Library manual for conversion syntax for other types */
  /* https://www.gnu.org/software/libc/manual/html_node/Formatted-Output.html */
  printf("value = %d\n", unsig);
  PRINT_MEM(unsig);
 
  printf("\nvalue = %d\n", sig);
  PRINT_MEM(sig);
 
  return 0;
}

The source code print_binrep.c can be copied from /opt/apo/binrep/print_binrep on computers in the laboratory.

To compile the program:

gcc -Wall -pedantic -o print_binrep ./print_binrep.c

or GNU make program can be used which processes rules and dependencies given by associated Makefile

Use Make Program to Build C Code

Make is a build automation tool which builds targets (i.e. executable program) from specified components and sources. Where only one source file needs to be compiled to build executable, manual compiler invocation or use of shell script to invoke short commands sequence is appropriate each time when build is requires. But processing of more files is required for larger projects and they are often of different types – for example, source codes in C and assembler, documentation in XML and perhaps something else. Manually translating each file separately and with another translator and combining the result together is a tedious and error prone large amount of work. That is why the program make was invented. Its basic idea is that it describes how to produce specified file types and determines which files are needed to “make” another file.

The make program works by reading the Makefile file, where the rules for compilation are written, and it invokes compilers, or more precisely system shell lines, commands, as needed.

Makefile consists of four line types:

description of dependencies (explicit or by patter rules),
commands to execute to generate specified target,
declaration of variables,
comments.

The most simplified Makefile for simple C language hello world programu compilation follows.

hello: hello.c

It consist of only single line with explicit dependency which specifies that to produce program hello (name without extension is expected to be program executable) source file hello.c is required. The list of filenames or even more generic targets is specified on the left of colon and it is specified that it depends on result of processing components on the right (again files or generic targets) of the colon.

The 'make includes set of predefined rules which specify what should be called in above case. If implicit rules do not exists or we need to override them then given Makefile will contain:

hello: hello.c
	gcc -Wall -o hello hello.c

The second line specifies command to generate target specified in the dependency line. Each command has to start with tabulator character. Each time when make is invoked and file hello is missing or is older than hell.c the command is invoked.

Implicit rules tell how to generate one type of file from another type. These rules are mostly defined using variables. If we want to change the implicit rule just a bit, we don't need to define it again, but just change the values of the variables that are used in the rules.

CFLAGS = -g -Wall
CC = m68k-elf-gcc
hello: hello.c

The CFLAGS variable specifies switches which should be passed in addition to filenames to C language compiler. The actual C compiler executable/shell command is specified by CC variable [3].

For larger project, it is common to extend Makefile to include additional targets:

all: hello
 
clean:
	rm -f hello
 
hello: hello.c
	gcc -Wall -g -o hello hello.c
 
.PHONY: clean depend

The all is specified the first and becomes default target controlling execution when make' is invoked without parameters. The clean target is usual name for the target which use during invocation leads to remove/clean of all final and intermediate products of compilation. The special .PHONY'' target is used to denote targets which are not expected to generate file with corresponding name but serve only as targets to invoke other targets nad files builds.

Much more could be written about the make program. It can used to translate even very complex projects, such as the Linux kernel itself. There is usually no need to create complex Makefiles manually. Tools like Meson, CMake, autoconf or some IDE generate Makefiles automatically.

Python Language Alternative

#!/usr/bin/python3
 
import struct
 
a = 1234.56789
b = -11.1111111111111111111111
c = a + b
 
buf = struct.pack('<f', c)
 
print ('C float LE:' + ' '.join(["{0:02x}".format(b) for b in buf]))
print ('C float LE:' + str(struct.unpack('<f', buf)[0]))
 
buf = struct.pack('>f', c)
 
print ('C float BE:' + ' '.join(["{0:02x}".format(b) for b in buf]))
print ('C float BE:' + str(struct.unpack('>f', buf)[0]))
 
buf = struct.pack('<d', c)
 
print ('C double LE:' + ' '.join(["{0:02x}".format(b) for b in buf]))
print ('C double LE:' + str(struct.unpack('<d', buf)[0]))
 
buf = struct.pack('>d', c)
 
print ('C double BE:' + ' '.join(["{0:02x}".format(b) for b in buf]))
print ('C double BE:' + str(struct.unpack('>d', buf)[0]))

See Python struct module for more struct — Interpret bytes as packed binary data.

Tasks

Compile and execute program above
- Explain output of the program
- Alter the code to print representation of other data types (float, char, double, long, etc…)
- Alter the code to print numbers and their representation between -16…15.
- Modify the program to add and subtract integer numbers and print arguments and results (in “normal” form and computer representation).
- Try to add and subtract negative and positive numbers. Try to cause integer overflow and underflow.
Integer addition and subtraction in two's complement representation
- add and subtract two integer numbers. For example 5+(-6) and 5-(-6).
- repeat the operations with different numbers and check your results with the computer program from the first exercise.
- When the underflow and overflow can happen? How can we detect, that it had occured?
Integer multiplication
- multiply two integers, For example 7*6.
- is there any difference when multiplying negative numbers? (e.g. -7*6, (-7)*(-6), etc…)
- show how to speed-up the multiplier? (use many adders instead repetitively using one).
Integer division
- divide integers 42/7, 43/7
- does the algorithm change when we use negative numbers?
Real number representation (IEEE 754)
- binary representation of real numbers (float - 32bit, double - 64bit)
- show binary representation of -0.75. Check your result with program from the last exercise.
- find decimal number for float binary number 0xC0A00000 in IEEE754.
- explain who to add numbers 9.999*10^1 and 1.1610*10^(-1) in decimal representation. Assume that it is possible to store only 4 digits in mantisa and 2 digits in exponent.
  - Hint: 1) align numbers 2) addition 3) normalization 4) round numbers.
- in binary representation add 0.5 and -0.4375
- in decimal representation show multiplication of 1.110*10^10 * 9.200*10^(-5)
- in binary representation multiply 0.5 and -0.4375

Usefull links

https://en.cppreference.com/w/c - overview and reference of C language
http://www.gnu.org/software/libc/manual/html_node/Formatted-Output.html - printf function documentation. The printf is used for formated output.
Basic command-line operations for Unix and GNU/Linux systems on Bootlin
Description of IEEE 754 numbers representation on Wikipedii

Table of Contents