4. Hierarchical memory model II - cache, additional class

4. Hierarchical memory model II - cache, additional class

for lecturer: tutorial 4

Content of the class

Aim of today's class is to deepen knowledge about cache, to clarify and to show example of time and spatial locality principle. Try to optimize cache parameters.

Example A

Consider following example:

#define s0 $16
#define s1 $17
#define s2 $18

.globl    array
.data
.align    2

array:
.word    1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
.text
.globl start
.ent start

start:

la   s0, array      // load address of the first value in array to register s0
addi s1, $0, 3      // Number of cycle repetitions (number of iterations)
nop                 // Why nop here? Try to remove it. Will it have any influence on hit rate?

loop:
    beq s1,$0, end
    lw s2, 0(s0)      // Read 0th value in array
    lw s2, 4(s0)      // Read 1st value in array (4/4 = 1)

    lw s2, 36(s0)     // Read 9th value (36/4=9)
    lw s2, 40(s0)     // Read 10th value (40/4=10)
   
    addi s1, s1, -1
    j loop

end:
nop  

end_loop:           //Final infinite loop
    j end_loop

Now consider instruction and data cache with size of 8 words (1 word = 4B) each. The cache is word-aligned – it means that address of the first byte in word is divisible by 4. Replacement policy is LRU (lease recently used). Fill-in following table for instruction and data cache. Study in details execution of the above program and study accesses into memory and cache.

Data cache:

	Directly mapped			N-way set associative	Fully asociative
Size/Block size/Blocks in sets:	8/1/1	8/2/1	8/8/1	8/2/2	8/4/2	8/2/4	8/1/8
Hit Count:
Miss Count:
Hit Rate:

Instruction cache:

	Directly mapped			N-way set associative	Fully asociative
Size/Block size/Blocks in sets:	8/1/1	8/2/1	8/8/1	8/2/2	8/4/2	8/2/4	8/1/8
Hit Count:
Miss Count:
Hit Rate:

Questions

What are your conclusions? Which cache has the best hit rate?
Is the fully associative cache always the best?
Does the hit rate depend on executed program or is it parameter of the cache?
Is it possible to have data in cache that will never be accessed? (Data we are not working with, instructions which are never executed,, etc…)
Is there optimal value for Block size parameter? Is there any rule for the value of Block size?
Find all information stored in the cache.
Find type of instruction and data cache use your computer? Find parameters - size, block size, blocks in set.

Directly mapped cache have one block in set. Fully associative cache has as many ways as it has blocks.

Example B

Take the program from the last class (insertion sort) and cache with size of 8 words. Is there any replacement policy which you expect to perform better that others? Consider both data and instruction cache. Verify your conclusions with simulation.

Questions

Try the best parameters from Example A. Do these the parameters achieve the best hit ratio here as well?
Do you know another replacement policies different from those in MIPS Simulator? Try to find any on the internet or in the lectures.
Which write policy you expect to have the lowest impact on memory bus. (Transmits the lowest size of data).

Table of Contents

4. Hierarchical memory model II - cache, additional class

Content of the class

Example A

Questions

Example B

Questions