====== 4. Hierarchical memory model II - cache, additional class ======
* for lecturer: [[..:..:..:internal:tutorials:04:start|tutorial 4]]
===== Content of the class =====
Aim of today's class is to deepen knowledge about cache, to clarify and to show example of time and spatial locality principle. Try to optimize cache parameters.
===== Example A =====
Consider following example:
#define s0 $16
#define s1 $17
#define s2 $18
.globl array
.data
.align 2
array:
.word 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
.text
.globl start
.ent start
start:
la s0, array // load address of the first value in array to register s0
addi s1, $0, 3 // Number of cycle repetitions (number of iterations)
nop // Why nop here? Try to remove it. Will it have any influence on hit rate?
loop:
beq s1,$0, end
lw s2, 0(s0) // Read 0th value in array
lw s2, 4(s0) // Read 1st value in array (4/4 = 1)
lw s2, 36(s0) // Read 9th value (36/4=9)
lw s2, 40(s0) // Read 10th value (40/4=10)
addi s1, s1, -1
j loop
end:
nop
end_loop: //Final infinite loop
j end_loop
Now consider instruction and data cache with **size of 8 words** (1 word = 4B) **each**. The cache is word-aligned -- it means that address of the first byte in word is divisible by 4. Replacement policy is LRU (lease recently used). Fill-in following table for instruction and data cache. **Study in details execution of the above program and study accesses into memory and cache**.
** Data cache: **
^ ^ Directly mapped ^^^ N-way set associative ^ Fully asociative ^^^
^ Size/Block size/Blocks in sets: ^ 8/1/1 ^ 8/2/1 ^ 8/8/1 ^ 8/2/2 ^ 8/4/2 ^ 8/2/4 ^ 8/1/8 ^
| Hit Count: | | | | | | | |
| Miss Count: | | | | | | | |
| Hit Rate: | | | | | | | |
\\
** Instruction cache: **
^ ^ Directly mapped ^^^ N-way set associative ^ Fully asociative ^^^^
^ Size/Block size/Blocks in sets: ^ 8/1/1 ^ 8/2/1 ^ 8/8/1 ^ 8/2/2 ^ 8/4/2 ^ 8/2/4 ^ 8/1/8 ^
| Hit Count: | | | | | | | |
| Miss Count: | | | | | | | |
| Hit Rate: | | | | | | | |
\\
===== Questions =====
* What are your conclusions? Which cache has the best hit rate?
* Is the fully associative cache always the best?
* Does the hit rate depend on executed program or is it parameter of the cache?
* Is it possible to have data in cache that will never be accessed? (Data we are not working with, instructions which are never executed,, etc...)
* Is there optimal value for **Block size** parameter? Is there any rule for the value of **Block size**?
* Find all information stored in the cache.
* Find type of instruction and data cache use your computer? Find parameters - size, block size, blocks in set.
Directly mapped cache have one block in set. Fully associative cache has as many ways as it has blocks.
===== Example B =====
Take the program from the last class (insertion sort) and cache with size of 8 words. Is there any replacement policy which you expect to perform better that others? Consider both data and instruction cache. Verify your conclusions with simulation.
===== Questions =====
* Try the best parameters from Example A. Do these the parameters achieve the best hit ratio here as well?
* Do you know another replacement policies different from those in MIPS Simulator? Try to find any on the internet or in the lectures.
* Which write policy you expect to have the lowest impact on memory bus. (Transmits the lowest size of data).