Tải bản đầy đủ (.pdf) (45 trang)

Advanced Computer Architecture - Lecture 28: Memory hierarchy design

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.3 MB, 45 trang )

CS 704
Advanced Computer Architecture

Lecture 28
Memory Hierarchy Design
(Cache Design and policies )

Prof. Dr. M. Ashraf Chughtai


Today’s Topics
Recap: Cache Addressing Techniques
Placement and Replacement Policies
Cache Write Strategy
Cache Performance Enhancement
Summary

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

2


Recap: Block Size Trade off
Impact of block size on the cache
performance and categories of cache
design
The trade-off of the block size verses
the Miss rate, Miss Penalty, and


Average access time , the basic CPU
performance matrices
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

3


Recap: Block Size Trade off
– The larger block size reduces the miss
rate, but If block size is too big relative to
cache size, miss rate will go up; and
– Miss penalty will go up as the block size
increases; and
– Combining these two parameters, the
third parameter, Average Access Time

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

4


Recap: Cache Organizations
Cache organizations
Block placement policy, we studied three

cache organizations.

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

5


Recap: Cache Organizations
– Direct Mapped where each block has
only one place it can appear in the cache
– Conflict Miss
– Fully Associative Mapped where any
block of the main memory can be placed
any where in the cache; and
– Set Associative Mapped which allows to
place a block in a set of places in the
cache
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

6


Memory Hierarchy Designer’s Concerns
Block placement: Where can a block be

placed in the upper level?
Block identification: How is a block found if
it is in the upper level?
Block replacement: Which block should be
replaced on a miss?
Write strategy: What happens on a write?

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

7


Block Placement Policy
Fully Associative:
Block can be
placed any where
in the upper level
(Cache)
E.g. Block 12 from
the main memory
can be place at
block 2, 6 or any of
the 8 block
locations in cache
MAC/VU-Advanced
Computer Architecture


Lecture 28 Memory Hierarchy (4)

8


Block Placement Policy
Set Associative: Block can be
placed any where in a set in
upper level (cache)
The set number in the upper
level given as:
(Block No) MOD (number of sets)
E.g., an 8-block, 2-way set
associative mapped cache, has 4
sets [0-3] each of two blocks;
therefore
and block 12 or 16 of main
memory can go any where in
set # 0 as (12 MOD 4 = 0) and
(16 MOD 4 = 0)
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

Similarly, block 14 can
be placed at any of the
2 locations in set#2 (14
MOD 4 = 2)
9



Block Placement Policy
Direct Mapped: (1 way associative)
Block can be placed at only one
specific location in upper level (Cache)
The location in the cache is given by:
Block number MOD No. of cache blocks

E.g., the block 12 or
block 20 can be place
at location 4 in cache
having 8 blocks as:
12 MOD 8 = 4
20 MOD 8 = 4
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

10


Block Identification
How is a block found if it is in the upper level? Tag/Block

A TAG is associated with each block frame
The TAG gives the block address
All possible TAGS, where a block may be
placed are checked in parallel

Valid bit is used to identify whether the
block contains correct data
– No need to check index or block offset
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

11


Block Identification: Direct Mapped
Lower Level (Main) memory: 4GB – 32-bit address
31

9
Cache Tag

(22­bits)

4

Cache Index 5bits
Ex: 0x00

0
Byte Select
Ex: 0x00
31


 Cache Data
22 bit

Byte 31
Byte 63

:

 Cache Tag

Byte 1

:

Valid Bit

Byte 0

0

Byte 33 Byte 32 1
31

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

12



Block Identification

Cache Tag

(23­bits)

8

4

0

Cache Index 4bits Byte Select

Byte 31
Byte 63

: :

9

31

Byte 1

Byte 0

0


Byte 33 Byte 32 1
15

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

13


Block Replacement Policy
In case of cache miss, a new block
needs to be brought in
If the existing block locations, as
defined by Block placement policy, the
are filled,
then an existing block has to be fired
based on
– Cache mapping; and
– some block replacement policy
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

14


Block Replacement Policy

For the Direct Mapped Cache, the block
replacement is very simple as a block
can be place at only one location given
by:
(Block No.) MOD (Number of Cache Blocks

There are three commonly used
schemes for Fully and Set Associative
mapped
These policies are:
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

15


Block Replacement Policy
Random: replace any block
– it is simple and easiest to implement
– The candidate for replacement are
randomly selected
– Some designers use pseudo random
block numbers

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)


16


Block Replacement Policy
Least Recently Used (LRU): replace the
block either never used of used long ago
– It reduces the chances of throwing out
information that may be needed soon
– Here, the access time and number of times a
block is accessed is recorded
– The block replaced is one that has not been
used for longest time
– E.g., if the blocks are accessed in the sequence
0,2,3,0, 4,3,0,1,8,0 the victim to replace is block
2
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

17


Block Replacement Policy
First-in, First-out (FIFO): the block first
place in the cache is thrown out first; e.g., if
the blocks are accessed in the sequence
2,3,4,5,3,4
then to bring in a new block in the cache,

the block 2 will be thrown out as it is the
oldest accessed block in the sequence
FIFO is used as approximation to LRU as
LRU can be complicated to calculate
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

18


Block Replacement Policy: Conclusion
Associativity
2-way
4-way
8-way
Size

LRU

Random

LRU

Random

LRU

Random


16 KB

5.2%

5.7%

4.7%

5.3%

1.9%

2.0%

1.5% 1.7%

.15%

1.17%

1.13% 1.13%

4.4%

5.0%

64 KB
1.4%1.5%


256 KB

1
1.12% 1.12%

MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

19


Write Strategy
Must not overwrite a cache block
unless main memory is up to date
Multiple CPUs may have individual
caches
I/O may address main memory directly
Memory is accessed for read and write
purposes
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

20


Write Strategy .. Cont’d

The instruction cache accesses are
read
Instruction issue dominates the cache
traffic as the writes are typically 10% of
the cache access
Furthermore, the data cache are 10%20% of the overall memory access are
write
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

21


Write Strategy.. Cont’d
In order to optimize the cache
performance, according to the
Amdahl’s law, we make the common
case fast
Fortunately, the common case, i.e., the
cache read, is easy to make fast as:
– Read can be optimized by making the tagchecking and data-transfer in parallel

Thus, the cache performance is good
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)


22


Write Strategy .. Cont’d
However, in case of cache-write, the cache
contents modification cannot begin until the
tag is checked for address-hit
Therefore the cache-write cannot begin in
parallel with the tag checking
Another complication is that the processor
specifies the size of write which is usually a
portion of the block
Therefore, the write needs consideration
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

23


Write Strategy.. Cont’d
– Write back —The information is written only to
the block in the cache. The modified cache block
is written to main memory only when it is
replaced
– Write through —The information is written to
both the block in the cache and to the block in
the lower-level memory


MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

24


Write Strategy: Pros and Cons of each
Write Back:
– No write to the lower level for repeated
writes to cache
– a dirty bit is commonly used to indicate the
status as the cache block is modified (dirty)
or not modified (clean)
– Reduce memory-bandwidth requirements,
hence the reduces the memory power
requirements
MAC/VU-Advanced
Computer Architecture

Lecture 28 Memory Hierarchy (4)

25


×