CS 704
Advanced Computer Architecture
Lecture 26
Memory Hierarchy Design
(Concept of Caching and Principle of Locality)
Prof. Dr. M. Ashraf Chughtai
Today’s Topics
Recap: Storage trends and memory hierarchy
Concept of Cache Memory
Principle of Locality
Cache Addressing Techniques
RAM vs. Cache Transaction
Summary
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
2
Recap: Storage Devices
Design features of semiconductor
memories
SRAM
DRAM
magnetic disk storage
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
3
Recap: Speed and Cost per byte
– DRAM is slow but cheap relative to
SRAM
– Main memory of the processor to
hold moderately large amount of data
and instructions
– Disk storage is slowest and
cheapest
– secondary storage to hold bulk of
data and instructions
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
4
Recap: CPU-Memory Access-Time
– The gap between the speed of DRAM
and Disk with respect to the speed of
processor, as compared to that of the
SRAM, is increasing very fast with
time
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
5
CPU-Memory Gap
… Cont’d
100,000,000
10,000,000
1,000,000
100,000
10,000
1,000
100
10
1
Disk seek time
s
n
DRAM access
time
SRAM access
time
CPU cycle time
1980
MAC/VU-Advanced
Computer Architecture
1985
1990
year
1995
Lecture 26 Memory Hierarchy (2)
2000
6
Memory Hierarchy Principles
The speed of DRAM and CPU
complement each other
Organize memory in hierarchy,
based on the Concept of Caching;
and
– Principle of Locality
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
7
1: Concept of Caching
staging area or temporary-place to:
– store frequently-used subset of the data
or instructions from the relatively
cheaper, larger and slower memory; and
– To avoid having to go to the main
memory every time this information is
needed
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
8
Caching and Memory Hierarchy
Memory devices of different type are
used for each value k – the device level
– the faster, smaller device at level k,
serves as a cache for the larger,
slower device at level k+1
– The programs tend to access the
data or instructions at level k more
often than they access the data at
level k+1
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
9
Caching and Memory Hierarchy
– Storage at level k+1 can be slower,
but larger and cheaper per bit
A large pool of memory that costs as
much as the cheap storage at the
highest level (near the bottom in hierarchy)
serves data or instructions at the rate
of the fast storage at the lowest level
(near the top in hierarchy)
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
10
Examples of Caching in the Hierarchy
Cache Type
What Cached
Where Cached
Registers
4-byte word
CPU registers
TLB
Address
translations
32-byte block
32-byte block
4-KB page
On-Chip TLB
Parts of files
Main memory
L1 cache
L2 cache
Virtual
Memory
Buffer cache
On-Chip L1
Off-Chip L2
Main memory
Network buffer Parts of files
cache
Browser cache Web pages
Local disk
Web cache
Remote server
disks
Web pages
MAC/VU-Advanced
Computer Architecture
Local disk
Lecture 26 Memory Hierarchy (2)
Latency
(cycles)
Managed
By
0 Compiler
0 Hardware
1 Hardware
10 Hardware
100 Hardware+
OS
100 OS
10,000,000 AFS/NFS
client
10,000,000 Web
browser
1,000,000,000 Web proxy
server
11
2: Principle of Locality
Programs access a relatively small
portion of the address space at any
instant of time
E.g.; we all have a lot of friends, but at
any given time most of us can only
keep in touch with a small group of
them
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
12
Principle of Locality
Phys
ics
Chem
istry
Civil Engg
We select 4 books;
2 each of Electronics and
Computers; place them on
a small table for fast
MAC/VU-Advanced
access
Computer Architecture
Electrical Engg.
Lecture 26 Memory Hierarchy (2)
Computers
nics
o
r
t
c
e
El
ture
Litera
13
Types of Locality
Temporal
Spatial
Temporal locality is the locality in time
which says if an item is referenced, it will
tend to be referenced again soon.
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
14
Types of Locality
Spatial locality
It is the locality in space. It says if an
item is referenced, items whose
addresses are close by tend to be
referenced soon
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
15
A well-written program tends to reuse data
and instructions which are:
– either near those they have used recently
– or that were recently referenced
themselves
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
16
Principle of Locality
– Spatial locality: Items with nearby
addresses (i.e., nearby in space) be
located at the same level, as they
tend to be referenced close together
in time
– Temporal locality: Recently
referenced items (i.e., referenced
close in time) be placed at the same
memory level, as they are likely to be
referenced in the near future
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
17
Locality Example: Program
sum = 0;
for (i = 0; i < n; i++)
sum + = a[i];
return sum;
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
18
Locality Example
sum = 0;
for (i = 0; i < n; i++)
sum + = a[i];
return sum;
Spatial Locality:
All the array-elements a[ i ] or data,
reference in succession at each loop
iteration, so all the array elements be
located at the same level
All the instructions of the loop are
referenced repeatedly in sequence
therefore be located at the same level
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
19
Locality Example
sum = 0;
for (i = 0; i < n; i++)
sum + = a[i];
return sum;
Temporal Locality
The data, sum is referred each
iteration; i.e., recently referred data is
referred in each iteration
The Instructions of a loop, sum += a[i]
Cycle through loop repeatedly
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
20
Based on Locality Principle
How Memory Hierarchy works?
― the memory hierarchy will keep the
more recently accessed data items
closer to the processor because
chances are the processor will
access them again soon
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
21
Based on Locality Principle
How Memory Hierarchy works?
NOT ONLY do we move the item
that has just been accessed
closer to the processor, but we
ALSO move the data items that
are adjacent to it
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
22
Hierarchy List
Register File
L1
L2
Main memory
Disk cache
Disk
Optical
Tape
MAC/VU-Advanced
Computer Architecture
Level 0
Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
Level 7
Datapath
Cache on Chip
External Cache
System Board DRAM
Disk drive
Magnetic disk
CDs etc- bulk storage
Huge cheapest Storage
Lecture 26 Memory Hierarchy (2)
23
Intel Processor Cache
80386 –
no on chip cache
80486 –
8k byte lines
Pentium (all versions)
– two on chip L1 caches
– Data & instructions
Pentium 4
L1 caches Two 8k bytes
L2 cache 256k
– Feeding both L1 caches
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
24
Cache Devices
Cache device is a small SRAM which is
made directly accessible to the processor;
and
DRAM, which is accessible by the cache as
well as by the user or programmer, is
placed at the next higher level as the MainMemory
Larger storage such as disk, is placed away
from the main memory
MAC/VU-Advanced
Computer Architecture
Lecture 26 Memory Hierarchy (2)
25