CS703 Advanced
Operating Systems
By Mr. Farhan Zaidi
Lecture No.
25
Overview of today’s lecture
Segmentation
Combined Segmentation and paging
Efficient translations and caching
Translation Lookaside Buffer (TLB)
Segmentation
Paging
mitigates various memory allocation complexities (e.g.,
fragmentation)
view an address space as a linear array of bytes
divide it into pages of equal size (e.g., 4KB)
use a page table to map virtual pages to physical page frames
page (logical) => page frame (physical)
Segmentation
partition an address space into logical units
stack, code, heap, subroutines, …
a virtual address is <segment #, offset>
What’s the point?
More “logical”
absent segmentation, a linker takes a bunch of
independent modules that call each other and organizes
them
they are really independent; segmentation treats them as
such
Facilitates sharing and reuse
a segment is a natural unit of sharing – a subroutine or
function
A natural extension of variable-sized partitions
variable-sized partition = 1 segment/process
segmentation = many segments/process
Hardware support
Segment table
multiple base/limit pairs, one per segment
segments named by segment #, used as index
into table
a virtual address is <segment #, offset>
offset of virtual address added to base address of
segment to yield physical address
Segment lookups
segment table
limit
physical memory
base
segment 0
segment #
offset
segment 1
virtual address
segment 2
yes
+
segment 3
no
raise
protection fault
segment 4
Segmentation pros & cons
+ efficient for sparse address spaces
+ easy to share whole segments (for example, code segment)
Need to add protection mode in segmentation table. For
example, code segment would be read-only (only execution and
loads are allowed). Data and stack segment would be read-write
(stores allowed).
- complex memory allocation
- Still need first fit, best fit, etc., and re-shuffling to coalesce free
fragments, if no single free space is big enough for a new
segment.
Linux:
1 kernel code segment, 1 kernel data segment
1 user code segment, 1 user data segment
N task state segments (stores registers on context
switch)
1 “local descriptor table” segment (not really used)
all of these segments are paged
Segmentation with paging translation
Segmentation with paging translation Pros
& Cons
+ only need to allocate as many page table entries as we need.
In other words, sparse address spaces are easy.
+ easy memory allocation
+ share at seg or page level
- pointer per page (typically 4KB - 16KB pages today)
- page tables need to be contiguous
- two lookups per memory reference
Integrating VM and Cache
VA
CPU
miss
PA
Translation
Cache
Main
Memory
hit
data
Most Caches “Physically Addressed”
Accessed by physical addresses
Allows multiple processes to have blocks in cache at same time
Allows multiple processes to share pages
Cache doesn’t need to be concerned with protection issues
Access rights checked as part of address translation
Perform Address Translation Before Cache Lookup
But this could involve a memory access itself (of the PTE)
Of course, page table entries can also become cached
Caching review
Cache: copy that can be accessed more quickly than original.
Idea is: make frequent case efficient, infrequent path doesn't
matter as much. Caching is a fundamental concept used in lots
of places in computer systems. It underlies many of the
techniques that are used today to make computers go fast: can
cache translations, memory locations, pages, file blocks, file
names, network routes, authorizations for security systems, etc.
Generic Issues in Caching
Cache hit: item is in the cache
Cache miss: item is not in the cache, have to do full operation
Effective access time = P (hit) * cost of hit + P (miss) * cost of miss
1. How do you find whether item is in the cache (whether there
is a cache hit)?
2. If it is not in cache (cache miss), how do you choose what to
replace from cache to make room?
3. Consistency -- how do you keep cache copy consistent with
real version?
Speeding up Translation with a TLB
“Translation Lookaside Buffer” (TLB)
Small hardware cache in MMU
Maps virtual page numbers to physical page numbers
Contains complete page table entries for small number of pages
hit
PA
VA
CPU
miss
TLB
Lookup
miss
Cache
hit
Translation
data
Main
Memory
Translation Buffer, Translation Lookaside
Buffer
Hardware table of frequently used translations, to avoid having to
go through page table lookup in common case. Typically, on
chip, so access time of 2-5ns, instead of 30-100ns for main
memory.
How do we tell if needed translation is in TLB?
1. Search table in sequential order
2. Direct mapped: restrict each virtual page to use specific slot
in TLB
For example, use upper bits of virtual page number to index TLB.
Compare against lower bits of virtual page number to check for
match.
What if two pages conflict for the same TLB slot? Ex: program
counter and stack.
One approach: pick hash function to minimize conflicts
What if use low order bits as index into TLB?
What if use high order bits as index into TLB?
Thus, use selection of high order and low order bits as index.
3. Set associativity: arrange TLB (or cache) as N separate
banks. Do simultaneous lookup in each bank. In this case,
called "N-way set associative cache".
More set associativity, less chance of thrashing. Translations
can be stored, replaced in either bank.
4. Fully associative: translation can be stored anywhere in
TLB, so check all entries in the TLB in parallel.
Direct mapped TLB