CS703 – Advanced
Operating Systems
By Mr. Farhan Zaidi
Lecture No. 9
Overview of today’s lecture
Shared variable analysis in multi-threaded
programs
Concurrency and synchronization
Critical sections
Solutions to the critical section problem
Concurrency examples
Re-cap of lecture
Shared Variables in Threaded C Programs
Question: Which variables in a threaded C program are shared
variables?
The answer is not as simple as “global variables are
shared” and “stack variables are private”.
Requires answers to the following questions:
What is the memory model for threads?
How are variables mapped to memory instances?
How many threads reference each of these instances?
Threads Memory Model
Conceptual model:
Operationally, this model is not strictly enforced:
Each thread runs in the context of a process.
Each thread has its own separate thread context.
Thread ID, stack, stack pointer, program counter, condition
codes, and general purpose registers.
All threads share the remaining process context.
Code, data, heap, and shared library segments of the
process virtual address space.
Open files and installed handlers
While register values are truly separate and protected....
Any thread can read and write the stack of any other thread.
Mismatch between the conceptual and operation model is a source of
confusion and errors.
What resources are shared?
Local variables are not shared
Global variables are shared
refer to data on the stack, each thread has its own stack
never pass/share/store a pointer to a local variable on another
thread’s stack!
stored in the static data segment, accessible by any thread
Dynamic objects are shared
stored in the heap, shared if you can name it
in C, can conjure up the pointer
e.g., void *x = (void *) 0xDEADBEEF
in Java, strong typing prevents this
must pass references explicitly
Synchronization
Threads cooperate in multithreaded programs
to share resources, access shared data structures
e.g., threads accessing a memory cache in a web server
also, to coordinate their execution
e.g., a disk reader thread hands off blocks to a network writer
thread through a circular buffer
disk
reader
thread
network
writer
thread
circular
buffer
For correctness, we have to control this cooperation
We control cooperation using synchronization
must assume threads interleave executions arbitrarily and at
different rates
scheduling is not under application writers’ control
enables us to restrict the interleaving of executions
Note: this also applies to processes, not just threads
It also applies across machines in a distributed system
Example of Threads Accessing
Another Thread’s Stack
char **ptr;
/* global */
int main()
{
int i;
pthread_t tid;
char *msgs[2] = {
"Hello from foo",
"Hello from bar"
};
ptr = msgs;
for (i = 0; i < 2; i++)
Pthread_create(&tid,
NULL,
thread,
(void *)i);
Pthread_exit(NULL);
}
/* thread routine */
void *thread(void *vargp)
{
int myid = (int)vargp;
static int svar = 0;
printf("[%d]: %s (svar=%d)\n",
myid, ptr[myid], ++svar);
}
Peer threads access main thread’s stack
indirectly through global ptr variable
Mapping Variables to Mem. Instances
Global var: 1 instance (ptr [data])
Local automatic vars: 1 instance (i.m, msgs.m )
char **ptr;
/* global */
int main()
{
int i;
pthread_t tid;
char *msgs[N] = {
"Hello from foo",
"Hello from bar"
};
ptr = msgs;
for (i = 0; i < 2; i++)
Pthread_create(&tid,
NULL,
thread,
(void *)i);
Pthread_exit(NULL);
}
Local automatic var: 2 instances (
myid.p0[peer thread 0’s stack],
myid.p1[peer thread 1’s stack]
)
/* thread routine */
void *thread(void *vargp)
{
int myid = (int)vargp;
static int svar = 0;
printf("[%d]: %s (svar=%d)\n",
myid, ptr[myid], ++svar);
}
Local static var: 1 instance (svar [data])
Shared Variable Analysis
Variable
Which
instance
ptr
svar
i.m
msgs.m
myid.p0
myid.p1
Referenced
by shared?
Referenced by
variables
are
main thread?
peer thread 0?
Referenced by
peer thread 1?
yes
no
yes
yes
no
no
yes
yes
no
yes
yes
no
yes
yes
no
yes
no
yes
A variable x is shared iff multiple threads reference at least one
instance of x. Thus:
ptr, svar, and msgs are shared.
i and myid are NOT shared.
badcnt.c: An Improperly Synchronized
Threaded Program
unsigned int cnt = 0; /* shared */
#define NITERS 100000000
int main() {
pthread_t tid1, tid2;
Pthread_create(&tid1, NULL,
count, NULL);
Pthread_create(&tid2, NULL,
count, NULL);
Pthread_join(tid1, NULL);
Pthread_join(tid2, NULL);
if (cnt != (unsigned)NITERS*2)
printf("BOOM! cnt=%d\n",
cnt);
else
printf("OK cnt=%d\n",
cnt);
}
/* thread routine */
void *count(void *arg) {
int i;
for (i=0; i
cnt++;
return NULL;
}
linux> ./badcnt
BOOM! cnt=198841183
linux> ./badcnt
BOOM! cnt=198261801
linux> ./badcnt
BOOM! cnt=198269672
cnt should be
equal to 200,000,000.
What went wrong?!
Assembly Code for Counter Loop
C code for counter loop
Corresponding asm code
(gcc -O0 -fforce-mem)
for (i=0; i
cnt++;
.L9:
movl -4(%ebp),%eax
cmpl $99999999,%eax
jle .L12
jmp .L10
Head (Hi)
Load cnt (Li)
Update cnt (Ui)
Store cnt (Si)
.L12:
movl cnt,%eax
leal 1(%eax),%edx
movl %edx,cnt
.L11:
movl -4(%ebp),%eax
leal 1(%eax),%edx
movl %edx,-4(%ebp)
jmp .L9
Tail (Ti)
.L10:
# Load
# Update
# Store
Concurrent Execution
Key idea: In general, any sequentially consistent interleaving is
possible, but some are incorrect!
Ii denotes that thread i executes instruction I
%eaxi is the contents of %eax in thread i’s context
i (thread)
instri
%eax1
%eax2
cnt
1
1
1
1
2
2
2
2
2
1
H1
L1
U1
S1
H2
L2
U2
S2
T2
T1
0
1
1
1
1
2
2
2
-
0
0
0
1
1
1
1
2
2
2
OK
Concurrent Execution (cont)
Incorrect ordering: two threads increment the counter, but the
result is 1 instead of 2.
i (thread)
instri
%eax1
%eax2
cnt
1
1
1
2
2
1
1
2
2
2
H1
L1
U1
H2
L2
S1
T1
U2
S2
T2
0
1
1
1
-
0
1
1
1
0
0
0
0
0
1
1
1
1
1
Oops!
Concurrent Execution (cont)
How about this ordering?
i (thread)
instri
1
1
2
2
2
2
1
1
1
2
H1
L1
H2
L2
U2
S2
U1
S1
T1
T2
%eax1
%eax2
cnt
We can clarify our understanding of concurrent
execution with the help of the progress graph
Progress Graphs
A progress graph depicts
the discrete execution
state space of concurrent
threads.
Thread 2
T2
(L1, S2)
Each axis corresponds to
the sequential order of
instructions in a thread.
S2
U2
Each point corresponds to
a possible execution state
(Inst1, Inst2).
L2
H2
H1
L1
U1
S1
T1
Thread 1
E.g., (L1, S2) denotes state
where thread 1 has
completed L1 and thread
2 has completed S2.
Trajectories in Progress Graphs
Thread 2
A trajectory is a sequence
of legal state transitions
that describes one possible
concurrent execution of
the threads.
T2
S2
Example:
U2
H1, L1, U1, H2, L2,
S1, T1, U2, S2, T2
L2
H2
H1
L1
U1
S1
T1
Thread 1
Critical Sections and Unsafe Regions
Thread 2
L, U, and S form a
critical section with
respect to the shared
variable cnt.
T2
Instructions in critical
sections (wrt to some
shared variable) should
not be interleaved.
S2
critical
section
wrt cnt
U2
Unsafe region
Sets of states where such
interleaving occurs
form unsafe regions.
L2
H2
H1
L1
U1
S1
critical section wrt cnt
T1
Thread 1
Safe and Unsafe Trajectories
Thread 2
T2
Safe trajectory
S2
critical
section
wrt cnt
Unsafe
trajectory
Unsafe region
Def: A trajectory is safe
iff it doesn’t touch any
part of an unsafe region.
Claim: A trajectory is
correct (wrt cnt) iff it is
safe.
U2
L2
H2
H1
L1
U1
S1
critical section wrt cnt
T1
Thread 1
The classic example
Suppose we have to implement a function to withdraw money from a
bank account:
int withdraw(account, amount) {
int balance = get_balance(account);
balance -= amount;
put_balance(account, balance);
return balance;
}
Now suppose a husband and wife share a bank account with a balance
of $100.00
what happens if you both go to separate ATM machines, and
simultaneously withdraw $10.00 from the account?
Represent the situation by creating a separate thread for each person to
do the withdrawals
have both threads run on the same bank mainframe:
int withdraw(account, amount) {
}
int withdraw(account, amount) {
int balance = get_balance(account);
int balance = get_balance(account);
balance -= amount;
balance -= amount;
put_balance(account, balance);
put_balance(account, balance);
return balance;
return balance;
}
Interleaved schedules
The problem is that the execution of the two threads can be interleaved,
assuming preemptive scheduling:
balance = get_balance(account);
balance -= amount;
Execution sequence
as seen by CPU
balance = get_balance(account);
balance -= amount;
put_balance(account, balance);
put_balance(account, balance);
context switch
What’s the account balance after this sequence?
context switch
who’s happy, the bank or you?
How often is this unfortunate sequence likely to occur?
Race conditions and concurrency
Atomic operation: operation always runs to completion, or
not at all. Indivisible, can't be stopped in the middle.
On most machines, memory reference and assignment
(load and store) of words, are atomic.
Many instructions are not atomic. For example, on most 32-bit
architectures, double precision floating point store is not atomic;
it involves two separate memory operations.
The crux of the problem
The problem is that two concurrent threads (or processes)
access a shared resource (account) without any synchronization
creates a race condition
output is non-deterministic, depends on timing
We need mechanisms for controlling access to shared resources
in the face of concurrency
so we can reason about the operation of programs
essentially, re-introducing determinism
Synchronization is necessary for any shared data structure
buffers, queues, lists, hash tables, scalars, …