Tải bản đầy đủ (.pdf) (469 trang)

Operating systems principles and practice

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (34.74 MB, 469 trang )

Operating Systems
Principles and Practice

Anderson and Dahlin
v. 0.22


Operating Systems: Principles and Practice
Version 0.22
Base revision e8814fe, Fri Jan 13 14:51:02 2012 -0600.
Copyright c 2011-2012 by Thomas Anderson and Michael Dahlin, all rights reserved.


Contents
Preface

v

1 Introduction

1

1.1
1.2
1.3

I

What is an operating system?
Evaluation Criteria
A brief history of operating systems



Kernels and Processes

2 The Kernel Abstraction
2.1
2.2
2.3
2.4
2.5
2.6

The process concept
Dual-mode operation
Safe control transfer
Case Study: Booting an operating system kernel
Case Study: Virtual machines
Conclusion and future directions

3 The Programming Interface
3.1
3.2
3.3
3.4
3.5
3.6

II

Process management
Input/output

Case Study: Implementing a shell
Case Study: Interprocess communication
Operating system structure
Conclusion and future directions

Concurrency

4 Concurrency and Threads
4.1
4.2
4.3
4.4
4.5

Threads: Abstraction and interface
Simple API and example
Thread internals
Implementation details
Asynchronous I/O and event-driven programming

4
20
28

39
41
45
47
61
89

91
94

101
107
114
119
121
124
130

135
137
138
145
149
155
172


Contents

4.6

Conclusion and future directions

5 Synchronizing Access to Shared Objects
5.1
5.2
5.3

5.4
5.5
5.6
5.7

Challenges
Shared objects and synchronization variables
Lock: Mutual Exclusion
Condition variables: Waiting for a change
Implementing synchronization objects
Designing and implementing shared objects
Conclusions

6 Advanced Synchronization
6.1
6.2
6.3
6.4

Multi-object synchronization
Deadlock
Alternative approaches to synchronization
Conclusion

7 Scheduling
7.1
7.2
7.3
7.4
7.5

7.6

III

Uniprocessor scheduling
Multiprocessor scheduling
Energy-aware scheduling
Real-time scheduling
Queuing theory
Case Study: data center servers

Memory Management

8 Address Translation
8.1
8.2
8.3
8.4
8.5

Address translation concept
Segmentation and Paging
Efficient address translation
Software address translation
Conclusions and future directions

9 Caching and Virtual Memory
9.1
9.2
9.3

9.4

Cache concept: when it works and when it doesn’t
Hardware cache management
Memory mapped files and virtual memory
Conclusions and future directions

10 Applications of Memory Management
10.1
10.2
10.3
10.4
10.5

Zero copy input/output
Copy on write
Process checkpointing
Recoverable memory
Information flow control

ii

175

179
182
189
192
199
212

224
245

251
253
260
278
287

291
294
306
311
312
312
315

317
319
319
320
320
320
320

321
321
322
322
322


323
323
323
323
323
323


iii

CONTENTS

10.6 External pagers
10.7 Virtual machine address translation
10.8 Conclusions and future directions

IV

Persistent Storage

11 File Systems: Introduction and Overview
11.1
11.2
11.3
11.4

The file system abstraction
API
Software layers

Conclusions and future directions

12 Storage Devices
12.1 Magnetic disk
12.2 Flash storage
12.3 Conclusions and future directions

13 Files and Directories
13.1
13.2
13.3
13.4
13.5

Accessing files: API and caching
Files: Placing and finding data
Directories: Naming data
Putting it all together: File access in FFS
Alternatives to file systems

14 Reliable Storage
14.1 Transactions: Atomic updates
14.2 Error detection and correction
14.3 Conclusion and future directions

V

Index

Index


323
324
324

325
327
332
339
343
351

353
354
369
376

385
386
388
393
394
394

397
401
421
447

453

455


Contents

iv


Preface
Why We’re Writing This Book
There has been a huge amount of innovation in both the principles and practice
of operating systems over the past two decades. The pace of innovation in
operating systems has, if anything, increased over the past few years, with the
introduction of the iOS and Android operating systems for smartphones, the
shift to multicore computers, and the advent of cloud computing.
Yet many operating systems textbooks treat the field as if it is static — that
almost everything we need to cover in our classes was invented in the 60’s and
70’s. No! We strongly believe that students both need to, and can, understand
modern operating systems concepts and modern implementation techniques.
At Texas and Washington, we have been teaching the topics covered in this
textbook for years, winning awards for our teaching. The approach in this
book is the same one we use in organizing our own courses: that it is essential
for students to learn both principles and practice, that is, both concepts and
implementation, rather than either alone.
Although this book focuses on operating systems, we believe the concepts
and principles are important for anyone getting a degree in computer science or
computer engineering. The core ideas in operating systems — protection, concurrency, virtualization, resource allocation, and reliable storage — are widely
used throughout computer science. Anyone trying to build resilient, secure,
flexible computer systems needs to have a deep grounding in these topics and
to be able to apply these concepts in a variety of settings. This is especially

true in a modern world where nearly everything a user does is distributed, and
nearly every computer is multi-core. Operating systems concepts are popping
up in many different areas; even web browsers and cloud computing platforms
have become mini-operating systems in their own right.
Precisely because operating systems concepts are among the most difficult
in all of computer science, it is also important to ground students in how these
ideas are applied in practice in real operating systems of today. In this book, we
give students both concepts and working code. We have designed the book to
support and be complemented with a rigorous operating systems course project,
v


Contents

vi

such as Nachos, Pintos, JOS, or Linux. Our treatment, however, is general —
it is not our intent to completely explain any particular operating system or
course project.
Because the concepts in this textbook are so fundamental to much of the
practice of modern computer science, we believe a rigorous operating systems
course should be taken early in an undergraduate’s course of study. For many
students, an operating systems class is the ticket to an internship and eventually to a full-time position. We have designed this textbook assuming only
that students have taken a class on data structures and one on basic machine
structures. In particular, we have designed our book to interface well if students
have used the Bryant and O’Halloran textbook on machine structures. Since
some schools only get through the first half of Bryant and O’Halloran in their
machine structures course, our textbook reviews and covers in much more depth
the material from the second half of that book.


An Overview of the Content
The textbook is organized to allow each instructor to choose an appropriate
level of depth for each topic. Each chapter begins at a conceptual level, with
implementation details and the more advanced material towards the end. A
more conceptual course will skip the back parts of several of the chapters; a
more advanced or more implementation-oriented course will need to go into
chapters in more depth. No single semester course is likely to be able to cover
every topic we have included, but we think it is a good thing for students to
come away from an operating systems course with an appreciation that there is
still a lot for them to learn.
Because students learn more by needing to solve problems, we have integrated some homework questions into the body of each chapter, to provide
students a way of judging whether they understood the material covered to
that point. A more complete set of sample assignments is given at the end of
each chapter.
The book is divided into five parts: an introduction (Chapter 1), kernels and
processes (Chapters 2-3), concurrency, synchronization and scheduling (Chapters 4-7), memory management (Chapters 8-10), and persistent storage (Chapters 11-13).
The goal of chapter 1 is to introduce the recurring themes found in the later
chapters. We define some common terms, and we provide a bit of the history
of the development of operating systems.
Chapter 2 covers kernel-based process protection — the concept and implementation of executing a user program with restricted privileges. The concept
of protected execution and safe transfer across privilege levels is a key concept
to most modern computer systems, given the increasing salience of computer
security issues. For a quick introduction to the concepts, students need only


vii

CONTENTS

read through 2.3.2; the chapter then dives into the mechanics of system calls,

exceptions and interrupts in some detail. Some instructors launch directly into
concurrency, and cover kernels and kernel protection afterwards, as a lead-in to
address spaces and virtual memory. While our textbook can be used that way,
we have found that students benefit from a basic understanding of the role of
operating systems in executing user programs, before introducing concurrency.
Chapter 3 is intended as an impedance match for students of differing backgrounds. Depending on student background, it can be skipped or covered in
depth. The chapter covers the operating system from a programmer’s perspective: process creation and management, device-independent input/output,
interprocess communication, and network sockets. Our goal is that students be
able to understand at a detailed level what happens between a user clicking on
a link in a web browser, and that request being transferred through the operating system kernel on each machine to the web server running at user-level,
and back again. The second half of Chapter 3 dives into the organization of the
operating system itself — how device drivers and the hardware abstraction layer
work in a modern operating system; the difference between a monolithic and a
microkernel operating system; and how policy and mechanism can be separated
in modern operating systems.
Chapter 4 motivates and explains the concept of threads. Because of the
increasing importance of concurrent programming, and its integration with Java,
many students will have been introduced to multi-threaded programming in
an earlier class. This is a bit dangerous, as testing will not expose students
to the errors they are making in concurrent programming. Thus, the goal of
this chapter is to provide a solid conceptual framework for understanding the
semantics of concurrency, as well as how concurrent threads are implemented in
both the operating system kernel and in user-level libraries. Instructors needing
to go more quickly can omit Section 3.4 and 3.5.
Chapter 5 discusses the synchronization of multi-threaded programs, a central part of all operating systems and increasingly important in many other
contexts. Our approach is to describe one effective method for structuring
concurrent programs (monitors), rather than to cover in depth every proposed
mechanism. In our view, it is important for students to master one methodology,
and monitors are a particularly robust and simple one, capable of implementing most concurrent programs efficiently. Implementation of synchronization
primitives are covered in Section 5.5; this can be skipped without compromising

student understanding.
Chapter 6 discusses advanced topics in concurrency, including deadlock, synchronization across multiple objects, and advanced synchronization techniques
like read-copy-update (RCU). This is material is important for students to know,
but most semester-long operating systems courses will only be able to briefly
touch upon these issues.
Chapter 7 covers the concepts of resource allocation in the specific context of


Contents

viii

processor scheduling. After a quick tour through the tradeoffs between response
time and throughput for uniprocessor scheduling, the chapter covers a set of
more advanced topics in affinity and gang scheduling, power-aware and deadline
scheduling, as well as server scheduling, basic queueing theory and overload
management.
Chapter 8 explains hardware and software address translation mechanisms.
The first part of the chapter covers how to provide flexible memory management through multilevel segmentation and paging. Section 8.3 then considers
how hardware makes flexible memory management efficient through translation
lookaside buffers and virtually addressed caches, and how these are kept consistent as the operating system changes the addresses assigned to each process.
We conclude with a discussion of modern software-based protection mechanisms
such as those found in Android.
Chapter 9 covers caching and virtual memory. Caches are of course central
to many different types of computer systems. Most students will have seen the
concept of a cache in an earlier class machine structures, so our goal here is
to cover the theory and implementation of caches: when they work and when
they don’t, and how they are implemented in hardware and software. While
it might seem that we could skip virtual memory, many systems today provide
programmers the abstraction of memory-mapped files, and these rely on the

same mechanisms as in traditional virtual memory.
Chapter 10 discusses advanced topics in memory management. Address
translation hardware and software can be used for a number of different features in modern operating systems, such as zero copy I/O, copy on write, process checkpointing, and recoverable virtual memory. As this is more advanced
material, it can be skipped for time.
Chapter 11 sketches the characteristics of storage hardware, specifically block
storage devices such as magnetic disks and flash memory. The last two decades
have seen rapid change in storage technology affecting both application programmers and operating systems designers; this chapter provides a snapshot for
students, as a building block for the next two chapters. Classes in which students have taken a computer architecture course that covers these topics may
choose to skip this chapter.
Chapter 12 uses file systems as a case study of how complex data structures
can be organized on block storage devices to achieve flexibility and performance.
Chapter 13 explains the concept and implementation of reliable storage, using file systems as a concrete example. Starting with the ad hoc techniques in
UNIX fsck for implementing a reliable file system, the chapter explains checkpointing and write ahead logging as alternate implementation strategies for
building reliable storage, and it discusses how redundancy such as checksums
and replication are used to improve reliability and availability.
We are contemplating adding several chapters on networking and distributed


ix

CONTENTS

operating systems topics, but we are still considering what topics we can reasonably cover. We will be developing this material over the coming months.


Chapter 1

Introduction

“Everything I need to know I learned in kindergarten.” – Robert Fulgham


How do we construct reliable, portable, efficient and secure computer systems? An essential component is the computer’s operating system — the software that manages a computer’s resources.
First, the bad news: operating systems concepts are among the most complex
topics in computer science. A modern general-purpose operating system can run
to over 50 million lines of code, or in other words, more than a thousand times as
long as this textbook. New operating systems are being written all the time. If
you are reading this textbook on an e-book reader, tablet, or smartphone, there
is an operating system managing the device. Since we will not be able to cover
everything, our focus will be on the essential concepts for building computer
systems, ones that every computer scientist should know.
Now the good news: operating systems concepts are also among the most
accessible topics in computer science. Most of the topics in this book will seem
familiar to you — if you have ever tried to do two things at once, or picked
the wrong line at a grocery store, or tried to keep a roommate or sibling from
messing with your things, or succeeded at pulling off an April Fool’s joke. Each
of these has an analogue in operating systems, and it is this familiarity that
gives us hope that we can explain how operating systems do their work in a
single textbook. All we will assume of the reader is a basic understanding of
the operation of a computer and the ability to read pseudo-code.
We believe that understanding how operating systems work is essential for
any student interested in building modern computer systems. Of course, everyone who uses a computer or a smartphone or even a modern toaster uses an
1


2

2. Read

1. Get x.html
Server


Client

x.html

4. Data
3. Data

Figure 1.1: The operation of a web server.

operating system, so understanding the function of an operating system is useful
to most computer scientists. Our goal in this book is to go much deeper than
that, to explain the technologies used inside operating systems, technologies
many of us rely on every day without realizing it.
Software engineers often encounter challenges similar to those faced by operating systems when building other complex systems, and they use many of
the same technologies and design patterns. Whether your goal is to work on
the internals of an operating system kernel, or to build the next generation of
software for cloud computing, secure web browsers, game consoles, graphical
user interfaces, media players, databases, or multicore software, the concepts
and abstractions needed for reliable, portable, efficient and secure software are
much the same. In our experience, the best way to learn these concepts is to
study how they are used in operating systems, but we hope you will apply these
concepts to a much broader range of computer systems.
To get started, consider the web server in Figure 1.1. Its behavior is amazingly simple: it receives a packet containing the name of the web page from
the network. The web server decodes the packet, fetches the file from disk, and
sends the contents back over the network to the user.
Part of an operating system’s job is to make it easy to write applications
like web servers. But if we dig a bit deeper, this simple story quickly raises as
many questions as it answers:
• Many web requests involve both data and computation. For example,

the Google home page presents a simple text box, but each search query
entered in that box consults databases spread over literally thousands of
machines. To keep their software manageable, web servers often invoke
helper applications, e.g., to manage the actual search function. These
helper applications need to communicate with the main web server for


3

CHAPTER 1. INTRODUCTION

this to work. How does the operating system enable multiple applications
to commmunicate with each other?
• What if two users (or a million) try to request a web page from the server
at the same time? A simple approach might be to handle each request in
turn. If any individual request takes a long time, however, this approach
would mean that everyone else would need to wait for it to complete. A
faster, but more complex, solution is to multitask: to juggle the handling of
multiple requests at once. Multitasking is especially important on modern
multicore computers, as it provides a way to keep many processors busy.
How does the operating system enable applications to do multiple things
at once?
• For better performance, the web server might want to keep a copy, sometimes called a cache, of recently requested pages, so that the next user to
request the same page can be returned the results from the cache, rather
than starting the request from scratch. This requires the application to
synchronize access to the cache’s data structures by the thousands of web
requests being handled at the same time. How does the operating system
support application synchronization to shared data?
• To customize and animate the user experience, it is common for web
servers to send clients scripting code, along with the contents of the web

page. But this means that clicking on a link can cause someone else’s
code to run on your computer. How does the client operating system
protect itself from being compromised by a computer virus surreptitiously
embedded into the scripting code?
• Suppose the web site administrator uses an editor to update the web page.
The web server needs to be able to read the file that the editor wrote; how
does the operating system store the bytes on disk so that later on the web
server can find and read them?
• Taking this a step further, the administrator probably wants to be able to
make a consistent set of changes to the web site, so that embedded links
are not left dangling, even temporarily. How can the operating system
enable users to make a set of changes to a web site, so that requests either
see the old pages or the new pages, but not a mishmash of the two?
• What happens when the client browser and the web server run at different
speeds? If the server tries to send the web page to the client faster than
the client can draw the page, where are the contents of the file stored in
the meantime? Can the operating system decouple the client and server
so that each can run at its own speed, without slowing the other down?
• As demand on the web server grows, the administrator is likely to want
to move to more powerful hardware, with more memory, more processors,
faster network devices, and faster disks. To take advantage of this new


1.1. What is an operating system?

4

hardware, does the web server need to be re-written from scratch, or can it
be written in a hardware-independent fashion? What about the operating
system — does it need to be re-written for every new piece of hardware?

We could go on, but you get the idea. This book will help you understand
the answers to these questions, and more.
Goals of this chapter
The rest of this chapter discusses three topics in detail:
• OS Definition. What is an operating system and what does it do?
• OS Challenges. How should we evaluate operating systems, and what
are some of the tradeoffs their designers face?
• OS Past, Present and Future. What is the history of operating systems, and what new functionality are we likely to see in future operating
systems?

1.1
Definition: operating
system

What is an operating system?

An operating system is the layer of software that manages a computer’s resources
for its users and their applications. Operating systems run in a wide range of
computer systems. Sometimes they are invisible to the end user, controlling
embedded devices such as toasters, gaming systems, and the many computers
inside modern automobiles and airplanes. Operating systems are also an essential component of more general-purpose systems such as smartphones, desktop
computers, and servers.
Our discussion will focus on general-purpose operating systems, because the
technologies they need are a superset of the technologies needed for embedded systems. Increasingly though, technologies developed for general-purpose
computing are migrating into the embedded sphere. For example, early mobile phones had simple operating systems to manage the hardware and to run
a handful of primitive applications. Today, smartphones — phones capable of
running independent third party applications — are the fastest growing part
of the mobile phone business. These new devices require much more complete
operating systems, with sophisticated resource management, multi-tasking, security and failure isolation.
Likewise, automobiles are increasingly software controlled, raising a host of

operating system issues. Can anyone write software for your car? What if
the software fails while you are driving down the highway? How might the
operating system of your car be designed to prevent a computer virus from


5

CHAPTER 1. INTRODUCTION

A
P
P

A
P
P

A
P
P

Operating System

Hardware

Figure 1.2: A general-purpose operating system
hijacking control of your car’s computers? Although this might seem far fetched,
researchers recently demonstrated that they could remotely turn off a car’s
braking system through a computer virus introduced into the car’s computers
through a hacked car radio. A goal of this book is to explain how to build more

reliable and secure computer systems in a variety of contexts.
For general-purpose systems, users interact with applications, applications
execute in an environment provided by the operating system, and the operating
system mediates access to the underlying hardware (Figure 1.2, and expanded
in Figure 1.3). What do we need from an operating system to be able to run a
group of programs? Operating systems have three roles:
• Operating systems play referee — they manage shared resources between
different applications running on the same physical machine. For example,
an operating system can stop one program and start another. Operating
systems isolate different applications from each other, so that if there is
a bug in one application, it does not corrupt other applications running
on the same machine. The operating system must protect itself and other
applications from malicious computer viruses. And since the applications


1.1. What is an operating system?

6

An Expanded View of an Operating System
Figure 1.3 shows the structure of a general-purpose operating system, as an expansion on the simple view presented in Figure 1.2. At the lowest level, the hardware provides processor, memory, and a set of devices for providing the user interface, storing
data and communicating with the outside world. The hardware also provides primitives that the operating system can use to provide fault isolation and synchronization.
The operating system runs as the lowest layer of software on the computer, with a
device-specific layer interfaces to the myriad hardware devices, and a set of deviceindependent services provided to applications. Since the operating system needs to be
able to isolate malicious and buggy applications from affecting other applications or the
operating system itself, much of the operating system runs in a separate execution environment protected from application code. A portion of the operating system can also
run as a library linked into each application. In turn, applications run in an execution
context provided by the operating system. The application context is much more than
a simple abstraction on top of hardware devices: applications execute in a virtual environment that is both more constrained (to prevent harm), more powerful (to mask hardware limitations), and more useful (via common services), than the underlying hardware.


are sharing physical resources, the operating system needs to decide which
applications get which resources.
• Operating systems play illusionist — they provide an abstraction physical hardware to simplify application design. To write a “hello world”
program, you do not need (or want!) to think about how much physical
memory the system has, or how many other programs might be sharing
the computer’s resources. Instead, operating systems provide the illusion
of a nearly infinite memory, as an abstraction on top of a limited amount
of physical memory. Likewise, operating systems provide the illusion that
each program has the computer’s processors entirely to itself. Obviously,
the reality is quite different! These illusions enable applications to be
written independently of the amount of physical memory on the system
or the physical number of processors. Because applications are written to
a higher level of abstraction, the operating system is free to change the
amount of resources assigned to each application as applications start and
stop.
• Operating systems provide glue — a set of common services between applications. An important benefit of common services is to facilitate sharing
between applications, so that, for example, cut and paste works uniformly
across the system and a file written by one application can be read by
another. Many operating systems provide a common set of user interface
routines to help applications provide a common “look and feel.” Perhaps
most importantly, operating systems provide a layer separating applica-


7

CHAPTER 1. INTRODUCTION

Users

A

P
P

Usermode

A
P
P

File
System TCP
IP
networking

OS

Kernelmode

A
P
P

System
Library

Kernel-user
Virtual
Memory Interface (Abstract
CPU
virtual machine)

scheduling

Hardware-Specific Software
and Device Drivers
CPU
Hardware
Graphics
Processor

Hardware
Abstraction Layer

Address
Translation
Network

Figure 1.3: A general-purpose operating system: expanded view

tions from hardware input and output devices, so that applications can
be written independently of which specific keyboard, mouse or disk drive
is being used on a particular computer.
We next discuss these three roles in a bit more detail.

1.1.1

Resource sharing: Operating system as referee

Sharing is central to most uses of computers. Right now, my laptop is running
a browser, podcast library, text editor, email program, document viewer, and
newspaper. The operating system must somehow keep all of these activities separate, yet allow each the full capacity of the machine if the others aren’t running.

At a minimum, when one program stops running, the operating system should


1.1. What is an operating system?

8

let me run another. Better, the operating system should allow multiple applications to run at the same time, as when I read email while I am downloading
a security patch to the system software.
Even individual applications can be designed to do multiple things at once.
For instance, a web server will be more responsive to its users if it can handle
multiple requests at the same time rather than waiting for each to complete
before the next one starts running. The same holds for the browser — it is
more responsive if it can start drawing a page while the rest of the page is
still being transferred. On multiprocessors, the computation inside a parallel
application can be split into separate units that can be run independently for
faster execution. The operating system itself is an example of software written
to be able to do multiple things at once. As we will describe later, the operating
system is a customer of its own abstractions.
Sharing raises several challenges for an operating system:
• Resource Allocation. The operating system must keep all of the simultaneous activities separate, allocating resources to each as appropriate. A
computer usually has only a few processors and a finite amount of memory,
network bandwidth, and disk space. When there are multiple tasks to do
at the same time, how should the operating system choose how many resources to give to each? Seemingly trivial differences in how resources are
allocated can have a large impact on user-perceived performance. As we
will see later, if the operating system gives too little memory to a program,
it will not only slow down that particular program, it can dramatically
hurt the performance of the entire machine.
As another example, what should happen if an application executes an
infinite loop:

while ( true ){
;
}

If programs ran directly on the raw hardware, this code fragment would
lock up the computer, making it completely non-responsive to user input.
With resource multiplexing provided by the operating system, the specific
application might lock up, but other programs can proceed unimpeded.
Additionally, the user can ask the operating system to force the looping
program to exit.

Definition: fault isolation

• Isolation. An error in one application should not disrupt other applications, or even the operating system itself. This is called fault isolation.
Anyone who has taken an introductory computer science class knows the
value of an operating system that can protect itself and other applications
from programmer bugs. Debugging would be vastly harder if an error


9

CHAPTER 1. INTRODUCTION

in one program could corrupt data structures in other applications. Likewise, downloading and installing a screen saver or other application should
not crash other unrelated programs, nor should it be a way for a malicious attacker to surreptitiously install a computer virus on the system.
Nor should one user be able to access or change another’s data without
permission.
Fault isolation requires restricting the behavior of applications to less than
the full power of the underlying hardware. Given access to the full capability of the hardware, any application downloaded off the web, or any script
embedded in a web page, would have complete control of the machine.

Thus, it would be able to install spyware into the operating system to
log every keystroke you type, or record the password to every website you
visit. Without fault isolation provided by the operating system, any bug
in any program might cause the disk to become irretrievably corrupted.
Erroneous or malignant applications would cause all sorts of havoc.
• Communication. The flip side of isolation is the need for communication
between different applications and between different users. For example, a
web site may be implemented by a cooperating set of applications: one to
select advertisements, another to cache recent results, yet another to fetch
and merge data from disk, and several more to cooperatively scan the web
for new content to index. For this to work, the various programs need
to be able to communicate with one another. If the operating systems
is designed to prevent bugs and malicious users and applications from
affecting other users and their applications, how does the operating system
support communication to share results? In setting up boundaries, an
operating system must also allow for those boundaries to be crossed in
carefully controlled ways as the need arises.
In its role as a referee, an operating system is somewhat akin to that of a
government, or perhaps a particularly patient kindergarten teacher, balancing
needs, separating conflicts, and facilitating sharing. One user should not be
able to hog all of the system’s resources or to access or corrupt another user’s
files without permission; a buggy application should not be able to crash the
operating system or other unrelated applications; and yet applications also need
to be able to work together. Enforcing and balancing these concerns is the role
of the operating system.

Exercises
Take a moment to speculate. We will provide answers to these questions
throughout the rest of the book, but given what you know now, how would you
answer them? Before there were operating systems, someone needed to develop

solutions, without being able to look them up! How would you have designed
the first operating system?


1.1. What is an operating system?

10

1. Suppose a computer system and all of its applications are completely bug
free. Suppose further that everyone in the world is completely honest and
trustworthy. In other words, we do not need to consider fault isolation.
a. How should the operating system allocate time on the processor?
Should it give all of the processor to each application until it no
longer needs it? If there are multiple tasks ready to go at the same
time, should it schedule the task with the least amount of work to
do or the one with the most? Justify your answer.
b. How should the operating system allocate physical memory between
applications? What should happen if the set of applications do not
all fit in memory at the same time?
c. How should the operating system allocate its disk space? Should the
first user to ask be able to grab all of the free space? What would
the likely outcome be for that policy?
2. Now suppose the computer system needs to support fault isolation. What
hardware and/or operating support do you think would be needed to accomplish this goal?
a. For protecting an application’s data structures in memory from being
corrupted by other applications?
b. For protecting one user’s disk files from being accessed or corrupted
by another user?
c. For protecting the network from a virus trying to use your computer
to send spam?

3. How should an operating system support communication between applications?
a. Through the file system?
b. Through messages passed between applications?
c. Through regions of memory shared between the applications?
d. All of the above? None of the above?

1.1.2

Mask hardware limitations: Operating system as illusionist

A second important role of operating systems is to mask the restrictions inherent
in computer hardware. Hardware is necessarily limited by physical constraints
— a computer has only a limited number of processors and a limited amount


11

CHAPTER 1. INTRODUCTION

A
P
P
A
P
P

Guest
Operating
System


A
P
P

Guest
Operating
System

Operating System
Hardware
Figure 1.4: An operating system virtual machine

of physical memory, network bandwidth, and disk. Further, since the operating
system must decide how to split the fixed set of resources among the various
applications running at each moment, a particular application will have different amounts of resources from time to time, even when running on the same
hardware. While a few applications might be designed to take advantage of a
computer’s specific hardware configuration and their specific resource assignment, most programmers want to use a higher level of abstraction.
We have just discussed one example of this: a uniprocessor can run only one
program at a time, yet most operating systems allow multiple applications to
appear to the user to be running at the same time. The operating system does so
through a concept called virtualization. Virtualization provides an application Definition: virtualization
with the illusion of resources that are not physically present. For example, the
operating system can present to each application the abstraction that it has an
entire processor dedicated to it, even though at a physical level there may be only
a single processor shared among all the applications running on the computer.
With the right hardware and operating system support, most physical resources
can be virtualized: examples include the processor, memory, screen space, disk,
and the network. Even the type of processor can be virtualized, to allow the
same, unmodified application to be run on a smartphone, tablet, and laptop
computer.

Pushing this a step further, some operating systems virtualize the entire
computer, to run the operating system as an application running on top of
another operating system (see Figure 1.4). This is called creating a virtual
machine. The operating system running in the virtual machine, called the guest Definition: virtual
operating system, thinks it is running on a real, physical machine, but this is machine
Definition: guest
operating system


1.1. What is an operating system?

12

an illusion presented by the true operating system running underneath. One
reason for the operating system to provide a virtual machine is for application
portability. If a program only runs on an old version of an operating system,
then we can still run the program on a new system running a virtual machine.
The virtual machine hosts the application on the old operating system, running
on top of the new operating system. Another reason for virtual machines is as
an aid in debugging. If an operating system can be run as an application, then
the operating system developers can set breakpoints, stop, and single step their
code just as they would an application.
In addition to virtualization, operating systems mask many other limitations inherent in physical hardware, by providing applications with the illusion
of hardware capabilities that are not physically present. For example, on a computer with multiple processors sharing memory, each processor can update only
a single memory location at a time. The memory system in hardware ensures
Definition: atomic that any updates to the same memory word are atomic, that is, the value stored
in memory is the last value stored by one of the processors, not a mixture of the
updates of the different processors. Atomicity at the level of a memory word
is preserved in hardware even if more than one processor attempts to write to
memory at exactly the same time. While this might seem sufficient, applications (and the operating system itself) need to be able to update larger data

structures, ones spread over many memory locations. What happens when two
processors attempt to update the same data structure at roughly the same time?
As we’ll discuss later, the results can be quite unexpected and quite different
from what would have happened had each of the processors updated the data
structure in turn. Ideally, the programmer would like to have the abstraction
of an atomic update to the entire data structure, not just to a single memory
word. As we will discuss, the illusion of atomic updates to data structures is
provided by the operating system using some specialized mechanisms provided
in hardware.
Persistent block storage devices, such as magnetic disk or flash RAM, provide
another example. At a physical level, these systems support block writes to
storage, where the size of the block depends on physical device characteristics.
If the computer crashes in the middle of a block write, it could leave the disk
in an unknown state, with neither the old nor the new value stored at that
location. Of course, applications need to be able to store data on disk that is
variable in size, possibly spanning multiple disk blocks. And users want their
data to be preserved even — or especially — if there is a machine failure while
the disk is being updated.
We will discuss techniques that the operating system uses to accomplish
these and other illusions. In each of these cases, the operating system provides
a more convenient and flexible programming abstraction than what is provided
by the underlying hardware.


13

CHAPTER 1. INTRODUCTION

Exercises
Take a moment to speculate; to build the systems we use today, someone

needed to answer these questions. Consider how you might answer them, before
seeing how others solved these puzzles.
4. How would you design combined hardware and software support to provide
the illusion of a nearly infinite virtual memory on a limited amount of
physical memory?
5. How would you design a system to run an entire operating system as an
application running on top of another operating system?
6. How would you design a system to update complex data structures on
disk in a consistent fashion despite machine crashes?

1.1.3

Common services: Operating system as glue

Operating system also play a third role: providing a set of common, standard
services to applications to simplify and regularize their design. We saw an
example of this with the web server outlined at the beginning of this chapter.
The operating system hides the specifics of how the network and disk devices
work, providing a simpler abstraction to applications based on receiving and
sending reliable streams of bytes, and reading and writing named files. This
allows the web server can focus on its core task of decoding incoming requests
and filling them, rather than on the formatting of data into individual network
packets and disk blocks.
An important reason for the operating system to provide common services,
rather than leaving it up to each application, is to facilitate sharing between
applications. The web server needs to be able to read the file that the text editor
wrote. If applications are to share files, they need to be stored in a standard
format, with a standard system for managing file directories. Likewise, most
operating systems provide a standard way for applications to pass messages,
and to share memory, to facilitate sharing.

The choice of which services an operating system should provide is often a
matter of judgment. For example, computers can come configured with a blizzard of different devices: different graphics co-processors and pixel formats, different network interfaces (WiFi, Ethernet, and Bluetooth), different disk drives
(SCSI, IDE), different device interfaces (USB, Firewire), and different sensors
(GPS, accelerometers), not to mention different versions of each of those standards. Most applications will be able to ignore these differences, using only
a generic interface provided by the operating system. For other applications,


1.1. What is an operating system?

14

such as a database, the specific disk drive may matter quite a bit. For those
applications that can operate at a higher level of abstraction, the operating
system serves as an interoperability layer, so that both applications, and the
devices themselves, can be independently evolved without requiring simultaneous changes to the other side.
Another standard service in most modern operating systems is the graphical
user interface library. Both Microsoft’s and Apple’s operating systems provide
a set of standard user interface widgets. This facilitates a common “look and
feel” to users, so that frequent operations such as pull down menus and “cut”
and “paste” are handled consistently across applications.
Most of the code of an operating system is to implement these common services. However, much of the complexity of operating systems is due to resource
sharing and masking hardware limits. Because the common service code is built
on the abstractions provided by the other two operating system roles, this book
will focus primarily on those two topics.

1.1.4

Operating system design patterns

The challenges that operating systems address are not unique — they apply

to many different computer domains. Many complex software systems have
multiple users, run programs written by third party developers, and/or need
to coordinate many simultaneous activities. These pose questions of resource
allocation, fault isolation, communication, abstractions of physical hardware,
and how to provide a useful set of common services for software developers.
Not only are the challenges the same, but often the solutions are as well: these
systems use many of the design patterns and techniques described in this book.
For now, we focus on the challenges these systems have in common with
operating systems:
• Cloud computing (Figure 1.5) is a model of computing where large-scale
applications are run on shared computing and storage infrastructure in
data centers, instead of on the user’s own desktop computer. A similar
approach is to run compute-intensive applications in the idle cycles of
remote desktop computers. In both cases, many of the same issues arise
as in operating systems, in terms of sharing, abstraction, and common
services.
– Referee. How are resources allocated between competing applications running in the cloud? How are buggy or malicious applications
prevented from disrupting other applications?
– Illusionist. The computing resources in the cloud are continually
evolving; what abstractions are provided to isolate application developers from changes in the underlying hardware?


×