Tải bản đầy đủ (.pdf) (18 trang)

Lecture Operating systems: A concept-based approach (2/e): Chapter 19 - Dhananjay M. Dhamdhere

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (423.16 KB, 18 trang )

PROPRIETARY MATERIAL. ©  2007 The McGraw­Hill Companies, Inc. All rights reserved. No part of this PowerPoint slide  may be displayed, reproduced or distributed 
in any form or by any means, without the prior written permission of the publisher, or used beyond the limited distribution to teachers and educators permitted by McGraw­Hill 
for their individual course preparation. If you are a student using this PowerPoint slide, you are using it without permission. 

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 1
Copyright © 2008


Design issues in Distributed File Systems



Transparency of a file system
– A user need not know the location of a file in the system
* Location transparency: The name of a file should not reveal its
location
 Provides user convenience
* Location independence: File system should be able to change the
location of a file without having to change its name
 Enables the file system to optimize its own performance

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—


A Concept­Based Approach, 2 ed

Slide No: 2
Copyright © 2008


Design issues in Distributed File Systems



Fault tolerance
– Two techniques are used to ensure that a fault does not disrupt
operation of a file system
* Journaling technique may be used to ensure consistency of meta
data
* Stateless file server design eliminates the need to maintain
consistency of meta data

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 3
Copyright © 2008


Design issues in Distributed File Systems




Performance
– File system performance has two aspects
* High efficiency
 File caching boosts efficiency by reducing network traffic
* Scalability
 Response time should not degrade as system size grows
 Special scalability techniques are employed
» Clusters of computers
» Distributed locking techniques

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 4
Copyright © 2008


Basics of file processing in a DFS






A user or process that accesses a file is called a client

When the client opens a file, the DFS finds its location during name resolution
DFS sets up the arrangement involving the client and file server agents
This arrangement is analogous to a remote procedure call (RPC)

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 5
Copyright © 2008


Transparency



DFS may use the following arrangement
– Each file is assigned a globally unique file id
* It is stored in the directory entry of the file

– DFS uses a separate data structure to hold file id-location pairs
* If the DFS changes a file’s location, it updates its entry in this data
structure

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—

A Concept­Based Approach, 2 ed

Slide No: 6
Copyright © 2008


File sharing semantics



Specify how file updates are visible to concurrent users
– Unix semantics
* Use single image mutable files
 Updates made by one client are immediately visible to others

– Session semantics
* Employ multiple image mutable files
 Clients in a session use a single image mutable file
» Their updates are visible to one another immediately but
are visible to other clients only when the session ends

– Transaction semantics
* File processing performed by a client is implemented as an atomic
transaction, so only one client can access a file at any time
* Provides reliability when faults occur
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed


Slide No: 7
Copyright © 2008


Session semantics

•  Only clients in the same node can form a session
• A new session is started in a node if no session is in progress in it, or if some
client of a previous session has updated the file and performed a close operation
• The semantics do not guarantee portability because they do not specify which
file version should be accessed by a new client

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 8
Copyright © 2008


Fault tolerance



File system reliability has many facets. A file should be
– Robust
* A file must survive faults in a guaranteed manner

 Techniques such as disk mirroring ensure robustness

– Recoverable
* It should be possible to restore it to an earlier state when a fault
occurs

– Available
* It should be accessible despite faults
 Availability is typically ensured while opening a file
 Ensuring it during file processing would require a complex
arrangement (see quorum algorithms in Chapter 18)
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 9
Copyright © 2008


Fault tolerance techniques



Three key techniques employed for fault tolerance
– Cached directories
* A remote directory is cached during name resolution
 helps in tolerating failures of intermediate nodes in future name
resolutions


– File replication
* Quorum algorithms can be used to control read / write accesses;
however, they are expensive
 Instead, only the primary copy of a file may be updated and the
updates are propagated to other copies

– Stateless file server
* Does not maintain any meta data
 Hence immune to consistency problems when faults arise
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 10
Copyright © 2008


Stateless file servers



A conventional file system performs file processing in a
stateful manner
– It maintains information about the activity in memory
* For example, file control block, file map table, file buffers




– Loss of this information disrupts file processing activities

Stateless file server
– Does not maintain any information about the activity in memory
* Extra work may be performed to obtain relevant information such as
FMT at every file operation
 To avoid it, file server may return some information to the client
process, which can provide it at the next operation
 Hint: Server maintains some of the information and uses it if
available; otherwise, it acts as a stateless server

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 11
Copyright © 2008


Performance techniques of distributed file systems



Three classes of techniques
– Efficiency of file access
* Multi-threaded file server
 Threads can service file access requests concurrently

* Hint-based file server
 Hybrid of stateful and stateless file server
 Uses a hint if available, else functions as a stateless server

– File caching
* Reduces network traffic by caching portions of a file (see next slide)

– Scalability
* Clusters of nodes using high speed LANs are used
 Processes in a cluster rarely need to access outside files
 Hence network traffic does not grow with number of clusters
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 12
Copyright © 2008


A schematic of file caching






Cache manager in client node traps file accesses whose data exists in the cache
Cache validation traffic is needed to ensure validity of cached data

Client-initiated validation is performed by the cache manager in client site
In server-initiated validation, server keeps track of copies of data in the caches
and invalidates the copies when data is updated

Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 13
Copyright © 2008


Sun Network File System



Provides a stateless file server for files on a network
– Virtual file system (VFS) layer
* It uses the mount protocol and creates system-wide unique file ids
* Export list of a node identifies a local directory and which remote
nodes can access it
* Permits cascaded mounting of file systems, i.e., mounting over a
mounted file system

– Network file system (NFS) layer
* Uses a directory names cache to perform path name resolution
* Uses the NFS protocol to provide access to a remote file using RPC
 NFS employs file caching to provide high performance


Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 14
Copyright © 2008


Architecture of the Sun network file system (NFS)

• The VFS interface either invokes a local file system or the NFS
• NFS uses the RPC protocol to implement file operations
• VFS interface acts as both a client and a server
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 15
Copyright © 2008


Andrew and Coda file systems




Features
– Scalable performance is obtained through use of clusters and
caching of whole files on nodes within a cluster
– Files of a user exist in one volume
* Volumes can be mounted and migrated within the system

– File sharing semantics
* File is cached in units called chunks; chunk size is adapted
* Server-initiated cache validation using callbacks
 Node caching a file F has a callback on F
 Callback is broken when F is updated

– Features of Coda
* Provides replication using read-one-write-all policy
* Supports disconnected mode of operation
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 16
Copyright © 2008


General parallel file system (GPFS)



File system for clusters operating under Linux

– Uses data striping; uses large block size to reduce seek
overhead and small block size for high transfer rate for small
files
– Uses locking to maintain consistency of file data
* Lock granularity is adjusted to trade-off overhead and concurrency
 Implemented using lock token with an adjustable byte-range

– Uses partitioned free disk space map to provide concurrency
– Each node writes a journal for recovery
– A failure may partition the cluster
* Only majority partition can continue file processing activities
 Prevents file inconsistency due to concurrent updates
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 17
Copyright © 2008


Windows



Features for replication and distribution
– Remote differential compression (RDC)
* Reduces the replication and file coherence traffic between servers
 Replication is performed using the notion of replication group

 RDC protocol synchronizes copies of a replicated folder by
transmitting changes in it

– DFS namespaces enables list of virtual tree of folders that can
be accessed by a client located in any node
* System administrator associates a list of servers with a folder
* This list is given to the client that wishes to access the folder
 The client contacts the servers in this order to access the folder
 This scheme permits a hot standby arrangement
Chapter 19: 
Distributed File Systems

Dhamdhere: Operating Systems—
A Concept­Based Approach, 2 ed

Slide No: 18
Copyright © 2008



×