Tải bản đầy đủ (.pdf) (10 trang)

The Best Damn Windows Server 2003 Book Period- P23 docx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (430.97 KB, 10 trang )

A User Gets an “Insufficient Disk Space”
Message When Adding Files to a Volume
The insufficient disk space message (see Figure 5.107) to be expected for any of your users that are
over their quota limit. Usually this is a good thing because it means that disk quotas are working.
The only way around it is to increase your users’ quota limit or to stop denying users who exceed
their disk space. If this is happening unexpectedly, verify that your users’ limits are set correctly. A
common error is to forget to change the quota measurement from KB to MB.You may think that
your users have 150MB of available space when they only have 150KB of space.
Troubleshooting Remote Storage
Remember when you are troubleshooting Remote Storage that you are writing data to backup
media.This is going to be slower than writing to disks.This is not to say that your performance
should be terrible. Just be realistic with your expectations. Here are some common Remote Storage
troubleshooting issues:
186 Chapter 5 • Managing Physical and Logical Disks
Figure 5.106 Cleaning Up Disk Quotas
Figure 5.107 Exceeding Your Quota Limit
301_BD_W2k3_05.qxd 5/12/04 12:32 PM Page 186

Remote Storage will not install.

Remote Storage is not finding a valid media type.

Files can no longer be recalled from Remote Storage.
Remote Storage Will Not Install
Remote Storage is not installed by default.You must add it through Control Panel | Add or
Remove Programs.You must have administrative rights on the machine on which you are
installing Remote Storage. Without administrative rights, setup will not continue.
Remote Storage Is Not Finding a Valid Media Type
During initial setup, Remote Storage searches for an available media type. If Remote Storage is not
finding one on your machine, you either have not waited long enough for Remote Storage to finish
searching or you do not have a compatible library.


Files Can No Longer Be Recalled from Remote Storage
Remote Storage has a runaway recall limit to deny recalling files from storage more than a specified
number of times in a row. It is possible that you have an application that is making too many recalls.
Once this threshold is crossed, future recalls are denied. If the recalls are legitimate, you can increase
the threshold for the runaway recall limit. If they are not valid, then you need to terminate the
application making the request.
Troubleshooting RAID
When troubleshooting RAID volumes, you must first troubleshoot the disk itself. So always start
with the basic disk and dynamic disk checklist. However, there are times when the problem is with
the RAID volume itself and not the underlying disk.This section covers the following:

Mirrored or RAID-5 volume’s status is Data Not Redundant.

Mirrored or RAID-5 volume’s status is Failed Redundancy.

Mirrored or RAID-5 volume’s status is Stale Data.
Mirrored or RAID-5 Volume’s Status is Data Not Redundant
A Data Not Redundant status indicates that your volume is not intact.This is due to moving disks
from one machine to another without moving all the disks in the volume. Wait to import your disk
until you have all the disks in the volume physically connected to the server.Then when you import
them, Windows will see them as a complete volume and retain their configuration.
Mirrored or RAID-5 Volume’s Status is Failed Redundancy
Failed Redundancy, as shown in Figures 5.108 and 5.109, occurs when one of the disks in a fault-tol-
erant volume fails.Your volume will continue to work, but it is no longer fault tolerant. If another disk
fails, you will lose all your data on that volume.You should repair the failed disk as quickly as possible.
Managing Physical and Logical Disks • Chapter 5 187
301_BD_W2k3_05.qxd 5/12/04 12:32 PM Page 187
Your mirrored volume will need to be recreated after replacing the disk. Right-click the defec-
tive disk and select Remove Mirror.Then right-click the working disk and select Add Mirror,
selecting the new disk as the mirror.To repair the RAID-5 volume, put in the disk and right-click

the volume and choose Repair RAID-5 Volume.
Mirrored or RAID-5 Volume’s Status is Stale Data
Stale data occurs when a volume’s fault-tolerant information is not completely up to date.This hap-
pens in a mirrored volume if something has been written to the primary disk, but for whatever
reason it hasn’t made it to the mirror disk yet.This occurs in a RAID-5 volume when the parity
information isn’t up to date.
If you try to move a volume while it contains stale information, you will get a status of Stale
Data when you try to import the disk. Move the disk back to the machine it was originally in and
rescan the machine for new disk. After all the disks are discovered, wait until they say online and
healthy before you try to move them again.
188 Chapter 5 • Managing Physical and Logical Disks
Figure 5.108 Recovering a Failed Mirrored Volume
Figure 5.109 Recovering a Failed RAID-5 Volume
301_BD_W2k3_05.qxd 5/12/04 12:32 PM Page 188
Implementing Windows
Cluster Services and Network
Load Balancing
In this chapter:

Making Server Clustering Part of Your High-Availability Plan

Making Network Load Balancing Part of Your High-
Availability Plan
Introduction
Fault tolerance generally involves redundancy; for example, in the case of disk fault toler-
ance, multiple disks are used.The ultimate in fault tolerance is the use of multiple servers,
configured to take over for one another in case of failure or to share the processing load.
Windows Server 2003 provides network administrators with two powerful tools to
enhance fault tolerance and high availability: server clustering (only in the Enterprise and
Datacenter Editions) and Network Load Balancing (included in all editions).

This chapter looks first at server clustering and shows you how to make clustering
services part of your enterprise-level organization’s high-availability plan. We’ll start by
introducing you to the terminology and concepts involved in understanding clustering.
You’ll learn about cluster nodes, cluster groups, failover and failback, name resolution as
it pertains to cluster services, and how server clustering works. We’ll discuss three cluster
models: single-node, single quorum device, and majority node set.Then we’ll talk about
cluster deployment options, including N-node failover pairs, hot standby server/N+1,
failover ring, and random.You’ll learn about cluster administration, and we’ll show you
how to use the Cluster Administrator tool as well as command-line tools.
Next, we’ll discuss best practices for deploying server clusters.You’ll learn about hard-
ware issues, especially those related to network interface controllers, storage devices,
power-saving features, and general compatibility issues. We’ll discuss cluster network con-
figuration and you’ll learn about multiple interconnections and node-to-node communi-
cations. We’ll talk about the importance of binding order, adapter settings, and TCP/IP
settings. We’ll also discuss the default cluster group. Next, we’ll move onto the subject of
security for server clusters.This includes physical security, public/mixed networks, private
Chapter 6
189
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 189
networks, secure remote administration of cluster nodes, security issues involving the cluster service
account, and how to limit client access. We’ll also talk about how to secure data in a cluster, how to
secure disk resources, and how to secure cluster configuration log files.
The next section addresses how to make Network Load Balancing (NLB) part of your high-avail-
ability plan. We’ll introduce you to NLB concepts such as hosts/default host, load weight, traffic distri-
bution, convergence, and heartbeats.You’ll learn about how NLB works and the relationship of NLB
to clustering. We’ll show you how to manage NLB clusters using the NLB Manager tool, remote-
management tools, and command-line tools. We’ll also discuss NLB error detection and handling.
Next, we’ll move onto monitoring NLB using the NLB Monitor Microsoft Management Console
(MMC) snap-in or the Windows Load Balancing Service (WLBS) cluster control utility. We discuss
best practices for implementing and managing NLB, including issues such as multiple network

adapters, protocols and IP addressing, and NLB Manager logging. Finally, we’ll address NLB security.
Making Server Clustering
Part of Your High-Availability Plan
Certain circumstances require an application to be operational more than standard hardware would
allow. Databases and mail servers often have this need. Using server clustering, it is possible to have
more than one server ready to run critical applications. Server clustering also provides the capability
to automatically manage the operation of the application so that if one server experienced a failure
another server would automatically take over and keep the application running. Server clustering is
a critical component in a high-availability plan. We’ll discuss high-availability strategies in the next
chapter.
The basic idea of server clustering has been around for many years on other computing plat-
forms. Microsoft initially released its server cluster technology as part of Windows NT 4.0
Enterprise Edition. It supported two nodes and a limited number of applications. Server clustering
was further refined with the release of Windows 2000 Advanced and Datacenter Server Editions.
Server clusters were simpler to create, and more applications were available. In addition, some pub-
lishers began to make their applications “cluster-aware,” so that their applications installed and oper-
ated more easily on a server cluster. Now, with the release of Windows Server 2003, we see another
level of improvement on the server clustering technology. Server clusters now support much larger
clusters and more robust configurations. Server clusters are easier to create and manage. Features that
were available only in the Datacenter Edition of Windows 2000 have now been made available in
the Enterprise Edition of Windows Server 2003.
Terminology and Concepts
Although it has been used previously, a more formal definition of a server cluster is needed. For our
purposes, a server cluster is a group of independent servers that work together to increase application
availability to client systems and appear to clients under one common name.The independent
servers that make up a server cluster are individually called nodes. Nodes in a server cluster monitor
each other’s status through a communication mechanism called a heartbeat.The heartbeat is a series
of messages that allow the server cluster nodes to detect communication failures and, if necessary,
190 Chapter 6 • Implementing Windows Cluster Services and Network Load Balancing
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 190

perform a failover operation. A failover is the process by which resources are stopped on one node
and started on another.
Cluster Nodes
A server cluster node is an independent server.This server must be running Windows 2000
Advanced Server, Windows 2000 Datacenter Server, Windows Server 2003 Enterprise Edition, or
Windows Server 2003 Datacenter Edition.The two editions of Windows Server 2003 cannot be
used in the same server cluster, but either can exist in a server cluster with a Windows 2000
Advanced Server node. Since Windows Server 2003 Datacenter Edition is available only through
original equipment manufacturers (OEMs), this chapter deals with server clusters constructed with
the Advanced Server Edition of Windows Server 2003 unless specifically stated otherwise.
A server cluster node should be a robust system. When designing your server cluster, do not
overlook applying fault-tolerant concepts to the individual nodes. Using individual fault-tolerant
components to build fault-tolerant nodes to build fault-tolerant server clusters can be described as
“fault tolerance in depth.”This approach will increase overall reliability and make your life easier.
A server cluster consists of anywhere between one and eight nodes.These nodes do not neces-
sarily need to have identical configurations, although that is a frequent design element. Each node in
a server cluster can be configured to have a primary role that is different from the other nodes in
the server cluster.This allows you to have overall better utilization of the server cluster if each node
is actively providing services. A node is connected to one or more storage devices, which contain
disks that house information about the server cluster. Each node also contains one or more separate
network interfaces that provide client communications and support heartbeat communications.
Cluster Groups
The smallest unit of service that a server cluster can provide is a resource. A resource is a physical or
logical component that can be managed on an individual basis and can be independently activated
or deactivated (called bringing the resource online or offline). A resource can be owned by only one
node at a time.
There are several predefined (called “standard”) types of resources known to Windows Server
2003. Each type is used for a specific purpose.The following are some of the most common stan-
dard resource types:


Physical Disk Represents and manages disks present on a shared cluster storage device.
Can be partitioned like a regular disk. Can be assigned a drive letter or used as an NTFS
mounted drive.

IP Address Manages an IP address.

Network Name Manages a unique NetBIOS name on the network, separate from the
NetBIOS name of the node on which the resource is running.

Generic Service Manages a Windows operating system service as a cluster resource.
Helps ensure that the service operates in one place at one time.

Generic Script Manages a script as a cluster resource (new to Windows Server 2003).

File Share Creates and manages a Windows file share as a cluster resource.
Implementing Windows Cluster Services and Network Load Balancing • Chapter 6 191
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 191
Other standard resource types allow you to manage clustered print servers, Dynamic Host
Configuration Protocol (DHCP) servers, Windows Internet Name Service (WINS) servers, and
generic noncluster-aware applications. (It is also possible to create new resource types through the
use of dynamic link library files.)
Individual resources are combined to form cluster groups. A cluster group is a collection of server
resources that defines the relationships of resource within the group to each other and defines the
unit of failover, so that if one resource moves between nodes, all resources in the group also move.
As with individual resources, a cluster group can be owned by only one node at a time.To use an
analogy from chemistry, resources are atoms and groups are compounds.The cluster group is the
primary unit of administration in a server cluster. Similar or interdependent resources are combined
into the same group. A resource cannot be dependent on another resource that is not in the same
cluster group. Most cluster groups are designed around either an application or a storage unit. It is in
this way that individual applications or disks in a server cluster are controlled independently of other

applications or disks.
Failover and Failback
If a resource on a node fails, the cluster service will first attempt to reactivate the resource on the
same node. If unable to do so, the cluster service will move the cluster group to another node in the
server cluster.This process is called a failover. A failover can be triggered manually by the adminis-
trator or automatically by a node failure. A failover can involve multiple nodes, if the server cluster is
configured this way, and each group can have different failover policies defined.
A failback is the corollary of a failover. When the original node that hosted the failed-over
resource(s) comes back online, the cluster service can return the cluster group to operation on the
original node.This failback policy can be defined individually for a cluster group or disabled
entirely. Failback is usually performed at times of low utilization to avoid impacting clients, and it
can be set to follow specific schedules.
Cluster Services and Name Resolution
A server cluster appears to clients as one common name, regardless of the number of nodes in the
server cluster. It is for this reason that the server cluster name must be unique on your network.
Ensure that the server cluster name is different from the names of other server clusters, domain
names, servers, and workstations on your network.The server cluster will register its name with the
WINS and DNS servers configured on the node running the default cluster group.
Individual applications that run on a server cluster can (and should) be configured to run in
separate cluster groups.The applications must also have unique names on the network and will also
automatically register with WINS and DNS. Do not use static WINS entries for your resources.
Doing so will prevent an update to the WINS registered address in the event of a failover.
How Clustering Works
Each node in a server cluster is connected to one or more storage devices.These storage devices
contain one or more disks. If the server cluster contains two nodes, you can use either a SCSI inter-
face to the storage devices or a Fibre Channel interface. For three or more node server clusters,
192 Chapter 6 • Implementing Windows Cluster Services and Network Load Balancing
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 192
Fibre Channel is recommended. If you are using a 64-bit edition of Windows Server 2003, Fibre
Channel is the required interface, regardless of the number of nodes.

Fibre Channel has many benefits over SCSI. Fibre Channel is faster and easily expands beyond
two nodes. Fibre Channel cabling is simpler, and Fibre Channel automatically configures itself.
However, Fibre Channel is also more expensive than SCSI, requires more components, and can be
more complicated to design and manage.
On any server cluster, there is something called the quorum resource.The quorum resource is used
to determine the state of the server cluster.The node that controls the quorum resource controls the
server cluster, and only one node at a time can own the quorum resource.This prevents a situation
called split-brain, which occurs when more than one node believes it controls the server cluster and
behaves accordingly. Split-brain was a problem that occurred in the early development of server
cluster technologies.The introduction of the quorum resource solved this problem.
Cluster Models
There are three basic server cluster design models available to choose from: single node, single
quorum, and majority node set. Each is designed to fit a specific set of circumstances. Before you
begin designing your server cluster, make sure you have a thorough understanding of these models.
Single Node
A single-node server cluster model is primarily used for development and testing purposes. As its
name implies, it consists of one node. An external disk resource may or may not be present. If an
external disk resource is not present, the local disk is configured as the cluster storage device, and
the server cluster configuration is kept there.
Failover is not possible with this server cluster model, because there is only one node. However,
as with any server cluster model, it is possible to create multiple virtual servers. (A virtual server is a
cluster group that contains its own dedicated IP address, network name, and services and is indistin-
guishable from other servers from a client’s perspective.) Figure 6.1 illustrates the structure of a
single-node server cluster.
Implementing Windows Cluster Services and Network Load Balancing • Chapter 6 193
Figure 6.1 Single Node Server Cluster
Storage
Node
Network
Virtual Server

Virtual Server
Virtual Server
. . .
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 193
If a resource fails, the cluster service will attempt to automatically restart any applications and
dependent resources.This can be useful when applied to applications that do not have built-in
restart capabilities but would benefit from that capability.
Some applications that are designed for use on server clusters will not work on a single-node
cluster model. Microsoft SQL Server and Microsoft Exchange Server are two examples.Applications
like these require the use of one of the other two server cluster models.
Single Quorum Device
The single quorum device server cluster model is the most common and will likely continue to be the
most heavily used. It has been around since Microsoft first introduced its server clustering technology.
This type of server cluster contains two or more nodes, and each node is connected to the
cluster storage devices.There is a single quorum device (a physical disk) that resides on the cluster
storage device.There is a single copy of the cluster configuration and operational state, which is
stored on the quorum resource.
Each node in the server cluster can be configured to run different applications or to act simply
as a hot-standby device waiting for a failover to occur. Figure 6.2 illustrates the structure of a single
quorum device server cluster with two nodes.
Majority Node Set
The majority node set (MNS) model is new in Windows Server 2003. Each node in the server
cluster may or may not be connected to a shared cluster storage device. Each node maintains its
own copy of the server cluster configuration data, and the cluster service is responsible for ensuring
that this configuration data remains consistent across all nodes. Synchronization of quorum data
occurs over Server Message Block (SMB) file shares.This communication is unencrypted. Figure 6.3
illustrates the structure of the MNS model.
194 Chapter 6 • Implementing Windows Cluster Services and Network Load Balancing
Figure 6.2 Single Quorum Device Server Cluster
Node

Public Network
Virtual Server
Virtual Server
Virtual Server
. . .
Node
S
Quorum Logical Disk Logical Disk
Interconnect Network
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 194
This model is normally used as part of an OEM pre-designed or pre-configured configuration.
It has the ability to support geographically distributed server clusters. When used in geographically
dispersed configurations, network latency becomes an issue.You must ensure that the round-trip
network latency is a maximum of 500 milliseconds (ms), or you will experience availability problems.
The behavior of an MNS server cluster differs from that of a single quorum device server
cluster. In a single quorum device server cluster, one node can fail and the server cluster can still
function.This is not necessarily the case in an MNS cluster.To avoid split-brain, a majority of the
nodes must be active and available for the server cluster to function. In essence, this means that 50
percent plus 1 of the nodes must be operational at all times for the server cluster to remain opera-
tional.Table 6.1 illustrates this relationship.
Table 6.1 Majority Node Set Server Cluster Failure Tolerance
Maximum Node Failures
Number of Nodes in MNS before Complete Cluster Nodes Required to Continue
Server Cluster Failure Cluster Operations
10 1
20 2
31 2
41 3
52 3
62 4

73 4
83 5
Implementing Windows Cluster Services and Network Load Balancing • Chapter 6 195
Figure 6.3 A Majority Node Set Server Cluster
Node
Public Network
Node
Quorum Quorum Quorum
Node NodeNode
Quorum Quorum
301_BD_W2k3_06.qxd 5/13/04 3:06 PM Page 195

×