Tải bản đầy đủ (.pdf) (50 trang)

Tài liệu Module 1: Introduction to Designing a Highly Available Web Infrastructure pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.43 MB, 50 trang )






Contents
Overview 1
Lesson: Identifying the Features of a
Highly Available Web Infrastructure 3
Lesson: Calculating System Availability 11
Lesson: Supporting a Highly Available
Web Infrastructure 22

Module 1: Introduction
to Designing a Highly
Available Web
Infrastructure



Information in this document, including URL and other Internet Web site references, is subject to
change without notice. Unless otherwise noted, the example companies, organizations, products,
domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious,
and no association with any real company, organization, product, domain name, e-mail address,
logo, person, place or event is intended or should be inferred. Complying with all applicable
copyright laws is the responsibility of the user. Without limiting the rights under copyright, no
part of this document may be reproduced, stored in or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or
otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual


property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.

 2001 Microsoft Corporation. All rights reserved.

Microsoft, MS-DOS, Windows, Windows NT, Active Directory, BackOffice, FrontPage, Outlook,
PowerPoint, Visio, Visual Studio, Win32, and Windows Media are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their
respective owners.


Module 1: Introduction to Designing a Highly Available Web Infrastructure iii


Instructor Notes
This module provides the students with an overview of the features of a highly
available Web infrastructure, how to calculate availability, and what processes
they can use to support a highly available Web infrastructure.
As a Web infrastructure designer, the student will be required to position
mission-critical applications and services in their highly available Web
infrastructure and identify the requirements for each of these applications and
services. The student must ensure availability if an application or service fails
by identifying the possible single points of failure in the infrastructure.
After completing this module, students will be able to:
!
Identify the characteristics of a highly available Web infrastructure.
!

Calculate the availability of a Web infrastructure.
!
Analyze the solution architecture.

To teach this module, you need the following materials:
!
Microsoft
®
PowerPoint
®
file 2088A_01.ppt
!
Delivery Guide
!
Trainer Materials compact disc

To prepare for this module:
!
Read all of the materials for this module.
!
Complete the practice and discussion.
Presentation:
135 minutes

Practice:
15 minutes

Discussion:
30 minutes
Required materials

Preparation tasks
iv Module 1: Introduction to Designing a Highly Available Web Infrastructure


How to Teach This Module
You will notice a new instructional design and strategy for this module and the
rest of the modules in this course. The modules were designed by analyzing the
job functions of a Web infrastructure designer, which then became the modules
titles. For each job function, the applicable tasks were then identified. Each task
is a lesson in the module.
At the end of each lesson (task), there is a guidelines page for the task that
identifies the actions steps that the students must address to complete the task.
To further assist the student in applying what they have learned, there are best
practices where appropriate for each topic on the guidelines page.
You may find it useful to tell the students that for each lesson the guidelines
page is what they will work toward. Each of the action steps on the guidelines
page will have supporting content on the topic pages if this is required. You
may find that some of the action steps will not require additional information
because it is assumed that the student already has the prerequisite knowledge
for that step.
Ensure that the students understand that each lesson is a critical task in the
design process and, at the end of the module; they will complete a lab that helps
to tie all of the lessons (tasks) together. This understanding will help the student
to stay focused during instruction.
The instructional strategy for this course is to introduce the students to the
concepts of a highly available Web infrastructure, how they assess the
availability of their Web infrastructure, and the processes that they need to
incorporate into their Web design.
Lesson: Identifying the Features of a Highly Available Web
Infrastructure

The overview page for this lesson introduces the characteristics of a highly
available Web infrastructure. It also provides a brief introduction to how the
student can make the front-end and back-end systems more highly available.
Their ultimate goal is to identify and eliminate single points of failure.
The topic pages for this lesson and the appropriate instructional strategies are
listed as follows:
This page provides a high level introduction to how the student can increase
availability and where they should concentrate their efforts. The content is self
explanatory and does not require that you spend a lot of time on each subtopic.
The overall goal of the page is to identify the following features of a highly
available Web infrastructure and how they can improve availability:
!
Increasing availability
!
Networking infrastructure
!
Security mechanisms
!
Management and operations

A highly available Web
infrastructure
Module 1: Introduction to Designing a Highly Available Web Infrastructure v


This page provides an overview of load-balancing technologies. The students
should already be familiar with the concepts (prerequisite knowledge) and you
do not need to spend more time than it takes to read the slide and add a brief
comment. To prepare for this page, you should have taken Course 2087A,
Implementing Microsoft

®
Windows
®
2000 Clustering, or be familiar with:
!
Software load balancing
!
Hardware load balancing
!
Round robin DNS

This page provides an overview of clustering technologies. The students are
expected to know this content. You should have taken Course 2087A,
Implementing Microsoft Windows 2000 Clustering, or be familiar with:
!
Component load balancing
!
Microsoft Cluster service
!
Network Load Balancing

You should be familiar with all of the Microsoft products that are part of a
highly available Web infrastructure, including:
!
Microsoft Application Center 2000
!
Microsoft Windows 2000 Datacenter Server
!
Microsoft Internet Information Services 5.0
!

Microsoft Internet Security and Acceleration Server
!
Microsoft SQL Server

2000

The guidelines page provides the students with the actions steps that they must
address before they can identify the features of a highly available Web
infrastructure. You should review the action steps with the students and ensure
that they understand how these steps map to the task.
Lesson: Calculating System Availability
The overview page for this lesson introduces the concept of calculating
availability. The students will need to understand how to calculate the mean
time between failure, mean time to repair, and system availability.
The topic pages for this lesson and the appropriate instructional strategies are
listed as follows.
This page defines a highly available system in terms of a level of nines
(availability class). The content on this page is self explanatory and is primarily
prerequisite knowledge on:
!
Service level agreements
!
Availability classes

Most of the content on this page is a review for the students. Of special interest
is a reference table that lists the recommended Microsoft clustering solution
used to eliminate single points of failure.
• Common points of failure
Load-balancing
technologies

Clustering technologies
Overview of Microsoft
products
Guidelines for
identifying the
components of a highly
available Web
infrastructure
A highly available
system
Single points of failure
vi Module 1: Introduction to Designing a Highly Available Web Infrastructure


This page introduces the importance of predicting hardware reliability. The
slide graphic identifies the three phases of a hardware component’s life cycle.
You must understand and adequately explain the dynamics of the three phases
of a hardware component’s life cycle.
• Mean time between failure

This page provides formulas for calculating the mean time to repair (MTTR), or
the length of time it would take to repair a failed component. You need to
explain the importance of MTTR so that the students will understand how they
can apply this knowledge when they calculate system availability.
• Calculating MTTR

Students will learn how to calculate overall system availability. After students
have determined the mean time between failure and the mean time to repair,
they can then determine system availability by using the formula provided.
Some examples are included that require only a brief explanation.

!
Calculating availability
!
Calculate maximum availability

The guidelines page provides the students with the action steps that they must
address before they can calculate availability. Each action step includes a single
best practice that the student should consider. Review the action steps with the
students and ensure that they understand how these steps relate to the task.
Practice: Calculating Availability
Students are required to read two scenarios and then use the formulas from the
lesson to calculate availability. They should be able to complete this exercise
with very little guidance.
Lesson: Supporting a Highly Available Web Infrastructure
The overview page for this lesson introduces students to the components of a
highly available Web infrastructure, which include program design, hardware,
delivery mechanisms, and sustaining engineering.
The topic pages for this lesson and the appropriate instructional strategies are
listed as follows:
This page gives the students a high level overview of Microsoft Operations
Framework (MOF). It identifies the three main sources of risks that the students
must analyze and prioritize in their Web design. Review each of the subtopics
and highlight the bullet points for each.
!
People, process, and technology
!
Process model
!
Team model
!

Technology component
!
Risk analysis model

Hardware Reliability
Calculate mean time to
repair
Calculate system
availability
Guidelines for
calculating system
availability
Microsoft Operations
Framework
Module 1: Introduction to Designing a Highly Available Web Infrastructure vii


You can use the slide to provide a brief overview of the content. The student
should already be very familiar with the concepts presented on this page.
!
Development strategies
!
Windows DNA
!
Two-tier model
!
Three-tier model
!
N-tier model


This page addresses the common causes for unplanned downtime and some best
practices for maintaining high availability. You need to encourage the student
to develop an infrastructure process for ensuring that their Web infrastructure
will always be highly available.
This is the only section that has a high level overview of the change
management process. Review the steps of the change management process that
are outlined on the page. You might want to ask the students what process they
use in their organizations and how it compares to what is suggested here.
• Change management process

This page provides an overview of the types of documentation that are
recommended as part of the design process. Review the types of
documentation, and if time permits, ask the students what types of
documentation they are currently using and why.
This page provides an overview of the types of security policies and procedures
that the student should consider while designing a highly available Web
infrastructure.
!
Admitting personnel
!
Admitting hardware

This page provides a high level overview of the technology considerations and
some best practices for each.
!
Hardware
!
Software
!
Testing


This page provides a high level overview of facility considerations. You can
use the slide to discuss all of the major points. Be sure to review the best
practices with the students.
!
Facilities management
!
Assessing single points of failure
!
Maintaining physical security
!
Best practices

Application architecture
fundamentals
Infrastructure processes
Change management
process
Highly available solution
documentation
Security policies and
procedures
Technology
considerations
Facility considerations
viii Module 1: Introduction to Designing a Highly Available Web Infrastructure


The guidelines page provides the students with the actions steps that they must
address before they can create a design that supports a highly available Web

infrastructure. Review the action steps with the students and ensure that they
understand how these steps relate to the task. Tell the students that these steps
were not placed in any specific order. They can reorder them as required to
meet their particular design. Emphasize to the students the importance of
addressing all of these requirements.
Discussion
Allow thirty minutes for the discussion at the end of this lesson. Give the
students approximately 5 minutes to read the scenario, 5 minutes to answer the
five questions, and the remaining 20 minutes to discuss the answers as a class.
Divide the class into design teams of three or more people. Try to learn who the
more and less experienced students in the class are so that you can create evenly
skilled teams.
Let the students know that each team will present and justify their answers to
the rest of the class.

Guidelines for
calculating system
availabilit
y

Module 1: Introduction to Designing a Highly Available Web Infrastructure 1


Overview
!
Identifying the Features of a Highly Available Web
Infrastructure
!
Calculating System Availability
!

Supporting a Highly Available Web Infrastructure

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
As the Web infrastructure designer, you will be required to position mission-
critical applications and services in your highly available Web infrastructure
and identify the requirements for each of these applications and services. You
must also identify the possible single points of failure in the infrastructure so
that you can ensure availability if an application or service fails.
As you design your Web infrastructure solution, you need to consider the
following questions:
!
What are the specific requirements for your clustering solution? Determine
whether you need to coordinate with other hosts or nodes in a cluster or
provide data backup or content replication.
!
What are your software compatibility issues? Determine whether the
clustering technology can support your Web applications or if the
applications are stateless, or they maintain a client-side state.
!
What are your hardware compatibility issues? Determine whether the
hardware components that you want to use are on the Hardware
Compatibility List (HCL).
!
What are the expected changes in size and performance requirements for
your applications and services? Using business requirements, anticipate the
projected growth for your Web infrastructure solution.

!
What is the impact of downtime to the organization? Determine the potential
impact to the organization if the system cannot maintain a specific level of
availability. Impacts to the business can include a decline in customer
satisfaction, loss of competitiveness, or increase in costs.

Introduction
2 Module 1: Introduction to Designing a Highly Available Web Infrastructure


While you are designing your network infrastructure, you use hardware data to
document your infrastructure’s physical structure; you then use the physical
structure to document the positioning of services and the configuration of the
protocols that you will use in your network. You also need to document the
organization of your logical network, name and address resolution methods, and
the positioning and configuration of services that you use.
This course teaches principles that are used to design a highly available Web
infrastructure solution by using Microsoft applications, services, and
technologies. It also provides guidelines that you can use as you design your
physical and logical networks. The lesson practices and module labs allow you
to apply what you have learned. In addition, you can use the worksheet at the
end of each module as a job aid.

Planning strategies for a cluster backup and recovery and for site disaster
recovery are beyond the scope of this course.

After completing this module, you will be able to:
!
Identify the features and components of a highly available Web
infrastructure.

!
Calculate the availability and reliability of a highly available Web
infrastructure.
!
Identify the processes that are required to support a highly available Web
infrastructure.

Physical and logical
networks
Note
Module ob
j
ectives
Module 1: Introduction to Designing a Highly Available Web Infrastructure 3


Lesson: Identifying the Features of a Highly Available
Web Infrastructure
!
A Highly Available Web Infrastructure
!
Load-Balancing Technologies
!
Microsoft Clustering Technologies
!
Guidelines for Identifying the Features of a Highly
Available Web Infrastructure

*****************************
ILLEGAL FOR NON

-
TRAINER USE
******************************
A highly available Web infrastructure is a Web infrastructure that ensures high
performance and fault tolerance for mission-critical Web applications and
services.
You can make the front-end systems on the Web tier highly available through
the use of multiple hosts in a cluster configuration, with each host offering a
single Internet Protocol (IP) address to its clients. You can use Network Load
Balancing to distribute the load across the hosts.
The servers on your back-end systems must maintain high availability for
databases. You can make the back-end systems highly available through the use
of failover clustering for each data partition.
To ensure high availability you must be able to identify and eliminate single
points of failure in your Web infrastructure solution.
After completing this lesson, you will be able to:
!
Identify the components, security mechanisms, and management and
operations features of a highly available Web infrastructure.
!
Identify load-balancing technologies that you can use when designing a
highly available Web infrastructure.
!
Identify Microsoft clustering technologies that you can use when designing
a highly available Web infrastructure.
!
Identify the features of a highly available Web infrastructure.

Introduction
Lesson objectives

4 Module 1: Introduction to Designing a Highly Available Web Infrastructure


A Highly Available Web Infrastructure
ISP2
ISP1
ISP2
Sto r age Far m
ISP1
D
S
3
NOTES
Switch Subnets
VLAN 11: 192.168.11.0/24 DMZ Backend
VLAN 12: 192.168.12.0/24 Data / Mangement
VLAN 13: 192.168.13.0/24 A pplic ation / Infrastructure
VLAN 14: 192.168.14.0/24 RIM
VLAN 15: 192.168.15.0/24 Firewall
VLAN 16: 208.21 7.184.176 Perimeter
VLAN 21: 192.168.21.0/24 Internal ISA / DNS / SMTP / FTP
VLAN 22: 192.168.22.0/24 E xternal IIS01and IIS02 Clus ter
VLAN 23: 192.168.23.0/24 External IIS03 and IIS04 Cluster
Switch
IIS03 IIS04 II S05
192.168.23.13
192.168.23.14
DM Z
VLAN
192.168.22.12

Infrastructure VLAN
192.168.13.20
192.168.13.21
192.168.13.42
192.168.13.41
192.168.13.46
AD0
1
DNS
AD0
2
DNS
BTS02BTS01
192.168.12.12
192.168.21.100
Swit ch
ISA NLBS Cluster
192.168.12.11
Vl an 21
Vlan 15
Vlan 13
Management Servers
Tools
Vlan 11
IIS02IIS01
192.168.22.11
192.168.11.11
192.168.11.12
192.168.11.13
192.168.11.14

192.168.11.15
STG02
STG03
VPN N etwor k
BizTalk Outbound
HTTP/S
Vlan 12
192.168.12.40
192.168.12.43
192 .168. 12.24
192.168.12.49
192.168.12.44
BCK01
192.168.13.31
192.168.12.23
192.168.21.253
192.168.15.253
192.168.12.253
192.168.13.253
192.168.11.254
192. 1 68. 15.2 54
T
1
Router
Router
192.168.10.253
192.168.15.253
192.168.13.253
192.168.12.253
Managem ent

Car d
VLAN 14
Corporate
Network
VPN02
IS A
Fiber Switch
Bay Networks
SQL02
Tap e
System
Storage Area Network
VLAN 14
Manageme nt
Card
192.168.12.25
ISA01
ISA02
Fi rew a ll
VPN01
ISA
DNS02
SMTP
FT P
DNS01
SMTP
FT P
192.168.21.50
192.168.21.51
192.168.12.51

192.168.12.50
S
e
c
u
r
e
d

C
o
n
n
e
c
t
i
o
n
MAI L0 1
192.168.13.49
BTSORC01 BTSORC02
192.168.13.48
STG01
192.168.13.30
Stor age
System
192.168.12.26
192.168.12.14
192.168.12.15

192.168.12.41
192.168.12.42
192.168.11.16
192.168.11.17
NLBS
Int VIP: 192.168.22.20
GW: 192.168.22.253
NLBS
Int VI P: 192 .16 8.23. 21
GW: 192.168.23.253
MGMT
02
MGMT
01
BIZDSK02
BIZD SK01
SQL01
SQL05
(Warm Standby)
SQL03PR E01
RPT01
(reporting)
SQL04
SQL
Cluster -1/HB
192.168.12.10
SQL
Cluster -2/HB
192.168.12.13
192.168.14.253

192.168.14.253
Fire wall
192.168.21.101
BTSORC
192.168.13.47
HSRP
GW: 192. 168.21.253
192.168.23.253
192.168.22.253
192.168.23.253
192.168.22.253
Vl an 22 Vlan 23
192.168.23.15
I
D
S
I
D
S
GW: 192.168.21.253
NLB Cluster
192.168.12.45

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
Availability for a Web infrastructure is a measure of fault tolerance for a
computer, cluster, or system, and its applications. This measure takes into

account both the mean time between failures (MTBF) and the mean time to
repair (MTTR), and it may include downtime for both planned and unplanned
events.
The graphic in the slide gives an example of a highly available Web
infrastructure that includes the following products and technologies:
!
Microsoft
®
Internet Security and Acceleration (ISA) Server 2000 is used to
provide firewall capabilities.
!
Network Load Balancing and Server Clusters are used to provide a greater
level of reliability.
!
Failover networking components are used to provide redundant paths.
!
Management devices are used to administer the entire Web infrastructure.

The primary technique for increasing the availability of a site is to add
redundant components. You can use these redundant components to create
multiple communications paths, multiple servers that offer the same service,
and standby servers that fail over in the event of a server failure.
For example, to create complete redundancy in a large site, you would use
multiple Web clusters, and you would configure each server as a failover
cluster by using Microsoft Cluster service. In addition, you would ensure that
your site had connections to multiple Internet services providers (ISPs) and a
separate management network.
Your networking infrastructure and the connectivity of your site to the Internet
must be available continuously. It is recommended that your design include
multiple connections to the Internet by using multiple ISPs. For the highest

availability, you also need to consider using multiple power feeds and
redundant uninterruptible power sources.
Introduction
Increasing availability
Networking
infrastructure
Module 1: Introduction to Designing a Highly Available Web Infrastructure 5


Security involves managing risks by providing adequate protection for the
confidentiality, privacy, integrity, and availability of information. You can meet
these objectives by using security mechanisms and services, such as encryption,
authentication, authorization, accountability, and administration. However,
because security mechanisms are never perfect, you also need to use detection
mechanisms (monitoring and auditing) that generate alarms or other
notifications when an external attack occurs.
Site security is not an add-on feature. You must plan security in advance and
base it on your assessment of risks and the costs to implement protection. The
security domain model is a valuable tool to ensure that adequate, cost-effective
security is implemented throughout the site.
Integrated management and deployment tools allow smooth and rapid growth of
Web sites. The management system itself must be highly available to ensure
continuous operations. A well-designed management operations solution will:
!
Separate the management network from service networks for high
availability and increased security.
!
Distribute management network components to:
• Eliminate or reduce performance bottlenecks.
• Eliminate single points of failure.

• Allow independent scaling.
• Increase availability of the management system.
• Employ Microsoft tools and products where possible to achieve greater
performance due to tight integration with the underlying platform.
• Automate tasks where possible.
• Monitor everything to improve infrastructure performance and identify
problems before they occur.

Security mechanisms
Management and
operations
6 Module 1: Introduction to Designing a Highly Available Web Infrastructure


Load-Balancing Technologies
!
Software load balancing
!
Hardware load balancing
!
Round robin DNS

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
Load balancing is a technology that you can use to design a highly available
Web infrastructure. Three primary methods of load balancing are:
!

Load-balancing software solutions.
!
Load-balancing hardware solutions, such as Cisco LocalDirector, F5
Network BIG-IP, and Alteon Websystems ACEdirector.
!
Round robin Domain Name System (DNS).

Many software-based load balancing products employ dispatcher-based models
for load balancing. These central dispatching models, whether implemented by
network address translation (NAT) or other methods, such as Hypertext
Transfer Protocol (HTTP) redirects, introduce an overhead which limits
throughput, restricts performance, and introduces a single point of failure. Also,
the cluster’s throughput is limited by the speed and processing power of the
dispatch server.
Newer software load balancing models, such as Microsoft Network Load
Balancing, use a fully distributed software load balancing model to avoid
dependency on a single server or node. Unlike dispatcher-based load balancers,
where performance is limited by the central server, Network Load Balancing
performance scales with the speed of the cluster member hosts and local area
network (LAN). This scalability ensures that these load balancing software
models will never become a bottleneck as LAN and host speeds increase.
Introduction
Software load balancing
Module 1: Introduction to Designing a Highly Available Web Infrastructure 7


Load-balancing hardware can redirect Transmission Control Protocol/Internet
Protocol (TCP/IP) requests to multiple servers in a server farm, providing a
highly scalable, interoperable solution that is also very reliable. These load-
balancing devices sit between the connection to the Internet and the Web farm.

All requests come to the device by using a common IP address, and then the
device forwards each request to a different Web server based on various
algorithms implemented in the device. Hardware load-balancing products have
many of the same disadvantages found with software-based load-balancing
products that use the central dispatching model.
Round robin DNS allows a pool of servers to appear as a single host to the
clients. In reality, client requests are directed alternately to all servers in the
pool, and the traffic, therefore, is distributed across the servers.
Round robin DNS does not function effectively as a high-availability solution.
In the event of a server failure, round robin DNS continues to route requests to
the failed server until the server is manually removed from DNS, and even then
many users must wait for DNS to time out their connection before being able to
successfully access the target Web site.
Hardware load balancing
Round robin DNS
8 Module 1: Introduction to Designing a Highly Available Web Infrastructure


Microsoft Clustering Technologies
Network Load
Balancing
Network Load
Balancing
Component Load Balancing (COM+)
Component Load Balancing (COM+)
Cluster Service
Cluster Service
Clients
Clients
IIS Web Server or

other IP-based services
IIS Web Server or
other IP-based services
COM+ Application
Servers
COM+ Application
Servers
COM+
Components
COM+
Components
Data Servers
SQL Server, Exchange Server File
Data Servers
SQL Server, Exchange Server File

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
Microsoft clustering technologies are appropriate when you must provide high
availability solutions. For example, you would consider using a clustering
technology for an Internet server-based infrastructure supporting mission-
critical applications, such as financial transactions, database access, and other
key functions that must run 24 hours a day, 7 days a week.
Implementing a clustering technology makes it possible for you to share a
computing load over several computer systems. If any hardware or software
component in the system fails, the client will not lose access to the service or
application.

Network Load Balancing and Microsoft Cluster service are two of the
clustering technologies available in Microsoft Windows
®
2000 Advanced
Server and Microsoft Windows 2000 Datacenter Server. They provide high
availability and high scalability to IP-based applications, such as Web servers.
Microsoft Application Center 2000 extends the use of Network Load Balancing
to Microsoft Windows 2000 Server (rather than Windows 2000 Advanced
Server).
Component load balancing is a feature of Application Center 2000. This feature
distributes the workload across multiple servers, called COM+ (Component
Object Model) Application Servers, which run a site’s business logic
components. Component load balancing allows support for routing services,
which dynamically distribute instantiation requests for COM+ objects across
multiple servers based on the current performance of the individual servers in a
COM+ cluster.
Introduction
Module 1: Introduction to Designing a Highly Available Web Infrastructure 9


The Network Load Balancing service enables organizations to cluster up to 32
servers running Windows 2000 Advanced Server and Microsoft Windows 2000
Datacenter Server to distribute incoming traffic across the cluster while also
monitoring member hosts and the network. Network Load Balancing is also
supported by Application Center 2000, but the number of hosts in a cluster is
limited to 12.
Cluster service is a feature of Windows 2000 Advanced Server and
Windows 2000 Datacenter Server that allows two or four independent servers,
referred to as nodes, to be managed as a single logical entity. The objective of
Cluster service is to provide high levels of availability and scalability for

applications and data.
Network Load Balancing
Microsoft Cluster
service
10 Module 1: Introduction to Designing a Highly Available Web Infrastructure


Guidelines for Identifying the Features of a Highly Available Web
Infrastructure
!
Identify the Web solution strategies that will
meet the business requirements
!
Identify the appropriate Microsoft products and
technologies based on n-tier architecture

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
To ensure that your Web design meets your business requirements, you must
identify the features and components of a highly available Web infrastructure.
You will use these features and components to ensure that your applications and
services are always available when your customers need them.
Before you can identify the components of a highly available Web
infrastructure, you need to apply the following guidelines:
!
Identify the Web solution strategies that will meet your business
requirements.

!
Identify the appropriate Microsoft products and technologies based on n-tier
architecture.

Introduction
Design guidelines
Module 1: Introduction to Designing a Highly Available Web Infrastructure 11


Lesson: Calculating System Availability
!
A Highly Available System
!
Single Points of Failure
!
Hardware Reliability
!
Calculate Mean Time to Repair
!
Calculate System Availability
!
Guidelines for Calculating Availability

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
You can calculate the availability for your Web solution by understanding the
mean time between failures (MTBF) and the mean time to repair (MTTR) for

each of the hardware components that make up your Web solution. All of the
components in your Web solution will determine the perceived level of
availability for your clients.
When designing a highly available solution you can use the following methods:
!
Determine your required level of availability and the hardware MTBF. After
you have collected this data, you can calculate the required MTTR to meet
your target.
!
Determine the actual MTBF and MTTR for the hardware components in
your Web solution. You can then use this data to calculate the availability
that might be achieved.
!
Availability levels for components in your Web solution can vary. You may
need to consider a service level agreement (SLA) that covers only those
components where your Web design requires the fault tolerance that
multiple vendors provide.

After completing this lesson, you will be able to:
!
Identify the appropriate availability class for your highly available system.
!
Identify the single points of failure in your Web infrastructure.
!
Calculate the mean time between failures (MTBF).
!
Calculate the mean time to repair (MTTR).
!
Calculate system availability.
!

Calculate the level of availability of a highly available Web infrastructure.

Introduction
Lesson ob
j
ectives
12 Module 1: Introduction to Designing a Highly Available Web Infrastructure


A Highly Available System
Availability
9 9 9 99
Masks Some Hardware Failures
Masks Hardware Failures,
Operations Tasks (For example,. Software Upgrades)
Masks Some Software Failures
Masks Site Failures (For example, Power, Network, Fire)
Masks Some Operations Failures
Managed
Managed
Well-Managed Nodes
Well-Managed Nodes
Well-Managed Packs and Clones
Well-Managed Packs and Clones
Well-Managed Geoplex
Well-Managed Geoplex

*****************************
ILLEGAL FOR NON
-

TRAINER USE
******************************
A highly available system is one that provides a service whenever you want to
use it, based on a defined standard. You might expect your computer system to
be available 24 hours a day, seven days a week, and 365 days a year: it can
never stop working. For a system to be highly available it must be made up of
the most reliable subsystems that you can obtain and it must be fault tolerant.
A highly available system is:
!
Reliable.
!
Perceived as always available to critical users.
!
Reserved for the company’s most critical business systems.
!
A data center treated as a critical component of the company.

Many organizations advertise that they provide highly available solutions, and
they typically guarantee the operation of their systems to a specified level. They
provide a guarantee to their customers when the customers purchase an SLA for
their system. This SLA can provide for both corrective and proactive
maintenance.
Introduction
Service level
agreements
Module 1: Introduction to Designing a Highly Available Web Infrastructure 13


The following table identifies five availability classes and the corresponding
availability measurements, annual downtime, and methods for achieving each

class.
Availability
class
Availability
measurement
Annual
downtime

Achieving availability class

Two nines 99% 3.7 days You can easily maintain 99% availability with no specific
efforts on your part with Microsoft Windows 2000 and
Microsoft SQL Server

2000.
Three nines 99.9% 8.8 hours Achieving 99.9% availability typically involves better
hardware that includes some redundancy, such as redundant
array of independent disks (RAID) on the disks.
Four nines 99.99% 53 minutes Achieving 99.99% availability requires technology such as
replication, log shipping, or failover clustering and the
requisite hardware and infrastructure.
Five nines 99.999% 5.3 minutes Achieving 99.999% availability can be very difficult.
Everything that you did for 99.99% availability applies, but
you will also need to use both more expensive hardware, such
as split mirror System Area Network (SAN) solutions and geo-
clusters, and hard-core infrastructure improvements, such as
redundant power grids and multiple backup generators. In
addition, you will need to backup standby copies of the
database.


Availability classes
14 Module 1: Introduction to Designing a Highly Available Web Infrastructure


Single Points of Failure
!
Network hub
!
Network router
!
Power outage
!
Server connection
!
Disks and disk controllers
!
Other server hardware such as CPU or memory
!
Server software
!
Wide area network links

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
A single point of failure is any component in your environment that would
block data or applications if it failed. Single points of failure can be hardware,
software, or external dependencies, such as power supplied by a utility

company or dedicated wide area network (WAN) lines.
The following table lists common points of failure in a server environment and
describes whether you can protect the point of failure by using a Microsoft
clustering solution or a third-party solution.

Failure point
Clustering
solution

Other solutions

Network hub N/A Redundant networks
Network router N/A Open Shortest Path First (OSPF)
Power outage N/A Uninterruptible power supply
(UPS), power generator, and
multiple power grids
Server connection Failover N/A
Disk Failover Hardware or software RAID to
ensure against the loss of specific
data on a specific computer and to
provide uninterrupted service.

Introduction
Common points of
failure
Module 1: Introduction to Designing a Highly Available Web Infrastructure 15


(continued)


Failure point
Clustering
solution

Other solutions

Disk controller Failover Multiple controllers with external
small computer system interface
(SCSI) or fiber channel switches to
move disk drives from a failed
controller. This method can require
additional software and application
support.
Other server hardware
such as CPU or memory
Failover Spare components such as
motherboards and SCSI controllers
(any spare components need to
exactly match the original
components).
Server software such as the
operating system or
specific applications
Failover N/A
Wide area network (WAN)
links such as routers and
dedicated lines
N/A Redundant links that provide
secondary access to remote
connections.


16 Module 1: Introduction to Designing a Highly Available Web Infrastructure


Hardware Reliability
Failure
Rate
Failure
Rate
Time
Time
Burn-in
Normal
Aging
Failure
Mode
Characteristic
Hardware Reliability

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
To ensure that your Web solution is highly available, you need to predict the
reliability for each of your hardware components. You can use the reliability
numbers that hardware manufacturers publish for their hardware components.
When a hardware manufacturer designs a product, for example a computer
motherboard, they define a failure rate for each component by testing a large
number of motherboards. The aggregate number of run hours without failure is

used to determine the MTBF. By using the reliability figures of all of the
components, connections, computer card, and more, the hardware manufacturer
can calculate the probability of failure of the motherboard within a given time
period.
If you know the MTBF for a device in your Web infrastructure, it is possible for
you to predict when that device will fail. The graphic in the slide above
identifies the three phases in a component’s life cycle.
!
Burn in or early life of the component. Failure rates are typically high
during this phase.
!
Normal aging. Failure rates begin to drop off rapidly and components
seldom fail during this phase.
!
Failure mode. The component ages and the failure rates are predicted to
increase rapidly during this phase.

Introduction
Three-phase component
life cycle
Module 1: Introduction to Designing a Highly Available Web Infrastructure 17


Calculate Mean Time to Repair
Calculation
!
MTTR = (MTBF / Availability) - MTBF
Example
!
If MTBF = 262,800 minutes (6 months)

!
If you availability target is 99.99% then
!
MTTR = (262,800 / 0.9999) – 262,800
!
MTTR = 26.28 minutes

*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
When your system fails, you need to predict how long you can take to repair it
and still meet the availability requirements defined for your Web solution. After
your system is repaired, calculating the repair time can be more difficult
because you now have both new and old components in the system. For
example, you have just replaced a failed disk drive that had an MTBF of 1000
hours. The new disk drive has a high probability of working for the next 1000
hours, but the old disk drives, which have already run for 1000 hours, might be
approaching failure.
In a Web solution, your maintenance strategy must include planned downtime
to replace components such as fans, disk drives, or any component with a short
MTBF. For example, if one disk drive in an array fails and you simply replace
the failed drive, you might expect that others in the array may fail soon, because
they all have the same MTBF and could fail in a similar timeframe. Possible
maintenance strategies are:
!
Replace all of the disk drives when one drive in the array fails. This strategy
results in downtime caused by failure.
!

Replace all of the disk drives before they reach close to their predicted
failure time. This strategy is recommended because it results in planned
downtime.

If you know both the MTBF and your targeted availability, you can calculate
the maximum MTTR tolerable by using the following formula:
MTTR = (MTBF / Availability) - MTBF
For example, if your data center takes an average of six months to fail (MTBF =
six months or 262,800 minutes) and your targeted availability is 99.99 percent
(four nines), then your MTTR is 26.28 minutes. The equation is:
MTTR = (262,800 / 0.9999) – 262,800 = 26.28 minutes
Introduction
Calculate MTTR

×