Corporate Headquarters
Cisco Systems, Inc.
170 West Tasman Drive
San Jose, CA 95134-1706
USA
Tel: 408 526-4000
800 553-NETS (6387)
Fax: 408 526-4100
Data Center: SAN Extension for Business
Continuance
Solutions Reference Network Design
Version 1.2
THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL
STATEMENTS, INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT
WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS.
THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT
SHIPPED WITH THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE
OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY.
The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB’s public
domain version of the UNIX operating system. All rights reserved. Copyright © 1981, Regents of the University of California.
NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS” WITH
ALL FAULTS. CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT
LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF
DEALING, USAGE, OR TRADE PRACTICE.
IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING,
WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO
OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Data Center: SAN Extension for Business Continuance
Copyright © 2004, Cisco Systems, Inc.
All rights reserved.
CCIP, CCSP, the Cisco Arrow logo, the Cisco Powered Network mark, Cisco Unity, Follow Me Browsing, FormShare, and StackWise are trademarks of Cisco Systems, Inc.;
Changing the Way We Work, Live, Play, and Learn, and iQuick Study are service marks of Cisco Systems, Inc.; and Aironet, ASIST, BPX, Catalyst, CCDA, CCDP, CCIE, CCNA,
CCNP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, the Cisco IOS logo, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo,
Empowering the Internet Generation, Enterprise/Solver, EtherChannel, EtherSwitch, Fast Step, GigaStack, Internet Quotient, IOS, IP/TV, iQ Expertise, the iQ logo, iQ Net
Readiness Scorecard, LightStream, MGX, MICA, the Networkers logo, Networking Academy, Network Registrar, Packet, PIX, Post-Routing, Pre-Routing, RateMUX, Registrar,
ScriptShare, SlideCast, SMARTnet, StrataView Plus, Stratm, SwitchProbe, TeleRouter, The Fastest Way to Increase Your Internet Quotient, TransPath, and VCO are registered
trademarks of Cisco Systems, Inc. and/or its affiliates in the U.S. and certain other countries.
All other trademarks mentioned in this document or Web site are the property of their respective owners. The use of the word partner does not imply a partnership relationship
between Cisco and any other company. (0304R)
iii
Data Center: SAN Extension for Business Continuance
Version 1.2
CONTENTS
Preface
vii
Document Organization
vii
Document Conventions
viii
CHAPTER
1
SAN Extension for Business Continuance: Overview
1-1
Overview
1-1
Replication and Mirroring
1-3
Synchronous Replication
1-4
Network Latency
1-5
Asynchronous Replication
1-6
SAN Extension Transport Overview
1-7
Dark Fiber
1-8
CWDM
1-8
DWDM (Dense Wave Division Multiplexing)
1-9
SONET/SDH
1-9
FCIP
1-9
Flow Control over Long Distance
1-11
SAN Extension Solutions
1-12
Metro Optical SAN Extension with the ONS15530/ONS15540 and MDS 9000
1-12
Optical Protection Options
1-12
Scaling the Infrastructure
1-12
Extending the Distance
1-12
Campus SAN Extension using CWDM and MDS 9000
1-13
Protection
1-14
Scaling
1-14
Distance Capabilities
1-15
Metro/Regional – ONS15454
1-15
Synchronous Replication using FCIP on the MDS 9000 IP Services Module
1-16
MDS 9000 FCIP for Synchronous Data Replication
1-17
MDS 9000 FCIP for Asynchronous Data Replication
1-17
FCIP for SAN Extension with PA-FC-1G Port Adapter for 7200/7400
1-18
FCIP Compression and Encryption Solutions
1-19
Encryption for Data Privacy
1-19
Compression for Increased Throughput Over WAN links
1-20
Contents
iv
Data Center: SAN Extension for Business Continuance
Version 1.2
Summary
1-22
CHAPTER
2
Designing FCIP SAN Extension for Cisco SAN Environments
2-1
Overview
2-1
Why Use FCIP?
2-1
Solution Topology
2-2
Required Components
2-2
Using Compression
2-3
Compression and Available Bandwidth
2-4
Compression Modes and Rates
2-5
Using Write Acceleration
2-7
Write Acceleration Overview
2-7
Write Exchanges with More Than Two Round Trips
2-9
Expected Performance Gains When Using Write Acceleration
2-12
When to Use Write Acceleration
2-14
Considerations and Guidelines for Using Write Acceleration
2-16
CHAPTER
3
Implementing FCIP SAN Extension in Cisco SAN Environments
3-1
Implementation Details
3-1
FCIP Configuration Overview
3-1
Configuring the Gigabit Ethernet Interface, FCIP Profile, and FCIP Interface
3-2
Configuring the Gigabit Ethernet Interface
3-5
Setting the MTU Size
3-6
Automatic Path MTU Discovery
3-6
Manually Determining Path MTU Using Extended Ping
3-7
Configuring the FCIP Profile
3-8
FCIP Interface Configuration
3-9
Fibre Channel Configuration Considerations
3-10
Fibre Channel Receive Buffer Configuration Guidelines
3-10
Monitoring for Frame Expiry
3-13
Using Shared FCIP Links
3-13
Impact of WAN Packet Loss on FCIP Throughput
3-14
FCIP Tunnel Latency
3-14
Configuring Compression
3-15
Recommended Topologies
3-15
Single FCIP Link
3-16
Switch-A Configuration
3-16
Switch-B Configuration
3-17
Single FCIP Link with Multiple VSANs
3-18
Contents
v
Data Center: SAN Extension for Business Continuance
Version 1.2
Switch-A Configuration
3-19
Switch-B Configuration
3-19
High Availability Replication Design Using Multiple FCIP Links and Port Channels
3-20
High Availability Protection Levels
3-21
Port Channel Load Balancing
3-21
FCIP Link and Profile Configuration
3-22
Rate Limiting FCIP Links and Port Channel Members
3-22
Switch-A Configuration
3-23
Switch-B Configuration
3-24
Switch-C Configuration
3-25
Switch-D Configuration
3-26
CHAPTER
4
FCIP SAN Extension In Legacy Environments
4-1
Overview
4-1
Performance and Configuration Considerations
4-1
Cisco IOS Requirements
4-2
7200 VXR Series Router PCI Bus Considerations
4-2
7200 VXR Series Router CPU Performance Considerations
4-4
7200 VXR Series Router Gigabit Ethernet Considerations
4-5
Software Configuration
4-5
TCP Maximum Window Size (MWS)
4-7
Background Traffic
4-9
Interface MTU Size
4-9
Fibre Channel Configuration
4-10
Fibre Channel Timers
4-10
BB Credits
4-10
Fibre Channel Frame Size
4-11
Fibre Channel Switch Interoperability
4-12
Solution Topologies
4-12
CHAPTER
5
SAN Extension with Compression and Encryption in WAN Environments
5-1
FCIP Compression and Encryption Overview
5-1
Compression Overview
5-2
Selecting a Compression Solution
5-3
Determining the Compression Ratio
5-5
Encryption Overview
5-6
IPSec Network Security
5-7
IPSec Encapsulating Security Protocol (ESP)
5-7
IPSec Authentication Header (AH)
5-7
Contents
vi
Data Center: SAN Extension for Business Continuance
Version 1.2
IPSec Modes of Operation
5-7
DES/3DES Encryption
5-9
AES Encryption
5-9
Cisco Encryption Solutions
5-9
Combined Compression and Encryption FCIP Storage Extension Solutions
5-11
Compression and Encryption Design Considerations
5-14
IPS-8 Compression Performance Options
5-14
Configuring SA-VAM, SA-VAM2, IPSec VPN Services Module Compression and Encryption
5-15
Configuring IPS-8 FCIP and Compression
5-19
Compression and Encryption Configuration Examples
5-21
SA-VAM / PA-FC-1G
5-21
SA-VAM2 / IPS-8
5-23
SA-VAM / SA-VAM2 High Availability
5-25
IPS-8 SANOS v1.3 Compression
5-27
IPS-8 High Availability
5-28
IPS-8 / IPSec VPN Services Module
5-31
IPSec VPN Services Module High Availability Options
5-34
Conclusion
5-35
vii
Data Center: SAN Extension for Business Continuance
Version 1.2
Preface
The purpose of this document is to provide product knowledge and design guidance about how to extend
the capacity of Storage Area Networks (SAN) using Cisco products.
This document is intended for network design architects and support engineers who are responsible for
planning, designing, and implementing storage area networks (SAN) for business continuance and
disaster recovery.
Document Organization
This document contains the following chapters:
Chapter or Appendix Description
Chapter 1, “SAN Extension for
Business Continuance: Overview”
Provides an overview of SAN extension for business
continuance.
Chapter 2, “Designing FCIP SAN
Extension for Cisco SAN
Environments”
Provides background information on the FCIP protocol and
describes performance considerations involved when
designing a specific implementation.
Chapter 3, “Implementing FCIP SAN
Extension in Cisco SAN
Environments”
Describes design scenarios for implementing FCIP SAN
Extension using the and examines the tuning and
configuration instructions required for optimum operation.
Chapter 4, “FCIP SAN Extension In
Legacy Environments”
Provides product knowledge and design guidance about how
to extend the capacity of a Storage Area Networks (SAN)
using the Cisco PA-FC-1G PAM.
Chapter 5, “SAN Extension with
Compression and Encryption in WAN
Environments”
Provides product knowledge and design guidance for
compressing and encrypting Fibre Channel over IP (FCIP)
traffic using the following Cisco products:
•
Cisco SA-VAM and SA-VAM2 service modules for the
7200 VXR Series Routers
•
IPS-8 IP Storage Services Module for the MDS 9216 and
9500 Fibre Channel switches and directors
•
IPSec VPN Services Module for the Catalyst 6500
switch and 7600 router
Appendix A, “Glossary” A glossary of terms often used when describing encryption
and compression technology.
viii
Data Center: SAN Extension for Business Continuance
Version 1.2
Preface
Document Conventions
Document Conventions
This guide uses the following conventions to convey instructions and information:
Table 1 Document Conventions
Convention Description
boldface font Commands and keywords.
italic font Variables for which you supply values.
[ ] Keywords or arguments that appear within square brackets are optional.
{x | y | z} A choice of required keywords appears in braces separated by vertical bars. You must select one.
screen font
Examples of information displayed on the screen.
boldface screen
font
Examples of information you must enter.
< > Nonprinting characters, for example passwords, appear in angle brackets.
[ ] Default responses to system prompts appear in square brackets.
CHAPTER
1-1
Data Center: SAN Extension for Business Continuance
Version 1.2
1
SAN Extension for Business Continuance:
Overview
This chapter provides product knowledge and design guidance about how to extend the capacity of
Storage Area Networks (SAN) using Cisco products. It includes the following topics:
•
Overview, page 1-1
•
Replication and Mirroring, page 1-3
•
SAN Extension Transport Overview, page 1-7
•
SAN Extension Solutions, page 1-12
•
Summary, page 1-22
Overview
Storage Area Network (SAN) Extension is one of the enabling technologies for enterprise business
continuance. An extended SAN increases the geographic distance allowed for SAN storage operations,
in particular for data replication and copy operations. By replicating or copying data to an alternate site,
an enterprise can protect its data in the event of disaster at the primary site.
You should consider the following three factors when determining your business continuance
requirements:
•
Recovery time objective (RTO)
•
Recovery point objective (RPO)
•
Disaster radius
The RTO and RPO provide a measurable target for business continuance and disaster recovery purposes.
It also presents a firm basis from which to design the underlying SAN extension network.
RTO refers to how long it takes for the enterprise systems to resume operation after a disaster. RTO is
the longest time that your organization can tolerate. This is shown below in Figure 1-1.
1-2
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
Overview
Figure 1-1 Recovery Time and Recovery Time Objective
The technologies and processes used in a network largely determine how well an organization achieves
its RTO. For example, an extended cluster offers better recovery capabilities than manual migration or
tape-based restoration.
RPO refers to how current or fresh the data is after a disaster. This is shown below in Figure 1-2. RPO
is the maximum data loss that your organization can tolerate after a disaster.
Figure 1-2 Recovery Point and Recovery Point Objective
As with the RTO, the systems, processes and technologies employed will determine if you achieve your
RPO. Additionally, the distance between the data centers, and how well your applications tolerate
network latency, determine whether zero RPO is possible. The following recovery technologies support
increasingly lower RPO at increasingly higher cost:
•
Tape backup and restore
•
Periodic replication and backups
•
Asynchronous replication
•
Synchronous replication.
The disaster radius refers to how extensive the disaster is. Earthquakes, floods, fires, hurricanes,
cyclones and attacks will all have varying probabilities and disaster radii according to the geographic
region in which they occur. The key issue is that the backup site must not be within the radius of possible
threats. For example, an earthquake could destroy both primary and secondary data centers if they were
both separated geographically but connected by a major fault line. Many enterprises adopt a multi-hop
strategy to minimize their exposure to this situation—two data centers located within metro distances,
and a third disaster recovery site located out of the region. This is shown below in Figure 1-3.
104725
Disaster
occurs
Systems
Inoperative
Systems
Recovered and
Operational
time t
1
Recovery
time
time t
2
time
104726
Disaster
occurs
Period over
which data is
"lost"
Last point at
which data is in
a valid state
time t
0
Recovery
point
time t
1
time
1-3
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
Replication and Mirroring
Figure 1-3 Disaster Radius
Replication and Mirroring
Data replication and mirroring both have two primary objectives:
•
Get the data to a recovery site and maximize the currency to address the RPO.
•
Enable rapid restoration.
If the data is already on disk and the secondary data center has standby hosts, then restoration can be
quite rapid.
Two data replication methods are in popular use today:
•
Host-based mirroring
•
Disk-based replication
Figure 1-4 Host-Based Mirroring – High-Level View
104727
Local
1-2 km
Primary
Data Center
Metro
< 50 km
Secondary
Data Center
DR Site
Regional
104728
Local
Data Center
Remote
Data Center
Writes
mirroring by
host to
mirrored
volume
1-4
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
Replication and Mirroring
Figure 1-5 Disk-Based Replication–High-Level View
Host-based mirroring is a synchronous mirroring method where the host itself is responsible for
duplicating writes to the storage arrays. However, because the two storage arrays are identical copies,
reads are only performed against one array. This is shown in Figure 1-4.
Reads may be completed in a round-robin or preferred-array fashion. Host-based mirroring imposes an
overhead upon the host because the host must take care of the mirroring function. Although host-based
mirroring can be used across multiple sites, it is not quite as robust or flexible as storage-array based
solutions over distance.
Disk-based replication uses modern intelligent storage arrays, which can be equipped with data
replication software. Replication is performed transparently to the host without additional processing
overhead (see Figure 1-5). Replication products are proprietary and vendor specific, so matched arrays
from the same manufacturer must be used. Each manufacturer has a variety of replication products that
can be categorized as either synchronous or asynchronous. These two main types of replication are
discussed in more detail in the following paragraphs.
Synchronous Replication
Synchronous replication is a zero data loss (or zero RPO) data replication method. All data is written to
the local storage array and the remote array before the I/O is considered complete or acknowledged to
the host.
An example of a synchronously replicated write is shown in Figure 1-6. Some key points to note about
synchronous replication include the following:
•
The SCSI Write exchange is composed of three phases:
1. Command
2. Data Transfer
3. Status
•
The I/O is acknowledged to the host only after the acknowledgement is received from the remote
storage array. This is the disk I/O service time.
•
The replicated Write exchange requires two round trips between the two storage arrays. The impact
of these two rounds trips varies according to the distance. The longer the distance, the higher the
latency and the higher the overall disk I/O service time.
•
Specific vendor implementations may vary.
104729
Local
Data Center
Remote
Data Center
Writes
replicated to
remote data
center
1-5
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
Replication and Mirroring
Figure 1-6 Synchronous Replication – Protocol Detail
Disk I/O service time increases as distance increases because the I/O must be completed at the remote
storage array. Higher disk I/O service times negatively impact application performance. Enterprises
employing synchronous replication must balance the need for zero data loss, good application
performance, and adequate distance between data centers to exceed the disaster radius. Replication
between data centers in close proximity involves lower latency and better application performance but
there may be a higher likelihood of failure from a single disaster.
Network Latency
When using an optical network
1
, the additional network latency due to speed of light through fiber is
approximately 5µs per kilometer (8µs per mile). At two rounds trips per write, the additional service time
accumulates at 20µs per kilometer. For example at 50km, the additional service time is 1000µs or 1ms
(see Figure 1-7).
Figure 1-7 Network Latency for Synchronous Replication
104730
Local
Data Center
Remote
Data Center
SCSI Write
Host
Disk I/O
Service
Time
Transfer Ready
FCP Data (2kB frames)
SCSI Status=good
Two
Round
Trips
per
write
Response from local
controller (initiator)
returned after remote
(target) responds
ttt
SCSI Status=good
SCSI Write
Transfer Ready
FCP Data (2kB frames)
1. The different types of optical networks are introduced later in this paper.
104731
50km (30mi) over Optical Link
250us
250us
250us
250us
4 x 250us = 1ms additional latency at 50km
1-6
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
Replication and Mirroring
The maximum distance will vary from enterprise to enterprise, and from application to application., and
depends on a number of factors that include the following:
•
Application type—Database applications are latency sensitive because incomplete I/Os will lock
portions of the database. However, because every database is different in structure, the impact will
vary.
•
Transaction rate—Applications with a low transaction rate may not be as impacted by latency as
those requiring a higher rate.
•
Enterprise tolerance—Requirements for application performance vary depending on the enterprise
and the purpose of the application.
Asynchronous Replication
Asynchronous replication is a real-time replication method in which the data is replicated to the remote
array some time after the I/O is acknowledged as complete to the host. This means application
performance is not impacted, as the replication component is not factored into the I/O service time of
the host. The enterprise can therefore locate the remote array virtually any distance away from the
primary data center without impact. However, since data is replicated at some point after local
acknowledgement, the storage arrays are slightly out-of-step—the remote array is behind the local array.
If disaster strikes the local array at the primary data center, some data loss will result.
Each manufacturer of intelligent storage arrays has one or more proprietary asynchronous replication
products and implementations. The amount of data loss is dependent upon the method chosen, the
application used, and the processes used within the enterprise.
An example showing one implementation of asynchronous replication is shown below in Figure 1-8.
Key points to consider about asynchronous replication include the following:
•
Status is returned to the host after the data is written locally (to the cache)
•
The replication to the remote site occurs at some point afterwards.
The replication method shown in Figure 1-8 shows individual I/Os replicated in an ordered approach.
Other manufacturers replicate dirty tracks or periodically replicate delta sets. These latter methods are
geared towards lower bandwidth utilization.
1-7
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Transport Overview
Figure 1-8 Asynchronous Replication –Ordered Writing
SAN Extension Transport Overview
The network choices for SAN Extension can be split into two categories:
•
Optical, including dark fiber, course wave division multiplexing (CWDM), dense wave division
multiplexing (DWDM), and SONET/Synchronous Digital Hierarchy (SDH)
•
IP Transport., including Fibre Channel over IP (FCIP)
Each transport has different characteristics and capabilities, making some applicable to synchronous
replication and others applicable only to asynchronous replication. A SAN Extension transport overview
is shown below in Figure 1-10.
104732
Local
Data Center
Remote
Data Center
SCSI Write
Host
Disk I/O
Service
Time
Transfer Ready
FCP Data (2kB frames)
SCSI Status=good
ttt
SCSI Status=good
SCSI Write
Transfer Ready
FCP Data (2kB frames)
1-8
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Transport Overview
Figure 1-9 Replication Methods with Optical and IP Transports
Dark Fiber
Dark fiber is a viable method for SAN extension over data center or campus distances. The maximum
attainable distance is a function of the optical characteristics (transmit power and receive sensitivity),
of a Small Form-factor Pluggable (SFP) or Gigabit Interface Converter (GBIC), the number of fiber
joins, and the attenuation of the fiber. Lower cost multimode fiber with 850nm SX SFPs/GBICs are used
in and around data center rooms. Single mode fiber with 1310nm or 1550nm SFPs/GBICs are used over
longer distances.
Protection against failure is provided by the the switches at the end points. This is also known as client
protection. Diverse optical links and paths must be employed to ensure high availability. Port channels
in conjunction with VSAN trunking will provide added fabric stability across path failures.
As fiber is physical level, frames do not undergo any additional latency other than that due to the speed
of light through fiber, making any fiber-based solution ideal for the low-latency requirements of
synchronous replication. Latency through fiber is around 5µs per kilometer.
CWDM
CWDM takes dark fiber SAN extension one step further. CWDM allows up to eight 1Gbps or 2Gbps
channels (or colors) to share a single fiber pair. Each channel uses a different colored SFP or GBIC.
These channels are networked with a variety of wavelength specific add-drop multiplexers to enable an
assortment of ring or point-to-point topologies. The CWDM wavelengths are not amplifiable and so are
limited in distance according to the number of joins and drops. A typical CWDM SFP has a 30dB power
budget, so it can reach up to ~90 km in a point-point topology or around 40 km in a ring topology.
As with dark fiber, protection against failure is provided by the switches at the end points. Diverse
optical paths must be employed to ensure high availability and port channels are recommended to
maintain fabric stability through path failures.
As with dark fiber, CWDM links do not add any additional latency other than that incurred by the speed
of light through fiber. As such, CWDM links are ideally suited to the low-latency requirements of
synchronous replication applications.
104733
Increasing Distance
Data
Center
Campus
Metro
Regional
National
Dark Fiber
CWDM
DWDM
SONET/SDH
MDS9000 IPS-8
SN5428-2
7200/7400 with
PA-FC-1G
7200 with VAM and
PA-FC-1G
Sync (2Gbps)
Sync (2Gbps lamdba)
Sync (1Gbps+subrate) Async
Syn (Metro Eth) Async (1Gbps+)
Async (< OC12/STM4)
Async (< OC12/STM4)
Async (< DS3/E3)
F
C
I
P
Sync
1-9
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Transport Overview
DWDM (Dense Wave Division Multiplexing)
DWDM enables up to 32 channels (lambdas) to share a single fiber pair. Each of these channels can
operate at up to 10 Gbps. In contrast with the passive multiplexing nature of CWDM; DWDM platforms,
such as the ONS15530, 15540, and 15454, are intelligent, offering a variety of protection schemes to
guard against failures in the fiber plant. The Cisco ONS15530 and 15540 can also aggregate and
transport the IBM protocol sets of GDPS (Sysplex Timer and Coupling Links), ESCON and FICON.
This is in addition to Fibre Channel, Gigabit Ethernet and 10Gigabit Ethernet, making them ideally
suited for metro data center deployments.
The ONS15530 and 15540 platforms offer a number of options for wavelength and service protection.
These can be used in isolation or in concert with client protection through port channeling on the MDS
9000.
As with other pure Fibre Channel over optical solutions, DWDM links only incur latency from the speed
of light through fiber. DWDM is thus ideally suited and the most popular choice for enterprise
deployment of synchronous replication between metro data centers.
SONET/SDH
SONET/SDH is the predominant basis for most service provider networks. SONET is the North
American standard, while SDH is the standard elsewhere in the world. SONET/SDN can transport a
number of optical and electrical WAN interface types. In North America these are T1, DS3, STS-3, and
so forth. In the rest of the world slightly different protocols, such as E1, E3, STM-1, STM-4, are used.
SONET/SDH is also deployed in some enterprise networks, often in addition to DWDM. The Cisco
ONS15454 platform offers SONET/SDH client connection options, in addition to Gigabit Ethernet and
Fibre Channel.
The SL linecard offers a Fibre Channel over SONET/SDH transport. From the standpoint of the Fibre
Channel switch, the connection looks like any other optical link. However, it differs from other optical
solutions in that it spoofs Receiver Ready (R_RDY) frames to extend the distance capabilities of the
Fibre Channel transport. All Fibre Channel over optical links use Buffer to Buffer Credits (BB_Credits)
to control the flow of data between switches. R_RDY frames control the BB_Credits. As the distance
increases, so must the number of BB_Credits. The spoofing capability of the SL linecard extends this
distance capability to 2,800 km at 1 Gbps.
The ONS15454 offers a number of SONET/SDH protection options, in addition to client-level protection
through Fibre Channel switches. Port channels in conjunction with VSAN trunking are recommended
where multiple links are used.
Just as with other optical solutions, FC over SONET/SDH is suitable for synchronous replication
deployments subject to application performance constraints. Latency through the FC over SONET/SDH
network is only negligibly higher than other optical networks of the same distance as each frame is
serialized in and out of the FC over SONET/SDH network. The latency is 10µs per maximum-sized FC
frame at 2 Gbps.
FCIP
FCIP fills the void where optical networks are unavailable, unnecessary, too expensive, or where the
distance between data centers is so great that BB_Credit limitations would cripple throughput. FCIP is
a product of the IETF IP Storage Working Group. In conjunction with the NCITS T.11 FC-BB-2
standard, it defines the extension of a Fibre Channel fabric over an IP network. FCIP is supported in the
following three Cisco product lines:
1-10
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Transport Overview
•
MDS 9000 IP Services Module (IPS-8). Each IPS-8 card has eight Gigabit Ethernet interfaces, each
supporting up to three FCIP point-to-point links for a total of 24 FCIP links per card.
•
SN5428-2 Storage Router. The SN5428-2 has two Gigabit Ethernet interfaces and 8 x 1 or 2Gbps
Fibre Channel ports, making it an ideal choice for smaller enterprises or DR site connectivity to a
central MDS 9000 IP Services Module.
•
PA-FC-1G Port Adapter for the Cisco 7200 and 7400. The PA-FC-1G is an ideal single-chassis
solution for connecting a Fibre Channel SAN to a WAN. With the addition of an SA VPN adapter
module (VAM), 3DES encryption and LZS compression can be added in a single-chassis solution.
As with the optical solutions, FCIP extends a Fiber Channel fabric over distance with a single FSPF
routing domain. Just like Fibre Channel links on the MDS 9000, FCIP links can be port channeled for
greater resilience and throughput. VSAN trunks allow for SAN scalability.
SAN Extension through FCIP does incur greater latency than a purely optical network of similar
distance. This extra latency comes from store-and-forward delays, queuing, serialization time at
intermediate switches and routers, and segmentation and reassembly of Fibre Channel frames (when
normal 1500-byte MTUs are used). For this reason, FCIP extended SANs are typically not employed for
synchronous replication.
The MDS 9000 with the IP Services Module, however can sustain wire rate performance with very low
delays through the switch so it can realistically be used for synchronous replication where IP network
bandwidth is guaranteed (such as in a dedicated Metro Ethernet link).
A summary of SAN extension alternatives for various applications and distances is shown below in
Table 1-1.
Table 1-1 SAN Extension Alternatives
Transport Dark Fiber CWDM DWDM SONET/SDH FCIP
Distance
Capabilities
<500m (multimode)
<100km (single
mode) if using
CWDM SFPs
<~90 km
point-to-point
or
<~40 km for
ring topology
~200 km ~2800 km at 2 Gbps
with BB_Credit support
on SL line card
(ONS15454)
Limited by TCP Max
Window Size
MDS 9000 – 32 MB
(25,000 km at 1 Gbps)
SN5428-2 – 2 MB
(10,000km at OC-3)
PA-FC-1G – 512 KB
(2500km at OC-3)
Cisco Products MDS 9000 at each
end with appropriate
SFPs for distance
and fiber type
MDS 9000 at
each end with
CWDM SFPs.
CWDM
OADMs and
Muxes
ONS15530
ONS15540
ONS15454
MSTP
ONS15454 MSPP
(SONET/SDH)
MDS 9000 IP Services
Module
PA-FC-1G for 7200 and
7400
SN5428-2
1-11
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Transport Overview
Flow Control over Long Distance
All data networks employ flow control to prevent data overruns in intermediate and end devices. Fibre
Channel networks use Buffer to Buffer Credits (BB_Credits) on a hop-by-hop basis with Class 3 storage
traffic. Senders are permitted to send up to the negotiated number of frames (equal to the BB_Credit
value) to the receiver before waiting for Receiver Readys (R_RDY) to return from the receiver to
replenish the BB_Credits for the sender. As distance increases, so does latency; so the number of
BB_Credits required to maintain the flow of data increases.
FCIP links are slightly different. Although they carry Fiber Channel traffic, they do not use, and are
therefore not constrained, by BB_Credits. Instead, TCP flow control is used. TCP is an end-to-end
sliding window flow control mechanism. The ability to keep the pipe full or maintain the data flow is
governed by the window size. The current congestion window is the amount of TCP data the sender
allows outstanding at any one time. The TCP sender dynamically changes the congestion window in an
attempt to maintain traffic flow without causing an overflow at intermediate points. In a typical TCP
stack, the sender will begin in an exponentially increasing slow start phase and then enter a linearly
increasing congestion avoidance phase.
The TCP implementation on the MDS 9000 IP Services Module is slightly different to typical or normal
TCP. It employs a traffic shaping function that sends traffic during the first round trip period after an idle
at a rate equivalent to the minimum available bandwidth of the path. In doing this, it is able to ramp up
more quickly and recover from retransmissions more effectively than normal TCP implementations.
Fibre Channel
connectivity
1 or 2 Gbps 1 or 2 Gbps 1 or 2 Gbps 1 or 2 Gbps connection
bandwidth dependent
upon selected subrate
MDS 9000: 1 or 2 Gbps
FC. 1Gbps for FCIP
link
SN5428-2: 1 or 2 Gbps
FC. 1 Gbps FCIP link.
PA-FC-1G: 1 Gbps FC.
1 Gbps actual FCIP
bandwidth dependent
upon available WAN
bandwidth
Client
protection
options
Client protection.
Augment with port
channels and VSAN
trunking
Client
protection.
Augment with
port channels
and VSAN
trunking
Client
protection.
Augment with
port channels
and VSAN
trunking;
Client protection.
Augment with port
channels and VSAN
trunking
Client protection.
Augment with port
channels and VSAN
trunking
Network
protection
options
None None Splitter-based
protection;
Y-cable
protection
UPSR/SNCP, 2F&4F
BLSR/MSSP Ring,
PPNM, 1+1 APS/MSP
VRRP on MDS 9000.
IGP/EGP routing
convergence.
HSRP/VRRP on
Routers
Table 1-1 SAN Extension Alternatives (continued)
Transport Dark Fiber CWDM DWDM SONET/SDH FCIP
1-12
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Solutions
Each FCIP link on the MDS 9000 IP Services Module is capable of a TCP maximum window size
(MWS) of 32MB. This MWS enable the FCIP link to maintain a traffic flow of 1Gbps over a distance
up to 25,000 km—in theory, enough to link any two points on earth. The SN5428-2 has TCP MWS of
2MB—enough to run 10,000km at OC-3 or STM1 rates. The PA-FC-1G has a TCP MWS of
512kB—enough to keep an OC-3/STM1 link full at 2500km or a DS3 link full at 8500km.
SAN Extension Solutions
Metro Optical SAN Extension with the ONS15530/ONS15540 and MDS 9000
Metro optical networks, such as DWDM provide the lowest latency network options for synchronous
data replication. The Cisco ONS15540 and ONS15530 platforms can each multiplex up to 32 individual
channels (or lambdas) per fiber. Each channel can be Gigabit Ethernet, SONET/SDH, ATM, Fibre
Channel, FICON or ESCON. The ONS15530 can aggregate up to 10 ESCON links onto a single lambda.
Optical Protection Options
The ONS15530 and ONS15540 platforms provide for three types of 1+1 network protection as follows:
•
Facility Protection—Also known as splitter-based protection, switches traffic to an alternate
protected fiber upon detection of a fiber cut or signal degradation.
•
Line Card Protection—Also known as Y-cable protection, involves splitting and sending the signal
down two paths simultaneously. The receiver receives signal on both paths but only selects the
alternate upon detecting a failure in the primary path.
•
Switch Fabric Based Protection—Also known as client protection, involves the client sending
signals down separate or redundant paths or fibers.
Switch fabric-based protection is the preferred protection scheme for Fibre Channel-based storage
traffic. When used in conjunction with the port channelling feature of the Cisco MDS 9000, single fiber
cuts in DWDM rings will not alter the topology of the Fibre Channel switch fabric. This prevents
possible fabric and application disruption from Fibre Channel state changes and route reconvergence. In
Figure 1-9, the two member of the PortChannel would each be mapped over a different optical path—one
over the East path and the other over the West path.
Scaling the Infrastructure
VSANs enable enterprises to maintain separation between SAN fabrics without necessarily requiring
additional fabric switch hardware. In Figure 1-10, additional storage arrays can be added on existing or
new SAN fabrics without addition to the ONS15530/ONS15540 or MDS 9000 infrastructure. Additional
links can be added to the MDS 9000 Port channel and lambdas to the ONS155x0 platforms as traffic
throughput requirements increase.
Extending the Distance
As mentioned earlier, Fibre Channel SANs use a hop-by-hop credit based mechanism called BB_Credits
to control the flow of data. As distance increases, so does network latency, at a rate of 5µs per kilometer,
so more BB_Credits are required in order to sustain adequate traffic flow. Fibre Channel line cards in
many storage arrays have limited BB_Credits, so Fibre Channel directors such as the MDS 9000, which
1-13
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Solutions
have sufficient BB_Credits, are required if extension is required beyond a few kilometers. The MDS
9000 16-port Fibre Channel line card supports up to 255 BB_credits per port, allowing DWDM metro
optical fabric extension over 200 km without BB_Credit starvation and resulting performance
degradation.
At the maximum Fibre Channel frame size of 2148 bytes, one BB_Credit is consumed every two
kilometers at 1 Gbps and one BB_Credit per kilometer at 2 Gbps. Given an average Fibre Channel frame
size for replication traffic between 1600-1900 Bytes, a general guide for allocating BB_Credits to
interfaces is as follows:
•
1 BB_Credit for every 2 kilometers at 1 Gbps
•
1 BB_Credit for every 1 kilometer at 2 Gbps
Figure 1-10 Synchronous Replication Over ONS155x0 DWDM Metro Optical
Campus SAN Extension using CWDM and MDS 9000
CWDM provides a lower cost and lower density alternative to DWDM for gigabit Ethernet and Fibre
Channel connectivity over a data center, campus, or metropolitan area. CWDM allows multiplexing of
up to eight 1 or 2 Gbps channels over a fiber pair. Each CWDM wavelength or lambda is spaced 20 nm
apart from 1470 nm to 1610 nm. In contrast to DWDM, the individual CWDM channels originate at the
client switch through the use of special CWDM SFPs or GBICs. Each CWDM SFP or GBIC is tuned to
a specific CWDM channel (color) and multiplexed through a passive optical add-drop multiplexer
(OADM).
104734
Metro Optical
Ring
Intelligent Storage Arrays
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
Cisco ONS 15530
or ONS 15540
Portchannels
Data Center A
Intelligent Storage Arrays
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
Cisco ONS 15530
or ONS 15540
Portchannels
Data Center B
1-14
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Solutions
Protection
As CWDM OADMs are passive, protection from fiber cuts and failures is provided by the fabric
switches. This is similar in principle to switch fabric-based Protection or client protection with DWDM.
An example of a CWDM SAN extension implementation employing client protection is shown in
Figure 1-11.
Figure 1-11 Synchronous Replication with Point-to-Point CWDM using MDS 9000 Port Channels and
Client Protection
Two fiber pairs over diverse paths are required to protect the SAN from single fiber cuts. A port channel
from each MDS 9000 is defined consisting of 4 x 2 Gbps Fibre Channel ports. The port channel members
are split over the two CWDM fiber pairs—two Fibre Channel ports each. In this way, a single fiber break
or MUX failure will only affect half a port channel, and will be transparent to the topology. As a result,
it will not disrupt the SAN with state change notifications or FSPF reconvergence.
Scaling
The Cisco CWDM product set includes eight CWDM SFPs and GBICs in addition to a variety of MUXes
and OADMs, These can be used to construct a variety of point-to-point and ring topologies for Gigabit
Ethernet and Fibre Channel applications. Each CWDM fiber pair can accommodate up to eight CWDM
channels. A single fiber can accommodate four channels by carrying odd numbered channels in one
direction and even numbered channels in the other direction.
MDS 9000 port channels allow up to 16 CWDM channels to be bundled (two fiber pairs with 8 channels
apiece) into one 32 Gbps logical ISL. VSAN trunking permits transparent sharing of any or all of these
port channels between isolated SAN fabrics.
114370
Intelligent Storage
Arrays
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
CWDM
MUX-4
4 x 2Gbps
Portchannels
Data Center A
Fibre
Channel
Data Center B
CWDM
MUX-4
Intelligent Storage
Arrays
4 x 2Gbps
Portchannels
Cisco
MDS9000
Fibre Channel
Directors
CWDM
MUX-4
CWDM
MUX-4
1-15
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Solutions
Distance Capabilities
Unlike DWDM, CWDM lambdas or channels cannot be amplified, except for the single lambda around
the 1550nm wavelength.) This limits the distance capabilities of CWDM to the optical power of the
transmitter and sensitivity of the receiver in the CWDM SFPs. Cisco CWDM SFPs have the following
optical power budget characteristics:
•
At 1 Gbps: 30 dB power budget
•
At 2 Gbps: 29 dB power budget
Fiber connections, distance, and the OADMs all contribute to attenuation, thereby shortening the
effective distance capabilities. Taking this into account, a typical point-to-point CWDM connection can
extend to around 100km. In a ring topology, this can drop down to around 40km, depending upon the
number of fiber joins and OADMs.
As the CWDM infrastructure transports Fibre Channel, proper BB_Credit configuration applies. The
same BB_Credit rules apply for CWDM as it does for DWDM. Based upon an average Fibre Channel
frame size of 1600-1900 bytes, a general guide for allocating BB_Credits to interfaces is as follows:
•
1.25 BB_Credits for every 2 kilometers at 1 Gbps
•
1.25 BB_Credits for every 1 kilometer at 2 Gbps
Metro/Regional – ONS15454
Leasing dark fiber for enterprise CWDM or DWDM networks may not be possible or appropriate in
many situations. In these cases, Fibre Channel attachment to a Service Provider SONET/SDH network
may be an option. The SL Series line card for the Cisco ONS15454 SONET/SDH MSPP provides 4-port
1 or 2 Gbps Fibre Channel or FICON connectivity over a SONET/SDH transport.
The low latency nature of SONET/SDH means this type of solution is viable for synchronous replication
over metro or short regional distances. The typical expanse of service provider SONET/SDH networks
means this is also an appropriate solution for asynchronous replication or copy operations over longer
distances.
A possible deployment scenario between two data centers is shown below in Figure 1-12.
In this example, a 1 or 2 Gbps Fibre Channel port from each MDS 9000 is connected to two ports on the
SL Line Card on the ONS15454 in the service provider network. Each of these links may also be trunking
one or more VSANs, depending upon the enterprise requirement. Extra links can be added as required
in a standalone arrangement to a possible DR site, or added to the existing links in a logical port channel.
Optical path protection from fiber cuts or failures is built-in to the ONS15454 SONET/SDH
infrastructure. Single fiber cuts will force the rings to wrap, thereby interrupting service for less than
50ms.
1-16
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Solutions
Figure 1-12 Replication with Fibre Channel Over ONS15454 SONET/SDH Infrastructure
Synchronous Replication using FCIP on the MDS 9000 IP Services Module
FCIP (Fibre Channel over IP) is an alternative transport for SAN extension in the following situations:
•
Dark fiber for DWDM or CWDM is unavailable, too costly, or distance is beyond Metro Optical
range
•
Service Provider does not offer native Fibre Channel transport service
•
The enterprise wishes to share a common IP network infrastructure
FCIP offers the most flexibility of all transports. It can be used for synchronous data replication over a
short metropolitan distance, as well as longer distances for semi-synchronous and asynchronous data
replication applications. The underlying TCP/IP network lets FCIP leverage a wealth of IP services
unavailable to alternative transports, including the following:
•
Data Compression using the VAM or VAM2 Port Adapter for the 7200; or the VPNSM for the
Catalyst 6500.
•
IPSec Encryption and authentication using the VAM or VAM2 Port Adapters for the Cisco 7200 and
7400 routers
•
Quality of Service (QoS)
•
Protection, including VRRP at the MDS 9000 FCIP endpoints; VRRP, HSRP and IP routing and
Ethernet switching features to route or switch around failures.
•
Consolidated IP enterprise network with voice, video, data and storage traffic
104736
Intelligent Storage Arrays
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
Data Center A
Intelligent Storage Arrays
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
Data Center B
Fibre
Channel
Fibre
Channel
Enterprise or Service
Provider SONET/SDH
Network
ONS 15454 with
4-port SL line
card
ONS 15454 with
4-port SL line
card
1-17
Data Center: SAN Extension for Business Continuance
Version 1.2
Chapter 1 SAN Extension for Business Continuance: Overview
SAN Extension Solutions
MDS 9000 FCIP for Synchronous Data Replication
An example of a dual data center deployment using FCIP from the IP Services Module of the MDS 9000
is shown below in Figure 1-13. In this example, a Gigabit Ethernet service is assumed from
SONET/SDH, transparent LAN or Metro Ethernet service provider.
As synchronous replication is sensitive to network latency, the following points should be considered
when using FCIP for this purpose:
•
The FCIP end points, intermediate routers, and some switches store-and-forward packets or frames,
adding additional latency.
•
For shared networks, resource contention should be kept to a minimum. Low latency queuing
methodologies must be used and adequate bandwidth provisioned.
Figure 1-13 Replication Using FCIP Transport Over SONET/SDH, Metro Ethernet or Transparent LAN
Service
MDS 9000 FCIP for Asynchronous Data Replication
The latency sensitivity of synchronous replication makes it impracticable over long distances.
Asynchronous can be deployed at almost any distance without affecting application performance
(providing adequate network bandwidth is available for ongoing replication).
114371
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
Data Center A
Fibre
Channel
Cisco
MDS9000
Fibre Channel
Directors
Data Center B
Gigabit
Ethernet
Gigabit
Ethernet
Intelligent Storage
Arrays
Intelligent Storage
Arrays
Portchannels
Portchannels
SONET/SDH Network
Metro Ethernet or
Transparent LAN
Service