Tải bản đầy đủ (.pdf) (40 trang)

Tài liệu ORACLE8i- P28 docx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (466.26 KB, 40 trang )

CHAPTER 26 • HIGH AVAILABILITY
1082
Configuring the Primary Instance
To implement managed recovery, you will need to configure the primary instance to
archive redo logs to the standby host. If you plan to use manual recovery, you do not
need to make any configuration changes to the primary instance.
To configure the primary instance to archive redo logs to the standby host, you
will need to set up the init.ora file for the primary instance with information about
where to archive these logs. You do not need to shut down the primary instance to
enable these parameters. Instead, you can manually enable them using the ALTER
SYSTEM command, as described in the “Enabling Parameters for the Primary
Instance” section later in this chapter.
The archive destination is specified by using the LOG_ARCHIVE_DEST_n parame-
ter, where n is an integer from 1 to 5. Up to five destinations may be defined, but one
destination must be a local device. When setting these parameters, the keyword
LOCATION specifies a valid path for the local archive destination, and the keyword
SERVICE specifies a service name referenced in the tnsnames.ora file. The SERVICE
keyword must be specified for all standby archive destinations, whether local or
remote. Here is an example of using the LOG_ARCHIVE_DEST_n parameters in the
init.ora file of the primary instance:
log_archive_dest_1 = ‘location=/u02/arch/PRMRY’
log_archive_dest_2 = ‘service=stdby’
There are several options that you can set with the LOG_ARCHIVE_DEST_n param-
eter, as described in the following sections.
Setting a Mandatory or Optional Destination
You can specify whether a destination is mandatory or optional by using the
MANDATORY or OPTIONAL keyword, respectively. Oracle recommends that you
specify the local archived redo log destination as MANDATORY. The following is an
example of using the MANDATORY and OPTIONAL keywords:
log_archive_dest_1 = ‘location=/u02/arch/PRMRY MANDATORY’
log_archive_dest_2 = ‘service=stdby OPTIONAL’


Specifying Access after a Failed Write
By default, Oracle will not attempt to access a destination following an error. You can
specify that Oracle should attempt to access an archived redo log destination again
after a failed write using the REOPEN keyword. The REOPEN keyword specifies the
number of seconds Oracle will wait before the archiver process should attempt to
access a failed destination again. The default value of the REOPEN keyword when
specified without qualification is 300 seconds. (If you do not specify the REOPEN key-
word, the default value is 0.) You can override the default by using REOPEN=n, where
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1083
n is the number of seconds to wait. The following is an example of using the REOPEN
keyword:
log_archive_dest_2 = ‘service=stdby OPTIONAL REOPEN=60’
Setting the Minimum Successful Write Destinations
The LOG_ARCHIVE_MIN_SUCCEED_DEST parameter specifies the minimum number
of destinations where the archiver process must successfully write archived redo logs
before the source redo log is available for new writes. When archived redo logs are
written, the count of successful writes to all MANDATORY destinations and OPTIONAL
nonstandby destinations is measured to satisfy the setting of the LOG_ARCHIVE_
MIN_SUCCEED_DEST parameter. The default setting for LOG_ARCHIVE_MIN_
SUCCEED_DEST is 1, and valid values are 1 through 5.
Enabling or Deferring a Destination
It is possible to define archived redo log destinations without enabling those destina-
tions. Whether or not an archived redo log destination is enabled is specified using
the LOG_ARCHIVE_DEST_STATE_n parameter, where n is an integer from 1 to 5, cor-
responding to the respective LOG_ARCHIVE_DEST_n parameter. Valid values are
ENABLE or DEFER. ENABLE, the default setting, tells Oracle to archive to the defined

destination. The DEFER setting allows you to define the destination without actually
archiving to that destination. The deferred destination can later be enabled dynami-
cally using the ALTER SESSION or ALTER SYSTEM command. The following example
illustrates how to define an archived redo log destination without enabling it:
log_archive_dest_state_2 = DEFER
log_archive_dest_2 = ‘service=stdby’
Configuring the Standby Instance
You can create the initialization parameter file for the standby instance by copying
the initialization parameter file from the primary instance. With a few exceptions, the
initialization parameters for the primary instance and the standby instance should
have the same settings.
Setting Standby Initialization Parameters
The following parameters are particularly important when you are configuring the
standby instance:
• The COMPATIBLE parameter settings should be identical for the primary and
standby instances.
• The DB_NAME parameter settings should also be identical in both files.
• The CONTROL_FILES parameter for the standby instance should be set to the
fully qualified name of the standby control file.
USING A STANDBY DATABASE
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1084

• The LOG_ARCHIVE_DEST_n (described earlier in the “Configuring the Primary
Instance” section) specifies the location of the archived redo logs for manual
recovery. This parameter should be set whether you are configuring managed or
manual recovery.
• The optional parameter LOG_ARCHIVE_TRACE causes an audit trail of archived
redo logs received from the primary database to be written to a trace file.
• In a managed recovery environment, the STANDBY_ARCHIVE_DEST parameter
sets the location to write the archived redo logs received from the primary data-
base. Set LOG_ARCHIVE_DEST_n and STANDBY_ARCHIVE_DEST to identical
values for easier maintenance.
Setting Parameters for a Standby on the Same Host as the Primary
When the primary and standby databases reside on the same host, several other
standby initialization parameters are important. These include the DB_FILE_NAME_
CONVERT, LOG_FILE_NAME_CONVERT, and LOCK_NAME_SPACE parameters.
The DB_FILE_NAME_CONVERT and LOG_FILE_NAME_CONVERT parameters in
the standby initialization parameter file enable automatic name conversion of
datafiles and archived redo logs, respectively. Ideally, the directory structures in the
primary and standby databases should be identical. However, if they are not (because
the primary and standby databases reside on the same host or for some other reason),
you will need to update the standby control file with the new filenames.
The DB_FILE_NAME_CONVERT and LOG_FILE_NAME_CONVERT parameters each
specify two strings. The first string is that portion of the path structure of the primary
database to be converted. The second string is the new path structure to replace the
structure specified by the first string. Here’s an example of the DB_FILE_NAME_
CONVERT parameter set in the standby database initialization parameter file:
db_file_name_convert = /u01/oradata/PRMRY, /u01/oradata/STDBY
The LOCK_NAME_SPACE parameter also must be set for the standby instance
when the primary and standby databases reside on the same host, as follows:
lock_name_space = stdby
If this parameter is not set for the two instances, you will receive an ORA-1102 error.

Mounting the Standby Database
When you are ready to start the standby database, follow these steps:
1. Connect to Oracle from SQL*Plus:
$ sqlplus /NOLOG
SQL> CONNECT INTERNAL
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1085
2. Start the instance without mounting the database:
SQL> STARTUP NOMOUNT
3. Mount the database as a standby database:
SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
Renaming Files on the Standby Site
If you are unable to correctly rename all of the datafiles and redo log files for the
standby database using conversion parameters, you will need to manually rename
these files before starting recovery. For example, suppose that the DB_FILE_NAME_
CONVERT parameter was set to convert the path /u01/oradata/PRMRY to /u01/
oradata/STDBY. However, a datafile was erroneously created following the path
/u01/app/oracle/admin/PRMRY, and you are unable to take the downtime to correct
this error. You can still move this datafile on the standby site to the correct path, but
you will need to manually rename this datafile in the standby control file.
To manually rename a datafile, with the standby database mounted, use the ALTER
DATABASE RENAME FILE command, as follows:
SQL> ALTER DATABASE RENAME FILE
2> ‘/u01/app/oracle/admin/PRMRY/temp01.dbf’ to
3> ‘/u01/oradata/STDBY/temp01.dbf’;
Enabling Parameters for the Primary Instance
If you are implementing managed recovery, you have configured the initialization

parameter file of the primary instance to archive redo logs to the standby site (as
described in the “Configuring the Primary Instance” section earlier in this chapter).
However, if you have not shut down and restarted the instance, these new settings
have not taken effect. You can enable these new parameter settings dynamically with-
out shutting down the instance by using the ALTER SYSTEM command.
For example, suppose that you set the LOG_ARCHIVE_DEST_2 parameter as follows:
log_archive_dest_2 = “service=stdby”
You can now enable this parameter with the following command:
SQL> ALTER SYSTEM SET log_archive_dest_2 = “service=stdby”;
Enabling Recovery
Your next step is to start the recovery process of the standby database. How you start
the recovery depends on the type of recovery you have chosen for your environment.
USING A STANDBY DATABASE
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1086
Enabling Manual Recovery
If you chose to use the manual recovery process, you can initiate recovery by issuing
the following command:
SQL> RECOVER STANDBY DATABASE;
This command tells Oracle to use the location specified in the initialization file.
Alternatively, you can start manual recovery with this command:
SQL> RECOVER FROM ‘/u02/arch/STDBY’ STANDBY DATABASE;

As you can see, this command gives the location of the archived log files to be used
for the current recovery session.
You can also use the above commands with the UNTIL CANCEL option:
SQL> RECOVER STANDBY DATABSE UNTIL CANCEL;
SQL> RECOVER FROM ‘/u02/arch/STDBY’ STANDBY DATABASE UNTIL CANCEL;
When using this command, Oracle will prompt you for each archived redo log file
that it wants to apply and wait for you to acknowledge the prompt before continuing.
Enabling Managed Recovery
If you are using the managed recovery option, you will first need to check to see if a
gap sequence exists. Since Oracle rolls the standby database forward by sequentially
applying the archived redo log files, it cannot enter into a managed recovery session
if there is a missing log. A gap sequence exists when the primary database generated a
log sequence that was not archived to the standby site. A gap sequence may occur for
a number of reasons, such as the following:
• The standby database was created from an old backup or from an inconsistent
(online) backup.
• You shut down the standby database(s) before shutting down the primary
database.
• There was a network failure.
If a gap sequence exists, you will need to manually apply the missing archived redo
log files to bring the standby database in synch with the primary site.
You can query the standby database to see if a gap sequence exists using SQL*Plus,
as follows:
SELECT high.thread#, “LowGap#”, “HighGap#”
FROM
(
SELECT thread#, MIN(sequence#)-1 “HighGap#”
FROM
C
opyright ©2002 SYBEX, Inc., Alameda, CA

www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1087
(
SELECT a.thread#, a.sequence#
FROM
(
SELECT *
FROM v$archived_log
) a,
(
SELECT thread#, MAX(next_change#)gap1
FROM v$log_history
GROUP BY thread#
) b
WHERE a.thread# = b.thread#
AND a.next_change# > gap1
)
GROUP BY thread#
) high,
(
SELECT thread#, MIN(sequence#) “LowGap#”
FROM
(
SELECT thread#, sequence#
FROM v$log_history, v$datafile
WHERE checkpoint_change# <= next_change#
AND checkpoint_change# >= first_change#
)
GROUP BY thread#

) low
WHERE low.thread# = high.thread#;
This query will generate an output similar to the following:
THREAD# LowSeq# HighSeq#

1 171 174
This shows that a gap sequence exists for thread #1 and that logs 171 through 174
should be manually applied to the standby database. If the HIGHSEQ# and LOWSEQ#
columns both show the same number, a gap sequence does not exist for that particu-
lar thread.
USING A STANDBY DATABASE
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1088
If there is a gap sequence, query the V$ARCHIVED_LOG view in the primary data-
base to obtain the names of the log files that need to be manually applied, as follows:
SQL> SELECT name
2> FROM v$archived_log
3> WHERE sequence# >= 171
4> AND sequence# <= 174;
NAME

/u02/arch/primary/arch_PRMRY_1_171.log

/u02/arch/primary/arch_PRMRY_1_172.log
/u02/arch/primary/arch_PRMRY_1_173.log
/u02/arch/primary/arch_PRMRY_1_174.log
You will then need to copy these archived log files from the primary database’s
archiving destination to the standby database’s receiving destination. After the files
have been copied, issue either the RECOVER STANDBY DATABASE UNTIL CANCEL
or RECOVER AUTOMATIC STANDBY DATABASE command to apply the necessary
log files.
Once the gap sequence has been resolved, you can initiate a managed recovery ses-
sion for your standby database, as follows:
SQL> RECOVER MANAGED STANDBY DATABASE;
When you use this statement, Oracle will wait indefinitely for the new archived log
files to be received and automatically apply them to your standby database.
If you want to specify a waiting period, use the TIMEOUT option and specify the
length of time to wait, in minutes:
SQL> RECOVER MANAGED STANDBY DATABASE TIMEOUT 10;
In this example, Oracle will wait for 10 minutes for new archived log files to arrive
before it will time out and cancel the managed recovery session.
Testing the Standby Database
You are now ready to test the failover functionality of your newly created standby
database. However, if you simply activated the standby database, you would need to
open the database using the RESETLOGS command. Of course, once the standby data-
base has been opened with RESETLOGS, the only way to resume in standby mode is
to re-create the standby database. The solution to this problem is to test your standby
database failover functionality by first using cancel-based recovery, and then starting
the standby database in read-only mode.
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

1089
For a managed recovery process, cancel the recovery session and shut down the
standby instance by issuing this command:
SQL> RECOVER MANAGED STANDBY DATABASE CANCEL
For manual recovery, use the following commands:
SQL> RECOVER CANCEL
SQL> SHUTDOWN IMMEDIATE
Next, start the standby database and put it in MOUNT mode:
SQL> STARTUP NOMOUNT
SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
Then, in MOUNT mode, open the database in read-only mode:
SQL> ALTER DATABASE OPEN READ ONLY;
Once the standby database has been opened in read-only mode, you can query the
standby database to verify that the transactions executed on the primary site are
propagated to the standby site.
Maintaining a Standby Database
The previous sections walked you through a step-by-step process of creating a standby
database. So now that you have a fully operational standby database, what’s next?
Like everything else, your standby database will require some administration and
maintenance. Generally, the only maintenance that is required for a standby database
is to keep it in synch with the primary database. Some problems you may run into
while managing standby databases include the need to resolve redo log sequence
number gaps as they occur. Other tasks include making sure that the physical struc-
ture of the standby database matches that of the primary database and manually
propagating any data that was applied to the primary site via the DIRECT or UNRE-
COVERABLE mode. Let’s take a closer look at these situations.
TIP The standby database can also be backed up, but not while the database is in
recovery mode. Either shut down the standby database or open it in read-only mode.
Make the backups. Then resume recovery.
Resolving Gap Sequences

Common causes for gap sequences (after the standby database is operational) are net-
work failures and shutting down the standby database while the primary database is
still open. You will need to be aware of these situations and regularly monitor the
USING A STANDBY DATABASE
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1090
standby database to ensure that a gap sequence does not occur. If you do find a gap
sequence, you will need to manually resolve it, as explained in the “Enabling Man-
aged Recovery” section earlier in this chapter.
Matching Changes in the Physical Structure
Standby database recovery is possible only as long as the physical structure matches
that of the primary database. Certain physical structure changes are transmitted auto-
matically to the standby database via redo application. Such changes include renam-
ing a primary database’s datafile, changing the mode of a tablespace from offline to
online or vice versa, changing the status of a tablespace from read-only to read/write
or vice versa, and dropping a tablespace.
Other structural changes require a certain degree of manual maintenance on the
standby database. Such changes include re-creating the control file and adding a
tablespace/datafile to the primary database.
Re-creating the Standby Control File
Creating new members or groups on the primary database does not affect the standby
database. Similarly, enabling or disabling threads does not affect the standby database

either. However, it is recommended that if you enable or disable threads, or add or
drop groups or members on the primary database, you re-create the standby control
file. This keeps the redo log file configuration on the primary and the standby data-
bases in synch.
WARNING If for some reason, the unarchived log files need to be cleared on the pri-
mary instance, you will invalidate the entire standby database. Your only option at that
time will be to re-create the standby database.
Changing certain initialization parameters requires re-creating the control file. If
you re-create the primary database’s control file, you will invalidate the standby data-
base’s control file. Therefore, you will need to re-create the standby database’s control
file. Follow these steps to re-create a standby control file (note that these steps assume
that the primary database’s control files have been re-created):
1. Cancel the recovery session and shut down the standby instance.
• For managed recovery, issue:
SQL> RECOVER MANAGED STANDBY DATABASE CANCEL
• For manual recovery, use:
SQL> RECOVER CANCEL
SQL> SHUTDOWN IMMEDIATE
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1091
2. Log in to the primary instance and use the following statement to create a
standby control file:
SQL> ALTER DATABASE CREATE STANDBY CONTROLFILE AS ‘/tmp/stdby.ctl’;
3. Archive the online redo log file on the primary database:
SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;
4. Copy the newly created standby control file from the primary host to the
standby host and start the standby instance in recovery mode:

SQL> STARTUP NOMOUNT
SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
SQL> RECOVER STANDBY DATABASE
NOTE If you restored a control file from backup on the primary database or opened the
primary database with the RESETLOGS command, you will need to re-create the standby
database.
Adding Tablespaces/Datafiles
Whenever a new datafile is added to the primary site, the same file needs to be added
to the standby site (or sites) as well. Suppose that you add a new datafile to the pri-
mary site like this:
SQL> ALTER TABLESPACE TEMP ADD DATAFILE ‘/u01/oradata/PRMRY/temp02.dbf’
SIZE 200M;
Redo generated from this statement will add the name of the new datafile to the
standby control file. However, in order for the standby database to continue recovery,
you must copy the file to its corresponding location. You can also cancel the recovery
session on the standby site and create the datafile.
To switch log files on the primary site to initiate archival of the redo logs to the
standby site, issue the following statement:
SQL> ALTER SYSTEM SWITCH LOGFILE;
Next, ensure that the standby database is running in recovery mode. Initiate man-
ual recovery, if necessary:
SQL> CONNECT INTERNAL
SQL> STARTUP NOMOUNT
SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
SQL> RECOVER MANAGED STANDBY DATABASE;
USING A STANDBY DATABASE
Oracle8i Distributed
Database
PART
IV

C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1092
For manual recovery of the standby database, substitute the last statement with
this one:
SQL> RECOVER STANDBY DATABASE UNTIL CANCEL;
Once all the archived redo log files have been applied to the standby database, use
CANCEL or RECOVER MANAGED STANDBY DATABASE CANCEL statements (or press
Ctrl+C) to cancel the recovery session.
Now that you have added related information about the new datafile to the
standby control file, you need to create the relevant file(s) on the standby site. To cre-
ate the datafile, use the ALTER DATABASE CREATE DATAFILE command:
SQL> ALTER DATABASE CREATE DATAFILE ‘/u01/oradata/STDBY/temp02.dbf’
AS ‘/u01/oradata/STDBY/temp02.dbf’;
You can now resume normal activity on both the primary database (for example,
transaction processing) and the standby database (such as managed recovery mode).
Making Manual Changes
Since direct-path loads and changes made using the NOLOGGING / UNRECOVER-
ABLE option do not generate redo, changes made to the primary database cannot be
automatically transmitted to the standby database. The recovery process on the
standby database will still read all of the archived log files in a sequential manner and
continue the recovery process. However, an error message will be generated in the
standby database’s alert log file, stating that the block was changed using the
NOLOGGING option and therefore cannot be recovered.
Your only options to synchronize the primary and the standby databases in such
cases are as follows:
• Re-create the standby database.

• Back up the affected tablespaces from the primary database, transfer them to the
standby database, and resume recovery.
• Cancel recovery on the standby database, take the affected datafiles offline, and
then drop them when the standby database is activated. This option may not be
feasible because it will result in data loss.
Opening a Standby Database in Read-Only Mode
You can open a standby database in read-only mode once the recovery session has
been canceled. Tablespaces in a read-only standby database need to be created as tem-
porary and locally managed, and they should contain only temporary files. Also, all
the required user permissions and privileges must be set to access these tablespaces.
Since this cannot be done at the standby site, all these privileges must be given at the
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1093
primary site. This allows for queries that generate on-disk sorting operations without
generating any redo entries or affecting the data dictionary.
To open a standby database in read-only mode, first start the standby database in
MOUNT mode:
SQL> STARTUP NOMOUNT
SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
In MOUNT mode, you can open the database in read-only mode with the follow-
ing command:
SQL> ALTER DATABASE OPEN READ ONLY;
Data Guard to the Rescue
Oracle introduced Data Guard with Oracle8i as an option to administer and maintain
your standby databases on Unix platforms. Built to protect against data corruptions and
disasters, Data Guard can be configured to automate failovers. Another feature of Data
Guard is that it can allow for a time delay. This means that it waits a specified amount

of time before applying the archived redo log files to the standby database. In Oracle8i,
Data Guard also can be used in conjunction with a remote mirroring technology to sup-
port a zero data loss environment.
Data Guard will be fully released with Oracle9i. It is supposed to have a GUI that will
plug into Oracle Enterprise Manager (OEM), providing for ease of management. Also,
the zero data loss option will be built in with the Oracle9i version of Data Guard—the
LGWR process on the primary database will write to its online redo log files and to the
archived log files on a remote site simultaneously.
Using Oracle Parallel Server
Oracle Parallel Server (OPS) is a combination of hardware and software configuration
that allows you to use the power of multiple servers as one server. OPS is commonly
installed in a clustered environment. Several nodes (servers) combine to form a cluster.
The idea behind the OPS architecture is to allow transactions to be divided and exe-
cuted simultaneously on several nodes. These nodes access a single database that usu-
ally resides on a centralized storage array.
USING ORACLE PARALLEL SERVER
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1094
OPS can be used for various database configurations. Data warehousing and deci-
sion support systems can readily use the robustness of the OPS architecture to split and
run multiple batch jobs concurrently. OPS can also be configured in various OLTP and
hybrid configurations, primarily for the purposes of load balancing and availability.

To a front-end application, an OPS configuration is transparent. This means that
functionally, running on an OPS does not appear any different to the application
than running on a single-instance database. However, since the application can be
distributed and the database accordingly partitioned, there is a large performance
gain in an OPS environment.
NOTE In earlier versions of OPS (versions 8.0.x and earlier), significant performance
gains were experienced in cases where the data-to-node affinity was greater. In environ-
ments where the applications were not OPS-aware (properly distributed), a database run-
ning an OPS configuration would actually perform slower than a single-instance database.
This was because of processes called pinging and false pinging. These topics, along with
enhancements in Oracle8i that avoid database pings, are discussed later in this chapter, in
the “Oracle 8i OPS Enhancements” section.
The OPS Architecture
OPS allows for multiple instances (running on the same or different nodes) to mount
and access the same database. OPS allows read-consistent data to be read from multi-
ple instances. This is handled using Parallel Cache Management (PCM) locks. Each
PCM lock occupies approximately 100 bytes of memory and is responsible for inter-
instance data block access.
As you might expect, there is a lot that goes on behind the scenes to allow multiple
instances to access a single database. A fairly complex hardware configuration and
software solution are combined to allow for such an architecture. Figure 26.2 illus-
trates the OPS architecture, and the following sections describe its components in
more detail.
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1095
FIGURE 26.2
The Oracle Parallel

Server architecture
Hardware Cluster Configurations
A clustered environment is built by linking two or more nodes via a high-speed inter-
connect. A node is essentially a server that joins with other servers to form a cluster. A
node, like any other server, is made up of a CPU, memory, and storage. Interconnects
are high-speed communication links between nodes. They form a private network for
all inter-node and inter-instance communications. An interconnect can be an Ether-
net or a fiber connection between two nodes.
There are practically an infinite number of configurations for a node, and therefore
cluster configurations vary significantly from vendor to vendor, as well as from envi-
ronment to environment.
CPU and Memory Configurations
CPUs and memory configurations can be divided into two main categories:
• In uniform memory access (UMA), the CPUs in a cluster access shared memory
at the same speed. This is also referred to as symmetrical multiprocessing (SMP).
Redo
log files
Database files
Shared disks
Inter-node
communication,
RPC calls
Rollback
segments
Node A Node B
LMON
LMD
n
LCK
n

BSP
SMON
PMON
DBW
n
CKPT
LGWR
ARCH
Shared pool
IPC
Communication
layer
Cluster
Manager
Lock Database
Buffer
cache
Log
buffer
cache
Redo
log files
Rollback
segments
LMON
LMD
n
LCK
n
BSP

SMON
PMON
DBW
n
CKPT
LGWR
ARCH
Shared pool
IPC
Communication
layer
Cluster
Manager
Lock Database
Buffer
cache
Log
buffer
cache
Interconnect(s)
USING ORACLE PARALLEL SERVER
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY

1096
• In nonuniform memory access (NUMA), the CPUs access all parts of the mem-
ory, but their speed varies.
Storage-Access Configurations
Like memory access, disk access can also be uniform and nonuniform. In a uniform
disk access configuration, nodes access a shared centralized storage array via SCSI or
fiber connections. In a nonuniform disk access configuration, each node has a certain
number of storage devices locally attached to it. Disk access for that node is much
faster because it is local; for other nodes, the disk access becomes remote and is han-
dled via special software over the interconnects.
The most common hardware configuration for an OPS environment is multiple
SMP nodes connected to a shared disk farm.
Software Components
Several software components collectively support the OPS environment. These include
the Cluster Manager, the Distributed Lock Manager (DLM or iDLM), the Block Server
Process (BSP), and the Inter-Process Communication (IPC) component, which serve
the following functions:
• The Cluster Manager is a vendor-provided software component that resides on
each node locally and manages the node’s membership in a cluster. The Clus-
ter Manager provides failure-detection services (stopping services and isolat-
ing a node from the cluster if a failure is detected), monitors each node for any
hardware and software changes, and scans all the nodes for any new instances
that start.
• The DLM provides transparency to data access from different instances, as well
as fault tolerance and deadlock detection. The DLM in Oracle8i consists of the
Lock Database function and four background processes (LMON, LMDn, LCKn,
and BSP), which work as follows:
• The Lock Database contains resources for all inter-instance activities, such
as locks and PCM and non-PCM enqueues.
• The LMON background process monitors the entire cluster and manages

locks and enqueue resources. In cases of failures, the LMON triggers DLM to
rebuild the Lock Database, and the SMON initiates instance recovery.
LMON also monitors incoming status messages from other instances.
• The LMD background process manages and processes all incoming PCM
enqueue requests.
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1097
• The LCK background process manages and processes all incoming non-PCM
enqueue requests.
• The BSP produces and transmits consistent images of data blocks to the
requesting instances over the interconnect. This process is new in Oracle8i
and is an integral part of the Cache Fusion technology.
• The IPC component provides the protocol and interfaces that Oracle uses to
transmit and receive messages among its instances. The IPC is based on an asyn-
chronous, queued messaging model.
Oracle8i OPS Enhancements
As mentioned earlier, OPS uses PCM locks to allow read-consistent data to be read
from multiple instances. In earlier Oracle versions, PCM locks were statically specified
in the init.ora file by the GC_FILES_TO_LOCKS initialization parameter. Any adjust-
ments to this parameter required the database to be taken down and brought back up
again (“bounced”). Since this parameter specifies locks per datafile, it should be set to
the same value in all of the different init.ora files. (See Table 26.1, later in this chap-
ter, for more details on instance-specific initialization parameters.) The result was that
there would be a set number of instances using a specified number of PCM locks to
cover each and every database block.
For example, suppose that you had a 80GB database with an 8KB block size, and
you allowed a 1:20 ratio for the PCM locks. This would mean that for a total of

10,485,760 database blocks, there were 524,288 PCM locks. Setting these locks would
require an additional 50MB of memory per instance. What would happen when a
request was made to modify one of the database blocks?
Let’s say that a server process on Instance A on Node 1 needed to read a block from
a specified file. Because the PCM lock ratio was set to 1:20, it would mean that the
server process would not only read that particular block, but would also read the next
19 blocks covered by that PCM lock. In this process, a lock would be placed on the
affected data block, indicating that it was being modified by Instance A.
Now, let’s say that Instance B on Node 2 made a read-only request for the same
block. In this scenario, Instance B would need to wait for Instance A to either write
the block back to the datafile or provide it with a read-consistent image. In either
case, Instance A would need to flush some data back to the datafile in order for
Instance B to be able to read. This is known as a database ping.
What would happen if Instance B made a read-only request for any of the other 19
blocks covered by that particular PCM lock? Again, Instance A would need to flush
the unmodified blocks back to the datafile in order for Instance B to be able to read it.
This process is known as false pinging.
USING ORACLE PARALLEL SERVER
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1098
As new instances were added, the chances of pinging as well as false pinging would
increase. This could result in serious performance problems. Therefore, in environ-

ments where applications were not OPS-aware (properly distributed), earlier versions
of OPS could cause headaches.
With Oracle 8.0.x, Oracle introduced a new mechanism for dynamically setting
PCM locks called DBA locking, or shared or dynamic locking. A DBA now could dynami-
cally assign a PCM lock per database block for a specific period of time. This provided
relief from false pinging, but with the added management costs of assigning and
releasing the PCM locks.
Now in Oracle8i, Oracle has introduced a new concept called Cache Fusion. Cache
Fusion allows for an instance to transmit a read-consistent image of a data block to a
requesting instance directly over the high-speed interconnect, thus significantly reduc-
ing database pings. Cache Fusion technology in Oracle8i works well for scenarios
where an instance requests a particular block that is being modified by another
instance (a “write-read” scenario). However, it does not prevent a ping in cases where
an instance requests to modify a block that is being read by another instance (a “read-
write” scenario) or where an instance requests to modify a data block that is currently
being modified by another instance (a “write-write” scenario).
OPS Limitations
OPS is a robust, highly scalable, and highly available configuration. However, it may
not be ideal for every environment. There are certain factors that need to be consid-
ered before implementing OPS. One of the biggest drawbacks of OPS is that the appli-
cations need to be built with OPS in mind. This means that OPS may not be a viable
solution for certain third-party, out-of-the-box solutions.
OPS requires all of the database files to be created as raw devices. When creating a
database on raw devices, you need to spend a lot more time planning the file sizes as
well as their locations. Certain Cluster Managers also need to be shut down whenever
adding new raw devices. This means that you need to either create all of the files for
the entire lifetime of the database prior to its creation or allow for some amount of
downtime. In addition to this, backup and recovery processes also require additional
overhead.
Creating the OPS Database

Creating an OPS environment includes installing and configuring several hardware
and software components. The following sections provide details on creating an OPS
database, preparing multiple initialization files, and then starting instances and
mounting the database in parallel mode.
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1099
NOTE We assume that you have already configured the required underlying software
and hardware components. Follow the manufacturers’ recommendations to configure
your hardware and OPS environment infrastructure.
Preparing for Installation
The preparatory steps include installing and configuring the following components
on each participating node in the cluster:
Operating system-specific software Load the specific operating system
on each participating node in the cluster. Also ensure that all vendor-recommended
patches have been installed. It is always good practice to check the hardware
compatibility list for every software component to be installed. Any conflicts
should be resolved prior to proceeding.
Operating system-dependent layer Install software components such
as the Cluster Manager, the IPC component, and shared disk subsystem manage-
ment software. Since each of these components is a vendor-supplied component
(from a vendor other than Oracle), follow the manufacturer’s installation and con-
figuration procedures. It is always good practice to run any vendor-recommended
diagnostics after installation of these components.
Raw devices One of the limitations of implementing OPS is that all the
datafiles, control files, and redo log files must be created as raw devices. Also,
some operating systems mandate that the Cluster Manager be stopped when
adding new raw devices. This means that you need to create all the raw devices

for the entire lifetime of the database prior to creating the database, or you must
allow for a certain amount of downtime later.
NOTE You must create at least the initial raw devices prior to creating the database.
Also, if you intend to use the Oracle Universal Installer and the Database Configuration
Assistant, you need to create an ASCII file with all the raw device filenames. The
DBCA_RAW_CONFIG environment variable must be set to point to this file.
Oracle software The Oracle Universal Installer (OUI) is OPS-aware. When
you use it to install the Oracle software, it will present you with the choice of
selecting nodes. But before you invoke the OUI, there is a certain amount of
USING ORACLE PARALLEL SERVER
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1100
preconfiguration that is required. Basically, OUI requires that the Oracle user
should be able to do remote login (rlogin) to all the nodes. The .rhosts or the
hosts.equiv file on Unix systems should be set accordingly. You will need to cre-
ate the OSOPER and OSDBA groups as well.
Creating the Database Manually
Creating an OPS database is much like creating a single-instance database, although
there are a few differences, as you will learn in this section. You can use the Database
Configuration Assistant to create the OPS database, or you can manually create it. The
following are the basic steps to manually create the OPS database:
1. Create instance-specific initialization files.

2. Create the database.
3. Create additional rollback segments.
4. Start the database in parallel mode.
5. Configure Net8 on each node.
The following sections describe these steps in more detail.
Creating the Initialization Files
You will need an initialization file for every instance that you start. You can use the
$ORACLE_HOME/opsm/admin/init.ora (on Unix) file as a starting point. This file con-
tains some of the OPS-specific initialization parameters. Copy and rename this file
accordingly. Table 26.1 describes the instance-specific initialization parameters, and
Table 26.2 describes the database-specific initialization parameters.
TABLE 26.1: INSTANCE-SPECIFIC INITIALIZATION PARAMETERS
Parameter Description
IFILE Identifies the path and name of the include file. In an OPS envi-
ronment, some initialization parameters are identical for all the
instances, so this parameter can be used to specify a common
file that contains all database-specific initialization parameters.
INSTANCE_NAME Identifies the name of the instance. Each instance in an OPS con-
figuration must have a unique name.
INSTANCE_NUMBER Maps the instance to a free list group of a database object with
the free list group of the storage parameter. This value must be
set the same as the value of the thread parameter for that instance.
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1101
TABLE 26.1: INSTANCE-SPECIFIC INITIALIZATION PARAMETERS (CONTINUED)
Parameter Description
THREAD Specifies the number of the redo thread to be used by the

instance. Each instance in an OPS configuration must have its
own redo thread to write to. Also, the instance will not mount the
database if its redo thread has been disabled.
ROLLBACK_SEGMENTS Sets rollback segments to be used by an instance. At least two
rollback segments per instance are required. When creating the
database, you need to create enough rollback segments for every
instance. Note that except for the system rollback segment, public
rollback segments cannot be shared among instances.
PARALLEL_SERVER Allows instances to mount the database in shared mode. This
should be set to TRUE for all instances.
GC_FILES_TO_LOCKS Statically allocates and distributes PCM locks per datafile. Because
the PCM locks control inter-instance access to blocks within a
datafile, the values for GC_FILES_TO_LOCKS must be set identi-
cally in all instances.
TABLE 26.2: DATABASE-SPECIFIC INITIALIZATION PARAMETERS
Parameter Description
DB_NAME Specifies the name of the database.
DB_DOMAIN Specifies the database domain. It is always good practice to
set the database domain identical to the network domain.
CONTROL_FILES Specifies the names of the raw devices to be used as con-
trol files.
BACKGROUND_DUMP_DEST Specifies the location where the background processes
write their trace files.
USER_DUMP_DEST Specifies the location when user processes need to write a
trace file.
SERVICE_NAMES Specifies the database service names. By default, the ser-
vice name is set to the global name of the database. It is
possible to have multiple service names in an OPS configu-
ration. To implement this, set the service_names initializa-
tion parameter in the instance-specific init.ora file.

USING ORACLE PARALLEL SERVER
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1102
The trick to creating an instance-specific init.ora file is that most of the instance-
specific parameters will have different values, whereas all the database-related param-
eters will be set the same throughout instances.
Using the CREATE DATABASE Command to Create the Database
Certain CREATE DATABASE options play an important role in how your OPS environ-
ment will be configured:
MAXDATAFILES A database that is going to be accessed by multiple instances
generally tends to have a larger number of datafiles than a single-instance data-
base. This is because every instance requires its own redo thread as well as roll-
back segments. Since any changes to the MAXDATAFILE parameter require
re-creating the control file, it is a good idea to initially set this to a large value.
MAXINSTANCES This parameter places the high limit on the number of
instances that can access the database concurrently. It is always recommended
that you set the MAXINSTANCES value to more than the maximum number of
instances than you ever plan to run concurrently.
MAXLOGHISTORY This parameter sets the high limit on the number of
archived redo log files that can be stored in the control file. This number should
be set much higher than the default value (which varies by operating system).
This value directly impacts the ability to automatically recover an OPS node.

MAXLOGFILES This parameter sets the high limit on the number of log
groups that can be created for the database. Since each instance that you start
requires its own redo thread, it is wise to set MAXLOGFILES to a high value.
Listing 26.3 shows a sample script that creates a database called TEST to be opened
by multiple instances.
Listing 26.3: Creating an OPS Database
CREATE DATABASE “TEST”
MAXDATAFILES 1024
MAXINSTANCES 10
MAXLOGFILES 15
MAXLOGMEMBERS 5
MAXLOGHISTORY 1000
CHARACTER SET UTF8
CONTROLFILE REUSE
LOGFILE
GROUP 1 (‘/dev/vx/rdsk1/TEST/redo01_log01.dbf’,
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1103
‘/dev/vx/rdsk2/TEST/redo01_log02.dbf’,
‘/dev/vx/rdsk3/TEST/redo01_log03.dbf’) SIZE 10M REUSE,
GROUP 2 (‘/dev/vx/rdsk1/TEST/redo02_log01.dbf’,
‘/dev/vx/rdsk2/TEST/redo02_log02.dbf’,
‘/dev/vx/rdsk3/TEST/redo02_log03.dbf’) SIZE 10M REUSE,
DATAFILE
‘/dev/vx/rdsk/oradata/TEST/system01.dbf’ size 400M REUSE;
The script in Listing 26.3 is no different from a regular CREATE DATABASE script,
other than the fact that it creates two redo log groups. Also, notice the high values

of the MAXDATAFILES, MAXINSTANCES, MAXLOGFILES, and MAXLOGHISTORY
parameters.
Once you have created the database and all of the required tablespaces, run the
$ORACLE_HOME/rdbms/admin/catalog.sql and $ORACLE_HOME/rdbms/admin/
catproc.sql scripts. Also run the $ORACLE_HOME/rdbms/admin/catparr.sql script.
Creating Additional Rollback Segments
You will need to create additional private or public rollback segments for each of your
instances. Use the CREATE ROLLBACK SEGMENT or the CREATE PUBLIC ROLLBACK
SEGMENT statement to create additional rollback segments.
Private rollback segments are set in the instance-specific initialization files and can
be used only by that specific instance. They are brought online when the instance is
started. On the other hand, public rollback segments can be created by any instance
and used by any other instance. Once an instance acquires a public rollback segment,
it continues to use it until the instance is shut down. After the instance releases the
public rollback segment, it can be acquired and used by any other instance.
Starting the Database in Parallel Mode
To start the database in parallel mode, use either of the following commands:
SQL> STARTUP SHARED
SQL> STARTUP PARALLEL
You can also set the RETRY option, specifying the number of retries, as follows:
SQL STARTUP OPEN TEST RETRY 10
With the RETRY option set, Oracle will continue to mount the database every 5
seconds until either the database is opened or the retry limit has been reached. In this
example, Oracle will try at least ten times to mount and open the TEST database.
Configuring Net8
When you use OPS, you can automatically fail over applications to other open instances.
Listing 26.4 illustrates how to configure the tnsnames.ora file so that the same alias
USING ORACLE PARALLEL SERVER
Oracle8i Distributed
Database

PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY
1104
will allow you to connect to the database via an alternate instance, in the event that
the primary instance is down.
Listing 26.4: Configuring Automatic Failover
test = (description=
(address_list=
(address=
(protocol=tcp)(host=node1)(port=1521))
(address=
(protocol=tcp)(host=node2)(port=1521))
)
(connect_data=(sid=test))
)
Starting with Oracle8i, Oracle introduced the Transparent Application Failover
(TAF) component. You can use the TAF component to reconnect an established con-
nection through a chosen backup instance in cases of failure. Listing 26.5 shows how
to use the TAF component.
Listing 26.5: Configuring the TAF Component
test = (description=
(failover_mode=on)
(address=
(protocol=tcp)(host=node1)(port=1521)
)

(address=
(protocol=tcp)(host=node2)(port=1521)
)
(connect_data=
(service_name=test)
)
)
Additionally, you can also use the LOAD_BALANCE option to perform a random
selection of the address. If the chosen address does not respond, Oracle automatically
transfers the connection request to the next available address. Listing 26.6 shows how
to use the LOAD_BALANCE option for this function.
Listing 26.6: Configuring Load Balancing
test = (description=
(load_balance=on)
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
1105
(address=
(protocol=tcp)(host=node1)(port=1521)
)
(address=
(protocol=tcp)(host=node2)(port=1521)
)
(connect_data=
(service_name=test)
)
)
Using Oracle Fail Safe

Oracle Fail Safe is an add-on option for Oracle databases and is used to deploy highly
available solutions on Windows NT and Windows 2000 clusters. It consists of the Fail
Safe Server and the Fail Safe Manager. The server component works in conjunction
with the Microsoft Cluster Server (MSCS) software and other resource libraries to pro-
vide fast, automated failovers.
The Fail Safe Manager is a GUI component used for configuration, maintenance,
and load balancing within the cluster. A command-line utility, FSCMD, is also avail-
able for performing configuration and maintenance tasks via scripts.
NOTE Check the Hardware Compatibility List (HCL) for Microsoft Cluster Server before
beginning the installation and configuration of the Windows cluster. Fail Safe 3.1 also sup-
ports the Windows 2000 Datacenter configurations.
Cluster configuration in a Fail Safe environment is similar to cluster configuration
in an OPS environment. The Fail Safe Server software, along with the Oracle database
software, is installed on each node in the cluster. The datafiles reside on the shared
disk subsystem. The shared disk subsystem is attached to each node via a SCSI or fiber
connection.
Even though the cluster configuration of Fail Safe is similar to that of OPS, the logi-
cal configuration can vary. Two configurations are common:
Active/passive In an active/passive configuration, the primary node picks
up the entire workload, while the secondary node waits in a standby mode. As
USING ORACLE FAIL SAFE
Oracle8i Distributed
Database
PART
IV
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 26 • HIGH AVAILABILITY

1106
soon as a failure is detected, applications fail over to the secondary node, which
then becomes the primary node. Failovers in this configuration are much faster
than other high-availability solutions. Figure 26.3 illustrates this configuration.
FIGURE 26.3
An active/passive Fail
Safe configuration
Active/active In an active/active configuration, applications access both the
primary and the secondary nodes for different tasks. Of course, there is the trade-
off of resource availability and usability versus performance and load balancing.
Also, since both nodes are active, failovers are not as responsive as they are in
active/passive mode. Typically, most of the workload is still picked up by the pri-
mary node, and only critical pieces of applications are allowed to fail over to the
secondary node. A good example of the active/active configuration is where the
primary node is the primary database server and the secondary node also serves as
a web server. Figure 26.4 illustrates this configuration.
Standby node is in passive
mode until a failover is
detected. After failover
detection, the standby node
becomes primary, and the
client applications access the
database through this node.
Client applications access the
database through the primary
node until a failover occurs.
Primary node
Windows NT or 2000
MSCS Cluster Manager
Oracle Fail Safe Server

Standby node
Windows NT or 2000
MSCS Cluster Manager
Oracle Fail Safe Server
IPC
Database files
Shared disks
C
opyright ©2002 SYBEX, Inc., Alameda, CA
www.sybex.com
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×