DG 110wp
DG 110wp
DG 110wp
IMPLEMENTING ORACLE 10G DATA GUARD FOR HIGHER AVAILABILITY IMPLEMENTING ORACLE 10G DA
Daniel T. Liu, Senior DBA First American Real Estate Solutions
INTRODUCTION
One of the biggest responsibilities for a database administrator is provide high availability and reduce unplanned downtime for a database. However, this has become a major challenge as our database size increased so dramatically over the years and our critical business information system requires 24x7 uptime. In an unplanned downtime when a terabyte database was corrupted, it may take hours, even days to restore such a database. To minimize downtime and avoid data loss, we need a standby database that can take the role of the primary database in a timely fashion. Oracle 10g Data Guard technology meets such a challenge. Oracle version 7.3 was the first release to support standby database, however, the process of transferring redo logs was manual. The standby database has no other use until it takes the role of the primary database. Oracle8i introduced the concept of automatic shipping and application of redo log files from the primary site to the standby site. It also allows the standby database to be opened for read only while the recovering process is stopped. Oracle9i release 1 introduces the new concept of protection mode, preventing the primary and the standby database from diverging. It also introduces Data Guard broker, an interface to manage the Data guard environment. Oracle9i release 2 introduces the new concept of logical standby database. Oracle 10g introduces many new enhancements: Real Time Apply, Integration with Flashback Database, Zero Downtime Instantiation, Rolling Upgrades, Support for New Data Types, and RAC Support. This paper provides an overview of Oracle 10g Data Guard technology. It offers an introduction to the basic concepts and architectures of Data Guard. It discusses the selection of several of data protection mode, steps to setup a Data Guard environment, and steps to perform failover and switchover operations. It also provides tips for implementing Data Guard.
A logical standby database is logically identical to the primary database. It is updated using SQL statements. LOG TRANSPORT SERVICES: Log transport services control the automated transfer of archived redo from the primary database to one or more standby sites. NETWORK CONFIGURATION: The primary database is connected to one or more remote standby database via Oracle Net. LOG APPLY SERVICES: Log apply services apply the archived redo logs to the standby database. DATA GUARD BROKER: Data Guard Broker is the management and monitoring component with which you configure, control, and monitor a fault tolerant system consisting of a primary database protected by one or more standby database.
CLI
GUI
Primary Database
Standby Database
Oracle Net
DATA GUARD BROKER GUI INTERFACE (DATA GUARD MANAGER) Data Guard Manger is a GUI version of Data Guard broker interface that allows you to automate many of the tasks involved in configuring and monitoring a Data Guard environment. DATA GUARD BROKER COMMAND-LINE INTERFACE (CLI) It is an alternative interface to using the Data Guard Manger. It is useful if you want to use the broker from batch programs or scripts. You can perform most of the activities required to manage and monitor the Data Guard environment using the CLI. The following example lists the available commands:
$ dgmgrl DGMGRL for Solaris: Version 9.2.0.1.0 - Production. (c) Copyright 2002 Oracle Corporation. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> help The following commands are available: quit exit show See "help show" for syntax enable See "help enable" for syntax disable See "help disable" for syntax help [<command>] connect <user>/<password> [@<connect>] alter See "help alter" for syntax create See "help create" for syntax remove See "help remove" for syntax switchover See "help switchover" for syntax failover See "help failover" for syntax startup See "help startup" for syntax shutdown See "help shutdown" for syntax DGMGRL>
Note: The use of an SPFILE is required with Oracle9i Release 2 and Oracle 10g when using a Data Guard Broker Configuration.
PROCESS ARCHITECTURE
PHYSICAL STANDBY PROCESSES ARCHITECTURE
The log transport services and log apply services use the following processes to ship and apply redo logs to the physical standby database: On the primary database site, the log writer process (LGWR) collects transactions from the log buffer and writes to the online redo logs. The archiver process (ARCH) creates a copy of the online redo logs, and writes to the local archive destination. Depending on the configuration, the archiver process or log writer process can also transmit redo logs to standby database. When using the log writer process, you can specify synchronous or asynchronous network transmission of redo logs to remote destinations. Data Guard achieves synchronous network I/O using LGWR process. Data Guard achieves asynchronous network I/O using LGWR network server process (LNS). These network severs processes are deployed by LOG_ARCHIVE_DEST_n initialization parameter. On the standby database site, the remote file server process (RFS) receives archived redo logs from the primary database. The primary site launches the RFS process during the first log transfer. The redo logs information received by the RFS process can be stored as either standby redo logs or archived redo logs. Data Guard introduces the concept of standby redo logs (separate pool of log file groups). Standby redo logs must be archived by the ARCH process to the standby archived destination before the managed recovery process (MRP) applies redo log information to the standby database. The fetch archive log (FAL) client is the MRP process. The fetch archive log (FAL) server is a foreground process that runs on the primary database and services the fetch archive log requests coming from the FAL client. A separate FAL server is created for each incoming FAL client. When using Data Guard broker (dg_broker_start = true), the monitor agent process named Data Guard Broker Monitor (DMON) is running on every site (primary and standby) and maintain a two-way communication.
Primary Database
DMON
DMON
SYNC
LNS
ASYNC
FAL Server
Primary Database
DMON
DMON
SYNC
LNS
ASYNC
RFS
PX
PX Applying Group
Oracle Net
LSP0
Note: Logical Standby database is an Oracle9i Release 2 and Oracle 10g feature. In 9.2, the LGWR SYNC actually does use the LNS as well. Only SYNC=NOPARALLEL goes directly from the LGWR. The default SYNC mode is SYNC=PARALLEL.
Note: Oracle recommends Standby Redo Logs on all of the top three modes.
ORACLE9I RELEASE 2 AND ORACLE 10G PROVIDE THREE DATA PROTECTION MODES:
Maximum Protection: It offers the highest level of data availability for the primary database. Redo records are synchronously transmitted from the primary database to the standby database using LGWR process. Transaction is not committed on the primary database until it has been confirmed that the transaction data is available on at least one standby database. This mode is usually configured with at least two standby databases. If all standby databases become unavailable, it may result in primary instance shutdown. This ensures that no data is lost when the primary database loses contact with all the standby databases. Standby online redo logs are required in this mode. Therefore, logical standby database cannot participate in a maximum protection configuration. This mode is similar to 9iR1s guaranteed mode. Maximum Availability: It offers the next highest level of data availability for the primary database. Redo records are standby database. If standby database becomes unavailable, it will not shut down the primary database. Instead, the protection mode is temporarily switched to maximum performance mode until the fault has been corrected and the standby database will re-synchronize with the primary database. This protection mode supports both physical and logical standby databases, and only available in Oracle9i release 2 and Oracle 10g Data Guard.
synchronously transmitted from the primary database to the standby database using LGWR process. The transaction is not complete on the primary database until it has been confirmed that the transaction data is available on the
Maximum Performance: It is the default protection mode. It offers slightly less primary database protection than maximum availability mode but with higher performance. Redo logs are asynchronously shipped from the primary database to the standby database using either LGWR or ARCH process. When operating in this mode, the primary database continues its transaction processing without regard to data availability on any standby databases and there is little or no effect on performance. This protection mode is similar to the combination of 9iR1s Instance, Rapid, and Delay modes. It supports both physical and logical standby databases.
Mode
Redo Log Reception Option Standby redo logs are required Standby redo logs Standby redo logs
Supported on
Physical standby databases Physical and logical standby databases Physical and logical standby databases
LGWR or ARCH
ASYNC if LGWR
NOAFFIRM
Primary init.ora file: db_name = prod #drs_start = false # for 9.0.1 dg_broker_start = false # for 9.2.0 #fal_server = prod_02 #fal_client = prod_01 log_archive_dest_1 = location=/u02/arch/prod/ mandatory log_archive_format = prod_%s.arc log_archive_start= true standby_archive_dest = /u02/arch/prod log_archive_dest_2 = service=prod_02 optional repoen=60 lgwr async noaffirm log_archive_dest_state_2 = enable Primary listener.ora file: Listener_prod_01 (address_list = (address = (protocol = tcp) (host = server_01) (port = 1522) ) ) sid_list_listener_prod_01 (sid_list = (sid_desc = (oracle_home=/u01/app/oracle/product/9.2) (sid_name = prod_01) ) )
Standby init.ora file: db_name = prod #drs_start = false dg_broker_start = false fal_server = prod_01 fal_client = prod_02
log_archive_dest_1 = location=/u02/arch/prod/ mandatory log_archive_format = prod_%s.arc log_archive_start= true standby_archive_dest = /u02/arch/prod #log_archive_dest_2 = service=prod_01 optional repoen=60 lgwr async noaffirm #log_archive_dest_state_2 = enable Standby listener.ora file: Listener_prod_02 (address_list = (address = (protocol = tcp) (host = server_02) (port = 1522) ) ) sid_list_listener_prod_02 (sid_list = (sid_desc = (oracle_home=/u01/app/oracle/product/9.2) (sid_name = prod_02) ) )
Primary tnsnames.ora file: Prod = (description = (address = (protocol = tcp) (host = server_01) (port = 1522) (connect_data = (sid = prod_01))) Prod_01 = (description = (address = (protocol = tcp) (host = server_01) (port = 1522) (connect_data = (sid = prod_01))) Prod_02 = (description = (address = (protocol = tcp) (host = server_02) (port = 1522) (connect_data = (sid = prod_02)))
Standby tnsnames.ora file: Prod = (description = (address = (protocol = tcp) (host = server_01) (port = 1522) (connect_data = (sid = prod_01))) Prod_01 = (description = (address = (protocol = tcp) (host = server_01) (port = 1522) (connect_data = (sid = prod_01))) Prod_02 = (description = (address = (protocol = tcp) (host = server_02) (port = 1522) (connect_data = (sid = prod_02)))
Setup the init.ora file for both primary and standby databases. Setup the listener.ora file for both primary and standby databases. Setup tnsnames.ora file for both primary and standby sites.
STEP 2: BACKUP
Shut down the primary database. Backup the primary database datafiles.
$ cp /u02/oradata/prod/* /u03/backup/prod/
STEP 4: TRANSFER THE DATAFILES AND CONTROL FILE TO THE STANDBY SITE
Transfer the datafiles.
$ rcp /u03/backup/prod/* server_02:/u02/oradata/prod
STEP 8: MONITOR THE LOG TRANSPORT SERVICES AND LOG APPLY SERVICES
Issue a few log switches on the primary database.
SQL> alter system switch logfile;
Confirm the log files received on the standby archive destination. Check the standby alert log file to see if the new logs have applied to the standby database.
Media Recovery Log /u02/arch/prod/prod_1482.arc
FAILOVER STEPS
Failover is only performed as a result of an unplanned outage of the primary database. During a failover, the standby database (prod_02) becomes the new primary database. It is possible to have data loss. In 9.0.1, since you do not have Standby Redo Log files, you issue the following command on the standby site to activate the new primary database:
SQL> alter database activate standby database;
The ACTIVATE STANDBY DATABASE clause automatically creates online redo logs. It also performed a reset logs operation. New logs generated from the new primary database (prod_02) cannot be applied to the old primary database (prod_01). In 9.2.0, you can gracefully Failover even without standby redo log files. Issue the following command on the standby site to Failover to a new primary database.
SQL> alter database recover managed standby database skip standby logfiles;
This will apply all available redo and make the standby available to become a Primary. Complete the operation by switching the standby over to the primary role with the following command:
SQL> alter database commit to switchover to primary;
The old primary (prod_01) has to be discarded and can not be used as the new standby database. You need to create a new standby database by backing up the new primary and restore it on host server_01. The time to create a new standby database exposes the risk of having no standby database for protection. After failover operation, you need to modify TNS entry for prod to point to the new instance and host name (see next section: Switchover Step 7).
SWITCHOVER STEPS
Unlike failover, a switchover operation is a planned operation. All the archive logs required bringing the standby to the primarys point in time need to be available. The primary databases online redo logs also must be available and intact. During switchover operation, primary and standby databases switch roles. The old standby database (prod_02) becomes the new primary, and the old primary (prod_01) becomes the new standby database. The following are steps for switchover operation:
STEP 2: SHUTDOWN THE PRIMARY DATABASE AND BRING UP AS THE NEW STANDBY DATABASE
Shutdown the primary database normally
SQL> shutdown normal;
Modify the former primary databases initialization file Add the following two parameters. These two parameters can also be set on the primary database ahead of time for future switchover operation. fal_server = prod_02 fal_client = prod_01 Remove parameters log_archive_dest_2 and log_archive_dest_state_2. Or, just defer it is you like
PHYSICAL STANDBY
TO PRIMARY
STEP 4: SHUTDOWN THE STANDBY DATABASE AND BRING UP AS THE NEW PRIMARY DATABASE
Shutdown the standby database
SQL> shutdown normal;
Modify the former standby databases initialization file fal_server = prod_01 fal_client = prod_02 Add parameters log_archive_dest_2 and log_archive_dest_state_2
IMPLEMENTATION TIPS
Here are several tips for implementing Data Guard:
TIP #2: STANDBY ONLINE REDO LOGS VS. STANDBY ARCHIVED REDO LOGS
Online redo logs transferred from the primary database are stored as either standby redo logs or archived redo logs. Which redo log reception option should we choose? Here is the comparison chart: Standby Online Redo Logs Advantages Pre-allocated files Can place on raw devices Can be duplexed for more protection Improve redo data availability No Data Loss capable Standby Archived Redo Logs No extra ARCH process Reduce lag time
TIP #5: DISABLE LOG TRANSPORT SERVICES WHEN STANDBY DATABASE IS DOWN
When a standby database or host is down for maintenance, it is advisable to temporarily disable the log transport services for that site. Especially during a heavily transaction period, one behavior observed in Oracle9i R1 is that when one of the standby database is down for maintenance, it can temporarily freeze the primary database even the data protection mode is set to rapid mode. To avoid such problem, you can issue this command on the primary database before bring down the standby database:
SQL> alter system set log_archive_dest_state_2 = defer;
TIP #9: ALWAYS MONITOR LOG APPLY SERVICES AND CHECK ALERT. LOG FILE FOR ERRORS.
If you are not using Data Guard broker, here is a script to help you to monitor your standby database recover process:
$ cat ckalertlog.sh #################################################################### ## ckalertlog.sh ## #################################################################### #!/bin/ksh export EDITOR=vi export ORACLE_BASE=/u01/app/oracle export ORACLE_HOME=$ORACLE_BASE/product/9.2.0 export ORACLE_HOME LD_LIBRARY_PATH=$ORACLE_HOME/lib export TNS_ADMIN=/var/opt/oracle export ORATAB=/var/opt/oracle/oratab PATH=$PATH:$ORACLE_HOME:$ORACLE_HOME/bin:/usr/ccs/bin:/bin:/usr/bin:/usr/sbin:/ sbin:/usr/openwin/bin:/opt/bin:.; export PATH DBALIST="primary.dba@company.com,another.dba@company.com";export for SID in `cat $ORACLE_HOME/sidlist` do cd $ORACLE_BASE/admin/$SID/bdump if [ -f alert_${SID}.log ] then mv alert_${SID}.log alert_work.log touch alert_${SID}.log cat alert_work.log >> alert_${SID}.hist grep ORA- alert_work.log > alert.err fi if [ `cat alert.err|wc -l` -gt 0 ] then mailx -s "${SID} ORACLE ALERT ERRORS" $DBALIST < alert.err fi rm -f alert.err rm -f alert_work.log done
ROLLING UPGRADES
Oracle Database 10g supports database software upgrades (from Oracle Database 10g Patchset 1 onwards) in a rolling fashion, with near zero database downtime, by using Data Guard SQL Apply. The steps involve upgrading the logical standby database to the next release, running in a mixed mode to test and validate the upgrade, doing a role reversal by switching over to the upgraded database, and then finally upgrading the old primary database. While running in a mixed mode for testing purpose, the upgrade can be aborted and the software downgraded, without data loss. For additional data protection during these steps, a second standby database may be used.
RAC SUPPORT
It is now possible to use the Data Guard Broker, and the Broker's Command Line Interface (DGMGRL), as well as Enterprise Manager, to create and manage Data Guard configurations that contain RAC primary and RAC standby databases. In Oracle9i, such administration is possible only through SQL*Plus. In Data Guard 10g, Data Guard Broker interfaces with Oracle Clusterware so that it has control over critical operations during specific Data Guard state transitions, such as Switchovers, failovers, protection mode changes and state changes.
CONCLUSION
This paper provides an overview of Oracle 10g Data Guard technology. The paper offers an introduction to the basic concepts and architectures of Data Guard. It reviews different data protection modes. It discusses the following implementation steps: planning for higher availability, creating the standby database environment, setting up the log transport services, managing the log apply services, and administrating the Data Guard environment. It also shows steps to perform switchover and failover operations, along with some implementation tips. By implementing Oracle Data Guard technology, organizations will achieve higher availability and no data loss.
REFERENCES
Oracle Data Guard, Concepts and Administration, 10g Release 1 (10.1); Oracle9i, Data Guard Concepts and Administration. Release 1 (9.0.1); Oracle9i, Data Guard Concepts and Administration. Release 2 (9.2); Oracle9i, Data Guard Broker. Release 2 (9.2); Oracle Metalink Support; Top DBA Shell Scripts for Monitoring Database, Daniel T. Liu; DBAZine; I would also like to acknowledge the assistance of Bob Polak of the Allants Groups, Larry Barry, Ann Collins, Archana Sharma and Husam Tomeh of FARES, and Larry Carpenter, Joseph Meeks of Oracle Corporation. All companies and product names are trademarks or registered trademarks of the respective owners. Please report errors in this article to the author. Neither FARES nor the author warrants that this document is error-free.