This document discusses various topics related to Oracle Data Guard configurations including:
- Choosing the appropriate protection mode based on bandwidth, latency, and data loss tolerance.
- Performance tuning techniques such as enabling SYNC parallelization in 11g and adjusting redo transport parameters.
- Best practices for role transitions like switchovers and failovers, including using flashback and real-time redo apply.
- Parameters for corruption detection and techniques for automatic block repair using standby databases.
Report
Share
Report
Share
1 of 21
Download to read offline
More Related Content
Data Guard Deep Dive UKOUG 2012
1. 1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Emre Baransel – DBA, Employee ACE- Oracle
Data Guard Deep Dive
2. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Choosing the Protection Mode
MODE
REDO TRANSPORT
ACTION WITH NO STANDBY DATABASE CONNECTION
RISK OF DATA LOSS
Maximum Protection
SYNC & LGWR
The primary database has to write redo to at least one standby database. Otherwise it will shut down
Zero data loss is guaranteed
Maximum Availability
SYNC & LGWR
Normally works with SYNC. If primary database cannot write redo to any of its standby databases, it continues as in ASYNC mode
Zero data loss in normal operation, but not guaranteed
Maximum Performance
ASYNC & (LGWR or ARCH)
Never expects acknowledgment from standby database
Potential for minimal data loss in normal operation
3. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Choosing the Protection Mode
•If there is network bandwidth and latency issue
•use Maximum Performance
•recommended because it has not any performance benefit with LGWR !!! ARCH is not but has less data protection in 11g
•When any data loss is not acceptable & service outage is preferred against any data loss
•make your network bandwidth high enough
•and use Maximum Protection.
•If there is no intolerance about data loss & have high bandwidth
•use Maximum Availability
Required bandwidth (Mbps) =
((Max redo rate bytes per sec. / 0.7) * 8) / 1,000,000
If maximum redo generation rate is 500MB per minute
which is 8738133 bytes per second,
Then Required bandwidth = 100 Mbps
* only for data guard
* latency is important
4. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
SYNC Enhancement in 11g
Previously, primary database was first finishing writes to online redo log and then sending redo to standby database. There were two consecutive I/O operations that primary database needs to wait in order to complete the commit.
Standby Redo Log
Redo Log Buffer
Online Redo Log
Before 11g
Commit
OK
ok
ok
Primary Standby
5. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
SYNC Enhancement in 11g
In 11g these two I/O operations run in parallel. Primary database does not wait finishing writes to online redo log and it sends the redo data to standby at the same time.
Standby Redo Log
Redo Log Buffer
Online Redo Log
In11g
Commit
OK
ok
ok
Primary Standby
6. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
No More Delay to Decrease RTO
Prefer Real Time Apply with “Flashback On” rather than “Delay”. Delayed configuration increases RTO
LOG_ARCHIVE_DEST_2='SERVICE=STANDBY LGWR ASYNC VALID_FOR= (ONLINE_LOGFILES, PRIMARY_ROLE) DB_UNIQUE_NAME=ORCLSTD DELAY=120
DB_RECOVERY_FILE_DEST=‘+FRA’;
DB_RECOVERY_FILE_DEST_SIZE=500G;
DB_FLASHBACK_RETENTION_TARGET=120;
ALTER DATABASE FLASHBACK ON;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION
USE REAL-TIME APPLY
TURN ON FLASHBACK
7. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Using Flashback Database...
You can reinstate the original primary database as a new standby database following a failover
A failed switchover process can be reversed easily
Unwanted changes on Primary Database can be reversed and queried from Standby Database if flashback is not being used on primary.
8. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Using Real Time Apply...
Prefer Real Time Apply to avoid
ORA-01555 Snapshot Too Old
errors on Active Data Guard standby databases.
Query fresh data from standby
RTO is decreased
9. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
11g Performance Improvements
11g Recovery performance improvements include:
•More parallelism by default
•More efficient asynchronous redo read, parse, and apply
•Fewer synchronization points in the parallel apply algorithm
•The media recovery checkpoint at a redo log boundary no longer blocks the apply of the next log
Active Data Guard 11g Best Practices
Oracle Maximum Availability Architecture
White Paper
10. Data Guard Deep Dive
1 – Configuration Considerations 2 – Performance Tuning 3 – Role Transition Best Practices 4 – Corruption Detection 5 – Integration Issues
Determining Redo Apply Rate
1. Method:
SQL> select * from v$recovery_progress
23-SEP-11 Media Recovery Active Apply Rate KB/sec 15564 0
23-SEP-11 Media Recovery Average Apply Rate KB/sec 20890 0
2. Method:
SQL> select APPLY_RATE from V$STANDBY_APPLY_SNAPSHOT;
APPLY_RATE
----------
16305
11. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Determining Redo Apply Rate
SQL> SELECT PROCESS, SEQUENCE#, THREAD#, block#, BLOCKS, TO_CHAR(SYSDATE, 'DD-MON-YYYY HH:MI:SS') time from v$MANAGED_STANDBY WHERE PROCESS='MRP0';
PROCESS SEQUENCE# THREAD# BLOCK# BLOCKS TIME
--------- ---------- ---------- ---------- ---------- --------------------
MRP0 276877 1 147338 4097947 19-APR-2011 12:25:34
PROCESS SEQUENCE# THREAD# BLOCK# BLOCKS TIME
--------- ---------- ---------- ---------- ---------- --------------------
MRP0 276877 1 645542 4097947 19-APR-2011 12:25:39
SQL> SELECT lebsz LOG_BLOCK_SIZE from x$kccle; Redo block size (byte)
3. Method:
0. Second
5. Second
Media Recovery Rate:
((BLOCK#_END – BLOCK#_BEG) * LOG_BLOCK_SIZE)) / ((TIME_END – TIME_BEG) * 1024 * 1024)
12. Data Guard Deep Dive
1 – Configuration Considerations 2 – Performance Tuning 3 – Role Transition Best Practices 4 – Corruption Detection 5 – Integration Issues
Redo Apply Tuning
•By default recovery parallelism = CPU Count-1. Do not use any other values.
•Keep PARALLEL_EXECUTION_MESSAGE_SIZE >= 8192
•Keep DB_CACHE_SIZE >= Primary value
•Keep DB_BLOCK_CHECKING = FALSE (if you have to)
•System Resources Needs to be assessed
SQL> select a.sid, b.username, b.osuser, a.event, a.wait_time, a.p1, a.p1text, a.seconds_in_wait from gv$session_wait a, gv$session b where a.sid=b.sid and b.sid=(select SID from v$session where PADDR=(select PADDR from v$bgprocess where NAME='MRP0'));
Query what MRP process is waiting
13. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Redo Transport Tuning
Also consider:
3 - Configuring TCP Send / Receive Buffer Sizes (RECV_BUF_SIZE / SEND_BUF_SIZE)
4 - Increasing SDU Size
5 - Setting TCP.NODELAY to YES
1 - Tune LOG_ARCHIVE_MAX_PROCESSES parameter on the primary.
•Specifies the parallelism of redo transport
•Default value is 2 in 10g, 4 in 11g
•Increase if there is high redo generation rate and/or multiple standbys
•Must be increased up to 30 in some cases.
•Significantly increases redo transport rate.
2 - Consider using Redo Transport Compression:
•In 11.2.0.2 redo transport compression can be always on
•Use if network bandwidth is insufficient
•and CPU power is available
Redo Transport Services Best Practices Oracle® Database High Availability Best Practices 11g Release 1
14. Data Guard Deep Dive
1 – Configuration Considerations 2 – Performance Tuning 3 – Role Transition Best Practices 4 – Corruption Detection 5 – Integration Issues
Switchover Best Practices
Set JOB_QUEUE_PROCESSES & AQ_TM_PROCESSES params to 0.
Use Real-Time Apply
Reduce LOG_ARCHIVE_MAX_PROCESSES to the minimum.
Properly set archiving destinations on the standby database.
Set LOG_ARCHIVE_TRACE=8191;
Enable Flashback Database or use Guaranteed Restore Points
15. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Failover Best Practices
Enable Flashback Database
Use Real-Time Apply
Consider configuring multiple standby databases.
Consider using Fast-Start Failover
Set FastStartFailoverThreshold
Set FastStartFailoverAutoReinstate
16. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Corruption Detection Parameters
DB_BLOCK_CHECKSUM
OFF
(FALSE)
TYPICAL
(TRUE)
FULL
Physical Corruption
DB_BLOCK_CHECKING
OFF
(FALSE)
LOW
MEDIUM
FULL
(TRUE)
Logical Corruption
Best Practices for Corruption Detection, Prevention, and Automatic Repair - in a Data Guard Configuration [ID 1302539.1]
17. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
Automatic Block Corruption Repair
‘Automatic Block Corruption Repair’
•11gR2 feature
•ON with Physical Standby & Active Data Guard
•Corruptions are reparied automatically using the remote db.
Also using RMAN “RECOVER BLOCK” command you can repair the corruption. This operation will try use the standby database first. If you don’t want to use the standby database for corruption repair, you must use EXCLUDE STANDBY option in the “RECOVER BLOCK” command.
18. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
“Lost – Write” detection
“Lost – Write” detection
•11gR1 feature
•A serious corruption which has its source in I/O subsystem.
•Physical Standby, Active Data Guard and Real-Time Apply is needed
•DB_LOST_WRITE_PROTECT = “TYPICAL” on both Primary and standby.
•When detected, standby recovery stops
•The way to get rid of this corruption is to failover to standby database.
19. Data Guard Deep Dive
1 – Configuration Considerations
2 – Performance Tuning
3 – Role Transition Best Practices
4 – Corruption Detection
5 – Integration Issues
RMAN Integration
And beginning with 11g, for “Block Change Tracking” feature of RMAN, which records the changed blocks for incremental backups, standby databases can be used. This requires Active Data Guard. There are important bugs of this feaure. Check bugs 9869287, 9068088, 10094823.
Integration Requirements and Best Practices
•Only Physical Standby can be used for interchangeable backups.
•RMAN Catalog must be used. (In a seperate location if possible)
•DB_UNIQUE_NAME must be different.
•General RMAN Best Practices must be preserved.
20. Data Guard Deep Dive
1 – Configuration Considerations 2 – Performance Tuning 3 – Role Transition Best Practices 4 – Corruption Detection 5 – Integration Issues
Integration with Oracle Applications
•Directs write operations to primary
•All read operations to Active Data Duard standby
•Applications developed with Oracle TopLink are able to be configured as “Active Data Guard aware”
•An ongoing study,
•Writes will work on primary and Reads on standby
•Automatic direction to primary in a case of lag
Configuring Oracle TopLink Applications with Oracle Active Data Guard
Oracle Maximum Availability Architecture White Paper
Configuring Oracle BI EE Server with Oracle Active Data Guard
Oracle Maximum Availability Architecture White Paper
Using Active Data Guard Reporting with Oracle E- Business Suite Release 12.1 and Oracle Database 11g [ID 1070491.1]
•Redirect Reports to Active Data Guard
•“fnd_adg_utility.enable_adg_support”
21. Data Guard Deep Dive
1 – Configuration Considerations 2 – Performance Tuning 3 – Role Transition Best Practices 4 – Corruption Detection 5 – Integration Issues
Best Practices Papers
http://www.oracle.com/technetwork/ database/features/availability/dataguard11g-bestpractices-161724.html