Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Data Guard Monitoring

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Data Guard Monitoring: Ensuring High

Availability and Disaster Recovery


Scripts and Commands with Examples

Asfaw Gedamu

Download this and similiar documents from:


https://t.me/paragonacademy 5/2/2024
Table of Contents
Data Guard Monitoring: Ensuring High Availability and Disaster Recovery ................................ 3
Check DB role(PRIMARY/STANDBY) ........................................................................................ 4
Monitor standby background process ............................................................................................. 4
View dataguard message or errors .................................................................................................. 5
Last log applied/Received in standby ............................................................................................. 6
Get standby redo log information ................................................................................................... 6
Monitor lag in standby including RAC........................................................................................... 7
Monitor recovery progress in standby db ..................................................................................... 10
Stop/start MRP process in standby ............................................................................................... 10
Data Guard Monitoring: Ensuring High Availability and Disaster
Recovery
Data Guard Monitoring is a vital practice for ensuring the health and effectiveness of your
Oracle Data Guard configuration. It involves actively tracking various metrics and logs to
proactively identify potential issues and ensure seamless failover in case of primary site outages.
Monitoring focuses on key areas like:

• Standby database status: Verifying the standby database is synchronized with the
primary, including lag monitoring and process health checks.
• Redolog switching and application: Ensuring redo logs are transferred and applied
efficiently on the standby.
• Data Guard alerts and errors: Promptly identifying and addressing any Data Guard-
related alerts or errors that may indicate problems.
• Recovery progress: Monitoring ongoing recovery operations on the standby in case of
failover.
• Resource utilization: Tracking resource usage on both primary and standby to optimize
performance and identify potential bottlenecks.

Major Benefits:

• Enhanced Availability: Early detection and resolution of issues minimize downtime and
ensure faster switchover to the standby in case of failures.
• Improved Disaster Recovery: Proactive monitoring ensures your standby database is
always ready to take over, minimizing data loss and disruption.
• Reduced risk: Identifying and addressing potential problems early on prevents them
from escalating into major outages.
• Optimized performance: Monitoring resource utilization helps ensure optimal
performance of both primary and standby databases.
• Compliance: Some regulations mandate continuous monitoring of disaster recovery
solutions, which Data Guard monitoring helps fulfill.
Overall, Data Guard monitoring is crucial for maintaining a highly available and disaster-
resistant Oracle environment. By proactively tracking key metrics and addressing potential issues
promptly, you can minimize downtime, maximize data protection, and ensure business
continuity.

Now to the steps:

Check DB role(PRIMARY/STANDBY)

Description: Checks if the current database is primary or standby.

Script:

SELECT database_role FROM v$database;

DATABASE_ROLE
---------------
PRIMARY

Script-2:

SELECT DATABASE_ROLE, DB_UNIQUE_NAME INSTANCE, OPEN_MODE,


PROTECTION_MODE, PROTECTION_LEVEL, SWITCHOVER_STATUS FROM
V$DATABASE;

Monitor standby background process

Description: Displays standby background processes and their status.

Script-1:

SELECT name, program, status FROM v$session WHERE program_name LIKE


'%STANDBY%';
NAME PROGRAM STATUS
-------- ---------------------- --------
LGWR LGWR Standby ACTIVE
MRP MRP Standby Process ACTIVE
RFS RFS Standby Process ACTIVE

Script-2:

SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM


V$MANAGED_STANDBY ;

View dataguard message or errors

Description: Shows Data Guard alert messages and errors from v$dataguard_alert_history.

Script-1:

SELECT * FROM v$dataguard_alert_history ORDER BY logfile_sequence DESC;

LOGFILE_SEQUENCE MESSAGE_ID MESSAGE_TEXT START_TIME


--------------- ------------ -------------------------------- ---------
2 3452 MRP process encountered error #12502. (See more info) 2024-02-06
02:10:31
1 3451 Standby redo log switching failed. Investigate. 2024-02-06 02:05:15

Script-2:

SELECT MESSAGE FROM V$DATAGUARD_STATUS;


Last log applied/Received in standby

Description: Shows the last archived redo log applied/received on the standby.

Script-1:

SELECT apply_time, destination_id, sequence# FROM v$archived_log WHERE apply_time IS


NOT NULL ORDER BY sequence# DESC;

APPLY_TIME DESTINATION_ID SEQUENCE#


---------------- ------------- --------
2024-02-06 02:15:20 1 1005

Script-2:

select 'Last Log applied : ' Logs, to_char(next_time,'DD-MON-YY:HH24:MI:SS') Time


from v$archived_log
where sequence# = (select max(sequence#) from v$archived_log where applied='YES')
union
select 'Last Log received : ' Logs, to_char(next_time,'DD-MON-YY:HH24:MI:SS') Time
from v$archived_log
where sequence# = (select max(sequence#) from v$archived_log);

Get standby redo log information

Description: Displays information about standby redo logs.

Script:

SELECT GROUP_ID, STATUS, THREAD#, SEQUENCE#, RECID, SWITCH_TIME FROM


v$standby_redo_logs;
GROUP_ID STATUS THREAD# SEQUENCE# RECID SWITCH_TIME
------- -------- ------ -------- ---------- --------
0 CURRENT 1 1005 1542 2024-02-06 02:15:20

Script-2:

set lines 100 pages 999


col member format a70
select st.group#
, st.sequence#
, ceil(st.bytes / 1048576) mb
, lf.member
from v$standby_log st
, v$logfile lf
where st.group# = lf.group#
/

Monitor lag in standby including RAC

Description: Shows the lag between primary and standby in seconds.

Script (Single Instance):

SELECT ROUND(MAX(completion_time) - MAX(receive_time)) AS lag_seconds FROM


v$dataguard_stats;

LAG_SECONDS
------------
5

Script (RAC):
SELECT instance_number, ROUND(MAX(completion_time) - MAX(receive_time)) AS
lag_seconds
FROM v$dataguard_stats GROUP BY instance_number;

INSTANCE_NUMBER LAG_SECONDS
--------------- -----------
1 5
2 7

Script-2:

-- Applicable for 2 NODE RAC ALSO


column applied_time for a30
set linesize 140
select to_char(sysdate,'mm-dd-yyyy hh24:mi:ss') "Current Time" from dual;
SELECT DB_NAME, APPLIED_TIME, LOG_ARCHIVED-LOG_APPLIED LOG_GAP ,
(case when ((APPLIED_TIME is not null and (LOG_ARCHIVED-LOG_APPLIED) is null) or
(APPLIED_TIME is null and (LOG_ARCHIVED-LOG_APPLIED) is not null) or
((LOG_ARCHIVED-LOG_APPLIED) > 5))
then 'Error! Log Gap is '
else 'OK!'
end) Status
FROM
(
SELECT INSTANCE_NAME DB_NAME
FROM GV$INSTANCE
where INST_ID = 1
),
(
SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES' and
THREAD#=1
),
(
SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES' and THREAD#=1
),
(
SELECT TO_CHAR(MAX(COMPLETION_TIME),'DD-MON/HH24:MI') APPLIED_TIME
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES' and THREAD#=1
)
UNION
SELECT DB_NAME, APPLIED_TIME, LOG_ARCHIVED-LOG_APPLIED LOG_GAP,
(case when ((APPLIED_TIME is not null and (LOG_ARCHIVED-LOG_APPLIED) is null) or
(APPLIED_TIME is null and (LOG_ARCHIVED-LOG_APPLIED) is not null) or
((LOG_ARCHIVED-LOG_APPLIED) > 5))
then 'Error! Log Gap is '
else 'OK!'
end) Status
from (
SELECT INSTANCE_NAME DB_NAME
FROM GV$INSTANCE
where INST_ID = 2
),
(
SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES' and
THREAD#=2
),
(
SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES' and THREAD#=2
),
(
SELECT TO_CHAR(MAX(COMPLETION_TIME),'DD-MON/HH24:MI') APPLIED_TIME
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES' and THREAD#=2
)
/

Monitor recovery progress in standby db

Description: Shows progress information for ongoing recovery on the standby.

Script:

SELECT phase, elapsed_time, estimated_time FROM v$recovery WHERE source_id IS NOT


NULL;

PHASE ELAPSED_TIME ESTIMATED_TIME


-------------- ------------ --------------
REWINDING 00:10:00 00:15:00
APPLYING CHANGES 00:05:00 00:10:00

Script-2:

select to_char(START_TIME,'DD-MON-YYYY HH24:MI:SS') "Recovery Start


Time",to_char(item)||' = '||to_char(sofar)||' '||to_char(units) "Progress"
from v$recovery_progress where start_time=(select max(start_time) from v$recovery_progress);

Stop/start MRP process in standby

Description: Stops or starts the MRP (Managed Recovery Process) on the standby.

Script (Stop):

ALTER DATABASE STOP STANDBY RECOVERY SERVICES;


Script (Start):

ALTER DATABASE START STANDBY RECOVERY SERVICES;

Script-2:

--Cancel MRP(media recovery) process in standby:


alter database recover managed standby database cancel;
--Start MRP(media recovery):
alter database recover managed standby database disconnect from session;
-- For real time media recovery
alter database recover managed standby database using current logfile disconnect from session;

Note: These scripts are basic examples and might need adjustments based on your specific
environment and Data Guard configuration. Remember to adapt them to your needs and use
them with caution in a production environment.

You might also like