RAC Important PDF
RAC Important PDF
RAC Important PDF
RAC Important
Log Directory Structure in Cluster Ready Services
To diagnose any problem, the first thing examined by Oracle Support are the installation log files. Anyone who
knows anything about database administration knows the importance of the dump directories (bdump, udump, and
cdump). Similarly, each component in the CRS stack has its respective directories created under the CRS home:
$ORA_CRS_HOME/crs/log Contains trace files for the CRS resources.
$ORA_CRS_HOME/crs/init Contains trace files of the CRS daemon during startup. Good place to start
with any CRS login problems.
$ORA_CRS_HOME/css/log The Cluster Synchronization (CSS) logs indicate all actions such as
reconfigurations, missed checkins, connects, and disconnects from the client CSS listener. In some cases,
the logger logs messages with the category of auth.crit for the reboots done by Oracle. This could be used
for checking the exact time when the reboot occurred.
$ORA_CRS_HOME/css/init Contains core dumps from the Oracle Cluster Synchronization Service
daemon (OCSSd) and the process ID (PID) for the CSS daemon whose death is treated as fatal. If
abnormal restarts for CSS exist, the core files will have the format of core.<pid>.
$ORA_CRS_HOME/evm/log Log files for the Event Volume Manager (EVM) and evmlogger daemons.
Not used as often for debugging as the CRS and CSS directories.
$ORA_CRS_HOME/evm/init PID and lock files for EVM. Core files for EVM should also be written
here.
$ORA_CRS_HOME/srvm/log Log files for Oracle Cluster Registry (OCR), which contains the details at
the Oracle cluster level.
$ORA_CRS_HOME//log Log files for Oracle Clusterware (known as the cluster alert log),
which contains diagnostic messages at the Oracle cluster level. This is available from Oracle database
10g R2.
Services
Cluster Synchronization Services (ocssd)— Manages cluster node membership and runs as the
oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a
database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so
on) based on the resource's configuration information that is stored in the OCR. This includes start,
stop, monitor and failover operations. This process runs as the root user
Event manager daemon (evmd) —A background process that publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing.
OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then
OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle
Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to support Oraclespecific requirements and
complex resources. Runs server callout scripts when FAN events occur.
RAC Background Processes:
1. Lock Monitor Processes ( LMON)
2. Lock Monitor Services (LMS)
3. Lock Monitor Daemon Process ( LMD)
4. LCKn ( Lock Process)
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 1/8
01/01/2015 RAC Important
5. DIAG (Diagnostic Daemon)
1. Lock Monitor Processes ( LMON)
It Maintains GCS memory structures. Handles the abnormal termination of processes and instances.
Reconfiguration of locks & resources when an instance joins or leaves the cluster are handled by LMON
( During reconfiguration LMON generate the trace files). It responsible for executing dynamic lock
remastering every 10 mins ( Only in 10g R2 & later versions). LMON Processes manages the global
locks & resources.
It monitors all instances in cluster, primary for dictionary cache locks,library cache locks & deadlocks on
deadlock sensitive on enqueue & resources.LMON also provides cluster group services. Also called
Global enqueue service monitor.
2. Lock Monitor Services (LMS)
LMS is most very active background processes.Consuming significant amount of CPU time. ( 10g R2
ensure that LMS process does not encounter the CPU starvation).Its primary job is to transport blocks
across the nodes for cachefusion requests.
If there is a consistentread request, the LMS process rolls back the block, makes a ConsistentRead
image of the block and then ship this block across the HSI (High Speed Interconnect) to the process
requesting from a remote node. LMS must also check constantly with the LMD background process (or
our GES process) to get the lock requests placed by the LMD process.Each node have 2 or more LMS
processes.
GCS_SERVER_PROCESSES > no of LMS processes specified in init. ora parameter.
Above parameter value set based on number of cpu's ( MIN(CPU_COUNT/2,2)) 10gR2, single CPU
instance,only one LMS processes started. Increasing the parameter value,if global cache activity is very
high. Also called the GCS (Global Cache Services) processes.
Internal View: X$KJMSDP
3. Lock Monitor Daemon Process ( LMDn)
LMD process performs global lock deadlock detection.Also monitors for lock conversion timeouts.
Also sometimes referred to as the GES (Global Enqueue Service) daemon since its job is to manage
the global enqueue and global resource access. LMD process also handles deadlock detection and
remote enqueue requests.Remote resource requests are the requests originating from another
instance.
Internal View: X$KJMDDP
4. LCKn ( Lock Process)
Manages instance resource requests & cross instance calls for shared resources. During instance
recovery,it builds a list of invalid lock elements and validates lock elements.
5. DIAG (Diagnostic Daemon)
Oracle 10g this one new background processes ( New enhanced diagnosability framework).Regularly
monitors the health of the instance.Also checks instance hangs & deadlocks.
It captures the vital diagnostics data for instance & process failures.
What are Oracle Clusterware Components
Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health
check and arbitrates cluster ownership among the instances in case of network failures. The voting
disk must reside on shared disk.
Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as
configuration information about any cluster database within the cluster. The OCR must reside on
shared disk that is accessible by all of the nodes in your cluster
Status
srvctl status database d <database
srvctl status instance d <database> i <instance>
srvctl status nodeapps n <node>
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 2/8
01/01/2015 RAC Important
srvctl status service d <database>
srvctl status asm n <node>
Stop
srvctl stop database d <database>
srvctl stop instance d <database> i <instance>,<instance>
srvctl stop service d <database> [s <service><service>] [i <instance>,<instance>]
srvctl stop nodeapps n <node>
srvctl stop asm n <node>
Start
srvctl start database d <database>
srvctl start instance d <database> i <instance>,<instance>
srvctl start service d <database> s <service><service> i <instance>,<instance>
srvctl start nodeapps n <node>
srvctl start asm n <node>
srvctl config database d orcl
crs_stat t
OCR
ocrconfig showbackup
ocrconfig replace <ocr_file_nmae>
ocrcheck
Voting Disk
dd if=Voting_disk_nmae out=backup_location
crsctl query css votedisk
crsctl add css votedisk path
crsctl delete css votedisk path
crsctl query crs softwareversion
crsctl stop crs
crs_start all
crs_stop all
crsctl check crs
crsctl check cssd
crsctl check crsd
crsctl check evmd
crsctl check cluster
How do we verify that RAC instances are running?
Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column,host_:instancename under
NST_NAME column
select * from gv$instance;
Stop/Start Oracle Cluster
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 3/8
01/01/2015 RAC Important
Crsctl start crs
Crsctl stop crs
Enabling Archive Logs in a RAC Environment
1. Login to one of the nodes (i.e. linux1) and disable the cluster instance parameter by setting
cluster_database to FALSE from the current instance:
2. $ sqlplus "/ as sysdba"
SQL> alter system set cluster_database=false scope=spfile sid='orcl1';
3. Shutdown all instances accessing the clustered database:
$ srvctl stop database d orcl
4. Using the local instance, MOUNTthe database:
5. $ sqlplus "/ as sysdba"
SQL> startup mount
6. Enable archiving:
SQL> alter database archivelog;
7. Reenable support for clustering by modifying the instance parameter cluster_database to
TRUE from the current instance:
SQL> alter system set cluster_database=true scope=spfile sid='orcl1';
8. Shutdown the local instance:
SQL> shutdown immediate
9. Bring all instance back up using srvctl:
$ srvctl start database d orcl
10. (Optional) Bring any services (i.e. TAF) back up using srvctl:
$ srvctl start service d orcl
11. Login to the local instance and verify Archive Log Mode is enabled:
$ sqlplus "/ as sysdba"
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 83
Next log sequence to archive 84
Current log sequence 84
After enabling Archive Log Mode, each instance in the RAC configuration can automatically archive
redologs!
Enabling Flashback On RAC Database
Enabling Flashback /Archive Log mode on a Single Instance Database is quite straight forward. In
case of RAC, you need to follow additional steps.
The requirements for enabling Flashback Database are:
Your database must be running in ARCHIVELOG mode, because archived logs are used in the
Flashback Database operation. You must have a flash recovery area enabled, because flashback logs
can only be stored in the flash recovery area. For Real Application Clusters databases, the flash
recovery area must be stored in a clustered file system or in ASM.
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 4/8
01/01/2015 RAC Important
First of all configure flash recovery area by setting db_recovery_file_dest_size and
db_recovery_file_dest
ALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE = 20G SCOPE=BOTH;
ALTER SYSTEM SET DB_RECOVERY_FILE_DEST = '+DG1' SCOPE=BOTH;
We are using ASM Diskgroup here which is sharable and available to both the nodes.Next step is to
enable archivelog mode and then to turn on flashback. To perform this, database needs to be in mount
mode.
We can use srvctl to disable any associated Database service and then stop the Database
[/home/oracle>srvctl stop service d TESTDB
[/home/oracle>srvctl stop database d TESTDBNow set Cluster_database=false for enabling the
Archivelog mode. This is a additional step which is required in RAC Database. For Single Instance,
we do not require it.
/home/oracle>sqlplus "/ as sysdba"
Connected to an idle instance.
SQL> startup nomount
ORACLE instance started.
Total System Global Area 1073741824 bytes
Fixed Size 1271564 bytes
Variable Size 314575092 bytes
Database Buffers 750780416 bytes
Redo Buffers 7114752 bytes
SQL> alter system set cluster_database=false scope=spfile;
System altered.
SQL> shutdown immediate
ORA01507: database not mounted
ORACLE instance shut down.
SQL> exit
SQL> startup mount
ORACLE instance started.
Total System Global Area 1073741824 bytes
Fixed Size 1271564 bytes
Variable Size 314575092 bytes
Database Buffers 750780416 bytes
Redo Buffers 7114752 bytes
Database mounted.
SQL>alter database archivelog;
SQL> alter database flashback on;
Database altered.
Set the Cluster_database parameter again to true.
SQL> alter system set cluster_database=true scope=spfile;
System altered.
SQL>shutdown immediate
We will again use srvctl to start the database and associated service
[/home/oracle>srvctl start database d TESTDB
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 5/8
01/01/2015 RAC Important
[/home/oracle>srvctl start service d TESTDB
We can confirm whether Archivelog mode and Flashback is enabled by querying V$DATABASE
SQL> SELECT LOG_MODE,FLASHBACK_ON FROM V$DATABASE;
LOG_MODE FLASHBACK_ON
ARCHIVELOG YES
Oracle RAC load balancing and failover
LOAD BALANCING: The Oracle RAC system can distribute the load over many nodes this feature
called as load balancing.
There are two methods of load balancing
1.Client load balancing
2.Server load balancing
Client Load Balancing distributes new connections among Oracle RAC nodes so that no one server
is overloaded with connection requests and it is configured at net service name level by providing
multiple descriptions in a description list or multiple addresses in an address list. For example, if
connection fails over to another node in case of failure, the client load balancing ensures that the
redirected connections are distributed among the other nodes in the RAC.
Configure Clientside connecttime load balancing by setting LOAD_BALANCE=ON in the
corresponding client side TNS entry.
TESTRAC =
(DESCRIPTION =
(ADDRESS_LIST=
(LOAD_BALANCE = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2VIP)(PORT = 1521))
)
(CONNECT_DATA = (SERVICE_NAME = testdb.oracleracexpert.com))
)
Server Load Balancing distributes processing workload among Oracle RAC nodes. It divides the
connection load evenly between all available listeners and distributes new user session connection
requests to the least loaded listener(s) based on the total number of sessions which are already
connected. Each listener communicates with the other listener(s) via each database instance’s PMON
process.
TESTRAC_LISTENERS =
(DESCRIPTION =
(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1)(PORT = 1521)))
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 6/8
01/01/2015 RAC Important
(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2)(PORT = 1521))))
)
Set *.remote_listener= TESTRAC_LISTENERS’ initialization parameter in the database’s shared
SPFILE and add TESTRAC_LISTENERS’ entry to the TNSNAMES.ORA file in the Oracle Home of
each node in the cluster.
Once you configure Serverside connecttime load balancing, each database’s PMON process will
automatically register the database with the database’s local listener as well as crossregister the
database with the listeners on all other nodes in the cluster. Now the nodes themselves decide which
node is least busy, and then will connect the client to that node.
FAILOVER:
The Oracle RAC system can protect against failures caused by O/S or server crashes or hardware
failures. When a node failure occurs in RAC system, the connection attempts can fail over to other
surviving nodes in the cluster this feature called as Failover.
There are two methods of failover
1. Connection Failover
2. Transparent Application Failover (TAF)
Connection Failover If a connection failure occurs at connect time, the application failover the
connection to another active node in the cluster. This feature enables client to connect to another
listener if the initial connection to the first listener fails.
Enable clientside connecttime Failover by setting FAILOVER=ON in the corresponding client side
TNS entry.
TESTRAC =
(DESCRIPTION =
(ADDRESS_LIST=
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2VIP)(PORT = 1521))
)
(CONNECT_DATA = (SERVICE_NAME = testdb.oracleracexpert.com))
)
If LOAD_BALANCE is set to on then clients randomly attempt connections to any nodes. If client
made connection attempt to a down node, the client needs to wait until it receives the information that
the node is not accessible before trying alternate address in ADDRESS_LIST.
Transparent application Failover (TAF)– If connection failure occurs after a connection is
established, the connection fails over to other surviving nodes. Any uncommitted transactions are
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 7/8
01/01/2015 RAC Important
rolled back and server side program variables and session properties will be lost. In some case the
select statements automatically reexecuted on the new connection with the cursor positioned on the
row on which it was positioned prior to the failover.
TESTRAC =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY =
5))
)
)
Lets take an example of a RAC service called SVC with two instances SVC1 and SVC2 running on
host1 and host2 (with virtual addresses host1_vip and host2_vip). The client tnsnames would look
something like this:
1. SVC =
2. (DESCRIPTION =
3. (ADDRESS = (PROTOCOL = TCP)(HOST = host1_vip)(PORT = 1521))
4. (ADDRESS = (PROTOCOL = TCP)(HOST = host2_vip)(PORT = 1521))
5. (LOAD_BALANCE = yes)
6. (CONNECT_DATA =
7. (SERVER = DEDICATED)
8. (SERVICE_NAME = SVC)
9. (FAILOVER_MODE=(TYPE=select)(METHOD=basic)(RETRIES=10)(DELAY=1))
10. )
11. )
data:text/html;charset=utf8,%3Cdiv%20class%3D%22articleheader%22%20style%3D%22margin%3A%200px%3B%20outline%3A%20none%3B%20pa… 8/8