Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo

1

Oracle Application Continuity with Oracle RAC for java
How to create Oracle Database Services for High Availability
Introduction
To use High Availability features like Fast Application Noti
fi
cation (FAN), Draining, and Application
Continuity, you need to create a custom database service and set the high availability attributes
accordingly. This is not possible when using the default database service, which is the service
with the same name as the PDB. Additionally, custom database services provide location
transparency for the application.
FEATURE SET FOR KEEPING YOUR APPLICATIONS
CONTINUOUSLY AVAILABLE
Applications achieve continuous availability when planned maintenance, unplanned outages, and
load imbalances of the database tier are hidden.
Oracle provides a set of features that you can choose from to keep your application available
during planned events, unplanned outages and load imbalances.
You can think of these features as an insurance policy protecting your applications from service
interruptions.
The best features are those that are fully transparent to your application so your application
developers can focus on building functionality, not infrastructure, and that continue to protect the
application when it changes in the future. We call this future-proo
fi
ng.
Start with the feature set:
Draining and Rebalancing Sessions for Planned Maintenance:
When planned maintenance starts, sessions that need to be drained from an instance, PDB, or
database are marked to be drained. Idle sessions are released gradually. Active sessions are
drained when the work executing in that session completes. Draining of sessions is in wide use
with Oracle connection pools and mid-tiers con
fi
gured for Fast
Application Noti
fi
cation (FAN):
Starting with Oracle Database 18c, the database itself drains sessions when PDBs and instances
are stopped or relocated. Draining is always the best solution for hiding planned maintenance.
Failover solutions such as Application Continuity are the fallback when work will not drain in the
time allocated.
Transparent Application Failover (TAF):
TAF is a feature dating back to Oracle8i. Following an instance failure, TAF creates a new session
and, when using SELECT mode, on demand, replays queries back to where they were before the
failure occurred. Starting with Oracle Database 12.2, TAF o
ff
ers FAILOVER_RESTORE, matching
Application Continuity, to restore the initial session state before queries are replayed. Cursors are
replayed using the state re-established initially. Applications using TAF must not change session
state later in the session, (for example PLSQL, temp tables, temp lobs, sys context) as this
session state is not restored.

2

Application Continuity (AC):
Application Continuity hides outages starting with Oracle database 12.1 for thin Java-based
applications and Oracle Database 12.2.0.1 for OCI and ODP.NET based applications. Application
Continuity rebuilds the session by recovering the session from a known point which includes
session states and transactional states. Application Continuity rebuilds all in-
fl
ight work. The
application continues as it was, seeing a slightly delayed execution time when a failover occurs.
The standard mode for Application Continuity is for OLTP-style pooled applications. Transparent
Application Continuity (TAC) Starting with Oracle Database18c, Transparent Application Continuity
(TAC) transparently tracks and records session and transactional state so the database session
can be recovered following recoverable outages. This is done with no reliance on application
knowledge or application code changes, allowing Transparent Application Continuity to be
enabled for your applications. Application transparency and failover are achieved by consuming
the state-tracking information that captures and categorizes the session state usage as the
application issues user calls.
Transparent Application Continuity with Draining is the recommended solution for Continuous
Availability.
If you are using a 12c driver, or customization is required for initial states or side e
ff
ects, then you
should use Application Continuity.
TAF continues to be available and is fully supported.
INSTRUCTIONS FOR APPLICATION CONFIGURATION :
Follow these instructions when implementing your solution.
1. Use an Oracle Clusterware managed service that is not the default database service (the
default service has the same name as the database or PDB). The services that you create provide
location transparency and HA features.
2. Use the recommended Connection String (explained later in this paper) with built in timeouts,
retries and delay so incoming connections do not see errors during outages.
3. Fast Application Noti
fi
cation (FAN) is a mandatory component to initiate draining, to break out
of failures, and to rebalance sessions when services resume and when load imbalances occur. For
outages such as node and network failures, fast failover of the application does not happen if the
client is not interrupted by FAN.

3

FAN applies to all failover solutions. When con
fi
guring FAN, use Auto-con
fi
guration of ONS. Use
the recommended TNS format strictly. Do not alter this format. (Exception: if your client is
pre-12c, you will manually con
fi
gure FAN.)
4.Before maintenance starts, drain your work from the instances or nodes targeted for
maintenance. Enable FAN with Oracle Connection Pools or Connection tests (or both). Oracle
connection pools with FAN are the best solution as pools provides full life cycle of session
movement. That is, draining and rebalancing of work as maintenance progresses. When using
FAN, return your connections to the pool. If you are using server draining (the alternate plan
explained later in this paper) and your test is not a standard test, add your test to the server using
DBMS_APP_CONT_ADMIN. Sessions that do not drain within the DRAIN_TIMEOUT will be failed
over.
5. The standard solution for failing over sessions is Transparent Application Continuity (TAC). Use
Application Continuity (AC) if you are using Oracle Database 12c Release 2, or you want to
customize with side e
ff
ects or callbacks, or have an application that uses state such as temporary
tables and never cleans up. Use Transparent Application Failover (TAF) if your application is read
only and does not change Oracle session state in the session after the initial setup.
BUILDING BLOCKS FOR CONTINUOUS AVAILABILITY OF
YOUR APPLICATION :
Use Services
Service is a logical abstraction for managing work. Services allow applications to bene
fi
t from the
reliability of the redundant parts of the MAA system. The services hide the complexity of the
underlying system from the client by providing a single system image for managing work.
The service is :
• A unit for management:
a handful of services are manageable, many nodes, instances, listeners, and network interfaces
are not manageable. The service provides location transparency for sites and databases.
• A unit for availability:
resources are recovered quickly, independently and in parallel for each service and without the
need to start entire software stacks; and
• A unit for performance: work is routed transparently across the MAA system according to
service quality and priority. Services are measured against service level thresholds and violations
are reported to management with advised solutions in AWR.
Con
fi
gure the Connection String or URL for High Availability:
All Oracle-supplied connect strings will conform to the following recommendations. There is no
need to do anything if you use the Oracle-supplied wallet.
The following TNS/URL con
fi
guration is recommended for use for connecting at failover,
switchover, fallback and basic startup. Set RETRY_COUNT, RETRY_DELAY, CONNECT_TIMEOUT
and TRANSPORT_CONNECT_TIMEOUT parameters in the tnsnames.ora
fi
le or the URL to allow
connection requests to wait for the service and connect successfully. Connection attempts and
retries are managed by Oracle Database Net Services.
Set CONNECT_TIMEOUT to a high value to prevent login storms such as 90s or 120s.

4

Low values can result in frenzied login-attempts due to the application or pool cancelling and
retrying connection attempts.
Do not set (RETRY_COUNT+1)*RETRY_DELAY or CONNECT_TIMEOUT larger than your response
time SLA.
The application should either connect or receive an error within the response time SLA.
These are general recommendations for con
fi
guring the connections for high availability. Do not
use Easy Connect Naming on the client as EZCONNECT has no high availability capabilities. Note
that the standby-scan speci
fi
ed below refers to the SCAN address available on the STANDBY site
speci
fi
ed in your (Active) Data Guard con
fi
guration. Attempt will be made to connect to the
PRIMARY site
fi
rst, and if the service is not available, attempt to connect to this service at the
standby. Once the location of the service is known, Oracle drivers 12.2 and later remember the
address_list with that service o
ff
ered and chooses this
fi
rst, until the service next moves.
Adding the standby-scan to TNS connection descriptor to transparently fail over to the standby-
scan is optional.
Failing over to a standby database within the same region will have acceptable performance in
most cases versus failing over to a standby database in a di
ff
erent region where additional
network latency may result in unacceptable response time performance.
In the latter case, a site failover operation will be required which involves DNS failover to another
region containing mid-tier resources and standby database. Use this Connection String for ALL
Oracle driver version 12.2 or higher:
Alias (or URL) = (DESCRIPTION = (CONNECT_TIMEOUT= 90)(RETRY_COUNT=50)
(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3) (ADDRESS_LIST =
(LOAD_BALANCE=on) (ADDRESS = (PROTOCOL = TCP)(HOST=primary-scan)(PORT=1521)))
(ADDRESS_LIST = (LOAD_BALANCE=on) (ADDRESS = (PROTOCOL = TCP)(HOST=standby-
scan)(PORT=1521))) (CONNECT_DATA=(SERVICE_NAME = YOUR SERVICE)))
FAST APPLICATION NOTIFICATION (FAN) :
FAN must be used. FAN is a required component for interrupting the application to failover.
Without FAN, applications can hang on TCP/IP timeout following hardware and network failures,
and omit to rebalance when resources resume. All Oracle pools and all Oracle application servers
use FAN. Third-party JAVA application servers can use UCP to enable FAN. No application
changes are required to use FAN. These are con
fi
guration changes only.
For continuous service during planned maintenance, use FAN with:
• Oracle pools or
• UCP with third-party JDBC application servers or
• The latest Oracle client drivers For continuous service during unplanned outages, use FAN with •
Application Continuity or
• Transparent Application Continuity The format of the ADDRESS_LIST described above is
important for a number of reasons, one being that this format allows for autocon
fi
guration of ONS.
ONS is used to propagate FAN events to mid-tier pools and clients.
When a database connection is made, database-tier ONS information is sent back to the mid-tier,
allowing the mid-tier to establish ONS communication paths automatically.
This means that con
fi
guration can be dynamic and need not be maintained at the mid-tier. ONS
connections will be made from both the primary and standby sites.
To setup FAN ON clients:
Use the TNS alias or URL shown in the this discussion.
This connection string is used to auto-con
fi
gure the Oracle Noti
fi
cation Service (ONS)
subscription at the client for FAN-event receipt when using an Oracle Database 12c or later client
driver.
For older drivers, refer to the FAN white paper in the Appendix for con
fi
guration details. ONS
provides a secure communication path between the database tier and the client-tier allowing the
client to be noti
fi
ed of service availability (components stopping or starting) as well as runtime
load balancing advice for better work placement during normal operation.

5

To enable FAN in the client:
1. Use the TNS alias or URL provided. This connect string will auto-con
fi
gure ONS (auto-ONS)
subscription at the client for FAN event receipt when using a 12c driver or later.
For older drivers, refer to the FAN white paper.
2. Depending on the client, enable FAN in the application con
fi
guration properties as follows
2a-Universal Connection Pool or JDBC thin driver (starting 12.2) :
Set the property FastConnectionFailoverEnabled
2b-WebLogic Active GridLink for Oracle RAC :
FAN and Fast Connection Failover are enabled by default
2c-Oracle WebLogic Server, IBM WebSphere, IBM Liberty, Apache Tomcat, Red Hat WildFly
(JBoss), JDBC Applications :
Use Universal Connection Pool as a connection pool replacement
3c- ODP.Net clients (Managed and Unmanaged Providers)
Set “HA events = true;pooling=true”
in the connect string if using ODP.Net 12.1 or earlier
4c-Oracle Call Interface (OCI) clients and OCI-based drivers :
Oracle Call Interface (OCI) clients without native settings can use oraacces.xml and set events to
true Python, Node.js and PHP have native options.
In Python and Node.js you can set an events mode when creating a connection pool.
In PHP, edit php.ini add the entry oci8.events=on SQL*Plus enables FAN by default
3. You can con
fi
gure ONS to use TLS (wallet-based) authentication.
For JDBC applications:
a) Ensure the following JAR
fi
les are present in your application’s CLASSPATH :
(ons.jar,osdt_cert.jar,osdt_core.jar,oraclepki.jar). The Oracle ATP-D service always uses TLS.

6

b) Specify the wallet for FAN do one of the following:
• To use AUTO-ONS with wallets an application must set the following Java system properties: “-
Doracle.ons.wallet
fi
le=/replace_this_with_host_path/onswallet”
and
“Doracle.ons.walletpassword=myONSwalletPassword”
This cannot be set on a per-pool or per-connection basis
• To use explicit ONS con
fi
guration (instead of AUTO-ONS) do one of the following:
i) Programmatically within UCP, call setONSCon
fi
guration(), for example
ii) Set declaratively using an UCP XML
con
fi
guration
fi
le:
ii) For Oracle Call Interface (OCI) applications Oracle client drivers 12.2 or later are required:
The <wallet_location> path should be the name of the directory containing the wallet.
Other parameters may be set in the ons section of oraaccess.xml, including <hosts>,
<max_connections>, and <subscription_wait_timeout>.
Drivers that support native event setting controls may omit the <events> section and use the
driver setting instead.
By default connections will be established to the database even if ONS fails.
If you prefer connections to fail in this scenario, you can add a section to the same level as
<events> and <ons>:

7

<fan>
<subscription_failure_action>
error
</subscription_failure_action>
</fan>
Place the oraaccess.xml
fi
le in the same directory as the tnsnames.ora and sqlnet.ora network
fi
les. For example, when using Oracle Instant Client these
fi
les might be in the default directory
network/admin. Alternatively, all network con
fi
guration
fi
les can be put in another accessible
directory. Then set the environment variable TNS_ADMIN to that directory name.
FAN and Draining
Service status changes like stopping and starting the service can be propagated to Oracle client
drivers and Oracle connection pools using FAN, a high-availability noti
fi
cation mechanism. FAN
events are published using Oracle Noti
fi
cation Service (ONS).
The clients take immediate action accordingly, e.g., to balance connections or to establish new
connections and route work to available database instances or where the service is started.
Draining means allowing connected sessions to complete their current work in a speci
fi
c duration
(drain timeout) before maintenance is started.
Set the database service attribute noti
fi
cation to true to enable FAN and drain_timeout to
time_in_seconds to enable draining:
srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name>
-preferred <instance_name>
-available <instance_name>
-stopoption IMMEDIATE
-noti
fi
cation true
-drain_timeout 300
#example
srvctl add service -db RACCDB_POP -pdb PDB1 -service drainsrv -preferred
RACCDB1,RACCDB2 -stopoption IMMEDIATE -noti
fi
cation true -drain_timeout 300
#start the service
srvctl start service -db RACCDB_POP -service drainsrv
#check status
srvctl status service -db RACCDB_POP -service drainsrv
Service drainsrv is running on instance(s) RACCDB1,RACCDB2
When you stop the service on instance 1, RACCDB1:
#stop service on instance 1
srvctl stop service -db RACCDB_POP -service drainsrv -instance RACCDB1
#check status
srvctl status service -db RACCDB_fra -service drainsrv
Service drainsrv is running on instance(s) RACCDB2
You can specify a di
ff
erent drain_timeout when stopping the service:
srvctl stop service -db RACCDB_POP -service drainsrv -instance RACCDB1 -drain_timeout 10
New connections will immediately be directed to instance 2.
Existing sessions to instance 1 remain connected to instance 1 for the duration of drain_timeout
and can
fi
nish their work (commit) during this time without any interruption.

8

Ideally, all sessions drain (=
fi
nish their work) within the drain_timeout speci
fi
ed. You can set the
drain_timeout as high as needed for your application, e.g., 30 minutes or 1 hour. After that, the
database node will be ready for maintenance, and the application will not encounter any
interruption if all sessions drain.
Transaction Guard:
Transaction Guard (TG) is a database feature for applications to ensure that every transaction has
been executed at most once in case of planned or unplanned outage. In the background each
transaction is associated with a logical transaction IDthat let you determine, afterward, if the
transaction has committed and has been completed. Without Transaction Guard (TG) the typical
example that is given is a transaction that is increasing by 5% the salary of all company
employees. If the application fail, outside of the fact that you will make happy employees by
giving them Double% you cannot rerun blindly the transaction from scratch. With Transaction
Guard you can know from application layout how it went and present a warning or gray the
submit button to avoid a second dramatic rerun.
By default the Java connection has autocommit property set to TRUE
From Oracle Technology Network the requirements for TG are:
-Use Oracle Database Release 12.1 or later.
-Use an application service for all database work.
Create the service using srvctl if using RAC or DBMS_SERVICE if not using RAC. You may also
use GDSCTL.
-Set the following properties on the service – COMMIT_OUTCOME = TRUE for Transaction Guard.
-Grant execute permission on DBMS_APP_CONT package to the application user.
-Increase DDL_LOCK_TIMEOUT if using Transaction Guard with DDL statements (for example, 10
seconds).
Problem That Transaction Guard Solves
In applications without Transaction Guard, a fundamental problem for recovering applications
after an outage is that the commit message that is sent back to the client is not durable.
If there is a break between the client and the server, the client sees an error message indicating
that the communication failed. This error does not inform the application if the submission
executed any commit operations, if a procedural call completed and executed all expected
commits and session state changes, or if a call failed part way through or, yet worse, is still
running disconnected from the client.
Without Transaction Guard, it is impossible or extremely di
ffi
cult to determine the outcome of the
last commit operation, in a guaranteed and scalable manner, after a communication failure to the
server. If an application must determine whether the submission to the database was committed,
the application must add custom exception code to query the outcome for every possible commit
point in the application. Given that a system can fail anywhere, this is almost impractical because
the query must be speci
fi
c to each submission. After an application is built and is in production,
this is completely impractical. Moreover, a query cannot give the correct answer because the
transaction could commit immediately after that query executed. Indeed, after a communication
failure the server may still be running the submission not yet aware that the client has
disconnected. For PL/SQL or Java in the database, for a procedural submission, there is also no
record as to whether that submission ran to completion or was aborted part way through. While
such a procedure may have committed, subsequent work may not have been done for the
procedure.
Failing to recognize that the last submission has committed, or will commit sometime soon or has
not run to completion, can lead applications that attempt to replay, thus causing duplicate
transaction submissions and other forms of "logical corruption" because the software might try to
reissue already persisted changes.
Without Transaction Guard, if a transaction has been started and commit has been issued, the
commit message that is sent back to the client is not durable. The client is left not knowing
whether the transaction committed. The transaction cannot be validly resubmitted if the
nontransactional state is incorrect or if it already committed. In the absence of guaranteed commit

9

and completion information, resubmission can lead to transactions applied more than once and in
a session with the incorrect state.
Solution That Transaction Guard Provides
E
ff
ective with Oracle Database 12c Release 1 (12.1.0.1), Transaction Guard provides new,
integrated tools for applications to use to achieve idempotence automatically and transparently,
and in a manner that scales. Its key features are the following:
• Durability of COMMIT outcome by saving a logical transaction identi
fi
er (LTXID) at commit for all
supported transaction types against the database (Oracle Database 12cRelease 1 (12.1.0.1) or
later). This includes idempotence for transactions executed using autocommit, from inside PL/
SQL, from remote transactions, One-Phase XA transactions, and from callouts that cannot
otherwise be identi
fi
ed using generic means.
• Use of the LTXID to support at-most-once execution semantics, such that database
transactions protected by logical transaction identi
fi
ers cannot be duplicated when there are
multiple copies of that transaction in
fl
ight identi
fi
ed by the LTXID.
• Blocking of a commit of in-
fl
ight work to ensure that regardless of the outage situation, another
submission of the same transaction protected by that LTXID cannot commit.
• Identi
fi
cation of whether work committed at an LTXID was committed as part of a top-level call
(client to server), or was embedded in a procedure (such as PL/SQL) at the server. An
embedded commit state indicates that while a commit completed, the entire procedure in which
the commit executed has not yet run to completion. Any work beyond the commit cannot be
guaranteed to have completed until that procedure itself returns to the database engine.
• Identi
fi
cation of whether the database to which the commit resolution is directed is ahead of, in
sync with, or behind the original submission, and rejection when there are gaps in the
submission sequence of transactions from a client. It is considered an error to attempt to obtain
an outcome if the server or client are not in sync on an LTXID sequence.
• A callback on the JDBC Thin client driver that
fi
res when the LTXID changes. This can be used
by higher layer applications such as WebLogic Server and third parties to maintain the current
LTXID ready to use if needed.
• Namespace uniqueness across globally disparate databases and across databases that are
consolidated into a Multitenant infrastructure. This includes Oracle Real Application Clusters
(Oracle RAC) and RAC One, Data Guard, and Multitenant databases.
• Service name uniqueness across global databases and across databases that are consolidated
into a Multitenant infrastructure. This ensures that connections are properly directed to the
transaction information.
At-Most-Once Execution
Transaction Guard uses the logical transaction identi
fi
er (LTXID) to avoid duplicate transactions.
This ability to ensure at most one execution of a transaction is referred to as transaction
idempotence. The LTXID is persisted on commit and is reused following a rollback. During normal
runtime, an LTXID is automatically held in the session at both the client and server for each
database transaction. At commit, the LTXID is persisted as part of committing the transaction.
The at-most-once protocol requires that the database maintain the LTXID for the retention period
agreed for replay. The default retention period is 24 hours, although you might need a shorter or
longer period, conceivably even a week or longer. The longer the retention period, the longer the
at-most-once check blocks an old transaction using an old LTXID from replay. The setting is
available on each service. When multiple databases are involved, as is the case when using Data
Guard and Active Data Guard, the LTXID is replicated to each database involved through the use
of redo.

10

The getLTXID API, provided for Oracle JDBC Thin (with similar APIs for OCI, OCCI, and ODP.NET
clients), lets an application retrieve the logical transaction identi
fi
er that was in use on the dead
session. This is needed to determine the status of this last transaction.
The DBMS_APP_CONT.GET_LTXID_OUTCOME PL/SQL subprogram lets an application
fi
nd the
outcome of an action for a speci
fi
ed logical transaction identi
fi
er. Calling
DBMS_APP_CONT.GET_LTXID_OUTCOME may involve the server blocking the LTXID from
committing so that the outcome is known. This is a requirement if a transaction using that LTXID
is in
fl
ight or is about to commit. An application using Transaction Guard obtains the LTXID
following a recoverable error, and then calls DBMS_APP_CONT.GET_LTXID_OUTCOMEbefore
attempting a replay.
The Oracle PL/SQL supplied package to know if a transaction has committed and has completed
is DBMS_APP_CONT.GET_LTXID_OUTCOME:
SQL> DESC DBMS_APP_CONT
PROCEDURE GET_LTXID_OUTCOME
Argument TYPE IN/OUT
--------- --------- ------
CLIENT_LTXID RAW IN
COMMITTED BOOLEAN OUT
USER_CALL_COMPLETED BOOLEAN OUT
This procedure has two Boolean out parameters and until recently it was possible to map a
PL/SQL Boolean in Java from Oracle 12cR2 above, are wrapping this procedure in a PL/SQL
block to return numbers instead of Boolean.
Application Continuity:
What happens to sessions with in-
fl
ight work that do not complete their work within
drain_timeout?
This is where Application Continuity comes into play!
Application Continuity will restore the session state and replay the in-
fl
ight work on a servicing
instance transparently for the application.
The database service needs to have the Application Continuity attributes set:
srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name>
-preferred <instance_name>
-available <instance_name>
-stopoption IMMEDIATE
-noti
fi
cation true
-drain_timeout 300
-commit_outcome true
-retention 86400
-replay_init_time 600
-session_state DYNAMIC
-failover_restore LEVEL1
-failovertype TRANSACTION
For Transparent Application Continuity (an operation mode of Application Continuity) set
failover_restore and failovertype to AUTO:
srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name>
...

11

-session_state defaults to AUTO if failover_type is set to AUTO
-failover_restore AUTO
-failovertype AUTO
A key di
ff
erence is that Application Continuity (AC) requires the application to use a
Connection Pool, while Transparent Application Continuity(TAC) works with or without
Connection Pools.
Explanation of the following attributes:
• commit_outcome set to true enables Transaction Guard, which is required by Application
Continuity and Transparent Application Continuity.
• retention determines the amount of time (in seconds) that the commit outcome is retained in the
database.
• replay_init_time speci
fi
es the di
ff
erence between the time, in seconds, of the original execution
of the
fi
rst operation of a request and the time that the replay is ready to start after a successful
reconnect. Application Continuity will not replay after the speci
fi
ed amount of time has passed.
This parameter is intended to avoid the unintentional execution of a transaction when a system
is recovered after a long period.
• session_state describes how the non-transactional session state is changed by the application
within a request. Examples of the session states are NLS settings, optimizer preferences, event
settings, PL/SQL global variables, and temporary tables.
Services for Active Data Guard
Create role-based services when using Active Data Guard. Set the role attribute to primary for
services to be used for the primary site (read/write) and standby for services to be open on the
standby site (read-only). The services start and stop automatically at a site based on their role.
srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name>
...
-failover_restore AUTO
-failovertype AUTO
-role primary
Create the same service name with the same role on both primary and standby sites, e.g.:
#on primary and standby sites
srvctl add service -db RACCDB_POP -pdb pdb1 -service tacsrv -preferred RACCDB1,RACCDB2
-stopoption IMMEDIATE -noti
fi
cation true -drain_timeout 300 -commit_outcome true -retention
86400 -replay_init_time 600 -session_state AUTO -failover_restore AUTO -failovertype AUTO
-role primary
#on primary site
srvctl start service -db RACCDB_POP -service tacsrv
Note that we didn’t start the service on the standby site.
The service will automatically start after a switchover or failover operation.
Note 1: When using Active Data Guard with single-instance databases, remove the preferred and
available attributes that are related to Oracle RAC databases.
Note 2: To create the same service name on both sites in Oracle Cloud, include the domain name
in the service, e.g., tacsrv.company.com,

12

Using the default PDB or DB name is not recommended, as high availability attributes cannot be
set for these services. Always create a custom service for your application and set high availability
attributes accordingly.
Draining in Data Guard Environments
To allow services to drain during switchover operations, use the WAIT option of the switchover
Data Gaurd broker command:
DGMGRL> switchover to <standby_db_unique_name> wait;
--example
DGMGRL> switchover to CDB01_POP wait;
Stopping services and waiting for sessions to drain...
The broker determines the maximum drain_timeout value for all active services and waits for up to
that amount of time for all current client requests to be processed, before proceeding with the
switchover.
Alternatively, you can specify the number of seconds to wait for sessions to drain.
DGMGRL> switchover to <standby_db_unique_name> wait <timeout_in_seconds>;
Services in Autonomous Database
Autonomous Database provides prede
fi
ned services with di
ff
erent characteristics to be used for
applications depending on their requirements.
(Transparent) Application Continuity can be enabled as follows:
https://docs.oracle.com/en/cloud/paas/autonomous-database/adbsa/application-continuity-
con
fi
gure.html
Connection String
The application should use the recommended connection string con
fi
gured for high availability:
Alias = (DESCRIPTION = (CONNECT_TIMEOUT=90)(RETRY_COUNT=50)(RETRY_DELAY=3)
(TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST = (LOAD_BALANCE=ON)
(ADDRESS = (PROTOCOL=TCP)(HOST=primary-scan)(PORT=1521)))
(ADDRESS_LIST = (LOAD_BALANCE=ON)
(ADDRESS = (PROTOCOL=TCP)(HOST=secondary-scan)(PORT=1521)))
(CONNECT_DATA = (SERVICE_NAME=gold-cloud)))
How To Con
fi
gure Server Side Transparent Application
Failover
This document provides basic instructions for con
fi
guring Server side (versus client) TAF or
Transparant Application Failover.
Server side TAF settings override client-side counterparts that might be con
fi
gured in TNS
connect descriptors. If TAF is not con
fi
gured on the client side, then at a minimum, the failover
type must be set to enable TAF. If the failover type is set on the server side, then the failover
method will default to BASIC. Delay and retries are optional and may be speci
fi
ed independently.
SOLUTION
1. Create a service on the RAC cluster to setup for TAF. Example creates a service called
server_taf to the database called rac, with instance names rac1 & rac2.

13

Please note for the service name, select a name that is unique and not the same as the default
service name created. A special Oracle database service is created by default for your Oracle
RAC database. This default service is always available on all instances in an Oracle RAC
environment, unless an instance is in restricted mode. You cannot alter this service or its
properties.
$srvctl add service -d rac -s server_taf -r "rac1,rac2" -P BASIC
2. Start the service server_taf
$srvctl start service -d rac -s server_taf
3. Check service is running
$srvctl con
fi
g service -d rac
ractest
PREF: rac1 rac2
AVAIL: server_taf
PREF: rac1 rac2
AVAIL:
4. Find the service_id value for the service just created
sqlplus /nolog
Connect / as sysdba
SQL> select name,service_id from dba_services where name = 'server_taf';
NAME SERVICE_ID
-------------- ----------
server_taf 6
5. Review the standard setup for the services
SQL>col name format a15
col failover_method format a11
heading 'METHOD'
col failover_type format a10
heading 'TYPE'
col failover_retries format 9999999
heading 'RETRIES'
col goal format a10
col clb_goal format a8
col AQ_HA_NOTIFICATIONS format a5
heading 'AQNOT'
SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_noti
fi
catio
ns
from dba_services where service_id = 6
NAME METHOD TYPE RETRIES GOAL CLB_GOAL AQNOT
--------- ----------- ---------- -------- ---------- -------- -----
server_taf LONG NO
Please note there is no values for method, type or retries.
These are required to do server side TAF.
The cause of this problem has been identi
fi
ed and veri
fi
ed in an unpublished Bug 6886239

14

DBMS_SERVICE parameters are not added using srvctl add service. This is
fi
xed in release 11.2
onwards.
6. Add the server side failover parameters to the service. (📍 Pre 11.2)
Server side TAF method is BASIC.
BASIC is the only value currently supported.
This means that a new connection is established at failure time. It is not possible to pre-establish
a backup connection.
(which is to say, PRECONNECT is not supported)
SQL> execute dbms_service.modify_service (service_name => 'server_taf' -
, aq_ha_noti
fi
cations => true -
, failover_method => dbms_service.failover_method_basic -
, failover_type => dbms_service.failover_type_select -
, failover_retries => 180 -
, failover_delay => 5 -
, clb_goal => dbms_service.clb_goal_long);
PL/SQL procedure successfully completed.
📍 For 11.2 version and newer use SVRCTL to modify the service
srvctl modify service -d RAC -s server_taf -m BASIC -e SELECT -q TRUE -j LONG
Service can be checked with the command:
$srvctl con
fi
g service -d RAC
Service name: server_taf
Service is enabled
Server pool: RAC_server_taf
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA noti
fi
cations: true
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy speci
fi
cation: BASIC
Edition:
Preferred instances: RAC1,RAC2
Available instances:
7. Check the service and we can now see values for Method, Type and Retries.
SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_noti
fi
catio
ns
from dba_services where service_id = 6
NAME METHOD TYPE RETRIES GOAL CLB_GOAL AQNOT
--------- ----------- ---------- -------- ---------- -------- -----
server_taf BASIC SELECT 180 NONE LONG YES

15

8. Check the listener has the service registered. (output will look similar too following, depending
on version used)
$lsnrctl services
Service "server_taf.za.oracle.com" has 2 instance(s).
Instance "rac1", status READY, has 2 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521))
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Instance "rac2", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=node2)(PORT=1521))
9. Create a net service name. Here we have client load balancing between the two nodes.
SERVERTAF =
(DESCRIPTION =
(LOAD_BALANCE = yes)
(ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = server_taf.oracle.com)
)
)
10. Testing...
SQL> select host_name,instance_name from v$instance;
HOST_NAME INSTANCE_NAME
---------------- --------------------
node2 rac2
11. Shutdown the database in the node the connection has routed to
SQL> select instance_name from V$instance;
INSTANCE_NAME
----------------
rac2
SQL> shutdown abort;
ORACLE instance shut down.
12. TAF will now kick in
SQL> /
HOST_NAME INSTANCE_NAME
---------------- ----------------
node1 rac1

16

How to Setup SCAN Listener and Client for TAF and Load Balancing
Setup SCAN listener and Clients to use SCAN listener to connect to RAC using SCAN.
SOLUTION
For the purposes of this document, the following
fi
ctitious environment is used as an example to
describe the procedure:
Service name : MYRAC, MYRAC_TAF and TEST
Verify SCAN Listener Con
fi
guration on Server
After grid installation is completed you can verify the SCAN Listener con
fi
guration on your server -
•At the $GRID_HOME/network/admin directory you will have two listener related
fi
les.
-rw-r--r-- 1 grid oinstall 887 Jul 13 09:33 listener.ora
-rw-r--r-- 1 grid oinstall 375 Jul 13 09:33 endpoints_listener.ora
example of listener.ora
LISTENER_SCAN3=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)
(KEY=LISTENER_SCAN3))))
# line added by Agent
LISTENER_SCAN2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)
(KEY=LISTENER_SCAN2))))
# line added by Agent
LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))))
# line added by Agent
LISTENER_SCAN1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)
(KEY=LISTENER_SCAN1))))
# line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON
# line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON
# line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN2=ON
# line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN3=ON
# line added by Agent
example of endpoints_listener.ora

17

LISTENER_DBGRAC1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)
(HOST=xxxxx)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=xx.xx.xx.xx.x.xx)(PORT=1521)
(IP=FIRST))))
# line added by Agent
Check Status of SCAN IPs and SCAN Listener
srvctl command can be used to check the status of SCAN IPs and SCAN listener -
$ srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node myrac2
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node myrac1
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node myrac1
Note that two SCAN IPs are online on node myrac1 and one is online on myrac2.
Respective SCAN listeners will be online on the nodes where the corresponding SCAN IP is
online.
$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node myrac2
SCAN Listener LISTENER_SCAN2 is enabled
SCAN listener LISTENER_SCAN2 is running on node myrac1
SCAN Listener LISTENER_SCAN3 is enabled
SCAN listener LISTENER_SCAN3 is running on node myrac1
You can verify the instance registration with SCAN listeners using lsnrctl command -
$ lsnrctl status listener_scan2
LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 28-AUG-2010 12:53:56
Copyright (c) 1991, 2009, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
------------------------
Alias LISTENER_SCAN2
Version TNSLSNR for Linux: Version 11.2.0.1.0 - Production
Start Date 23-AUG-2010 10:15:16
Uptime 5 days 2 hr. 38 min. 40 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/11.2.0/grid/network/admin/listener.ora

18

Listener Log File /u01/app/11.2.0/grid/log/diag/tnslsnr/myrac1/
listener_scan2/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xxx.xxx.xx)(PORT=1521)))
Services Summary...
Service "myrac" has 2 instance(s).
Instance "myrac1", status READY, has 2 handler(s) for this service...
Instance "myrac2", status READY, has 2 handler(s) for this service...
Service "myracXDB" has 2 instance(s).
Instance "myrac1", status READY, has 1 handler(s) for this service...
Instance "myrac2", status READY, has 1 handler(s) for this service...
The command completed successfully
Note that above output shows that the listener "listener_scan2" is running on myrac1 and both
the instances are registered with this listener.
Each instance within the cluster registers itself with all the SCAN listener using remote_listener
setting.
For e.g
NAME TYPE VALUE
-------------- ----------- ----------
local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)
HOST=xxxxx)(PORT=1521))))
remote_listener string xxxxx:1521
The LOCAL_LISTENER will always point to the local VIP for the node while the
REMOTE_LISTENER will point to <SCAN_NAME>:<PORT>
You can use srvctl command to change or edit the SCAN listener settings.
Refer Note 972500.1 : How to Modify SCAN Setting after Installation.
Con
fi
gure Client to Connect Using SCAN
To con
fi
gure client to connect using SCAN, con
fi
gure the tnsnames.ora. For e.g -
MYRAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = xxxxx)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myrac)
)
)

19

Con
fi
gure Clients for Transparent Application
Failover (TAF) Using SCAN
The above tns entry will connect to cluster using SCAN name "myrac-cluster-scan" to service
"myrac" and assumes that the service is con
fi
gured for server side TAF.
The default Service created at the time of installation and creation of RAC database is not TAF
enabled.
You can create a TAF enabled service using srvctl command.
Refer Note 460982.1: How To Con
fi
gure Server Side Transparent Application Failover
If you wish to use client side TAF setting for clients connecting using SCAN name, you can
con
fi
gure tnsnames.ora entry at the client side as -
MYRAC_TAF =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = xxxxx)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myrac)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))
You can verify the client connections to database for TAF using :
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)
FROM V$SESSION
GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;
MACHINE FAILOVER_TYPE
FAILOVER_METHOD FAILED_OVER COUNT(*)
----- ------- ------- ----- ---
XXXX SELECT BASIC NO 1
Older supported client versions which are using VIP Address TAF enabled
connect string can continue to use VIP to connect to 11gR2 RAC setup.
Client Side Load Balancing
For clients connecting to RAC database via SCAN, the client side load balancing happens during
the connection phase when the SCAN name gets resolved. Client side sqlnet trace can be used to
verify this. For e.g

20

(3086231232) [28-AUG-2013 14:13:06:118] nlad_expand_hst: Expanding xxxxx
(3086231232) [28-AUG-2013 14:13:06:118] snlinGetAddrInfo: entry
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetAddrInfo: exit
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: entry
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: exit
(3086231232) [28-AUG-2013 14:13:11:119] nlad_expand_hst: Adding IP
xxx.xxx.xxx.xx
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: entry
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: exit
(3086231232) [28-AUG-2013 14:13:11:119] nlad_expand_hst: Adding IP
xxx.xxx.xxx.xx
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: entry
(3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: exit
(3086231232) [28-AUG-2013 14:13:11:119] nlad_expand_hst: Adding IP
xxx.xxx.xxx.xx
(3086231232) [28-AUG-2013 14:13:11:119] snlinFreeAddrInfo: entry
Note that the name "xxxxx" was resolved to three di
ff
erent IPs, client will randomly pick any of the
IP for the connection to achieve client side load balancing.
A pre-11g Release 2 client (Oracle Database 11g Release 1 or Oracle Database 10g Release 2, or
older) will not fully bene
fi
t from the advantages of SCAN.
For 12c and newer you can use sacn name and there us not challenging.
The older version Oracle Client will not be able to handle a set of three IPs returned by the DNS
for SCAN.
Hence, it will try to connect to only the
fi
rst address returned in the list and will more or less ignore
the others. If the SCAN Listener listening on this speci
fi
c IP is not available or the IP itself is not
available, the connection will fail.
To workaround this problem :
•If you are using static SCAN IPs, include the address of all SCAN IPs in the tnsnames.ora entry.
TEST =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE=ON)
(FAILOVER=ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP1>)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP2>)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP3>)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myrac)

21

(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))
12c above:
TEST =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE=ON)
(FAILOVER=ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = <scan-ip>)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myrac)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))
•If you are using GNS (dynamic SCAN IPs), use VIP listener for the connections.
However in this case the remote_listener parameter on each node should be set accordingly so
that all instance cross registers with each VIP listener within the cluster.
Conclusion:
Always create and use custom database services for your application, as these allow you to set
high availability attributes for Fast Application Noti
fi
cation (FAN), Draining, and Application
Continuity.
In Data Guard environments, use role-based database services and create a service with the
same name on all sites. Specify the WAIT option in the Data Guard broker switchover command
to allow services to drain during switchover operations. Oracle Application Continuity hides
database interruptions from end-users by replaying in-
fl
ight transactions on available database
instances, completely transparent to end-users and without any application code changes
needed. the Java code provided is an example to get you started using Oracle Application
Continuity with Java. For production implementations, please follow the instructions and
recommendations provided by Oracle.
Sincerely,
Alireza Kamrani

More Related Content

Oracle Application Continuity with Oracle RAC for java

  • 1. Oracle Application Continuity with Oracle RAC for java How to create Oracle Database Services for High Availability Introduction To use High Availability features like Fast Application Noti fi cation (FAN), Draining, and Application Continuity, you need to create a custom database service and set the high availability attributes accordingly. This is not possible when using the default database service, which is the service with the same name as the PDB. Additionally, custom database services provide location transparency for the application. FEATURE SET FOR KEEPING YOUR APPLICATIONS CONTINUOUSLY AVAILABLE Applications achieve continuous availability when planned maintenance, unplanned outages, and load imbalances of the database tier are hidden. Oracle provides a set of features that you can choose from to keep your application available during planned events, unplanned outages and load imbalances. You can think of these features as an insurance policy protecting your applications from service interruptions. The best features are those that are fully transparent to your application so your application developers can focus on building functionality, not infrastructure, and that continue to protect the application when it changes in the future. We call this future-proo fi ng. Start with the feature set: Draining and Rebalancing Sessions for Planned Maintenance: When planned maintenance starts, sessions that need to be drained from an instance, PDB, or database are marked to be drained. Idle sessions are released gradually. Active sessions are drained when the work executing in that session completes. Draining of sessions is in wide use with Oracle connection pools and mid-tiers con fi gured for Fast Application Noti fi cation (FAN): Starting with Oracle Database 18c, the database itself drains sessions when PDBs and instances are stopped or relocated. Draining is always the best solution for hiding planned maintenance. Failover solutions such as Application Continuity are the fallback when work will not drain in the time allocated. Transparent Application Failover (TAF): TAF is a feature dating back to Oracle8i. Following an instance failure, TAF creates a new session and, when using SELECT mode, on demand, replays queries back to where they were before the failure occurred. Starting with Oracle Database 12.2, TAF o ff ers FAILOVER_RESTORE, matching Application Continuity, to restore the initial session state before queries are replayed. Cursors are replayed using the state re-established initially. Applications using TAF must not change session state later in the session, (for example PLSQL, temp tables, temp lobs, sys context) as this session state is not restored.
  • 2. Application Continuity (AC): Application Continuity hides outages starting with Oracle database 12.1 for thin Java-based applications and Oracle Database 12.2.0.1 for OCI and ODP.NET based applications. Application Continuity rebuilds the session by recovering the session from a known point which includes session states and transactional states. Application Continuity rebuilds all in- fl ight work. The application continues as it was, seeing a slightly delayed execution time when a failover occurs. The standard mode for Application Continuity is for OLTP-style pooled applications. Transparent Application Continuity (TAC) Starting with Oracle Database18c, Transparent Application Continuity (TAC) transparently tracks and records session and transactional state so the database session can be recovered following recoverable outages. This is done with no reliance on application knowledge or application code changes, allowing Transparent Application Continuity to be enabled for your applications. Application transparency and failover are achieved by consuming the state-tracking information that captures and categorizes the session state usage as the application issues user calls. Transparent Application Continuity with Draining is the recommended solution for Continuous Availability. If you are using a 12c driver, or customization is required for initial states or side e ff ects, then you should use Application Continuity. TAF continues to be available and is fully supported. INSTRUCTIONS FOR APPLICATION CONFIGURATION : Follow these instructions when implementing your solution. 1. Use an Oracle Clusterware managed service that is not the default database service (the default service has the same name as the database or PDB). The services that you create provide location transparency and HA features. 2. Use the recommended Connection String (explained later in this paper) with built in timeouts, retries and delay so incoming connections do not see errors during outages. 3. Fast Application Noti fi cation (FAN) is a mandatory component to initiate draining, to break out of failures, and to rebalance sessions when services resume and when load imbalances occur. For outages such as node and network failures, fast failover of the application does not happen if the client is not interrupted by FAN.
  • 3. FAN applies to all failover solutions. When con fi guring FAN, use Auto-con fi guration of ONS. Use the recommended TNS format strictly. Do not alter this format. (Exception: if your client is pre-12c, you will manually con fi gure FAN.) 4.Before maintenance starts, drain your work from the instances or nodes targeted for maintenance. Enable FAN with Oracle Connection Pools or Connection tests (or both). Oracle connection pools with FAN are the best solution as pools provides full life cycle of session movement. That is, draining and rebalancing of work as maintenance progresses. When using FAN, return your connections to the pool. If you are using server draining (the alternate plan explained later in this paper) and your test is not a standard test, add your test to the server using DBMS_APP_CONT_ADMIN. Sessions that do not drain within the DRAIN_TIMEOUT will be failed over. 5. The standard solution for failing over sessions is Transparent Application Continuity (TAC). Use Application Continuity (AC) if you are using Oracle Database 12c Release 2, or you want to customize with side e ff ects or callbacks, or have an application that uses state such as temporary tables and never cleans up. Use Transparent Application Failover (TAF) if your application is read only and does not change Oracle session state in the session after the initial setup. BUILDING BLOCKS FOR CONTINUOUS AVAILABILITY OF YOUR APPLICATION : Use Services Service is a logical abstraction for managing work. Services allow applications to bene fi t from the reliability of the redundant parts of the MAA system. The services hide the complexity of the underlying system from the client by providing a single system image for managing work. The service is : • A unit for management: a handful of services are manageable, many nodes, instances, listeners, and network interfaces are not manageable. The service provides location transparency for sites and databases. • A unit for availability: resources are recovered quickly, independently and in parallel for each service and without the need to start entire software stacks; and • A unit for performance: work is routed transparently across the MAA system according to service quality and priority. Services are measured against service level thresholds and violations are reported to management with advised solutions in AWR. Con fi gure the Connection String or URL for High Availability: All Oracle-supplied connect strings will conform to the following recommendations. There is no need to do anything if you use the Oracle-supplied wallet. The following TNS/URL con fi guration is recommended for use for connecting at failover, switchover, fallback and basic startup. Set RETRY_COUNT, RETRY_DELAY, CONNECT_TIMEOUT and TRANSPORT_CONNECT_TIMEOUT parameters in the tnsnames.ora fi le or the URL to allow connection requests to wait for the service and connect successfully. Connection attempts and retries are managed by Oracle Database Net Services. Set CONNECT_TIMEOUT to a high value to prevent login storms such as 90s or 120s.
  • 4. Low values can result in frenzied login-attempts due to the application or pool cancelling and retrying connection attempts. Do not set (RETRY_COUNT+1)*RETRY_DELAY or CONNECT_TIMEOUT larger than your response time SLA. The application should either connect or receive an error within the response time SLA. These are general recommendations for con fi guring the connections for high availability. Do not use Easy Connect Naming on the client as EZCONNECT has no high availability capabilities. Note that the standby-scan speci fi ed below refers to the SCAN address available on the STANDBY site speci fi ed in your (Active) Data Guard con fi guration. Attempt will be made to connect to the PRIMARY site fi rst, and if the service is not available, attempt to connect to this service at the standby. Once the location of the service is known, Oracle drivers 12.2 and later remember the address_list with that service o ff ered and chooses this fi rst, until the service next moves. Adding the standby-scan to TNS connection descriptor to transparently fail over to the standby- scan is optional. Failing over to a standby database within the same region will have acceptable performance in most cases versus failing over to a standby database in a di ff erent region where additional network latency may result in unacceptable response time performance. In the latter case, a site failover operation will be required which involves DNS failover to another region containing mid-tier resources and standby database. Use this Connection String for ALL Oracle driver version 12.2 or higher: Alias (or URL) = (DESCRIPTION = (CONNECT_TIMEOUT= 90)(RETRY_COUNT=50) (RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3) (ADDRESS_LIST = (LOAD_BALANCE=on) (ADDRESS = (PROTOCOL = TCP)(HOST=primary-scan)(PORT=1521))) (ADDRESS_LIST = (LOAD_BALANCE=on) (ADDRESS = (PROTOCOL = TCP)(HOST=standby- scan)(PORT=1521))) (CONNECT_DATA=(SERVICE_NAME = YOUR SERVICE))) FAST APPLICATION NOTIFICATION (FAN) : FAN must be used. FAN is a required component for interrupting the application to failover. Without FAN, applications can hang on TCP/IP timeout following hardware and network failures, and omit to rebalance when resources resume. All Oracle pools and all Oracle application servers use FAN. Third-party JAVA application servers can use UCP to enable FAN. No application changes are required to use FAN. These are con fi guration changes only. For continuous service during planned maintenance, use FAN with: • Oracle pools or • UCP with third-party JDBC application servers or • The latest Oracle client drivers For continuous service during unplanned outages, use FAN with • Application Continuity or • Transparent Application Continuity The format of the ADDRESS_LIST described above is important for a number of reasons, one being that this format allows for autocon fi guration of ONS. ONS is used to propagate FAN events to mid-tier pools and clients. When a database connection is made, database-tier ONS information is sent back to the mid-tier, allowing the mid-tier to establish ONS communication paths automatically. This means that con fi guration can be dynamic and need not be maintained at the mid-tier. ONS connections will be made from both the primary and standby sites. To setup FAN ON clients: Use the TNS alias or URL shown in the this discussion. This connection string is used to auto-con fi gure the Oracle Noti fi cation Service (ONS) subscription at the client for FAN-event receipt when using an Oracle Database 12c or later client driver. For older drivers, refer to the FAN white paper in the Appendix for con fi guration details. ONS provides a secure communication path between the database tier and the client-tier allowing the client to be noti fi ed of service availability (components stopping or starting) as well as runtime load balancing advice for better work placement during normal operation.
  • 5. To enable FAN in the client: 1. Use the TNS alias or URL provided. This connect string will auto-con fi gure ONS (auto-ONS) subscription at the client for FAN event receipt when using a 12c driver or later. For older drivers, refer to the FAN white paper. 2. Depending on the client, enable FAN in the application con fi guration properties as follows 2a-Universal Connection Pool or JDBC thin driver (starting 12.2) : Set the property FastConnectionFailoverEnabled 2b-WebLogic Active GridLink for Oracle RAC : FAN and Fast Connection Failover are enabled by default 2c-Oracle WebLogic Server, IBM WebSphere, IBM Liberty, Apache Tomcat, Red Hat WildFly (JBoss), JDBC Applications : Use Universal Connection Pool as a connection pool replacement 3c- ODP.Net clients (Managed and Unmanaged Providers) Set “HA events = true;pooling=true” in the connect string if using ODP.Net 12.1 or earlier 4c-Oracle Call Interface (OCI) clients and OCI-based drivers : Oracle Call Interface (OCI) clients without native settings can use oraacces.xml and set events to true Python, Node.js and PHP have native options. In Python and Node.js you can set an events mode when creating a connection pool. In PHP, edit php.ini add the entry oci8.events=on SQL*Plus enables FAN by default 3. You can con fi gure ONS to use TLS (wallet-based) authentication. For JDBC applications: a) Ensure the following JAR fi les are present in your application’s CLASSPATH : (ons.jar,osdt_cert.jar,osdt_core.jar,oraclepki.jar). The Oracle ATP-D service always uses TLS.
  • 6. b) Specify the wallet for FAN do one of the following: • To use AUTO-ONS with wallets an application must set the following Java system properties: “- Doracle.ons.wallet fi le=/replace_this_with_host_path/onswallet” and “Doracle.ons.walletpassword=myONSwalletPassword” This cannot be set on a per-pool or per-connection basis • To use explicit ONS con fi guration (instead of AUTO-ONS) do one of the following: i) Programmatically within UCP, call setONSCon fi guration(), for example ii) Set declaratively using an UCP XML con fi guration fi le: ii) For Oracle Call Interface (OCI) applications Oracle client drivers 12.2 or later are required: The <wallet_location> path should be the name of the directory containing the wallet. Other parameters may be set in the ons section of oraaccess.xml, including <hosts>, <max_connections>, and <subscription_wait_timeout>. Drivers that support native event setting controls may omit the <events> section and use the driver setting instead. By default connections will be established to the database even if ONS fails. If you prefer connections to fail in this scenario, you can add a section to the same level as <events> and <ons>:
  • 7. <fan> <subscription_failure_action> error </subscription_failure_action> </fan> Place the oraaccess.xml fi le in the same directory as the tnsnames.ora and sqlnet.ora network fi les. For example, when using Oracle Instant Client these fi les might be in the default directory network/admin. Alternatively, all network con fi guration fi les can be put in another accessible directory. Then set the environment variable TNS_ADMIN to that directory name. FAN and Draining Service status changes like stopping and starting the service can be propagated to Oracle client drivers and Oracle connection pools using FAN, a high-availability noti fi cation mechanism. FAN events are published using Oracle Noti fi cation Service (ONS). The clients take immediate action accordingly, e.g., to balance connections or to establish new connections and route work to available database instances or where the service is started. Draining means allowing connected sessions to complete their current work in a speci fi c duration (drain timeout) before maintenance is started. Set the database service attribute noti fi cation to true to enable FAN and drain_timeout to time_in_seconds to enable draining: srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> -preferred <instance_name> -available <instance_name> -stopoption IMMEDIATE -noti fi cation true -drain_timeout 300 #example srvctl add service -db RACCDB_POP -pdb PDB1 -service drainsrv -preferred RACCDB1,RACCDB2 -stopoption IMMEDIATE -noti fi cation true -drain_timeout 300 #start the service srvctl start service -db RACCDB_POP -service drainsrv #check status srvctl status service -db RACCDB_POP -service drainsrv Service drainsrv is running on instance(s) RACCDB1,RACCDB2 When you stop the service on instance 1, RACCDB1: #stop service on instance 1 srvctl stop service -db RACCDB_POP -service drainsrv -instance RACCDB1 #check status srvctl status service -db RACCDB_fra -service drainsrv Service drainsrv is running on instance(s) RACCDB2 You can specify a di ff erent drain_timeout when stopping the service: srvctl stop service -db RACCDB_POP -service drainsrv -instance RACCDB1 -drain_timeout 10 New connections will immediately be directed to instance 2. Existing sessions to instance 1 remain connected to instance 1 for the duration of drain_timeout and can fi nish their work (commit) during this time without any interruption.
  • 8. Ideally, all sessions drain (= fi nish their work) within the drain_timeout speci fi ed. You can set the drain_timeout as high as needed for your application, e.g., 30 minutes or 1 hour. After that, the database node will be ready for maintenance, and the application will not encounter any interruption if all sessions drain. Transaction Guard: Transaction Guard (TG) is a database feature for applications to ensure that every transaction has been executed at most once in case of planned or unplanned outage. In the background each transaction is associated with a logical transaction IDthat let you determine, afterward, if the transaction has committed and has been completed. Without Transaction Guard (TG) the typical example that is given is a transaction that is increasing by 5% the salary of all company employees. If the application fail, outside of the fact that you will make happy employees by giving them Double% you cannot rerun blindly the transaction from scratch. With Transaction Guard you can know from application layout how it went and present a warning or gray the submit button to avoid a second dramatic rerun. By default the Java connection has autocommit property set to TRUE From Oracle Technology Network the requirements for TG are: -Use Oracle Database Release 12.1 or later. -Use an application service for all database work. Create the service using srvctl if using RAC or DBMS_SERVICE if not using RAC. You may also use GDSCTL. -Set the following properties on the service – COMMIT_OUTCOME = TRUE for Transaction Guard. -Grant execute permission on DBMS_APP_CONT package to the application user. -Increase DDL_LOCK_TIMEOUT if using Transaction Guard with DDL statements (for example, 10 seconds). Problem That Transaction Guard Solves In applications without Transaction Guard, a fundamental problem for recovering applications after an outage is that the commit message that is sent back to the client is not durable. If there is a break between the client and the server, the client sees an error message indicating that the communication failed. This error does not inform the application if the submission executed any commit operations, if a procedural call completed and executed all expected commits and session state changes, or if a call failed part way through or, yet worse, is still running disconnected from the client. Without Transaction Guard, it is impossible or extremely di ffi cult to determine the outcome of the last commit operation, in a guaranteed and scalable manner, after a communication failure to the server. If an application must determine whether the submission to the database was committed, the application must add custom exception code to query the outcome for every possible commit point in the application. Given that a system can fail anywhere, this is almost impractical because the query must be speci fi c to each submission. After an application is built and is in production, this is completely impractical. Moreover, a query cannot give the correct answer because the transaction could commit immediately after that query executed. Indeed, after a communication failure the server may still be running the submission not yet aware that the client has disconnected. For PL/SQL or Java in the database, for a procedural submission, there is also no record as to whether that submission ran to completion or was aborted part way through. While such a procedure may have committed, subsequent work may not have been done for the procedure. Failing to recognize that the last submission has committed, or will commit sometime soon or has not run to completion, can lead applications that attempt to replay, thus causing duplicate transaction submissions and other forms of "logical corruption" because the software might try to reissue already persisted changes. Without Transaction Guard, if a transaction has been started and commit has been issued, the commit message that is sent back to the client is not durable. The client is left not knowing whether the transaction committed. The transaction cannot be validly resubmitted if the nontransactional state is incorrect or if it already committed. In the absence of guaranteed commit
  • 9. and completion information, resubmission can lead to transactions applied more than once and in a session with the incorrect state. Solution That Transaction Guard Provides E ff ective with Oracle Database 12c Release 1 (12.1.0.1), Transaction Guard provides new, integrated tools for applications to use to achieve idempotence automatically and transparently, and in a manner that scales. Its key features are the following: • Durability of COMMIT outcome by saving a logical transaction identi fi er (LTXID) at commit for all supported transaction types against the database (Oracle Database 12cRelease 1 (12.1.0.1) or later). This includes idempotence for transactions executed using autocommit, from inside PL/ SQL, from remote transactions, One-Phase XA transactions, and from callouts that cannot otherwise be identi fi ed using generic means. • Use of the LTXID to support at-most-once execution semantics, such that database transactions protected by logical transaction identi fi ers cannot be duplicated when there are multiple copies of that transaction in fl ight identi fi ed by the LTXID. • Blocking of a commit of in- fl ight work to ensure that regardless of the outage situation, another submission of the same transaction protected by that LTXID cannot commit. • Identi fi cation of whether work committed at an LTXID was committed as part of a top-level call (client to server), or was embedded in a procedure (such as PL/SQL) at the server. An embedded commit state indicates that while a commit completed, the entire procedure in which the commit executed has not yet run to completion. Any work beyond the commit cannot be guaranteed to have completed until that procedure itself returns to the database engine. • Identi fi cation of whether the database to which the commit resolution is directed is ahead of, in sync with, or behind the original submission, and rejection when there are gaps in the submission sequence of transactions from a client. It is considered an error to attempt to obtain an outcome if the server or client are not in sync on an LTXID sequence. • A callback on the JDBC Thin client driver that fi res when the LTXID changes. This can be used by higher layer applications such as WebLogic Server and third parties to maintain the current LTXID ready to use if needed. • Namespace uniqueness across globally disparate databases and across databases that are consolidated into a Multitenant infrastructure. This includes Oracle Real Application Clusters (Oracle RAC) and RAC One, Data Guard, and Multitenant databases. • Service name uniqueness across global databases and across databases that are consolidated into a Multitenant infrastructure. This ensures that connections are properly directed to the transaction information. At-Most-Once Execution Transaction Guard uses the logical transaction identi fi er (LTXID) to avoid duplicate transactions. This ability to ensure at most one execution of a transaction is referred to as transaction idempotence. The LTXID is persisted on commit and is reused following a rollback. During normal runtime, an LTXID is automatically held in the session at both the client and server for each database transaction. At commit, the LTXID is persisted as part of committing the transaction. The at-most-once protocol requires that the database maintain the LTXID for the retention period agreed for replay. The default retention period is 24 hours, although you might need a shorter or longer period, conceivably even a week or longer. The longer the retention period, the longer the at-most-once check blocks an old transaction using an old LTXID from replay. The setting is available on each service. When multiple databases are involved, as is the case when using Data Guard and Active Data Guard, the LTXID is replicated to each database involved through the use of redo.
  • 10. The getLTXID API, provided for Oracle JDBC Thin (with similar APIs for OCI, OCCI, and ODP.NET clients), lets an application retrieve the logical transaction identi fi er that was in use on the dead session. This is needed to determine the status of this last transaction. The DBMS_APP_CONT.GET_LTXID_OUTCOME PL/SQL subprogram lets an application fi nd the outcome of an action for a speci fi ed logical transaction identi fi er. Calling DBMS_APP_CONT.GET_LTXID_OUTCOME may involve the server blocking the LTXID from committing so that the outcome is known. This is a requirement if a transaction using that LTXID is in fl ight or is about to commit. An application using Transaction Guard obtains the LTXID following a recoverable error, and then calls DBMS_APP_CONT.GET_LTXID_OUTCOMEbefore attempting a replay. The Oracle PL/SQL supplied package to know if a transaction has committed and has completed is DBMS_APP_CONT.GET_LTXID_OUTCOME: SQL> DESC DBMS_APP_CONT PROCEDURE GET_LTXID_OUTCOME Argument TYPE IN/OUT --------- --------- ------ CLIENT_LTXID RAW IN COMMITTED BOOLEAN OUT USER_CALL_COMPLETED BOOLEAN OUT This procedure has two Boolean out parameters and until recently it was possible to map a PL/SQL Boolean in Java from Oracle 12cR2 above, are wrapping this procedure in a PL/SQL block to return numbers instead of Boolean. Application Continuity: What happens to sessions with in- fl ight work that do not complete their work within drain_timeout? This is where Application Continuity comes into play! Application Continuity will restore the session state and replay the in- fl ight work on a servicing instance transparently for the application. The database service needs to have the Application Continuity attributes set: srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> -preferred <instance_name> -available <instance_name> -stopoption IMMEDIATE -noti fi cation true -drain_timeout 300 -commit_outcome true -retention 86400 -replay_init_time 600 -session_state DYNAMIC -failover_restore LEVEL1 -failovertype TRANSACTION For Transparent Application Continuity (an operation mode of Application Continuity) set failover_restore and failovertype to AUTO: srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> ...
  • 11. -session_state defaults to AUTO if failover_type is set to AUTO -failover_restore AUTO -failovertype AUTO A key di ff erence is that Application Continuity (AC) requires the application to use a Connection Pool, while Transparent Application Continuity(TAC) works with or without Connection Pools. Explanation of the following attributes: • commit_outcome set to true enables Transaction Guard, which is required by Application Continuity and Transparent Application Continuity. • retention determines the amount of time (in seconds) that the commit outcome is retained in the database. • replay_init_time speci fi es the di ff erence between the time, in seconds, of the original execution of the fi rst operation of a request and the time that the replay is ready to start after a successful reconnect. Application Continuity will not replay after the speci fi ed amount of time has passed. This parameter is intended to avoid the unintentional execution of a transaction when a system is recovered after a long period. • session_state describes how the non-transactional session state is changed by the application within a request. Examples of the session states are NLS settings, optimizer preferences, event settings, PL/SQL global variables, and temporary tables. Services for Active Data Guard Create role-based services when using Active Data Guard. Set the role attribute to primary for services to be used for the primary site (read/write) and standby for services to be open on the standby site (read-only). The services start and stop automatically at a site based on their role. srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> ... -failover_restore AUTO -failovertype AUTO -role primary Create the same service name with the same role on both primary and standby sites, e.g.: #on primary and standby sites srvctl add service -db RACCDB_POP -pdb pdb1 -service tacsrv -preferred RACCDB1,RACCDB2 -stopoption IMMEDIATE -noti fi cation true -drain_timeout 300 -commit_outcome true -retention 86400 -replay_init_time 600 -session_state AUTO -failover_restore AUTO -failovertype AUTO -role primary #on primary site srvctl start service -db RACCDB_POP -service tacsrv Note that we didn’t start the service on the standby site. The service will automatically start after a switchover or failover operation. Note 1: When using Active Data Guard with single-instance databases, remove the preferred and available attributes that are related to Oracle RAC databases. Note 2: To create the same service name on both sites in Oracle Cloud, include the domain name in the service, e.g., tacsrv.company.com,
  • 12. Using the default PDB or DB name is not recommended, as high availability attributes cannot be set for these services. Always create a custom service for your application and set high availability attributes accordingly. Draining in Data Guard Environments To allow services to drain during switchover operations, use the WAIT option of the switchover Data Gaurd broker command: DGMGRL> switchover to <standby_db_unique_name> wait; --example DGMGRL> switchover to CDB01_POP wait; Stopping services and waiting for sessions to drain... The broker determines the maximum drain_timeout value for all active services and waits for up to that amount of time for all current client requests to be processed, before proceeding with the switchover. Alternatively, you can specify the number of seconds to wait for sessions to drain. DGMGRL> switchover to <standby_db_unique_name> wait <timeout_in_seconds>; Services in Autonomous Database Autonomous Database provides prede fi ned services with di ff erent characteristics to be used for applications depending on their requirements. (Transparent) Application Continuity can be enabled as follows: https://docs.oracle.com/en/cloud/paas/autonomous-database/adbsa/application-continuity- con fi gure.html Connection String The application should use the recommended connection string con fi gured for high availability: Alias = (DESCRIPTION = (CONNECT_TIMEOUT=90)(RETRY_COUNT=50)(RETRY_DELAY=3) (TRANSPORT_CONNECT_TIMEOUT=3) (ADDRESS_LIST = (LOAD_BALANCE=ON) (ADDRESS = (PROTOCOL=TCP)(HOST=primary-scan)(PORT=1521))) (ADDRESS_LIST = (LOAD_BALANCE=ON) (ADDRESS = (PROTOCOL=TCP)(HOST=secondary-scan)(PORT=1521))) (CONNECT_DATA = (SERVICE_NAME=gold-cloud))) How To Con fi gure Server Side Transparent Application Failover This document provides basic instructions for con fi guring Server side (versus client) TAF or Transparant Application Failover. Server side TAF settings override client-side counterparts that might be con fi gured in TNS connect descriptors. If TAF is not con fi gured on the client side, then at a minimum, the failover type must be set to enable TAF. If the failover type is set on the server side, then the failover method will default to BASIC. Delay and retries are optional and may be speci fi ed independently. SOLUTION 1. Create a service on the RAC cluster to setup for TAF. Example creates a service called server_taf to the database called rac, with instance names rac1 & rac2.
  • 13. Please note for the service name, select a name that is unique and not the same as the default service name created. A special Oracle database service is created by default for your Oracle RAC database. This default service is always available on all instances in an Oracle RAC environment, unless an instance is in restricted mode. You cannot alter this service or its properties. $srvctl add service -d rac -s server_taf -r "rac1,rac2" -P BASIC 2. Start the service server_taf $srvctl start service -d rac -s server_taf 3. Check service is running $srvctl con fi g service -d rac ractest PREF: rac1 rac2 AVAIL: server_taf PREF: rac1 rac2 AVAIL: 4. Find the service_id value for the service just created sqlplus /nolog Connect / as sysdba SQL> select name,service_id from dba_services where name = 'server_taf'; NAME SERVICE_ID -------------- ---------- server_taf 6 5. Review the standard setup for the services SQL>col name format a15 col failover_method format a11 heading 'METHOD' col failover_type format a10 heading 'TYPE' col failover_retries format 9999999 heading 'RETRIES' col goal format a10 col clb_goal format a8 col AQ_HA_NOTIFICATIONS format a5 heading 'AQNOT' SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_noti fi catio ns from dba_services where service_id = 6 NAME METHOD TYPE RETRIES GOAL CLB_GOAL AQNOT --------- ----------- ---------- -------- ---------- -------- ----- server_taf LONG NO Please note there is no values for method, type or retries. These are required to do server side TAF. The cause of this problem has been identi fi ed and veri fi ed in an unpublished Bug 6886239
  • 14. DBMS_SERVICE parameters are not added using srvctl add service. This is fi xed in release 11.2 onwards. 6. Add the server side failover parameters to the service. (📍 Pre 11.2) Server side TAF method is BASIC. BASIC is the only value currently supported. This means that a new connection is established at failure time. It is not possible to pre-establish a backup connection. (which is to say, PRECONNECT is not supported) SQL> execute dbms_service.modify_service (service_name => 'server_taf' - , aq_ha_noti fi cations => true - , failover_method => dbms_service.failover_method_basic - , failover_type => dbms_service.failover_type_select - , failover_retries => 180 - , failover_delay => 5 - , clb_goal => dbms_service.clb_goal_long); PL/SQL procedure successfully completed. 📍 For 11.2 version and newer use SVRCTL to modify the service srvctl modify service -d RAC -s server_taf -m BASIC -e SELECT -q TRUE -j LONG Service can be checked with the command: $srvctl con fi g service -d RAC Service name: server_taf Service is enabled Server pool: RAC_server_taf Cardinality: 2 Disconnect: false Service role: PRIMARY Management policy: AUTOMATIC DTP transaction: false AQ HA noti fi cations: true Failover type: SELECT Failover method: BASIC TAF failover retries: 0 TAF failover delay: 0 Connection Load Balancing Goal: LONG Runtime Load Balancing Goal: NONE TAF policy speci fi cation: BASIC Edition: Preferred instances: RAC1,RAC2 Available instances: 7. Check the service and we can now see values for Method, Type and Retries. SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_noti fi catio ns from dba_services where service_id = 6 NAME METHOD TYPE RETRIES GOAL CLB_GOAL AQNOT --------- ----------- ---------- -------- ---------- -------- ----- server_taf BASIC SELECT 180 NONE LONG YES
  • 15. 8. Check the listener has the service registered. (output will look similar too following, depending on version used) $lsnrctl services Service "server_taf.za.oracle.com" has 2 instance(s). Instance "rac1", status READY, has 2 handler(s) for this service... Handler(s): "DEDICATED" established:0 refused:0 state:ready REMOTE SERVER (ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)) "DEDICATED" established:0 refused:0 state:ready LOCAL SERVER Instance "rac2", status READY, has 1 handler(s) for this service... Handler(s): "DEDICATED" established:0 refused:0 state:ready REMOTE SERVER (ADDRESS=(PROTOCOL=TCP)(HOST=node2)(PORT=1521)) 9. Create a net service name. Here we have client load balancing between the two nodes. SERVERTAF = (DESCRIPTION = (LOAD_BALANCE = yes) (ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521)) (CONNECT_DATA = (SERVICE_NAME = server_taf.oracle.com) ) ) 10. Testing... SQL> select host_name,instance_name from v$instance; HOST_NAME INSTANCE_NAME ---------------- -------------------- node2 rac2 11. Shutdown the database in the node the connection has routed to SQL> select instance_name from V$instance; INSTANCE_NAME ---------------- rac2 SQL> shutdown abort; ORACLE instance shut down. 12. TAF will now kick in SQL> / HOST_NAME INSTANCE_NAME ---------------- ---------------- node1 rac1
  • 16. How to Setup SCAN Listener and Client for TAF and Load Balancing Setup SCAN listener and Clients to use SCAN listener to connect to RAC using SCAN. SOLUTION For the purposes of this document, the following fi ctitious environment is used as an example to describe the procedure: Service name : MYRAC, MYRAC_TAF and TEST Verify SCAN Listener Con fi guration on Server After grid installation is completed you can verify the SCAN Listener con fi guration on your server - •At the $GRID_HOME/network/admin directory you will have two listener related fi les. -rw-r--r-- 1 grid oinstall 887 Jul 13 09:33 listener.ora -rw-r--r-- 1 grid oinstall 375 Jul 13 09:33 endpoints_listener.ora example of listener.ora LISTENER_SCAN3=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC) (KEY=LISTENER_SCAN3)))) # line added by Agent LISTENER_SCAN2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC) (KEY=LISTENER_SCAN2)))) # line added by Agent LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))) # line added by Agent LISTENER_SCAN1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC) (KEY=LISTENER_SCAN1)))) # line added by Agent ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON # line added by Agent ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON # line added by Agent ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN2=ON # line added by Agent ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN3=ON # line added by Agent example of endpoints_listener.ora
  • 17. LISTENER_DBGRAC1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP) (HOST=xxxxx)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=xx.xx.xx.xx.x.xx)(PORT=1521) (IP=FIRST)))) # line added by Agent Check Status of SCAN IPs and SCAN Listener srvctl command can be used to check the status of SCAN IPs and SCAN listener - $ srvctl status scan SCAN VIP scan1 is enabled SCAN VIP scan1 is running on node myrac2 SCAN VIP scan2 is enabled SCAN VIP scan2 is running on node myrac1 SCAN VIP scan3 is enabled SCAN VIP scan3 is running on node myrac1 Note that two SCAN IPs are online on node myrac1 and one is online on myrac2. Respective SCAN listeners will be online on the nodes where the corresponding SCAN IP is online. $ srvctl status scan_listener SCAN Listener LISTENER_SCAN1 is enabled SCAN listener LISTENER_SCAN1 is running on node myrac2 SCAN Listener LISTENER_SCAN2 is enabled SCAN listener LISTENER_SCAN2 is running on node myrac1 SCAN Listener LISTENER_SCAN3 is enabled SCAN listener LISTENER_SCAN3 is running on node myrac1 You can verify the instance registration with SCAN listeners using lsnrctl command - $ lsnrctl status listener_scan2 LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 28-AUG-2010 12:53:56 Copyright (c) 1991, 2009, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2))) STATUS of the LISTENER ------------------------ Alias LISTENER_SCAN2 Version TNSLSNR for Linux: Version 11.2.0.1.0 - Production Start Date 23-AUG-2010 10:15:16 Uptime 5 days 2 hr. 38 min. 40 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/11.2.0/grid/network/admin/listener.ora
  • 18. Listener Log File /u01/app/11.2.0/grid/log/diag/tnslsnr/myrac1/ listener_scan2/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xxx.xxx.xx)(PORT=1521))) Services Summary... Service "myrac" has 2 instance(s). Instance "myrac1", status READY, has 2 handler(s) for this service... Instance "myrac2", status READY, has 2 handler(s) for this service... Service "myracXDB" has 2 instance(s). Instance "myrac1", status READY, has 1 handler(s) for this service... Instance "myrac2", status READY, has 1 handler(s) for this service... The command completed successfully Note that above output shows that the listener "listener_scan2" is running on myrac1 and both the instances are registered with this listener. Each instance within the cluster registers itself with all the SCAN listener using remote_listener setting. For e.g NAME TYPE VALUE -------------- ----------- ---------- local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP) HOST=xxxxx)(PORT=1521)))) remote_listener string xxxxx:1521 The LOCAL_LISTENER will always point to the local VIP for the node while the REMOTE_LISTENER will point to <SCAN_NAME>:<PORT> You can use srvctl command to change or edit the SCAN listener settings. Refer Note 972500.1 : How to Modify SCAN Setting after Installation. Con fi gure Client to Connect Using SCAN To con fi gure client to connect using SCAN, con fi gure the tnsnames.ora. For e.g - MYRAC = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xxxxx)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = myrac) ) )
  • 19. Con fi gure Clients for Transparent Application Failover (TAF) Using SCAN The above tns entry will connect to cluster using SCAN name "myrac-cluster-scan" to service "myrac" and assumes that the service is con fi gured for server side TAF. The default Service created at the time of installation and creation of RAC database is not TAF enabled. You can create a TAF enabled service using srvctl command. Refer Note 460982.1: How To Con fi gure Server Side Transparent Application Failover If you wish to use client side TAF setting for clients connecting using SCAN name, you can con fi gure tnsnames.ora entry at the client side as - MYRAC_TAF = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xxxxx)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = myrac) (FAILOVER_MODE= (TYPE=select) (METHOD=basic)))) You can verify the client connections to database for TAF using : SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER; MACHINE FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER COUNT(*) ----- ------- ------- ----- --- XXXX SELECT BASIC NO 1 Older supported client versions which are using VIP Address TAF enabled connect string can continue to use VIP to connect to 11gR2 RAC setup. Client Side Load Balancing For clients connecting to RAC database via SCAN, the client side load balancing happens during the connection phase when the SCAN name gets resolved. Client side sqlnet trace can be used to verify this. For e.g
  • 20. (3086231232) [28-AUG-2013 14:13:06:118] nlad_expand_hst: Expanding xxxxx (3086231232) [28-AUG-2013 14:13:06:118] snlinGetAddrInfo: entry (3086231232) [28-AUG-2013 14:13:11:119] snlinGetAddrInfo: exit (3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: entry (3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: exit (3086231232) [28-AUG-2013 14:13:11:119] nlad_expand_hst: Adding IP xxx.xxx.xxx.xx (3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: entry (3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: exit (3086231232) [28-AUG-2013 14:13:11:119] nlad_expand_hst: Adding IP xxx.xxx.xxx.xx (3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: entry (3086231232) [28-AUG-2013 14:13:11:119] snlinGetNameInfo: exit (3086231232) [28-AUG-2013 14:13:11:119] nlad_expand_hst: Adding IP xxx.xxx.xxx.xx (3086231232) [28-AUG-2013 14:13:11:119] snlinFreeAddrInfo: entry Note that the name "xxxxx" was resolved to three di ff erent IPs, client will randomly pick any of the IP for the connection to achieve client side load balancing. A pre-11g Release 2 client (Oracle Database 11g Release 1 or Oracle Database 10g Release 2, or older) will not fully bene fi t from the advantages of SCAN. For 12c and newer you can use sacn name and there us not challenging. The older version Oracle Client will not be able to handle a set of three IPs returned by the DNS for SCAN. Hence, it will try to connect to only the fi rst address returned in the list and will more or less ignore the others. If the SCAN Listener listening on this speci fi c IP is not available or the IP itself is not available, the connection will fail. To workaround this problem : •If you are using static SCAN IPs, include the address of all SCAN IPs in the tnsnames.ora entry. TEST = (DESCRIPTION = (ADDRESS_LIST = (LOAD_BALANCE=ON) (FAILOVER=ON) (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP1>)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP2>)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP3>)(PORT = 1521)) ) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = myrac)
  • 21. (FAILOVER_MODE= (TYPE=select) (METHOD=basic)))) 12c above: TEST = (DESCRIPTION = (ADDRESS_LIST = (LOAD_BALANCE=ON) (FAILOVER=ON) (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-ip>)(PORT = 1521)) ) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = myrac) (FAILOVER_MODE= (TYPE=select) (METHOD=basic)))) •If you are using GNS (dynamic SCAN IPs), use VIP listener for the connections. However in this case the remote_listener parameter on each node should be set accordingly so that all instance cross registers with each VIP listener within the cluster. Conclusion: Always create and use custom database services for your application, as these allow you to set high availability attributes for Fast Application Noti fi cation (FAN), Draining, and Application Continuity. In Data Guard environments, use role-based database services and create a service with the same name on all sites. Specify the WAIT option in the Data Guard broker switchover command to allow services to drain during switchover operations. Oracle Application Continuity hides database interruptions from end-users by replaying in- fl ight transactions on available database instances, completely transparent to end-users and without any application code changes needed. the Java code provided is an example to get you started using Oracle Application Continuity with Java. For production implementations, please follow the instructions and recommendations provided by Oracle. Sincerely, Alireza Kamrani