N. America: (800)876-3101 | World: +44 (0) 1-344-386-367

Failover for Oracle RAC using DataDirect Connect for JDBC

Oracle RAC systems provide two methods of failover to provide reliable access to data:

  • Connection failover. If a connection failure occurs at connect time, the application can fail over the connection to another active node in the cluster. Connection failover ensures that an open route to your data is always available, even when server downtime occurs.
  • Transparent Application Failover (TAF). If a communication link failure occurs after a connection is established, the connection fails over to another active node. Any disrupted transactions are rolled back, and session properties and server-side program variables are lost. In some cases, if the statement executing at the time of the failover is a Select statement, that statement may be automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.

Both connection failover and TAF provide a connection retry feature that allows a connection to be retried automatically until a connection with another RAC node is successfully re-established.

The primary difference between connection failover and TAF is that the former method provides protection for connections at connect time and the latter method provides protection for connections that have already been established. Also, because the state of the transaction must be stored at all times, TAF requires more processing overhead than connection failover.

Connection Failover

Enabling connection failover allows a driver to attempt to connect on another node if the connection attempt on one node fails. When an application requests a connection to an Oracle database server via the driver, the driver does not connect to the database server directly. Instead, the driver sends a connection request to a listener process, which forwards the request to the appropriate Oracle database instance.

In an Oracle RAC system, each active Oracle database instance in the RAC system registers with each listener configured for the Oracle RAC. For example, if we look at the Oracle RAC nodes A, B, and C in Figure 2, Instance A, B, and C are registered with Listener A, B, and C. If the service name in the connection request specifies the RAC system database name, the requested listener selects one of the registered instances to forward the connection request to, based on the load each of the instances is experiencing. For example, if Instance A and B are operating under a heavy load, a connection request to Listener A results in the connection being forwarded to Instance C.

Connection Routing in an Oracle RAC System

Figure 2: Connection Routing in an Oracle RAC System

Because the requested listener selects from a set of active instances in the RAC to forward connection requests to, it should not route the connection request to an instance that is not running. You may think that connection failover is not needed in an Oracle RAC system; however, if the requested listener is down or the timing of an instance going down is such that the requested listener is not yet aware that an instance is down, the connection request can fail.

The connection failover feature provided by the DataDirect Connect for JDBC Oracle driver handles the case where the requested listener or the server selected by the listener is down by allowing you to specify multiple listeners to which to connect. For example, as shown in Figure 3, if Listener A is down, the DataDirect Connect for JDBC driver can be configured to try Listener B, and then Listener C.

Oracle RAC with Connection Failover

Figure 3: Oracle RAC with Connection Failover

Connection failover provides protection for new connections only and does not preserve states for transactions or queries, so your application needs to provide failure recovery for transactions and queries.

This feature is configured through the AlternateServers connection property of the driver using a connection URL or data source, or through the tnsnames.ora file. The following example shows a connection URL that enables connection failover for the DataDirect Connect for JDBC Oracle driver:

jdbc:datadirect:oracle//serverA:1521;ServiceName=TEST;
AlternateServers=(serverB:1521,serverC:1521)

Transparent Application Failover (TAF)

With TAF, if a communication link failure occurs after a connection is established, the connection is moved to another active Oracle RAC node in the cluster without the application having to re-establish the connection. For example, suppose you have the Oracle RAC environment shown in Figure 4 with multiple connections to Oracle RAC nodes: A, B, and C. As shown in the first case, connections are distributed among the nodes in an Oracle RAC system.

Transparent Application Failover (TAF)

Figure 4: Transparent Application Failover (TAF)

When a communication link failure occurs between an Oracle node and the application as shown in the second case, the driver automatically switches the connection to another available node.

When a user session fails over to an alternate RAC node, the following items are not persisted to the failover node and must be reinitialized by the application:

  • In-use stored procedures
  • Application changes to session state
  • In-flight "write" transactions (local transactions doing database updates)
  • Global transactions

Although Oracle documentation refers to this functionality as transparent, the preceding list shows that it is not completely transparent to an application. The application programmer must include code to handle the necessary "clean-up" caused by rolled back transactions or lost session states. Because of these restrictions, the situations where application failover is beneficial when implemented by the driver are limited.

Applications can perform a failover using the DataDirect Connect for JDBC Oracle driver by performing the following steps.

  1. Catch the communication error exception generated by the driver.
  2. Take the necessary steps to deal with current transactions that were rolled back.
  3. Re-establish the connection to the server.
  4. Re-initialize the session state.
  5. Re-run any transaction that was rolled back.

To make it easy for applications to detect when the connection with the server is lost, all communication error exceptions thrown by the DataDirect Connect for JDBC drivers have a SQL state that begins with 08.

Oracle's TAF implementation in their OCI driver performs Step 3 in the preceding list for the application and may perform Step 5 for the application if the only operation in the transaction is a Select statement.

DataDirect is currently evaluating ways to enhance the failover functionality in the DataDirect Connect for JDBC drivers for a future release.

Connection Retry

DataDirect Connect for JDBC drivers provide a connection retry feature that works with connection failover. You can customize the driver to attempt to reconnect a certain number of times and at a certain time interval. For example, the following connection URL:

jdbc:datadirect:oracle//server1:1521;ServiceName=TEST;
AlternateServers=(server2:1521,server3:1521,server4:1521);
ConnectionRetry=10;ConnectionDelay=10

instructs the driver to cycle through the list of servers (the primary server and alternate servers) up to 10 more times if the driver was unable to establish a connection to any of the servers in the list during the initial pass. The driver waits 10 seconds before it cycles through the list of servers again.

Connection retry can be an important strategy in recovering from failures that bring down an Oracle RAC system. For example, suppose you have a power failure scenario in which both the client and the Oracle RAC system go down. When the power is restored and all computers are restarted, the client may be ready to attempt a connection before an Oracle RAC system has completed its startup routines. If connection retry is enabled, the client application would continue to retry the connection until a connection is successfully accepted by a node in the Oracle RAC system.

 

Prev: Connecting to an Oracle Real Application Clusters (RAC) System

Next: Load Balancing

Email Print Share

Using DataDirect Connect for JDBC with Oracle RAC

Tutorial: Testing and Debugging JDBC Applications Would you rather have the PDF version of this Tutorial? No Problem!

Click here to download the PDF version of Using DataDirect Connect for JDBC with Oracle RAC