Oracle Consulting Oracle Training Oracle Support Development
Home
Catalog
Oracle Books
SQL Server Books
IT Books
Job Interview Books
eBooks
Rampant Horse Books
911 Series
Pedagogue Books

Oracle Software
image
image
Write for Rampant
Publish with Rampant
Rampant News
Rampant Authors
Rampant Staff
  Phone
  800-766-1884
Oracle News
Oracle Forum
Oracle Tips
Articles by our Authors
Press Releases
SQL Server Books
image
image

Oracle 11g Books

Oracle tuning

Oracle training

Oracle support

Remote Oracle

STATSPACK Viewer

    Privacy Policy

 

 
  Build Your Own RAC Cluster on Linux and FireWire
by Jeffrey Hunter - OTN - June 2004

 Jeffrey Hunter is the author of Conducting the Java Job Interview and
Conducting the J2EE Job Interview
by Rampant TechPress

Build Your Own RAC Cluster on Linux and FireWire
by Jeffrey Hunter

Learn how to set up and configure an Oracle Real Applications Cluster for less than $1,500 (for development and testing only)


Overview

One of the most efficient ways to become familiar with Oracle Real Application Clusters (RAC) technology is to have access to an actual Oracle RAC cluster. In learning this new technology, you will soon start to realize the benefits Oracle RAC has to offer like fault tolerance, new levels of security, load balancing, and the ease of upgrading capacity. The challenge, however, is the price of the hardware required for a typical production RAC configuration. A small two-node cluster, for example, can run anywhere from $10,000 to well over $20,000. This cost would not even include shared storage, the heart of a production RAC environment.

For those who simply want to become familiar with Oracle RAC, this article provides a low-cost alternative for configuring an Oracle9i RAC system using commercial off-the-shelf components and downloadable software. The estimated cost for this configuration could be anywhere from $1,000 to $1,500. The system will comprise a dual-node cluster, both running Linux (Red Hat Linux Fedora Core 1 in this example) with a shared disk array based on IEEE1394 (FireWire) drive technology.

Please note that this is not the only way to build a low-cost Oracle9i RAC system. I have seen other solutions that utilize an implementation based on SCSI rather than FireWire for shared storage. In most cases, SCSI will cost more than our FireWire solution where a typical SCSI card is priced around $70 and an 80GB external SCSI drive will cost $700-$1,000. Keep in mind that some motherboards may already include built-in SCSI controllers.

It is important to note that this configuration should never be run in a production environment. In a production environment, fiber channel is the technology of choice, since it is the high-speed serial-transfer interface that can connect systems and storage devices in either point-to-point or switched topologies. FireWire is able to offer a low-cost alternative to fiber channel for testing and development, but it is not ready for production.

NOTE: At the time of this writing, I had not verified that these instructions will work with Oracle Database 10g. I will be providing a separate article in the next several months on how to perform a similar install using 10g.


Oracle9i Real Application Clusters (RAC) Introduction

Oracle Real Application Clusters (RAC) is the successor to Oracle Parallel Server (OPS). RAC allows multiple instances to access the same database (storage) simultaneously. RAC provides fault tolerance, load balancing, and performance benefits by allowing the system to scale out, and at the same time since all nodes access the same database, the failure of one instance will not cause the loss of access to the database.

At the heart of Oracle RAC is a shared disk subsystem. All nodes in the cluster must be able to access all of the data, redo log files, control files and parameter files for all nodes in the cluster. The data disks must be globally available in order to allow all nodes to access the database. Each node has its own redo log and control files, but the other nodes must be able to access them in order to recover that node in the event of a system failure.

Not all clustering solutions use shared storage. Some vendors use an approach known as a federated cluster, in which data is spread across several machines rather than shared by all. With Oracle RAC, however, multiple nodes use the same set of disks for storing data. With Oracle RAC, the data, redo log, control, and archived log files reside on shared storage on raw-disk devices or on a clustered file system. Oracle's approach to clustering leverages the collective processing power of all the nodes in the cluster and at the same time provides failover security.

Although it is not absolutely necessary, Oracle recommendeds that you install the Oracle Cluster File System (OCFS). OCFS makes disk management much easier for you by creating the same file system on all the nodes. This isn't necessary, but without OCFS, you will have to make all partitions manually. (NOTE: This article does not go into the details of installing or utilizing OCFS, but rather uses all manual methods for creating partitions and binding raw devices to those partitions.)

One of the main reasons why I do not use the Oracle Cluster File System for Red Hat Linux is that OCFS comes in the form of RPMs. All the RPM modules and the precompiled modules are tied to the Red Hat Enterprise Linux AS ($1,200) kernel-naming standard and will not load in the supplied 2.4.20 linked kernel.

The biggest difference between Oracle RAC and OPS is the addition of Cache Fusion. With OPS a request for data from one node to another required the data to be written to disk first, then the requesting node can read that data. With cache fusion, data is passed along with locks.

Pre-configured Oracle9i RAC solutions are available from vendors such as Dell, IBM and HP for production environments. This article, however, focuses on putting together your own Oracle9i RAC environment for development and testing by using Linux servers and a low cost shared disk solution; FireWire.


What software is necessary for RAC? Does it have a separate installation CD to order?

RAC is contained within the Oracle9i Database Enterprise Edition. (Oracle recently announced that RAC is now available in Oracle Database 10g Standard Edition as well.) If you install Oracle9i Enterprise Edition onto a cluster, and the Oracle Universal Installer (OUI) recognizes the cluster, you will be provided the option of installing RAC. Most UNIX platforms require an OSD installation for the necessary clusterware. For Intel platforms (Linux and Windows), Oracle provides the OSD software within the Oracle9i Enterprise Edition release.


Shared Storage Overview

Today, fiber-channel is one of the most popular solutions for shared storage. As mentioned earlier, fiber-channel is a high-speed serial-transfer interface that is used to connect systems and storage devices in either point-to-point or switched topologies. Protocols supported by fiber channel include SCSI and IP. Fiber channel configurations can support as many as 127 nodes and have a throughput of up to 2.12 gigabits per second. Fiber-channel, although, is very expensive. Just the fiber-channel switch alone can run as much as $1,000. This does not even include the fiber-channel storage array and high-end drives, which can reach prices of about $300 for a 36GB drive. A typical fiber-channel setup which includes fiber-channel cards for the servers, a basic setup is roughly $5,000, which does not include the cost of the servers that make up the cluster.

A less expensive alternative to fiber-channel is SCSI. SCSI technology provides acceptable performance for shared storage, but for administrators and developers who are accustomed to GPL-based Linux prices, even SCSI can come in over budget, at around $1,000 to $2,000 for a two-node cluster.

Another popular solution is the Sun NFS (Network File System). It can be used for shared storage but only if you are using a network appliance or something similar. Specifically, you need servers that guarantee direct I/O over NFS.


FireWire Technology

Developed by Apple Computer and Texas Instruments, FireWire is a cross-platform implementation of a high-speed serial data bus. With its high bandwidth, long distances (up to 100 meters in length) and high-powered bus, FireWire is being used in applications such as digital video (DV), professional audio, hard drives, high-end digital still cameras and home entertainment devices. Today, FireWire operates at transfer rates of up to 800 megabits per second while next generation FireWire calls for speeds to a theoretical bit rate to 1,600 Mbps and then up to a staggering 3,200 Mbps. That's 3.2 gigabits per second. This speed will make FireWire indispensable for transferring massive data files and for even the most demanding video applications, such as working with uncompressed high-definition (HD) video or multiple standard-definition (SD) video streams.

The following chart shows speed comparisons of the various types of disk interface. For each interface, I provide the maximum transfer rates in kilobits (kb), kilobytes (KB), megabits (Mb), and megabytes (MB) per second. As you can see, the capabilities of IEEE1394 compare very favorably with other available disk interface technologies.

 

Disk Interface Speed
Serial 115 kb/s - (.115 Mb/s)
Parallel (standard) 115 KB/s - (.115 MB/s)
USB 1.1 12 Mb/s - (1.5 MB/s)
Parallel (ECP/EPP) 3.0 MB/s
IDE 3.3 - 16.7 MB/s
ATA 3.3 - 66.6 MB/sec
SCSI-1 5 MB/s
SCSI-2 (Fast SCSI / Fast Narrow SCSI) 10 MB/s
Fast Wide SCSI (Wide SCSI) 20 MB/s
Ultra SCSI (SCSI-3 / Fast-20 / Ultra Narrow) 20 MB/s
Ultra IDE 33 MB/s
Wide Ultra SCSI (Fast Wide 20) 40 MB/s
Ultra2 SCSI 40 MB/s
IEEE1394(b) 100 - 400Mb/s - (12.5 - 50 MB/s)
USB 2.x 480 Mb/s - (60 MB/s)
Wide Ultra2 SCSI 80 MB/s
Ultra3 SCSI 80 MB/s
Wide Ultra3 SCSI 160 MB/s
FC-AL Fiber Channel 100 - 400 MB/s

 

Hardware & Costs

The hardware used to build our example Oracle9i RAC environment consists of two Linux servers and components that can be purchased at any local computer store or over the Internet.

 

Server 1 (linux1)
Dell Dimension XPS D266 Computer
     - 266MHz Pentium II
     - 384MB RAM
     - 60GB Internal HD
     - CDROM and Floppy
$400
2 - Ethernet LAN Cards
     - Linksys 10/100 Mpbs - (To public network)
     - Linksys 10/100 Mpbs - (Used for Interconnect to linux2)
$20
$20
1 - FireWire Card
    - SIIG, Inc. 3-Port 1394 I/O Card
     Note: Cards with chipsets made by VIA or TI are known to work.
$30
Server 2 (linux2)
Pentium IV Computer
     - 1.8GHz Pentium IV
     - 300W Power Supply
     - 512MB RAM
     - 40GB Internal HD
     - 32MB AGP Video Card
     - CDROM and Floppy
$600
2 - Ethernet LAN Cards
     - Linksys 10/100 Mpbs - (To public network)
     - Linksys 10/100 Mpbs - (Used for Interconnect to linux1)
$20
$20
1 - FireWire Card
     - Belkin FireWire 3-Port 1394 PCI Card

 
     Note: Cards with chipsets made by VIA or TI are known to work.
$40
Miscellaneous Components
FireWire Hard Drive
     - Maxtor One Touch 200GB USB 2.0 / Firewire External Hard Drive

 
  Ensure that the FireWire drive you purchase supports multiple logins. If the drive has a chipset that does not allow for concurrent access for more than one server, the disk and its partitions can only be seen by one server at a time. Disks with the Oxford 911 chipset are known to work. Here are the details about the disk that I purchased for this test:
Vendor: Maxtor
Model: OneTouch
Mfg. Part No. or KIT No.: A01A200 or A01A250
Capacity: 200GB or 250GB
Cache Buffer: 8MB
Spin Rate: 7200 RPM
"Combo" Interface: IEEE 1394 and SPB-2 compliant (100 to 400 Mbits/sec) plus USB 2.0 and USB 1.1 compatible

 
$270
1 - Extra FireWire Cable
     - Belkin 6-pin to 6-pin 1394 Cable
$15
1 - Ethernet hub or switch
     - Linksys EtherFast 10/100 5-port Ethernet Switch (used for interconnect int-linux1 / int-linux2)
$40
4 - Network Cables
     - Category 5e patch cable - (Connect linux1 to public network)
     - Category 5e patch cable - (Connect linux2 to public network)
     - Category 5e patch cable - (Connect linux1 to interconnect ethernet switch)
     - Category 5e patch cable - (Connect linux2 to interconnect ethernet switch)
$5
$5
$5
$5
Total   $1,495  

A Brief Walk Through the Process

Before presenting the details of building our Oracle9i RAC system, I thought it would be beneficial to take a brief walk through the steps involved in building the environment. (See Figure 1.)

Our implementation describes a dual node cluster (each with a single processor), each server running Red Hat Linux Fedora Core 1. Note that most of the tasks within this document will need to be performed on both servers. I will indicate at the beginning of each section whether or not the task(s) should be performed on both nodes.

 

     1. Install Red Hat Linux / Fedora Core 1 (on both nodes)
For this example configuration, you will be installing Red Hat Linux (Fedora Core 1) on both nodes that make up the RAC cluster.
 
     2. Configure network settings (on both nodes)
After installing the Red Hat Linux software on both nodes, you will then need to configure the network on both nodes. This includes configuring the public network as well as the interconnect for the cluster. You should also adjust the default and maximum send buffer size settings for the interconnect for better performance when using cache fusion buffer transfers between instances. These settings will be put in your /etc/sysctl.conf file.
 
     3. Obtain and Install a proper Linux Kernel (on both nodes)
In this section, we will be downloading and installing a new Linux kernel—one that supports multiple logins to the Fire Wire storage device. The kernel can be downloaded from Oracle's Linux Projects development group— http://oss.oracle.com. Once the new kernel is installed, there are several configuration steps in order to load the FireWire stack.
 
     4. Create UNIX oracle user account (dba group) (on both nodes)
We will then create an Oracle UNIX user id on all nodes within the RAC cluster. This section also provides an example login script (.bash_profile) that can be used to set all required environment variables for the oracle user.
 
     5. Create Partitions on the Shared FireWire Storage Device (run once only from a single node)
This is where we create the physical and logical volumes using Logical Volume Manager (LVM). Instructions will be provided on how to remove all partitions from our FireWire drive and then how to use LVM to create all of our logical partitions.
 
     6. Create RAW Bindings (on both nodes)
After creating our logical partitions, we need to configure raw devices on our FireWire shared storage to be used for all physical Oracle database files.
 
     7. Create Symbolic Links From RAW Volumes (on both nodes)
It is helpful to create symbolic links from the RAW volumes to human readable names to make file recognition easier. Although this step is optional, it is highly recommended.
 
     8. Configuring the Linux Servers (on both nodes)
This section will detail the steps involved to configure both Linux machines in order to prepare them for an Oracle9i RAC install.
 
     9. Configuring the hangcheck-timer Kernel Module (on both nodes)
Oracle9i RAC uses a kernel module called the hangcheck-timer to monitor the health of the cluster and to restart a RAC mode in case of a failure. This section explains the steps required to configure the hangcheck-timer kernel module. Although the hangcheck-timer module is not required for Oracle Cluster Manager operation, it is highly recommended by Oracle.
 
     10. Configuring RAC Nodes for Remote Access (on both nodes)
When installing Oracle9i RAC, the Oracle Installer will use the rsh command to copy the Oracle software to all other nodes within the RAC cluster. Included in this section are the instructions for configuring all nodes within your RAC cluster to run r* commands like rsh, rcp, and rlogin on a RAC node against other RAC nodes without a password.
 
     11. Configuring a Machine Startup Script (on both nodes)
Up to this point, we have talked in great detail about the parameters and resources that will need to be configured on both nodes for our Oracle9i RAC configuration. This section will take a breather and recap those parameters and commands (in previous sections of this document) that need to happen on each node when the machine is cycled. Although there are several ways to do this, I simply provide a listing of the commands that you can put into a startup script (i.e. /etc/rc.local) that setup all required resources (disks, memory, etc.) each time the machine is booted. Other startup scripts are included within this section in order to provide a check as to whether you have updated all required scripts when each machine in the cluster is booted.
 
     12. Update Red Hat Linux System (on both nodes)
There are several RPMs that will need to be applied to all nodes within the RAC cluster in preparation for the Oracle install. All the RPMs are included on the CDs for Fedora Core 1, plus I also put links to the files from this article. After applying all of the RPMs, you will then need to apply Oracle/Linux Patch 3006854. After applying all required patches, you should reboot all nodes within the RAC cluster.
 
     13. Download / Unpack the Oracle9i Installation Files (from a single node)
This section includes the steps to download and unpack the Oracle9i software distribution. The software can be downloaded from http://otn.oracle.com.
 
     14. Install Oracle9i Cluster Manager ( from a single node)
Installing Oracle9i RAC is a two-step process: (1) Install the Oracle9i Cluster Manager and (2) Install the Oracle9i RDBMS software. In this section, we will go through the steps to install, configure and start the Oracle Cluster Manager software.

Keep in mind that the installation of Oracle Cluster Manager only needs to be preformed on one of the nodes (the installation process will rsh the files out to all other nodes contained within the cluster), but the configuring and starting the Cluster Manager needs to be preformed on both nodes.

 

     15. Install Oracle9i RAC (only needs to be preformed from a single node)
After installing Oracle Cluster Manager, it is time to install the RAC software. This section provides many of the tasks involved to install the software as well as many post installation tasks that should be preformed before creating the Oracle cluster database.
 
     16. Create the Oracle Database (from a single node)
After all the software has been installed, we will now use the Oracle Database Configuration Assistant (DBCA) to create our clustered database on the shared storage (FireWire) device.
 
     17. Creating TNS Networking Files (on both nodes)
This section simply provides an example listing of my listener.ora and tnsnames.ora files. These will need to be configured for each node in the RAC cluster. The Oracle Installer and Oracle Database Configuration Assistant do a great job in keeping these files up to date. I do, however, like to make a few changes to the tnsnames.ora file.
 
     18. Verify the RAC Cluster / Database Configuration (on both nodes)
After the Oracle Database Configuration Assistant has completed in creating the clustered database, you should have a fully functional Oracle9i RAC cluster running. This section provides several commands SQL queries that can be used to validate your Oracle9i RAC configuration.
 
     19. Starting & Stopping the Cluster ( from a single node)
Examples will be given in this section on how to start and stop the cluster. This includes how to fully bring up or down the entire cluster, along with examples of how to bring up and shutdown individual instances within the cluster.
 
     20. Transparent Application Failover (TAF) (on one or both nodes)
Now that we have our cluster up and running, this section provides an example on how to test the Transparent Application Failover features of Oracle9i RAC. I will demonstrate how session failure works and how to setup your TNS configuration to take advantage of TAF.

Install Red Hat Linux (Fedora Core 1)

After procuring the required hardware, it is time to start the configuration process. The first step in the process is to install the Red Hat Linux Fedora Core 1 software on both servers.

NOTE: This article does not provide detailed instructions for installing Red Hat Linux Fedora Core 1. For the purpose of this article, I choose to perform a Custom installation and then "Install Everything" when prompted for which products to install. Documentation for installing Red Hat Linux can be found at http://www.redhat.com/docs/manuals/.


Configure Network Settings

Configuring Public and Private Network

Let's start our Oracle RAC Linux configuration by ensuring the correct network configuration. In our two-node example, we will need to configure the network on both nodes.

The easiest way to configure network settings in RedHat Linux is via the program Network Configuration. This application can be started from the command-line as the "root" user id as follows:

# su -
# /usr/bin/redhat-config-network &

NOTE: Do not use DHCP naming as the interconnects need hard IP addresses!

Using the Network Configuration application, you will need to configure both NIC devices as well as the /etc/hosts file. Both of these tasks can be completed using the Network Configuration GUI. Notice that the /etc/hosts settings are the same for both nodes.

Our example configuration will use the following settings:

Server 1 (linux1)
Device IP Address Subnet Purpose
eth0 192.168.1.100 255.255.255.0 Connects linux1 to the public network
eth1 192.168.2.100 255.255.255.0 Connects linux1 (interconnect) to linux2 (int-linux2)
/etc/hosts
127.0.0.1        localhost      loopback
192.168.1.100    linux1
192.168.2.100    int-linux1
192.168.1.101    linux2
192.168.2.101    int-linux2

 

Server 2 (linux2)
Device IP Address Subnet Purpose
eth0 192.168.1.101 255.255.255.0 Connects linux2 to the public network
eth1 192.168.2.101 255.255.255.0 Connects linux2 (interconnect) to linux1 (int-linux1)
/etc/hosts
127.0.0.1        localhost      loopback
192.168.1.100    linux1
192.168.2.100    int-linux1
192.168.1.101    linux2
192.168.2.101    int-linux2

In the screenshots below, only node 1 (linux1) is shown. Ensure to make all the proper network settings to both nodes.



Figure 1: Network Configuration Screen, Node 1 (linux1)



Figure 2: Ethernet Device Screen, eth0 (linux1)



Figure 3: Ethernet Device Screen, eth1 (linux1)



Figure 4: Network Configuration Screen, /etc/hosts (linux1)
 

Adjusting Network Settings

With Oracle 9.2.0.1 and above, Oracle uses UDP as the default protocol on Linux for interprocess communication (IPC), such as cache fusion buffer transfers between instances within the RAC cluster.

Oracle strongly suggests to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to 256KB, and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256KB.

The receive buffers are used by TCP and UDP to hold received data until is is read by the application. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the socket receive buffer. This could cause the sender to overwhelm the receiver.

NOTE: The default and maximum window size can be changed in the /proc file system without reboot:

su - root

# Default setting in bytes of the socket receive buffer
sysctl -w net.core.rmem_default=262144

# Default setting in bytes of the socket send buffer
sysctl -w net.core.wmem_default=262144

# Maximum socket receive buffer size which may be set by using
# the SO_RCVBUF socket option
sysctl -w net.core.rmem_max=262144

# Maximum socket send buffer size which may be set by using 
# the SO_SNDBUF socket option
sysctl -w net.core.wmem_max=262144

You should make the above changes permanent by adding the following lines to the /etc/sysctl.conf file for each node in your RAC cluster:

net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144

Listing 1
select event,
       total_waits,
       round(100 * (total_waits / sum_waits),2) pct_waits,
       time_wait_sec,
       round(100 * (time_wait_sec / greatest(sum_time_waited,1)),2)
       pct_time_waited,
       total_timeouts,
       round(100 * (total_timeouts / greatest(sum_timeouts,1)),2)
       pct_timeouts,
       average_wait_sec
from
(select event,
       total_waits,
       round((time_waited / 100),2) time_wait_sec,
       total_timeouts,
       round((average_wait / 100),2) average_wait_sec
from sys.v_$system_event
where event not in
('lock element cleanup',
 'pmon timer',
 'rdbms ipc message',
 'rdbms ipc reply',
 'smon timer',
 'SQL*Net message from client',
 'SQL*Net break/reset to client',
 'SQL*Net message to client',
 'SQL*Net more data from client',
 'dispatcher timer',
 'Null event',
 'parallel query dequeue wait',
 'parallel query idle wait - Slaves',
 'pipe get',
 'PL/SQL lock timer',
 'slave wait',
 'virtual circuit status',
 'WMON goes to sleep',
 'jobq slave wait',
 'Queue Monitor Wait',
 'wakeup time manager',
 'PX Idle Wait') AND
 event not like 'DFS%' AND
 event not like 'KXFX%'),
(select sum(total_waits) sum_waits,
        sum(total_timeouts) sum_timeouts,
        sum(round((time_waited / 100),2)) sum_time_waited
 from sys.v_$system_event
 where event not in
 ('lock element cleanup',
 'pmon timer',
 'rdbms ipc message',
 'rdbms ipc reply',
 'smon timer',
 'SQL*Net message from client',
 'SQL*Net break/reset to client',
 'SQL*Net message to client',
 'SQL*Net more data from client',
 'dispatcher timer',
 'Null event',
 'parallel query dequeue wait',
 'parallel query idle wait - Slaves',
 'pipe get',
 'PL/SQL lock timer',
 'slave wait',
 'virtual circuit status',
 'WMON goes to sleep',
 'jobq slave wait',
 'Queue Monitor Wait',
 'wakeup time manager',
 'PX Idle Wait') AND
 event not like 'DFS%' AND
 event not like 'KXFX%')
order by 4 desc, 1 asc
 
 
Listing 2
SELECT sid,
       username,
       event,
       total_waits,
       100 * round((total_waits / sum_waits),2) pct_of_total_waits,
       time_wait_sec,
       total_timeouts,
       average_wait_sec,
       max_wait_sec
 
FROM
(SELECT a.event,
       b.sid sid,
       decode (b.username,null,c.name,b.username) username,
       a.total_waits total_waits,
       round((a.time_waited / 100),2) time_wait_sec,
       a.total_timeouts total_timeouts,
       round((average_wait / 100),2)
       average_wait_sec,
       round((a.max_wait / 100),2) max_wait_sec
  FROM sys.v_$session_event a,
       sys.v_$session b,
       sys.v_$bgprocess c,
       sys.v_$process d
 WHERE a.event NOT IN
          ('lock element cleanup',
          'pmon timer',
          'rdbms ipc message',
          'smon timer',
          'SQL*Net message from client',
          'SQL*Net break/reset to client',
          'SQL*Net message to client',
          'SQL*Net more data from client',
          'dispatcher timer',
          'Null event',
          'parallel query dequeue wait',
          'parallel query idle wait - Slaves',
          'pipe get',
          'PL/SQL lock timer',
          'slave wait',
          'virtual circuit status',
          'WMON goes to sleep'
          )
   AND a.event NOT LIKE 'DFS%'
   AND a.event NOT LIKE 'KXFX%'
   AND a.sid = b.sid
   AND d.addr = b.paddr
   AND c.paddr (+) = b.paddr 
),
(select sum(total_waits) sum_waits
 FROM sys.v_$session_event a,
       sys.v_$session b
 WHERE a.event NOT IN
          ('lock element cleanup',
          'pmon timer',
          'rdbms ipc message',
          'smon timer',
          'SQL*Net message from client',
          'SQL*Net break/reset to client',
          'SQL*Net more data from client',
          'SQL*Net message to client',
          'dispatcher timer',
          'Null event',
          'parallel query dequeue wait',
          'parallel query idle wait - Slaves',
          'pipe get',
          'PL/SQL lock timer',
          'slave wait',
          'virtual circuit status',
          'WMON goes to sleep'
          )
   AND a.event NOT LIKE 'DFS%'
   AND a.event NOT LIKE 'KXFX%'
   AND a.sid = b.sid)
order by 6 desc, 1 asc
 
 

Obtain and Install a Proper Linux Kernel

Overview

The next step is to obtain and install a new Linux kernel that supports the use of IEEE1394 devices with multiple logins. In previous releases of this article, I included the steps to download a patched version of the Linux kernel and then compile it. Thanks to Oracle's Linux Projects development group, this is no longer a requirement. They provide a pre-compiled kernel for Red Hat Enterprise Linux 3.0 (which also works with Fedora) that can simply be downloaded and installed. The instructions for downloading and installing the kernel are included in this section. Before going into the details of how to perform these actions, however, let's take a moment to discuss the changes that are required in the new kernel.

While FireWire drivers already exist for Linux, they often do not support shared storage. Normally, when you logon to an OS, the OS associates the driver to a specific drive for that machine alone. This implementation simply will not work for our RAC configuration. The shared storage (our FireWire hard drive) needs to be accessed by more than one node. We need to enable the FireWire driver to provide nonexclusive access to the drive so that multiple servers—the nodes that comprise the cluster— will be able to access the same storage. This task is accomplished by removing the bit mask that identifies the machine during login in the source code. This results in allowing nonexclusive access to the FireWire hard drive. All other nodes in the cluster login to the same drive during their logon session, using the same modified driver, so they too also have nonexclusive access to the drive.

I'm probably getting ahead of myself, but I want to cover several topics before diving into the details of installing our new Linux kernel. When we install our new Linux kernel (one that supports multiple logons to the FireWire drive) the system will detect and recognize the FireWire attached drive as a SCSI device. You will be able to use standard OS tools to partition the disk, create a file system, and so on. For Oracle9i RAC, you must make partitions for all the files and bind raw devices to those partitions. This article will make use of Logical Volume Manager (LVM) to make all needed paritions (actually to be known as logical partitions) on the FireWire shared drive.

Our implementation describes a dual node cluster (each with a single processor), each server running Red Hat Linux Fedora Core 1. Keep in mind that the process of installing the patched Linux kernel will need to be performed on both Linux nodes. Red Hat Linux Fedora Core 1 includes kernel linux-2.4.22-1.2115.nptl; we will need to download the Oracle-supplied 2.4.21-9.0.1 Linux kernel from the following URL: http://oss.oracle.com/projects/firewire/files.

Perform the following procedures on both nodes in the cluster:

  1. Download one of the following files:

    kernel-2.4.21-9.0.1.ELorafw1.i686.rpm - for single processor

    - OR -

    kernel-smp-2.4.21-9.0.1.ELorafw1.i686.rpm - for multiple processors

  2. Make a backup of your GRUB configuration file:

    In most cases you will be using GRUB for your boot loader. Before actually installing the new kernel ensure to backup a copy of your /etc/grub.conf file:

    # cp /etc/grub.conf /etc/grub.conf.original
  3. Install the new kernel, as user root:
    # rpm -ivh --force kernel-2.4.21-9.0.1.ELorafw1.i686.rpm - for single processor
    - OR -
    # rpm -ivh --force kernel-smp-2.4.21-9.0.1.ELorafw1.i686.rpm  - for multiple processors

    NOTE: Installing the new kernel using RPM will also undate your grub or lilo configuration with the appropiate stanza. There is no need to add any new stanza to your boot loader configuration unless you want to have your old kernel image available.

    The following is a listing of my /etc/grub.conf file before and then after the kernel install. As you can see, the install that I did put in another stanza for the 2.4.21-9.0.1.ELorafw1 kernel. If you want, you can change the entry (default) in the new file so that the new kernel will be the default one booted. By default, the installer keeps your old kernel the default one by setting it to default=1.

    Original /etc/grub.conf File for Fedora Core 1

    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this file
    # NOTICE:  You have a /boot partition.  This means that
    #          all kernel and initrd paths are relative to /boot/, eg.
    #          root (hd0,0)
    #          kernel /vmlinuz-version ro root=/dev/hda3
    #          initrd /initrd-version.img
    #boot=/dev/hda
    default=0
    timeout=10
    splashimage=(hd0,0)/grub/splash.xpm.gz
    title Fedora Core (2.4.22-1.2115.nptl)
          root (hd0,0)
          kernel /vmlinuz-2.4.22-1.2115.nptl ro root=LABEL=/ rhgb
          initrd /initrd-2.4.22-1.2115.nptl.img
    Newly Configured /etc/grub.conf File for Fedora Core 1 After Kernel Install
    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this file
    # NOTICE:  You have a /boot partition.  This means that
    #          all kernel and initrd paths are relative to /boot/, eg.
    #          root (hd0,0)
    #          kernel /vmlinuz-version ro root=/dev/hda3
    #          initrd /initrd-version.img
    #boot=/dev/hda
    default=0
    timeout=10
    splashimage=(hd0,0)/grub/splash.xpm.gz
    title Fedora Core (2.4.21-9.0.1.ELorafw1)
            root (hd0,0)
            kernel /vmlinuz-2.4.21-9.0.1.ELorafw1 ro root=LABEL=/ rhgb
            initrd /initrd-2.4.21-9.0.1.ELorafw1.img
    title Fedora Core (2.4.22-1.2115.nptl)
            root (hd0,0)
            kernel /vmlinuz-2.4.22-1.2115.nptl ro root=LABEL=/ rhgb
            initrd /initrd-2.4.22-1.2115.nptl.img
  4. Add module options:

    Add the following lines to /etc/modules.conf:

    options sbp2 sbp2_exclusive_login=0
    post-install sbp2 insmod sd_mod
    post-remove sbp2 rmmod sd_mod

    It is vital that the parameter sbp2_exclusive_login of the Serial Bus Protocol module (sbp2) be set to zero to allow multiple hosts to login to and access the FireWire disk concurrently. The second line ensures the SCSI disk driver module (sd_mod) is loaded as well since (sbp2) requires the SCSI layer. The core SCSI support module (scsi_mod) will be loaded automatically if (sd_mod) is loaded—there is no need to make a separate entry for it.

  5. Reboot machine

    Reboot your machine into the new kernel. Ensure the firewire (ieee1394) pci cards are plugged into the machine!

  6. Load the firewire stack

    In most cases, the loading of the FireWire stack will already be configured in the /etc/rc.sysinit file. The commands that are contained within this file that are responsible for loading the FireWire stack are:

    # modprobe ohci1394
    # modprobe sbp2
    In older versions of Red Hat, this was not the case and these commands would have to be manually run or put within a startup file. With Fedora Core 1 and higher, these commands are already put within the /etc/rc.sysinit file and run on each boot.
     
  7. Rescan SCSI bus

    In older versions of the kernel, I would need to run the rescan-scsi-bus.sh script in order to detect the FireWire drive. The purpose of this script was to create the SCSI entry for the node by using the following command:

    echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi

    With Fedora Core 1, the disk should be detected automatically.

  8. Check for SCSI Device

    After you have rebooted the machine, the kernel should automatically detect the disk as a SCSI device (/dev/sdXX). This section will provide several commands that should be run on both nodes in the cluster to ensure the FireWire drive was successfully detected.

    For this configuration, I was performing the above procedures on both nodes at the same time. When complete, I shutdown both machines, started linux1 first, and then linux2. The following commands and results are from my linux2 machine. Again, make sure that you run the following commands on both nodes to ensure both machine can login to the shared drive.

    Let's first check to see that the FireWire adapter was successfully detected:

    # lspci
    00:00.0 Host bridge: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge (rev 11)
    00:01.0 PCI bridge: Intel Corp. 82845 845 (Brookdale) Chipset AGP Bridge (rev 11)
    00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 01)
    00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 01)
    00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 01)
    00:1d.7 USB Controller: Intel Corp. 82801DB USB2 (rev 01)
    00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB PCI Bridge (rev 81)
    00:1f.0 ISA bridge: Intel Corp. 82801DB LPC Interface Controller (rev 01)
    00:1f.1 IDE interface: Intel Corp. 82801DB Ultra ATA Storage Controller (rev 01)
    00:1f.3 SMBus: Intel Corp. 82801DB/DBM SMBus Controller (rev 01)
    01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1)
    02:00.0 Ethernet controller: Linksys Network Everywhere Fast Ethernet 10/100 model NC100 (rev 11)
    02:01.0 FireWire (IEEE 1394): Texas Instruments TSB12LV26 IEEE-1394 Controller (Link)
    02:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
    02:07.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
    Second, let's check to see that the modules are loaded:
    # lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod"
    sd_mod                 13808   0
    sbp2                   20556   0
    scsi_mod              109864   3  [sg sd_mod sbp2]
    ohci1394               28904   0  (unused)
    ieee1394               63652   0  [sbp2 ohci1394]
    Third, let's make sure the disk was detected and an entry was made by the kernel:
    # cat /proc/scsi/scsi
    Attached devices:
    Host: scsi0 Channel: 00 Id: 00 Lun: 00
      Vendor: Maxtor   Model: OneTouch         Rev: 0200
      Type:   Direct-Access
    Now let's ensure the FireWire drive is accessible for multiple logins and shows a valid login:
    # dmesg | grep sbp2
    ieee1394: sbp2: Query logins to SBP-2 device successful
    ieee1394: sbp2: Maximum concurrent logins supported: 3
    ieee1394: sbp2: Number of active logins: 2
    ieee1394: sbp2: Logged into SBP-2 device
    ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]
    ieee1394: sbp2: Reconnected to SBP-2 device
    ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]

    From the above output, you can see that the FireWire drive we have can support concurrent logins by up to 3 servers. It is vital that you have a drive where the chipset supports concurrent access for all nodes within the RAC cluster.

  9. Troubleshoot SCSI Device Detection

    If you are having troubles with any of the procedures (above) in detecting the SCSI device, you can try the following:

    # modprobe -r sbp2
    # modprobe -r sd_mod
    # modprobe -r ohci1394
    # modprobe ohci1394
    # modprobe sd_mod
    # modprobe sbp2

Create "oracle" User and Directories (both nodes)

Let's continue our example by creating the UNIX dba group and oracle userid along with all appropriate directories.

# mkdir /u01
# mkdir /u01/app

# groupadd -g 115 dba

# useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle Software Owner" -p oracle oracle

NOTE: When you are setting the Oracle environment variables for each RAC node, ensure to assign each RAC node a unique Oracle SID!

For this example, I used:

  • linux1 : ORACLE_SID=orcl1
  • linux2 : ORACLE_SID=orcl2
NOTE: The Oracle Universal Installer (OUI) requires at most 400MB of free space in the /tmp directory.

You can check the available space in /tmp by running the following command:

# df -k /tmp
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda3             36384656   6224240  28312140  19% /
If for some reason you do not have enough space in /tmp, you can temporarily create space in another file system and point your TEMP and TMPDIR to it for the duration of the install. Here are the steps to do this:
# su -
# mkdir /<AnotherFilesystem>/tmp
# chown root.root /<AnotherFilesystem>/tmp
# chmod 1777 /<AnotherFilesystem>/tmp
# export TEMP=/<AnotherFilesystem>/tmp     # used by Oracle
# export TMPDIR=/<AnotherFilesystem>/tmp   # used by Linux programs
                                           #   like the linker "ld"
When the installation of Oracle is complete, you can remove the temporary directory using the following:
# su -
# rmdir /<AnotherFilesystem>/tmp
# unset TEMP
# unset TMPDIR

After creating the "oracle" UNIX userid on both nodes, ensure that the environment is setup correctly by using the following .bash_profile:

 

# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
      . ~/.bashrc
fi

alias ls="ls -FA"

# User specific environment and startup programs
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/9.2.0

# Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2,...)
export ORACLE_SID=orcl1

export PATH=.:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp
export LD_ASSUME_KERNEL=2.4.1

Creating Partitions on the Shared FireWire Storage Device (one node)

Overview

It is time to create the physical and logical volumes to be used by the Logical Volume Manager (LVM). (For a more detailed view of managing the LVM, see my article Managing Physical & Logical Volumes.) The following table lists the mappings of logical partition to tablespace that we will be accomplishing in this section of the document:

Logical Volume RAW Volume Symbolic Link Tablespace/ File Name Tablespace/ File Size Partition Size
/dev/pv1/lvol1 /dev/raw/raw1 /u01/app/oracle/oradata/orcl/CMQuorumFile Cluster Manager Quorum File
-
5MB
/dev/pv1/lvol2 /dev/raw/raw2 /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile Shared Configuration File
-
100MB
/dev/pv1/lvol3 /dev/raw/raw3 /u01/app/oracle/oradata/orcl/spfileorcl.ora Server Parameter File
-
10MB
/dev/pv1/lvol4 /dev/raw/raw4 /u01/app/oracle/oradata/orcl/control01.ctl Control File 1
-
200MB
/dev/pv1/lvol5 /dev/raw/raw5 /u01/app/oracle/oradata/orcl/control02.ctl Control File 2
-
200MB
/dev/pv1/lvol6 /dev/raw/raw6 /u01/app/oracle/oradata/orcl/control03.ctl Control File 3
-
200MB
/dev/pv1/lvol7 /dev/raw/raw7 /u01/app/oracle/oradata/orcl/cwmlite01.dbf CWMLITE
50MB
55MB
/dev/pv1/lvol8 /dev/raw/raw8 /u01/app/oracle/oradata/orcl/drsys01.dbf DRSYS
20MB
25MB
/dev/pv1/lvol9 /dev/raw/raw9 /u01/app/oracle/oradata/orcl/example01.dbf EXAMPLE
250MB
255MB
/dev/pv1/lvol10 /dev/raw/raw10 /u01/app/oracle/oradata/orcl/indx01.dbf INDX
100MB
105MB
/dev/pv1/lvol11 /dev/raw/raw11 /u01/app/oracle/oradata/orcl/odm01.dbf ODM
50MB
55MB
/dev/pv1/lvol12 /dev/raw/raw12 /u01/app/oracle/oradata/orcl/system01.dbf SYSTEM
800MB
805MB
/dev/pv1/lvol13 /dev/raw/raw13 /u01/app/oracle/oradata/orcl/temp01.dbf TEMP
250MB
255MB
/dev/pv1/lvol14 /dev/raw/raw14 /u01/app/oracle/oradata/orcl/tools01.dbf TOOLS
100MB
105MB
/dev/pv1/lvol15 /dev/raw/raw15 /u01/app/oracle/oradata/orcl/undotbs01.dbf UNDOTBS1
400MB
405MB
/dev/pv1/lvol16 /dev/raw/raw16 /u01/app/oracle/oradata/orcl/undotbs02.dbf UNDOTBS2
400MB
405MB
/dev/pv1/lvol17 /dev/raw/raw17 /u01/app/oracle/oradata/orcl/users01.dbf USERS
100MB
105MB
/dev/pv1/lvol18 /dev/raw/raw18 /u01/app/oracle/oradata/orcl/xdb01.dbf XDB
150MB
155MB
/dev/pv1/lvol19 /dev/raw/raw19 /u01/app/oracle/oradata/orcl/perfstat01.dbf PERFSTAT
100MB
105MB
/dev/pv1/lvol20 /dev/raw/raw20 /u01/app/oracle/oradata/orcl/redo01.log REDO G1 / M1
100MB
105MB
/dev/pv1/lvol21 /dev/raw/raw21 /u01/app/oracle/oradata/orcl/redo02.log REDO G2 / M1
100MB
105MB
/dev/pv1/lvol22 /dev/raw/raw22 /u01/app/oracle/oradata/orcl/redo03.log REDO G3 / M1
100MB
105MB
/dev/pv1/lvol23 /dev/raw/raw23 /u01/app/oracle/oradata/orcl/orcl_redo2_2.log REDO G4 / M1
100MB
105MB

Remove All Partitions on FireWire Shared Storage

In this example, I will be using the entire FireWire disk (no partitions). In this case, I will be using /dev/sda to create the logical / physical volumes. This is not the only way to accomplish the task of creating our LVM environment. We could also create a Linux LVM partition (this is type 8e) on the disk. Let's say that the LVM partition is the first partition created on the disk. We would then need to work with /dev/sda1. Again, in this example, I will be using the entire FireWire drive (with no partitions) and therefore accessing /dev/sda. Before creating our physical and logical volumes, it is important to remove any existing partitions on the FireWire drive (since we will be using the entire disk) by using the fdisk command:

# fdisk /dev/sda
Command (m for help): p

Disk /dev/sda: 203.9 GB, 203927060480 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sda1             1     24791 199133676    c  Win95 FAT32 (LBA)

Command (m for help): d
Selected partition 1

Command (m for help): p

Disk /dev/sda: 203.9 GB, 203927060480 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id  System

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Create Logical Volumes

The following set of commands perform the steps required to create logical volumes:

  1. Run the vgscan command (on all RAC nodes within the cluster) in order to create the /etc/lvmtab file.
  2. Use pvcreate to create a physical volume for use by the Logical Volume Manager (LVM).
  3. Use vgcreate to create a volume group for the drive or for the partition you want to use for RAW devices. Here we do the entire single drive. In our example (below), the command will allow 256 logical partitions and 256 physical partitions with a 128K extent size.
  4. Use lvcreate to create the logical volumes inside the volume group.

NOTE: As mentioned above, I needed to run the vgscan command on all nodes so that it could create the /etc/lvmtab file. This should be performed before running the commands below.

Put the following commands in a schell script, modify the permission to execute, and then run it as the "root" UNIX userid:

vgscan
pvcreate -d /dev/sda
vgcreate -l 256 -p 256 -s 128k /dev/pv1 /dev/sda
lvcreate -L 5m   /dev/pv1
lvcreate -L 100m /dev/pv1
lvcreate -L 10m  /dev/pv1
lvcreate -L 200m /dev/pv1
lvcreate -L 200m /dev/pv1
lvcreate -L 200m /dev/pv1
lvcreate -L 55m  /dev/pv1
lvcreate -L 25m  /dev/pv1
lvcreate -L 255m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 55m  /dev/pv1
lvcreate -L 805m /dev/pv1
lvcreate -L 255m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 405m /dev/pv1
lvcreate -L 405m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 155m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1

Using the script (above) will result in the creation of /dev/pv1/lvol1 - /dev/pv1/lvol23.

I typically use the lvscan command to check the status of my logical volumes:

[root@linux2 root]# lvscan
lvscan -- ACTIVE            "/dev/pv1/lvol1" [5 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol2" [100 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol3" [10 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol4" [200 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol5" [200 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol6" [200 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol7" [55 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol8" [25 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol9" [255 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol10" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol11" [55 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol12" [805 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol13" [255 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol14" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol15" [405 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol16" [405 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol17" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol18" [155 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol19" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol20" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol21" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol22" [105 MB]
lvscan -- ACTIVE            "/dev/pv1/lvol23" [105 MB]
lvscan -- 23 logical volumes with 3.88 GB total in 1 volume group
lvscan -- 23 active logical volumes

Reboot All Nodes in RAC Cluster

After you have finished creating the partitions, it is recommended that you reboot the kernel on all RAC nodes to make sure that all of the new partitions are recognized by the kernel on all RAC nodes:

# su -
# reboot

IMPORTANT: Keep in mind that you will need to put a call to the vgscan and then vgchange -a y< in one of your startup scripts so that they are run at boot time for each machine in your RAC cluster. These two commands will give you an actual volume manager database before activating all volume groups. This document will provide all settings that should go into your /etc/rc.local script in order to setup each node within your Oracle9i RAC cluster.


Create RAW Bindings (both nodes)

NOTE: Several of the commands within this section will need to be performed on every node within the cluster every time that machine is booted. Details of these commands and instructions for placing them in a startup script are included in the section All Startup Commands for Each RAC Node.

In this section, I will provide the instructions for configuring raw devices on our FireWire shared storage to be used for all physical Oracle database files including the Cluster Manager Quorum File and the Shared Configuration File for srvctl.

At this point, we have already created the partions required on our FireWire shared storage. We now need to bind all volumes to our raw device by using the raw command:

usr/bin/raw /dev/raw/raw1 /dev/pv1/lvol1
/usr/bin/raw /dev/raw/raw2 /dev/pv1/lvol2
/usr/bin/raw /dev/raw/raw3 /dev/pv1/lvol3
/usr/bin/raw /dev/raw/raw4 /dev/pv1/lvol4
/usr/bin/raw /dev/raw/raw5 /dev/pv1/lvol5
/usr/bin/raw /dev/raw/raw6 /dev/pv1/lvol6
/usr/bin/raw /dev/raw/raw7 /dev/pv1/lvol7
/usr/bin/raw /dev/raw/raw8 /dev/pv1/lvol8
/usr/bin/raw /dev/raw/raw9 /dev/pv1/lvol9
/usr/bin/raw /dev/raw/raw10 /dev/pv1/lvol10
/usr/bin/raw /dev/raw/raw11 /dev/pv1/lvol11
/usr/bin/raw /dev/raw/raw12 /dev/pv1/lvol12
/usr/bin/raw /dev/raw/raw13 /dev/pv1/lvol13
/usr/bin/raw /dev/raw/raw14 /dev/pv1/lvol14
/usr/bin/raw /dev/raw/raw15 /dev/pv1/lvol15
/usr/bin/raw /dev/raw/raw16 /dev/pv1/lvol16
/usr/bin/raw /dev/raw/raw17 /dev/pv1/lvol17
/usr/bin/raw /dev/raw/raw18 /dev/pv1/lvol18
/usr/bin/raw /dev/raw/raw19 /dev/pv1/lvol19
/usr/bin/raw /dev/raw/raw20 /dev/pv1/lvol20
/usr/bin/raw /dev/raw/raw21 /dev/pv1/lvol21
/usr/bin/raw /dev/raw/raw22 /dev/pv1/lvol22
/usr/bin/raw /dev/raw/raw23 /dev/pv1/lvol23
/bin/chmod 600 /dev/raw/raw1
/bin/chmod 600 /dev/raw/raw2
/bin/chmod 600 /dev/raw/raw3
/bin/chmod 600 /dev/raw/raw4
/bin/chmod 600 /dev/raw/raw5
/bin/chmod 600 /dev/raw/raw6
/bin/chmod 600 /dev/raw/raw7
/bin/chmod 600 /dev/raw/raw8
/bin/chmod 600 /dev/raw/raw9
/bin/chmod 600 /dev/raw/raw10
/bin/chmod 600 /dev/raw/raw11
/bin/chmod 600 /dev/raw/raw12
/bin/chmod 600 /dev/raw/raw13
/bin/chmod 600 /dev/raw/raw14
/bin/chmod 600 /dev/raw/raw15
/bin/chmod 600 /dev/raw/raw16
/bin/chmod 600 /dev/raw/raw17
/bin/chmod 600 /dev/raw/raw18
/bin/chmod 600 /dev/raw/raw19
/bin/chmod 600 /dev/raw/raw20
/bin/chmod 600 /dev/raw/raw21
/bin/chmod 600 /dev/raw/raw22
/bin/chmod 600 /dev/raw/raw23
/bin/chown oracle:dba /dev/raw/raw1
/bin/chown oracle:dba /dev/raw/raw2
/bin/chown oracle:dba /dev/raw/raw3
/bin/chown oracle:dba /dev/raw/raw4
/bin/chown oracle:dba /dev/raw/raw5
/bin/chown oracle:dba /dev/raw/raw6
/bin/chown oracle:dba /dev/raw/raw7
/bin/chown oracle:dba /dev/raw/raw8
/bin/chown oracle:dba /dev/raw/raw9
/bin/chown oracle:dba /dev/raw/raw10
/bin/chown oracle:dba /dev/raw/raw11
/bin/chown oracle:dba /dev/raw/raw12
/bin/chown oracle:dba /dev/raw/raw13
/bin/chown oracle:dba /dev/raw/raw14
/bin/chown oracle:dba /dev/raw/raw15
/bin/chown oracle:dba /dev/raw/raw16
/bin/chown oracle:dba /dev/raw/raw17
/bin/chown oracle:dba /dev/raw/raw18
/bin/chown oracle:dba /dev/raw/raw19
/bin/chown oracle:dba /dev/raw/raw20
/bin/chown oracle:dba /dev/raw/raw21
/bin/chown oracle:dba /dev/raw/raw22
/bin/chown oracle:dba /dev/raw/raw23

NOTE: Keep in mind that the above bind steps will need to be done for each node within the RAC cluster on each startup. It will be placed in a startup script like /etc/rc.local.

You can verify raw bindings by using the raw command:

# raw -qa
/dev/raw/raw1:  bound to major 58, minor 0
/dev/raw/raw2:  bound to major 58, minor 1
/dev/raw/raw3:  bound to major 58, minor 2
/dev/raw/raw4:  bound to major 58, minor 3
/dev/raw/raw5:  bound to major 58, minor 4
/dev/raw/raw6:  bound to major 58, minor 5
/dev/raw/raw7:  bound to major 58, minor 6
/dev/raw/raw8:  bound to major 58, minor 7
/dev/raw/raw9:  bound to major 58, minor 8
/dev/raw/raw10: bound to major 58, minor 9
/dev/raw/raw11: bound to major 58, minor 10
/dev/raw/raw12: bound to major 58, minor 11
/dev/raw/raw13: bound to major 58, minor 12
/dev/raw/raw14: bound to major 58, minor 13
/dev/raw/raw15: bound to major 58, minor 14
/dev/raw/raw16: bound to major 58, minor 15
/dev/raw/raw17: bound to major 58, minor 16
/dev/raw/raw18: bound to major 58, minor 17
/dev/raw/raw19: bound to major 58, minor 18
/dev/raw/raw20: bound to major 58, minor 19
/dev/raw/raw21: bound to major 58, minor 20
/dev/raw/raw22: bound to major 58, minor 21
/dev/raw/raw23: bound to major 58, minor 22

Create Symbolic Links From RAW Volumes (both nodes)

NOTE: Several of the commands within this section will need to be performed on every node within the cluster every time that machine is booted. Details of these commands and instructions for placing them in a startup script are included in the section All Startup Commands for Each RAC Node.

I generally create symbolic links from the RAW volumes to human readable names to make file recognition easier. If you decide to NOT use symbolic links then you will need to use the /dev/pv1/lvolX designations for the Oracle files you define when creating / maintaining tablespaces. For some people, dealing with the cryptic designations (i.e. /dev/pv1/lvol21) is simply too much trouble—it is much easier to work with human readable names. These commands will need to be issued once on each Linux server. I typically include the in the /etc/rc.local startup script. If you add tablespaces; a new logical volume, RAW binding and link name should be added to the various files on all nodes.

mkdir /u01/app/oracle/oradata
mkdir /u01/app/oracle/oradata/orcl

ln -s /dev/raw/raw1  /u01/app/oracle/oradata/orcl/CMQuorumFile
ln -s /dev/raw/raw2  /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile
ln -s /dev/raw/raw3  /u01/app/oracle/oradata/orcl/spfileorcl.ora
ln -s /dev/raw/raw4  /u01/app/oracle/oradata/orcl/control01.ctl
ln -s /dev/raw/raw5  /u01/app/oracle/oradata/orcl/control02.ctl
ln -s /dev/raw/raw6  /u01/app/oracle/oradata/orcl/control03.ctl
ln -s /dev/raw/raw7  /u01/app/oracle/oradata/orcl/cwmlite01.dbf
ln -s /dev/raw/raw8  /u01/app/oracle/oradata/orcl/drsys01.dbf
ln -s /dev/raw/raw9  /u01/app/oracle/oradata/orcl/example01.dbf
ln -s /dev/raw/raw10 /u01/app/oracle/oradata/orcl/indx01.dbf
ln -s /dev/raw/raw11 /u01/app/oracle/oradata/orcl/odm01.dbf
ln -s /dev/raw/raw12 /u01/app/oracle/oradata/orcl/system01.dbf
ln -s /dev/raw/raw13 /u01/app/oracle/oradata/orcl/temp01.dbf
ln -s /dev/raw/raw14 /u01/app/oracle/oradata/orcl/tools01.dbf
ln -s /dev/raw/raw15 /u01/app/oracle/oradata/orcl/undotbs01.dbf
ln -s /dev/raw/raw16 /u01/app/oracle/oradata/orcl/undotbs02.dbf
ln -s /dev/raw/raw17 /u01/app/oracle/oradata/orcl/users01.dbf
ln -s /dev/raw/raw18 /u01/app/oracle/oradata/orcl/xdb01.dbf
ln -s /dev/raw/raw19 /u01/app/oracle/oradata/orcl/perfstat01.dbf
ln -s /dev/raw/raw20 /u01/app/oracle/oradata/orcl/redo01.log
ln -s /dev/raw/raw21 /u01/app/oracle/oradata/orcl/redo02.log
ln -s /dev/raw/raw22 /u01/app/oracle/oradata/orcl/redo03.log
ln -s /dev/raw/raw23 /u01/app/oracle/oradata/orcl/orcl_redo2_2.log

chown -R oracle:dba /u01/app/oracle/oradata

Configuring the Linux Servers (both nodes)

NOTE: Several of the commands within this section will need to be performed on every node within the cluster every time that machine is booted. Details of these commands and instructions for placing them in a startup script are included in section All Startup Commands for Each RAC Node.

This section of the document focuses on configuring both Linux servers—getting each one prepared for the Oracle9i RAC installation.

Swap Space Considerations

  • Installing Oracle9i requires a minimum of 512MB of memory.
    (An inadequate amount of swap during the installation will cause the Oracle Universal Installer to either "hang" or "die")

     

  • To check the amount of memory / swap you have allocated, type either:

    # free

    - OR -

    # cat /proc/swaps

    - OR -

    # cat /proc/meminfo | grep MemTotal

     

  • If you have less than 512MB of memory (between your RAM and SWAP), you can add temporary swap space by creating a temporary swap file. This way you do not have to use a raw device or even more drastic, rebuild your system.

    As root, make a file that will act as additional swap space, let's say about 300MB:
    # dd if=/dev/zero of=tempswap bs=1k count=300000

    Now we should change the file permissions:
    # chmod 600 tempswap

    Finally we format the "partition" as swap and add it to the swap space:
    # mke2fs tempswap
    # mkswap tempswap
    # swapon tempswap
     

Setting Shared Memory

Shared memory allows processes to access common structures and data by placing them in a shared memory segment. This is the fastest form of interprocess communication (IPC) available—mainly due to the fact that no kernel involvement occurs when data is being passed between the processes. Data does not need to be copied between processes.

Oracle makes use of shared memory for its Shared Global Area (SGA), which is an area of memory shared by all Oracle backup and foreground processes. Adequate sizing of the SGA is critical to Oracle performance because it is responsible for holding the database buffer cache, shared SQL, access paths, and so much more.

To determine all shared memory limits, use the following:

# ipcs -lm

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1
Setting SHMMAX

The SHMMAX parameters defines the maximum size (in bytes) for a shared memory segment. The Oracle SGA comprises shared memory and it is possible that incorrectly setting SHMMAX could limit the size of the SGA. When setting SHMMAX, keep in mind that the size of the SGA should fit within one shared memory segment. An inadequate SHMMAX setting could result in the following:

ORA-27123: unable to attach to shared memory segment

You can determine the value of SHMMAX by performing the following:

# cat /proc/sys/kernel/shmmax
33554432
The default value for SHMMAX is 32MB. This is often too small to configure the Oracle SGA. I generally get the SHMMAX parameter to 2GB using either of the following methods:
  • You can alter the default setting for SHMMAX without rebooting the machine by making the changes directly to the /proc file system. This is the method that I use by placing the following into the /etc/rc.local startup file:
    # >echo "2147483648" > /proc/sys/kernel/shmmax
  • You can also use the sysctl command to change the value of SHMMAX:
    # sysctl -w kernel.shmmax=2147483648
  • Lastly, you can make this change permanent by inserting the kernel parameter in the /etc/sysctl.conf startup file:
    # echo "kernel.shmmax=2147483648" >> /etc/sysctl.con 

Setting SHMMNI

We now look at the SHMMNI parameters. This kernel parameter is used to set the maximum number of shared memory segments systemwide. The default value for this parameter is 4096. This value is sufficient and typically does not need to be changed.

You can determine the value of SHMMNI by performing the following:

# cat /proc/sys/kernel/shmmni
4096

Setting SHMALL

Finally, we look at the SHMALL shared memory kernel parameter. This parameter controls the total amount of shared memory (in pages) that can be used at one time on the system. In short, the value of this parameter should always be at least:

ceil(SHMMAX/PAGE_SIZE)
The default size of SHMALL is 2097152 and be queried using the following command:
# cat /proc/sys/kernel/shmall
2097152
The default setting for SHMALL should be adequate for our Oracle9i RAC installation.

NOTE: The page size in Red Hat Linux on the i386 platform is 4,096 bytes. You can, however, use bigpages, which supports the configuration of larger memory page sizes.

Setting Semaphores

Now that we have configured our shared memory settings, it is time to take care of configuring our semaphores. The best way to describe a semaphore is as a counter that is used to provide synchronization between processes (or threads within a process) for shared resources like shared memory. Semaphore sets are supported in System V where each one is a counting semaphore. When an application requests semaphores, it does so using "sets."

To determine all semaphore limits, use the following:

# ipcs -ls

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
You can also use the following command:
# cat /proc/sys/kernel/sem
250     32000   32      128

Setting SEMMSL

The SEMMSL kernel parameter is used to control the maximum number of semaphores per semaphore set.

Oracle recommends setting SEMMSL to the largest PROCESS instance parameter setting in the init.ora file for all databases on the Linux system plus 10. Also, Oracle recommends setting the SEMMSL to a value of no less than 100.

Setting SEMMNI

The SEMMNI kernel parameter is used to control the maximum number of semaphore sets in the entire Linux system.

Oracle recommends setting the SEMMNI to a value of no less than 100.

 

Setting SEMMNS

The SEMMNS> kernel parameter is used to control the maximum number of semaphores (not semaphore sets) in the entire Linux system.

Oracle recommends setting the SEMMNS to the sum of the PROCESSES instance parameter setting for each database on the system, adding the largest PROCESSES twice, and then finally adding 10 for each Oracle database on the system.

Use the following calculation to determine the maximum number of semaphores that can be allocated on a Linux system. It will be the lesser of:

SEMMNS  -or-  (SEMMSL * SEMMNI)

 

Setting SEMOPM

The SEMOPM kernel parameter is used to control the number of semaphore operations that can be performed per semop system call.

The semop system call (function) provides the ability to do operations for multiple semaphores with one semop system call. A semaphore set can have the maximum number of SEMMSL semaphores per semaphore set and is therefore recommended to set SEMOPM equal to SEMMSL.

Oracle recommends setting the SEMOPM to a value of no less than 100.

Setting Semaphore Kernel Parameters

Finally, we see how to set all semaphore parameters using several methods. In the following, the only parameter I care about changing (raising) is SEMOPM. All other default settings should be sufficient for our example installation.

  • You can alter the default setting for all semaphore settings without rebooting the machine by making the changes directly to the /proc file system. This is the method that I use by placing the following into the /etc/rc.local startup file:
    # echo "250 32000 100 128" > /proc/sys/kernel/sem
  • You can also use the sysctl command to change the value of all semaphore settings:
    # sysctl -w kernel.sem="250 32000 100 128"
  • Finally, you can make this change permanent by inserting the kernel parameter in the /etc/sysctl.conf startup file:
    # echo "kernel.sem=250 32000 100 128" >> /etc/sysctl.conf

 

Setting File Handles

When configuring our Red Hat Linux server, it is critical to ensure that the maximum number of file handles is sufficiently large. The setting for file handles denotes the number of open files that you can have on the Linux system.

Use the following command to determine the maximum number of file handles for the entire system:

# cat /proc/sys/fs/file-max
32768

Oracle recommends that the file handles for the entire system be set to at least 65536.

  • You can alter the default setting for the maximum number of file handles without rebooting the machine by making the changes directly to the /proc file system. This is the method that I use by placing the following into the /etc/rc.local startup file:
    # echo "65536" > /proc/sys/fs/file-max

     

  • You can also use the sysctl command to change the value of SHMMAX:
    # sysctl -w fs.file-max=65536

     

  • Finally, you can make this change permanent by inserting the kernel parameter in the /etc/sysctl.conf startup file:
    # echo "fs.file-max=65536" >> /etc/sysctl.conf

NOTE: You can query the current usage of file handles by using the following:

# cat /proc/sys/fs/file-nr
613     95      32768
The file-nr file displays three parameters:
  • Total allocated file handles
  • Currently used file handles
  • Maximum file handles that can be allocated

NOTE: If you need to increase the value in /proc/sys/fs/file-max, then make sure that the ulimit is set properly. Usually for 2.4.20 it is set to unlimited. Verify the ulimit setting my issuing the ulimit command:

# ulimit
unlimited

Configuring the hangcheck-timer Kernel Module

 

Oracle 9.0.1 and 9.2.0.1 used a userspace watchdog daemon called watchdogd to monitor the health of the cluster and to restart a RAC mode in case of a failure. Starting with Oracle 9.2.0.2, however, this daemon has been deprecated by a Linux kernel module named hangcheck-timer which addresses availability and reliability problems much better. The hangcheck-timer is loaded into the kernel and checks if the system hangs. It will set a timer and check the timer after a certain amount of time. There is a configurable threshold to hang-check that, if exceeded will reboot the machine. Although the hangcheck-timer module is not required for Oracle Cluster Manager operation, it is highly recommended by Oracle.

 

The hangcheck-timer.o Module

 

The hangcheck-timer module uses a kernel-based timer that periodically checks the system task scheduler to catch delays in order to determine the health of the system. If the system hangs or pauses, the timer resets the node. The hangcheck-timer module uses the Time Stamp Counter (TSC) CPU register which is a counter that is incremented at each clock signal. The TCS offers much more accurate time measurements since this register is updated by the hardware automatically.

 

Much more information about the hangcheck-timer project can be found here.

 

Installing the hangcheck-timer.o Module

 

The hangcheck-timer was normally shipped by Oracle, however, this module is now included with Red Hat Linux AS starting with kernel versions 2.4.9-e.12 and higher. If you followed the steps in the "Obtaining and Installing a proper Linux Kernel," the hangcheck-timer is already included for you. Use the following to ensure that you have the module included:
# find /lib/modules -name "hangcheck-timer.o"
/lib/modules/2.4.21-9.0.1.ELorafw1/kernel/drivers/char/hangcheck-timer.o

 

Configuring and Loading the hangcheck-timer Module

 

There are two key parameters to the hangcheck-timer module.
  • hangcheck-tick: This parameter defines the period of time between checks of system health. The default value is 60 seconds. Oracle recommends to set it to 30 seconds.

     

  • hangcheck-margin: This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default value is 180 seconds. Oracle recommends to set it to 180 seconds.
These two parameters need to be coordinated with the MissCount parameter in the $ORACLE_HOME/oracm/admin/cmcfg.ora file for the Cluster Manager.

NOTE: The two hangcheck-timer module parameters indicate how long a RAC node must hang before it will reset the system. A node reset will occur when the following is true:

system hang time > (hangcheck_tick + hangcheck_margin)
Now let's talk about how to load the module. You can load the module with the correct parameter settings manually by using the following:
# su -
# /sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
# grep Hangcheck /var/log/messages*
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel: Hangcheck: 
  starting hangcheck timer 0.8.0 (tick is 30 seconds, margin is 180 seconds).
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel: Hangcheck: Using TSC.
Put the above "insmod" command in your /etc/rc.local file!

Although the manual method for loading the module (above) will work, we need a way to load the module with the correct parameters on every reboot of the node. We do this by making entries in the /etc/modules.conf file. Add the following line to the /etc/modules.conf file:

# su -
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modules.conf
Now, to test the module, use the modprobe command. You can run the modprobe command to manually load the hangcheck-timer module with the configured parameters definded in the /etc/modules.conf file:
# su -
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages*
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel: 
   Hangcheck: starting hangcheck timer 0.8.0 (tick is 30 seconds, margin is 180 seconds).
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel:
   Hangcheck: Using TSC.

NOTE: You don't have to run modprobe after each reboot. The hangcheck-timer module will be loaded by the kernel (automatically) when needed.


Configure RAC Nodes for Remote Access

When running the Oracle Installer on a RAC node, it will use the rsh command to copy the Oracle software to all other nodes within the RAC cluster. The oracle UNIX account on the node running the Oracle Installer (runIntaller) must be trusted by all other nodes in your RAC cluster. This means that you should be able to run r* commands like rsh, rcp, and rlogin on this RAC node against other RAC nodes without a password. The rsh daemon validates users using the /etc/hosts.equiv file and the .rhosts file found in the user's (oracle's) home directory. Unfortunatelly, SSH is not supported.

First, let's make sure that we have the rsh RPMs installed on each node in the RAC cluster:

# rpm -q rsh rsh-server
rsh-0.17-19
rsh-server-0.17-19
From the above, we can see that we have the rsh and rsh-server installed.

NOTE: If rsh is not installed, run the following command:

# su -
# rpm -ivh rsh-0.17-5.i386.rpm rsh-server-0.17-5.i386.rpm

To enable the "rsh" service, the "disable" attribute in the /etc/xinetd.d/rsh file must be set to "no" and xinetd must be refreshed. This can be done by running the following commands:

# su -
# chkconfig rsh on
# chkconfig rlogin on
# service xinetd reload
Reloading configuration: [  OK  ]
To allow the "oracle" UNIX user account to be trusted among the RAC nodes, create the /etc/hosts.equiv file:
# su -
# touch /etc/hosts.equiv
# chmod 600 /etc/hosts.equiv
# chown root.root /etc/hosts.equiv
Now add all RAC nodes to the /etc/hosts.equiv file similar to the following example:
# cat /etc/hosts.equiv
+linux1 oracle
+linux2 oracle
+int-linux1 oracle
+int-linux2 oracle

Be sure that the /etc/hosts.equiv file exists on all nodes in your RAC cluster!

NOTE: In the above example, the second field permits only the oracle user account to run rsh commands on the specified nodes. For security reasons, the /etc/hosts.equiv file should be owned by root and the permissions should be set to 600. In fact, some systems will only honor the content of this file if the owner of this file is root and the permissions are set to 600.

NOTE: Before attempting to test your rsh command, ensure that you are using the correct version of rsh. By default, Red Hat Linux puts /usr/kerberos/sbin at the head of the $PATH variable. This will cause the Kerberos version of rsh to be executed.

I will typically rename the Kerberos version of rsh so that the normal rsh command is being used. Use the following:

# su -

# which rsh
/usr/kerberos/bin/rsh

# cd /usr/kerberos/bin
# mv rsh rsh.original

# which rsh
/usr/bin/rsh

You should now test your connections and run the rsh command against each RAC. I will be using the node linux1 to perform the install.

# su - oracle

$ rsh int-linux1 ls -l /etc/hosts.equiv
-rw-------    1 root     root           68 May  2 14:45 /etc/hosts.equiv

$ rsh int-linux2 ls -l /etc/hosts.equiv
-rw-------    1 root     root           68 May  2 14:45 /etc/hosts.equiv

 


All Startup Commands for Each RAC Node (both nodes)

Up to this point, we have talked in great detail about the parameters and resources that will need to be configured on both nodes for our Oracle9i RAC configuration. In this section we will take a deep breath and recap those parameters, commands, and entries (in previous sections of this document) that need to happen on each node when the machine is booted.

In this section, I provide all of the commands, parameters, and entries that we have talked about so far that will need to be included into all startup scripts for each Linux node in the RAC cluster. For each of the startup files below, I have bolded the entries that should be included in each of the startup files in order to provide a successful RAC node.

File: /etc/modules.conf

All kernel parameters and modules that need to configured.

alias eth0 tulip
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias ieee1394-controller ohci1394
alias sound-slot-0 cmpci
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-remove sbp2 rmmod sd_mod
post-install sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -L >/dev/null 2>&1 || :
pre-remove sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -S >/dev/null 2>&1 || :
alias eth1 8139too
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180

File: /etc/sysctl.conf

We wanted to adjust the default and maximum send buffer size and default and maximum receive buffer size for our interconnect.

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Default setting in bytes of the socket receive buffer
net.core.rmem_default=262144

# Default setting in bytes of the socket send buffer
net.core.wmem_default=262144

# Maximum socket receive buffer size which may be set by using
# the SO_RCVBUF socket option
net.core.rmem_max=262144

# Maximum socket send buffer size which may be set by using
# the SO_SNDBUF socket option
net.core.wmem_max=262144

File: /etc/hosts

 

All machine/IP entries for nodes in our RAC cluster.
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1       localhost.localdomain   localhost
192.168.1.100   linux1
192.168.2.100   int-linux1
192.168.1.101   linux2
192.168.2.101   int-linux2
192.168.1.102   alex
192.168.1.105   bartman

File: /etc/hosts.equiv

 

Allow logins to each node as the oracle user account without the need for a password.
+linux1 oracle
+linux2 oracle
+int-linux1 oracle
+int-linux2 oracle

File: /etc/grub.conf

 

Determine which kernel to use when the node is booted.
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/hda3
#          initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Fedora Core (2.4.21-9.0.1.ELorafw1)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-9.0.1.ELorafw1 ro root=LABEL=/ rhgb
        initrd /initrd-2.4.21-9.0.1.ELorafw1.img
title Fedora Core (2.4.22-1.2115.nptl)
        root (hd0,0)
        kernel /vmlinuz-2.4.22-1.2115.nptl ro root=LABEL=/ rhgb
        initrd /initrd-2.4.22-1.2115.nptl.img
File: /etc/rc.local

 

These commands are responsible for binding volumes to raw devices, kernel setpoints, activating volume groups, creating symbolic links—all to prepare our shared storage for each node.
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

vgscan
vgchange -a y


# +---------------------------------------------------------+
# | SHARED MEMORY                                           |
# +---------------------------------------------------------+

echo "2147483648" > /proc/sys/kernel/shmmax
echo       "4096" > /proc/sys/kernel/shmmni


# +---------------------------------------------------------+
# | SEMAPHORES                                              |
# | ----------                                              |
# |                                                         |
# | SEMMSL_value  SEMMNS_value  SEMOPM_value  SEMMNI_value  |
# |                                                         |
# +---------------------------------------------------------+

echo "256 32000 100 128" > /proc/sys/kernel/sem


# +---------------------------------------------------------+
# | FILE HANDLES                                            |
# ----------------------------------------------------------+

echo "65536" > /proc/sys/fs/file-max


# +---------------------------------------------------------+
# | BIND ALL RAW DEVICES                                    |
# +---------------------------------------------------------+

/usr/bin/raw /dev/raw/raw1 /dev/pv1/lvol1
/usr/bin/raw /dev/raw/raw2 /dev/pv1/lvol2
/usr/bin/raw /dev/raw/raw3 /dev/pv1/lvol3
/usr/bin/raw /dev/raw/raw4 /dev/pv1/lvol4
/usr/bin/raw /dev/raw/raw5 /dev/pv1/lvol5
/usr/bin/raw /dev/raw/raw6 /dev/pv1/lvol6
/usr/bin/raw /dev/raw/raw7 /dev/pv1/lvol7
/usr/bin/raw /dev/raw/raw8 /dev/pv1/lvol8
/usr/bin/raw /dev/raw/raw9 /dev/pv1/lvol9
/usr/bin/raw /dev/raw/raw10 /dev/pv1/lvol10
/usr/bin/raw /dev/raw/raw11 /dev/pv1/lvol11
/usr/bin/raw /dev/raw/raw12 /dev/pv1/lvol12
/usr/bin/raw /dev/raw/raw13 /dev/pv1/lvol13
/usr/bin/raw /dev/raw/raw14 /dev/pv1/lvol14
/usr/bin/raw /dev/raw/raw15 /dev/pv1/lvol15
/usr/bin/raw /dev/raw/raw16 /dev/pv1/lvol16
/usr/bin/raw /dev/raw/raw17 /dev/pv1/lvol17
/usr/bin/raw /dev/raw/raw18 /dev/pv1/lvol18
/usr/bin/raw /dev/raw/raw19 /dev/pv1/lvol19
/usr/bin/raw /dev/raw/raw20 /dev/pv1/lvol20
/usr/bin/raw /dev/raw/raw21 /dev/pv1/lvol21
/usr/bin/raw /dev/raw/raw22 /dev/pv1/lvol22
/usr/bin/raw /dev/raw/raw23 /dev/pv1/lvol23
/bin/chmod 600 /dev/raw/raw1
/bin/chmod 600 /dev/raw/raw2
/bin/chmod 600 /dev/raw/raw3
/bin/chmod 600 /dev/raw/raw4
/bin/chmod 600 /dev/raw/raw5
/bin/chmod 600 /dev/raw/raw6
/bin/chmod 600 /dev/raw/raw7
/bin/chmod 600 /dev/raw/raw8
/bin/chmod 600 /dev/raw/raw9
/bin/chmod 600 /dev/raw/raw10
/bin/chmod 600 /dev/raw/raw11
/bin/chmod 600 /dev/raw/raw12
/bin/chmod 600 /dev/raw/raw13
/bin/chmod 600 /dev/raw/raw14
/bin/chmod 600 /dev/raw/raw15
/bin/chmod 600 /dev/raw/raw16
/bin/chmod 600 /dev/raw/raw17
/bin/chmod 600 /dev/raw/raw18
/bin/chmod 600 /dev/raw/raw19
/bin/chmod 600 /dev/raw/raw20
/bin/chmod 600 /dev/raw/raw21
/bin/chmod 600 /dev/raw/raw22
/bin/chmod 600 /dev/raw/raw23
/bin/chown oracle:dba /dev/raw/raw1
/bin/chown oracle:dba /dev/raw/raw2
/bin/chown oracle:dba /dev/raw/raw3
/bin/chown oracle:dba /dev/raw/raw4
/bin/chown oracle:dba /dev/raw/raw5
/bin/chown oracle:dba /dev/raw/raw6
/bin/chown oracle:dba /dev/raw/raw7
/bin/chown oracle:dba /dev/raw/raw8
/bin/chown oracle:dba /dev/raw/raw9
/bin/chown oracle:dba /dev/raw/raw10
/bin/chown oracle:dba /dev/raw/raw11
/bin/chown oracle:dba /dev/raw/raw12
/bin/chown oracle:dba /dev/raw/raw13
/bin/chown oracle:dba /dev/raw/raw14
/bin/chown oracle:dba /dev/raw/raw15
/bin/chown oracle:dba /dev/raw/raw16
/bin/chown oracle:dba /dev/raw/raw17
/bin/chown oracle:dba /dev/raw/raw18
/bin/chown oracle:dba /dev/raw/raw19
/bin/chown oracle:dba /dev/raw/raw20
/bin/chown oracle:dba /dev/raw/raw21
/bin/chown oracle:dba /dev/raw/raw22
/bin/chown oracle:dba /dev/raw/raw23

# +---------------------------------------------------------+
# | CREATE SYMBOLIC LINKS                                   |
# +---------------------------------------------------------+

mkdir /u01/app/oracle/oradata
mkdir /u01/app/oracle/oradata/orcl

ln -s /dev/raw/raw1  /u01/app/oracle/oradata/orcl/CMQuorumFile
ln -s /dev/raw/raw2  /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile
ln -s /dev/raw/raw3  /u01/app/oracle/oradata/orcl/spfileorcl.ora
ln -s /dev/raw/raw4  /u01/app/oracle/oradata/orcl/control01.ctl
ln -s /dev/raw/raw5  /u01/app/oracle/oradata/orcl/control02.ctl
ln -s /dev/raw/raw6  /u01/app/oracle/oradata/orcl/control03.ctl
ln -s /dev/raw/raw7  /u01/app/oracle/oradata/orcl/cwmlite01.dbf
ln -s /dev/raw/raw8  /u01/app/oracle/oradata/orcl/drsys01.dbf
ln -s /dev/raw/raw9  /u01/app/oracle/oradata/orcl/example01.dbf
ln -s /dev/raw/raw10 /u01/app/oracle/oradata/orcl/indx01.dbf
ln -s /dev/raw/raw11 /u01/app/oracle/oradata/orcl/odm01.dbf
ln -s /dev/raw/raw12 /u01/app/oracle/oradata/orcl/system01.dbf
ln -s /dev/raw/raw13 /u01/app/oracle/oradata/orcl/temp01.dbf
ln -s /dev/raw/raw14 /u01/app/oracle/oradata/orcl/tools01.dbf
ln -s /dev/raw/raw15 /u01/app/oracle/oradata/orcl/undotbs01.dbf
ln -s /dev/raw/raw16 /u01/app/oracle/oradata/orcl/undotbs02.dbf
ln -s /dev/raw/raw17 /u01/app/oracle/oradata/orcl/users01.dbf
ln -s /dev/raw/raw18 /u01/app/oracle/oradata/orcl/xdb01.dbf
ln -s /dev/raw/raw19 /u01/app/oracle/oradata/orcl/perfstat01.dbf
ln -s /dev/raw/raw20 /u01/app/oracle/oradata/orcl/redo01.log
ln -s /dev/raw/raw21 /u01/app/oracle/oradata/orcl/redo02.log
ln -s /dev/raw/raw22 /u01/app/oracle/oradata/orcl/redo03.log
ln -s /dev/raw/raw23 /u01/app/oracle/oradata/orcl/orcl_redo2_2.log

chown -R oracle:dba /u01/app/oracle/oradata

/sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180

Update Red Hat Linux System (Oracle Metalink Note #252217.1) (both nodes)

The following RPMs, all of which are available on the Red Hat Fedora Core 1 CDs, will need to be updated as per the steps described in Metalink Note #252217.1, "Requirements for Installing Oracle9iR2 on RHEL3."

All of these packages will need to be installed as the root user.

From Fedora Core 1 / Disk #1

# cd /mnt/cdrom/Fedora/RPMS
# rpm -Uvh libpng10-1.0.13-9.i386.rpm
From Fedora Core 1 / Disk #2
# cd /mnt/cdrom/Fedora/RPMS
# rpm -Uvh gnome-libs-1.4.1.2.90-35.i386.rpm
From Fedora Core 1 / Disk #3
# cd /mnt/cdrom/Fedora/RPMS
# rpm -Uvh compat-libstdc++-7.3-2.96.118.i386.rpm
# rpm -Uvh compat-libstdc++-devel-7.3-2.96.118.i386.rpm
# rpm -Uvh compat-db-4.0.14-2.i386.rpm
# rpm -Uvh compat-gcc-7.3-2.96.118.i386.rpm
# rpm -Uvh compat-gcc-c++-7.3-2.96.118.i386.rpm
# rpm -Uvh sysstat-4.0.7-5.i386.rpm      
# rpm -Uvh openmotif21-2.1.30-8.i386.rpm
# rpm -Uvh pdksh-5.2.14-23.i386.rpm
Set gcc296 and g++296 in PATH

 

Put gcc296 and g++296 first in $PATH variable by creating the following symbolic links:
# mv /usr/bin/gcc /usr/bin/gcc323
# mv /usr/bin/g++ /usr/bin/g++323
# ln -s /usr/bin/gcc296 /usr/bin/gcc
# ln -s /usr/bin/g++296 /usr/bin/g++
Check hostname

 

Make sure the hostname command returns a fully qualified host name by amending the /etc/hosts file if necessary:
# hostname
Install the 3006854 patch: The Oracle/Linux Patch 3006854 can be downloaded here.
# unzip p3006854_9204_LINUX.zip
# cd 3006854
# sh rhel3_pre_install.sh
Reboot the System

 

At this point, reboot all nodes within the RAC cluster before attempting to install the Oracle software.
# init 6
Download/Unpack the Oracle9i Installation Files

It is now time to download and extract the Oracle9i RDBMS software for Linux. As of March 26, 2004, Oracle includes the Oracle9i RDBMS software with the 9.2.0.4.0 patchset already included. This will save considerable time since the patchset does not have to be downloaded and installed.

  • Login as the newly created "oracle" user account. (su - oracle). Most of the actions throughout the rest of this document should be done as the "oracle" user account unless otherwise noted.
     
  • Download Oracle9i (9.2.0.4.0) for Linux x86 from OTN. (If you do not currently have an OTN account, you will need to create one. This is a FREE account!)
     
  • Save the following files to a temporary directory:
     
    • ship_9204_linux_disk1.cpio.gz    (538,906,295 bytes)
    • ship_9204_linux_disk2.cpio.gz    (632,756,922 bytes)
    • ship_9204_linux_disk3.cpio.gz    (296,127,243 bytes)
       
  • Run "gunzip <filename>" on all the files (e.g.,gunzip ship_9204_linux_disk1.cpio.gz)
     
  • Extract the cpio archives with the command cpio -idmv < <filename> (e.g., cpio -idmv < ship_9204_linux_disk1.cpio)

     

    NOTE: Some browsers will uncompress the files but leave the extension the same (gz) when downloading. If the above steps do not work for you, try skipping step 1 and go directly to step 2 without changing the filename (e.g., "cpio -idmv < ship_9204_linux_disk1.cpio.gz"
     
  • You should now have three directories called "Disk1, Disk2 and Disk3" containing the Oracle9i Installation files:

    /Disk1
    /Disk2
    /Disk3


Install Oracle9i Cluster Manager

Introduction

At this point, all of the pre-installation steps should have been performed on all RAC nodes and re-booted.

This section of the document provides the instructions for installing and configuring the Oracle9i Cluster Manager (Node Monitor) oracm software on all RAC nodes. The runInstaller command only needs to be run from one of the RAC nodes. I will be running the Oracle Installer from linux1.

Create the Cluster Manager Quorum File

If we were using a Clustered File System, it would be appropriate at this time to create the Cluster Manager Quorum File. Because we are using raw devices, however, we simply need to recognize the raw partition we will be using for the Cluster Manager Quorum File and then create a symbolic link to the raw device.

For our example, we already created a 5MB RAW partition, named it /dev/raw/raw1 and created the symbolic link that pointed to it:

# su - oracle
$ ls -l oradata/orcl/CMQuorumFile
lrwxrwxrwx  1 oracle   dba    13 May  2 20:28 oradata/orcl/CMQuorumFile -> /dev/raw/raw1
Testing rsh

You should test your connections and run the rsh command against each RAC. I will be using the node linux1 to perform the install.

# su - oracle

$ rsh int-linux1 hostname
linux1

$ rsh int-linux2 hostname
linux2

Installing Oracle9i Cluster Manager

To install the Oracle Cluster Manager software, simply navigate to Disk1 of the Oracle installation software and run the runInstaller command:

$ su - oracle
$ cd oracle_install/Disk1
$ ./runInstaller
Initializing Java Virtual Machine from /tmp/OraInstall2004-05-02_08-45-13PM/jre/bin/java. Please wait...
Screen Name Response
Welcome Screen Click "Next"
Inventory Location Click "OK"
UNIX Group Name Use "dba"
Root Script Window Open another window, login as the root userid, and run "/tmp/orainstRoot.sh". When the script has completed, return to the dialog from the Oracle Installer and hit Continue.
File Locations Leave the "Source Path" at its default setting. For the Destination name, I like to use "Oracle9iRAC." You can leave the Destination path at its default value, which should be /u01/app/oracle/product/9.2.0.
Available Products Select "Oracle Cluster Manager 9.2.0.4.0"
Public Node Information Public Node 1:     linux1
Public Node 2:     linux2
Private Node Information Private Node 1:     int-linux1
Private Node 2:     int-linux2
Quorum Disk Information /u01/app/oracle/oradata/orcl/CMQuorumFile
Summary Click "Install"

When the installation of Oracle Cluster Manager is complete, simply click the "Exit" button.

Configuring Oracle9i Cluster Manager (both nodes)

If this installation were using Oracle Cluster Manager below 9.2.0.2, we would need to configure the watchdog daemon. This is not necessary because the version we are using is 9.2.0.4.0. Starting with 9.2.0.2.0, Oracle has replaced the watchdog daemon with the hangcheck-timer kernel module. We will need to update some of those configuration files and this section describes those changes.

ADD the following line to the $ORACLE_HOME/oracm/admin/cmcfg.ora file:

KernelModuleName=hangcheck-timer
ADJUST the value of the MissCount parameter in the $ORACLE_HOME/oracm/admin/cmcfg.ora file based on the sum of the hangcheck_tick and hangcheck_margin values. The MissCount parameter must be set to a minimum value of 60 and it must be greater than the sum of hangcheck_tick + hangcheck_margin. In our example, hangcheck_tick + hangcheck_margin is 210. I will therefore set MissCount in $ORACLE_HOME/oracm/admin/cmcfg.ora to the value 210.
MissCount=210
Your $ORACLE_HOME/oracm/admin/cmcfg.ora> file should now look similar to the following:
HeartBeat=15000
ClusterName=Oracle Cluster Manager, version 9i
PollInterval=1000
MissCount=210
PrivateNodeNames=int-linux1 int-linux2
PublicNodeNames=linux1 linux2
ServicePort=9998
CmDiskFile=/u01/app/oracle/oradata/orcl/CMQuorumFile
HostName=int-linux1
KernelModuleName=hangcheck-timer
Starting and Stopping Oracle9i Cluster Manager (both nodes)

You should now start the the Cluster Manager (CM) and Node Monitor oracm. To do this, run the following commands on all nodes in the RAC cluster:

# su -
# . ~oracle/.bash_profile    # Set Oracle environment
# $ORACLE_HOME/oracm/bin/ocmstart.sh
oracm </dev/null 2>&1 >/u01/app/oracle/product/9.2.0/oracm/log/cm.out &
After starting the Cluster Manager (CM) and Node Monitor, check to make sure it is running:
# ps -ef | grep oracm | grep -v 'grep'
root      5476     1  0 18:24 pts/1    00:00:00 oracm
root      5478  5476  0 18:24 pts/1    00:00:00 oracm
root      5479  5478  0 18:24 pts/1    00:00:00 oracm
root      5480  5478  0 18:24 pts/1    00:00:00 oracm
root      5481  5478  0 18:24 pts/1    00:00:00 oracm
root      5483  5478  0 18:24 pts/1    00:00:00 oracm
root      5484  5478  0 18:24 pts/1    00:00:00 oracm
root      5485  5478  0 18:24 pts/1    00:00:00 oracm
root      5486  5478  0 18:24 pts/1    00:00:00 oracm
root      5491  5478  0 18:24 pts/1    00:00:00 oracm
If you need to stop the oracm process, you have to kill it at the O/S level:
# su -

# pkill oracm
If you would like more information on administering the Oracle Cluster Manager, see Oracle9i Real Application Clusters Administration.

NOTE: If you only see one oracm process in the process table when you run the ps command, it is probably because you have the procps RPM installed. The ps command that comes with the procps RPM does not show a thread as a separate process in the ps output.

# rpm -qf /bin/ps
procps-2.0.17-1

Installing Oracle9i RAC

Required RPMs

 

Before attempting to install the Oracle9i Real Application Cluster 9.2.0.4.0 software (RAC software + database software), you need to ensure that the pdksh and ncurses4 RPMs are installed on all RAC nodes. If these packages are not installed, you will get the following error message when attempting to run the $ORACLE_HOME/root.sh file on each RAC node during the software installation:
...
error: failed dependencies:
        libncurses.so.4 is needed by orclclnt-nw_lssv.Build.71-1
error: failed dependencies:
        orclclnt = nw_lssv.Build.71-1 is needed by orcldrvr-nw_lssv.Build.71-1
error: failed dependencies:
        orclclnt = nw_lssv.Build.71-1 is needed by orclnode-nw_lssv.Build.71-1
        orcldrvr = nw_lssv.Build.71-1 is needed by orclnode-nw_lssv.Build.71-1
        libscsi.so is needed by orclnode-nw_lssv.Build.71-1
        libsji.so is needed by orclnode-nw_lssv.Build.71-1
error: failed dependencies:
        orclclnt = nw_lssv.Build.71-1 is needed by orclserv-nw_lssv.Build.71-1
        orclnode = nw_lssv.Build.71-1 is needed by orclserv-nw_lssv.Build.71-1
        /bin/ksh is needed by orclserv-nw_lssv.Build.71-1
package orclman-nw_lssv.Build.71-1 is already installed

**      Installation of LSSV did not succeed.  Please refer
**      to the Installation Guide at http://www.legato.com/LSSV
**      and contact Oracle customer support if necessary.
Before attempting the installation, check that the two required RPMs are installed by running the following command:
# rpm -q pdksh ncurses4
pdksh-5.2.14-23
ncurses4-5.0-12
If you need to install these RPMs, then locate the RPMs on the Red Hat Linux CDs and run the following:
# su -
# rpm -Uvh pdksh-5.2.14-23.i386.rpm ncurses4-5.0-12.i386.rpm
Creating the Shared Configuration File for srvctl

If we were using a Clustered File System, it would be appropriate at this time to create the Shared Configuration File. Because we are using raw devices, however, we simply need to recognize the raw partition we will be using for the Cluster Manager Quorum File and then create a symbolic link to the raw device.

For our example, we already created a 100MB RAW partition, named it /dev/raw/raw2 and created the symbolic link that pointed to it:

# su - oracle
$ ls -l oradata/orcl/SharedSrvctlConfigFile
lrwxrwxrwx  1 oracle   dba    13 May  2 20:28 oradata/orcl/SharedSrvctlConfigFile -> /dev/raw/raw2

Removing system01.dbf Symbolic Link

Before starting the Oracle Universal Installer for Oracle9i RAC, you will need to remove the symbolic link you have for the SYSTEM tablespace (system01.dbf) on all nodes in the RAC cluster. After the Oracle installation is complete, we will be re-creating the symbolic link. To remove the symbolic link:

# su - oracle
$ rm oradata/orcl/system01.dbf
If you fail to remove this symbolic link, the following error will be display before the installation can start:

Installing Oracle9i 9.2.0.4.0 Database Software with Oracle9i RAC

To install the Oracle9i 9.2.0.4.0 Database Software with Oracle9i RAC software, simply navigate to Disk1 of the Oracle installation software and run the runInstaller command:

$ su - oracle
$ cd oracle_install/Disk1
$ ./runInstaller
Initializing Java Virtual Machine from /tmp/OraInstall2004-05-03_06-51-06PM/jre/bin/java. Please wait...
Screen Name Response
Welcome Screen Click "Next"
Cluster Node Selection Select (highlight) all RAC nodes by using the shift key and clicking each node with the left mouse button. If all of the nodes in your RAC cluster are not showing up, or if the Node Selection Screen does not appear, then the Oracle Cluster Manager (Node Manager) oracm> is probably not running on all RAC nodes. For more information, see Starting and Stopping Oracle9i Cluster Manager. under the "Installing Oracle9i Cluster Manager" section.
Inventory Location Keep all defaults and click "OK"
Available Products Select "Oracle9i Database 9.2.0.4.0" and click "Next"
Installation Types Select "Enterprise Edition (2.84GB)" and click "Next"
Database Configuration: Select Software Only" and click "Next"
Shared Configuration File Name /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile
Summary Click "Install"

Notes During the Oracle Installation

  • Errors during the "Link" phase (ins_oemagent.mk / ins_ctx.mk): You should not receive any errors during the Link phase. For those of you familiar with installing Oracle on Linux, receiving errors during the Linux phase should not be anything new to you. I was surprised to not receive any errors during the link phase of this install, as I did when installing Oracle 9.2.0.4.0 on Fedora Core 1. If you do receive any errors during the linking phase, you can read my article "Installing Oracle9i (9.2.0.4.0) on Red Hat Linux (Fedora Core 1)."
     
  • Performing remote operations (99%): During the installation, a dialog displays "Performing remote operations (99%)", you will see a command similar to the following (below) running on the RAC nodes:
    $ ps -ef | grep cpio | grep -v 'grep'
    oracle  7902  7901  0 21:07 ?  00:00:00 bash -c /bin/sh -c cd /; cpio  -idmuc
    oracle  7910  7902 14 21:07 ?  00:00:09 cpio -idmuc
    If you see the above command running, it shows that the Oracle software is currently being installed (copied) to all RAC node(s).

    There are still reported bugs in the Oracle Installer that prevent Oracle (sometimes) from installing the software on all nodes in the RAC cluster. If the Installer hangs at "Performing remote operations (99%)" and the above BASH command is NOT running on the Oracle RAC nodes anymore, then you will need to abort the installation. If you continue to receive errors during this phase of the installation, the only workaround (at the time of this writing) would be to run runInstaller on all RAC nodes to install the software on each RAC node separately.

  • Running root.sh Script: When the "Link" phase is complete, you will be prompted to run the $ORACLE_HOME/root.sh script as the "root" user account.

     

    NOTE: Before running the root.sh script on any nodes, you will need to perform several manual actions:

    Edit the root.sh script and near the bottom of the script (I believe line # 249); if the line reads:

    if [ ! -f "$OPSCONFIG" ];then
    then change it to:
    if [ -f "$OPSCONFIG" ];then

    Second, create the srvConfig.loc file:

    # su -
    # mkdir -p /var/opt/oracle
    # touch /var/opt/oracle/srvConfig.loc
    # /u01/app/oracle/product/9.2.0/root.sh

    When running the root.sh script, ensure that you run it on ALL RAC servers before clicking "OK" in the Oracle installation dialog box.

  • Oracle Enterprise Manager Console: When the Oracle Enterprise Manager Console comes up, exit from the application. We will be creating the Oracle cluster database in a later section.
Post Installation Step
  1. Replace system01.dbf Symbolic Link

    Earlier in this section, you needed to remove the symbolic link on all nodes in the RAC cluster that pointed to the datafile (system01.dbf)—the datafile that will be used for the SYSTEM tablespace. Now that the installation is complete, you can now re-create this symbolic link on all nodes in the RAC cluster:

    # su - oracle
    $ ln -s /dev/raw/raw12  /u01/app/oracle/oradata/orcl/system01.dbf
  2. Create Missing Directories Not Replicated on Remote Nodes

    After the Oracle9i RAC software installation, some directories may not get created. You should check for the following directories and if they do not exist, you should create them as the oracle UNIX user:

    # su - oracle
    
    # For Cluster Manager
    $ mkdir -p $ORACLE_HOME/oracm/log
    
    # For SQL*Net Listener
    $ mkdir -p $ORACLE_HOME/network/log
    $ mkdir -p $ORACLE_HOME/network/trace
    
    # For database instances
    $ mkdir -p $ORACLE_HOME/rdbms/log
    $ mkdir -p $ORACLE_HOME/rdbms/audit
    
    # For Oracle Intelligent Agent
    $ mkdir -p $ORACLE_HOME/network/agent/log
    $ mkdir -p $ORACLE_HOME/network/agent/reco
    
    # For Oracle HTTP Server (Apache)
    $ mkdir -p $ORACLE_HOME/Apache/Agent/logs
    $ mkdir -p $ORACLE_HOME/Apache/Jserv/logs
  3. Initialize the Shared Configuration File

    Before attempting to initialize Shared Configuration File, make sure that the Oracle Global Services daemon is NOT running, by using the following command:

    # su - oracle
    $ gsdctl stat
    GSD is not running on the local node
    If the deamon is running, then shut it down.

    To initialize the Shared Configuration File by running the following command only on one RAC node:

    # su - oracle
    $ srvconfig -init
    NOTE: If you receive a PRKR-1025 error when attempting to run the srvconfig -init command, check that you have the valid entry for "srvconfig_loc" in your /var/opt/oracle/srvConfig.loc file and that the file is owned by "oracle". This entry gets created by the root.sh.

    If you receive a PRKR-1064 error when attempting to run the srvconfig -init command, then check if /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile file is accessable by all RAC nodes:

    $ cd ~oracle/oradata/orcl
    $ ls -l SharedSrvctlConfigFile
    lrwxrwxrwx  1 oracle  dba   13 May  2 20:17 SharedSrvctlConfigFile -> /dev/raw/raw2
  4. Start Oracle Global Services

    After initializing the Shared Configuration File, you will need to manually start the Oracle Global Services daemon (gsd) to ensure that it works. At this point in the installation, the Global Services daemon should be down. To confirm this, run the following command:

    #su - oracle
    $ gsdctl stat
    GSD is not running on the local node

    Let's manually start the Global Services daemon (gsd) by running the following command on all nodes in the RAC cluster:

    # su - oracle
    $ gsdctl start
    Successfully started GSD on local node
  5. Check Node Name and Node Number Mappings

    In most cases, the Oracle Global Services daemon (gsd) should successfully start on all local nodes in the RAC cluster. There are problems, however, where the node name and node number mappings are not correct in the cmcfg.ora file on node 2. This does not happen very often, but it has happened to me on at least one occasion.

    If the node name and node number mappings are not correct, it will not show up until you attempt to run the Database Configuration Assistant (dbca)—the assistant we will be using later to create our cluster database. The error reported by the DBCA will say something to the effect, "gsd daemon has not been started on node 2".

    To check that the node name and number mappings are correct on your cluster, run the following command on both your nodes:

    Listing for node1:
    $ lsnodes -n
    linux1     0
    linux2     1
    
    Listing for node2:
    $ lsnodes -n
    linux2     1
    linux1     0
    The above example shows that my node name to node number mappings are correct. If you run into the problem where the node name and node number are not mapped correctly, you should make the changes to node 2 in the cmcfg.ora file. Edit the file to change the ordering for the following entries:
    PrivateNodeNames=int-linux1 int-linux2
    PublicNodeNames=linux1 linux2
  6. Update Node Startup Script (/etc/rc.local)

    Throughout this article, I have been inserting all the commands in the file /etc/rc.local that should be run on each node for our Oracle9i RAC configuration.

    There are several more commands that should be put in at this time. Put the following commands at the end of your /etc/rc.local file:

    ...
    
    . ~oracle/.bash_profile
    rm -rf $ORACLE_HOME/oracm/log/*.ts
    $ORACLE_HOME/oracm/bin/ocmstart.sh
    
    su - oracle -c "gsdctl start"
    
    su - oracle -c "lsnrctl start"
    
  7. Reboot All Nodes

    Before attempting to create the Oracle cluster database, I would reboot all nodes within the cluster. I had problems at one time with the database creation (using DBCA) and rebooting all nodes after the Oracle9i RAC software installation and before the creation of the cluster database creation, helped in resolving this issue.

    This also provides a chance to ensure that any of the additions we made to our /etc/rc.local startup file, are being run.

  8. Verify Node Configuration

    After rebooting each of the nodes in the RAC cluster, here is a list of commands I manually run to ensure that each node is configured correctly and the startup scripts are correctly running their required tasks.

    Keep in mind, that you should run the following on each node in the RAC cluster:

    # su -
    
    # raw -qa
    >/dev/raw/raw1:  bound to major 58, minor 0
    /dev/raw/raw2:  bound to major 58, minor 1
    /dev/raw/raw3:  bound to major 58, minor 2
    /dev/raw/raw4:  bound to major 58, minor 3
    /dev/raw/raw5:  bound to major 58, minor 4
    /dev/raw/raw6:  bound to major 58, minor 5
    /dev/raw/raw7:  bound to major 58, minor 6
    /dev/raw/raw8:  bound to major 58, minor 7
    /dev/raw/raw9:  bound to major 58, minor 8
    /dev/raw/raw10: bound to major 58, minor 9
    /dev/raw/raw11: bound to major 58, minor 10
    /dev/raw/raw12: bound to major 58, minor 11
    /dev/raw/raw13: bound to major 58, minor 12
    /dev/raw/raw14: bound to major 58, minor 13
    /dev/raw/raw15: bound to major 58, minor 14
    /dev/raw/raw16: bound to major 58, minor 15
    /dev/raw/raw17: bound to major 58, minor 16
    /dev/raw/raw18: bound to major 58, minor 17
    /dev/raw/raw19: bound to major 58, minor 18
    /dev/raw/raw20: bound to major 58, minor 19
    /dev/raw/raw21: bound to major 58, minor 20
    /dev/raw/raw22: bound to major 58, minor 21
    /dev/raw/raw23: bound to major 58, minor 22
    
    
    # lvscan
    lvscan -- ACTIVE            "/dev/pv1/lvol1" [5 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol2" [100 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol3" [10 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol4" [200 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol5" [200 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol6" [200 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol7" [55 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol8" [25 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol9" [255 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol10" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol11" [55 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol12" [805 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol13" [255 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol14" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol15" [405 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol16" [405 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol17" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol18" [155 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol19" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol20" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol21" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol22" [105 MB]
    lvscan -- ACTIVE            "/dev/pv1/lvol23" [105 MB]
    lvscan -- 23 logical volumes with 3.88 GB total in 1 volume group
    lvscan -- 23 active logical volumes
    
    
    $ ls -l ~oracle/oradata/orcl
    total 0
    lrwxrwxrwx 1 oracle dba 13 May 5 17:01 CMQuorumFile -> /dev/raw/raw1
    lrwxrwxrwx 1 oracle dba 13 May 5 17:01 control01.ctl -> /dev/raw/raw4
    lrwxrwxrwx 1 oracle dba 13 May 5 17:01 control02.ctl -> /dev/raw/raw5
    lrwxrwxrwx 1 oracle dba 13 May 5 17:01 control03.ctl -> /dev/raw/raw6
    lrwxrwxrwx 1 oracle dba 13 May 5 20:38 cwmlite01.dbf -> /dev/raw/raw7
    lrwxrwxrwx 1 oracle dba 13 May 5 20:38 drsys01.dbf -> /dev/raw/raw8
    lrwxrwxrwx 1 oracle dba 13 May 5 20:38 example01.dbf -> /dev/raw/raw9
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 indx01.dbf -> /dev/raw/raw10
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 odm01.dbf -> /dev/raw/raw11
    lrwxrwxrwx 1 oracle dba 14 May 5 17:01 orcl_redo2_2.log -> /dev/raw/raw23
    lrwxrwxrwx 1 oracle dba 14 May 5 17:01 perfstat01.dbf -> /dev/raw/raw19
    lrwxrwxrwx 1 oracle dba 14 May 5 17:01 redo01.log -> /dev/raw/raw20
    lrwxrwxrwx 1 oracle dba 14 May 5 17:01 redo02.log -> /dev/raw/raw21
    lrwxrwxrwx 1 oracle dba 14 May 5 17:01 redo03.log -> /dev/raw/raw22
    lrwxrwxrwx 1 oracle dba 13 May 5 17:01 SharedSrvctlConfigFile -> /dev/raw/raw2
    lrwxrwxrwx 1 oracle dba 13 May 5 17:01 spfileorcl.ora -> /dev/raw/raw3
    lrwxrwxrwx 1 oracle dba 14 May 5 18:58 system01.dbf -> /dev/raw/raw12
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 temp01.dbf -> /dev/raw/raw13
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 tools01.dbf -> /dev/raw/raw14 
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 undotbs01.dbf -> /dev/raw/raw15
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 undotbs02.dbf -> /dev/raw/raw16 
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 users01.dbf -> /dev/raw/raw17 
    lrwxrwxrwx 1 oracle dba 14 May 5 20:38 xdb01.dbf -> /dev/raw/raw18 $ 
    
    ps -ef | grep oracm | grep -v 'grep' 
    root 5476 1 0 19:28 pts/1 00:00:00 oracm 
    root 5478 5476 0 19:28 pts/1 00:00:00 oracm 
    root 5479 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5480 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5481 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5483 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5484 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5485 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5486 5478 0 19:28 pts/1 00:00:00 oracm 
    root 5491 5478 0 19:28 pts/1 00:00:00 oracm $ 
    
    gsdctl stat 
    GSD is running on the local node $ 
    
    lsnodes -n 
    linux1 0
    linux2 1 
    
    $ srvctl status database -d orcl 
    PRKR-1007 : getting of cluster database orcl configuration failed, 
    PRKR-1001 : cluster database orcl does not exist 
    PRKO-2005 : Application error: Failure in getting Cluster Database Configuration for: orcl
NOTE: When we run the srvctl status database -d orcl command (above), we WANT to get the PRKR-1007, PRKR-1001, and PRKO-2005 errors. If we were to get the following results from the command:
Instance orcl1 is not running on node linux1 Instance orcl2 is not running on node linux2
then the Oracle installer created the orcl database. This database will need to be deleted within the DBCA BEFORE creating the Oracle clustered database.
 

Create the Oracle Database

We will be using the Oracle Database Configuration Assistant (DBCA) to create our clustered database on the shared storage (FireWire) device.

NOTE: Keep in mind that on several occassions for me the Oracle Universal Installer created an Oracle Database named orcl. You will need to delete this database BEFORE attempting to create your clustered database. To start the database creation process, run the following:

# su - oracle
$ dbca -datafileDestination /u01/app/oracle/oradata &
Screen Name Response
Type of Database Select "Oracle Cluster Database" and click "Next"
Operations Select "Create a database" and click "Next"
Node Selection Click the "Select All" button to the right. If all of the nodes in your RAC cluster are not showing up, or if the Node Selection Screen does not appear, then the Oracle Cluster Manager (Node Manager) oracm is probably not running on all RAC nodes. For more information, see Starting and Stopping Oracle9i Cluster Manager under the "Installing Oracle9i Cluster Manager" section.
Database Templates Select "New Database" and click "Next"
Database Identification Global Database Name:    orcl
SID Prefix:    orcl
Database Features For your new database, you can keep all database features selected. I typically do. If you want to, however, you can clear any of the boxes to not install the feature in your new database.

Click "Next" when finished.

Database Connection Options Select "Dedicated Server Mode" and click "Next"
Initialization Parameters Click "Next"
Database Storage If you have followed this article and created all symbolic links, then the datafiles for all tablespaces should match the DBCA. I do, however, change the initial size for each tablespace. To do this, negotiate through the navigation tree for all tablespaces and change the value for the following tablespaces:
  • CWMLITE:    50MB
  • DRSYS:    20MB
  • EXAMPLE:    250MBMB
  • INDX:    100MB
  • ODM:    50MB
  • SYSTEM:    800MB
  • TEMP:    250MB
  • TOOLS:    100MB
  • UNDOTBS1:    400MB
  • UNDOTBS2:    400MB
  • USERS:    100MB
  • XDB:    150MB
If you need to, select appropriate files and then click "Next"
Creation Options Click here for a snapshot of the options I used to create my cluster database

When you are ready to start the database creation process, click "Finish"

Summary Click "OK"

When prompted to "Perform Another Operation", click "No".

Notes During the Oracle RAC Database Creation Process

  • ORA-29807 Error

    Within the "Creating data dictionary views" phase of the database creation process, you will receive an ORA-29807 error. If you look in the log file, you will see the following:

    drop operator XMLSequence
    *
    ERROR at line 1:
    ORA-29807: specified operator does not exist
    This is a known issue (Bug: 2686156) and can be ignored. To continue the database creation process, hit the "Ignore" button:

  • ORA-01430 Error

    Within the "Adding Oracle Spatial" phase of the database creation process, you will receive an ORA-01430 error. If you look in the log file, you will see the following:

    (SDO_ROOT_MBR mdsys.sdo_geometry)
     *
    ERROR at line 2:
    ORA-01430: column being added already exists in table

    This is a known issue and can be ignored. To continue the database creation process, hit the "Ignore" button:

When the DBCA has completed, you will have a fully functional Oracle RAC cluster running.

 


Creating TNS Networking Files

listener.ora

You should not need to make any changes to your listener.ora file, which is created by the Oracle installer. All instances on the node will automatically register with the listener.

listener.ora

# LISTENER.ORA.LINUX1 Network Configuration File:
# /u01/app/oracle/product/9.2.0/network/admin/listener.ora.linux1
# Generated by Oracle configuration tools.

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
      )
    )
  )

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /u01/app/oracle/product/9.2.0)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (ORACLE_HOME = /u01/app/oracle/product/9.2.0)
      (SID_NAME = orcl1)
    )
  )
tnsnames.ora

Here is a copy of my tnsnames.ora file that I have configured for Transparent Application Failover (TAF). You can put this file on each node in the RAC cluster in the directory $ORACLE_HOME/network/admin.

tnsnames.ora

# TNSNAMES.ORA Network Configuration File:
# /u01/app/oracle/product/9.2.0/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

LISTENERS_ORCL =
  (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
  )


LISTENER_ORCL1 =
  (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))


LISTENER_ORCL2 =
  (ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))

ORCL =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
      (LOAD_BALANCE = yes)
      (FAILOVER = yes)
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl)
      (FAILOVER_MODE =
        (TYPE = session)
        (METHOD = basic)
      )
    )
  )

ORCL1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = orcl)
      (INSTANCE_NAME = orcl1)
    )
  )

ORCL2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = orcl)
      (INSTANCE_NAME = orcl2)
    )
  )

EXTPROC_CONNECTION_DATA =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
    )
    (CONNECT_DATA =
      (SID = PLSExtProc)
      (PRESENTATION = RO)
    )
  )

INST1_HTTP =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = SHARED)
      (SERVICE_NAME = MODOSE)
      (PRESENTATION = http://HRService)
    )
  )

 


Verifying the RAC Cluster/Database Configuration

When the DBCA has completed, you will have a fully functional Oracle RAC cluster running.

This section provides several commands and SQL queries that can be used to validate your Oracle9i RAC configuration.

--Look for Oracle Cluster Manager--
        
      $ ps -ef | grep oracm | grep -v 'grep'
    root      5476     1  0 19:28 pts/1    00:00:00 oracm
    root      5478  5476  0 19:28 pts/1    00:00:00 oracm
    root      5479  5478  0 19:28 pts/1    00:00:00 oracm
    root      5480  5478  0 19:28 pts/1    00:00:00 oracm
    root      5481  5478  0 19:28 pts/1    00:00:00 oracm
    root      5483  5478  0 19:28 pts/1    00:00:00 oracm
    root      5484  5478  0 19:28 pts/1    00:00:00 oracm
    root      5485  5478  0 19:28 pts/1    00:00:00 oracm
    root      5486  5478  0 19:28 pts/1    00:00:00 oracm
    root      5491  5478  0 19:28 pts/1    00:00:00 oracm
       
--Look for the Oracle Global Services daemon--
    
$ gsdctl stat
    GSD is running on the local node
 
--Using srvctl--


$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2

$ srvctl config database -d orcl
linux1 orcl1 /u01/app/oracle/product/9.2.0
linux2 orcl2 /u01/app/oracle/product/9.2.0

Query gv$instance
    
SELECT
    inst_id
  , instance_number inst_no
  , instance_name inst_name
  , parallel
  , status
  , database_status db_status
  , active_state state
  , host_name host
FROM gv$instance
ORDER BY inst_id;

INST_ID  INST_NO INST_NAME  PAR STATUS  DB_STATUS   STATE   HOST
-------- -------- ---------- --- ------- ----------- ------- -------
       1        1 orcl1      YES OPEN    ACTIVE      NORMAL  linux1
       2        2 orcl2      YES OPEN    ACTIVE      NORMAL  linux2

 


Starting & Stopping the Cluster

This section details various ways and commands necessary to startup and shutdown the instances in your Oracle9i RAC cluster. Ensure that you are logged in as the "oracle" UNIX user:

# su - oracle
Starting the Cluster

Startup all registered instances:

$ srvctl start database -d orcl
Startup the orcl2 instance:
$ srvctl start instance -d orcl -i orcl2
 

Stopping the Cluster

Shutdown all registered instances:

$ srvctl stop database -d orcl>
Shutdown orcl2 instance using the immediate option:
$ srvctl stop instance -d orcl -i orcl2 -o immediate
Shutdown orcl2 instance using the abort option:
$ srvctl stop instance -d orcl -i orcl2 -o abort

 


Transparent Application Failover (TAF)

Overview

It is not uncommon for businesses to demand 99.99% or even 99.999% availability for their enterprise applications. Think about what it would take to ensure a downtime of no more than .5 hours or even no downtime during the year. To answer many of these high availability requirements, businesses are investing in mechanisms that provide for automatic failover when one participating system fails. When considering the availability of the Oracle database, Oracle9i RAC provides a superior solution with its advanced failover mechanisms. Oracle9i RAC includes the required components that all work within a clustered configuration responsible for providing continuous availability—when one of the participating systems fail within the cluster, the users are automatically migrated to the other available systems.

A major component of Oracle9i RAC that is responsible for failover processing is the Transparent Application Failover (TAF) option. All database connections (and processes) that loose connections are reconnected to another node within the cluster. The failover is completely transparent to the user.

This final section provides a short demonstration on how automatic failover works in Oracle9i RAC. Please note that a complete discussion on failover in Oracle9i RAC would be an article in of its own. My intention here is to present a brief overview and example of how it works.

One important note before continuing is that TAF happens automatically within the OCI libraries. This means that your application (client) code does not need to change in order to take advantage of TAF. Certain configuration steps, however, will need to be done on the Oracle TNS file tnsnames.ora.

NOTE: Keep in mind that using the Java thin client will not be able to participate in TAF since it never reads the tnsnames.ora file.

 

Setup tnsnames.ora File

Before demonstrating TAF, we need to configure a tnsnames.ora file on a non-RAC client machine (if you have a Windows machine laying around). Ensure that you have Oracle RDBMS software installed. (Actually, you only need a client install of the Oracle software.)

Here is the entry I put into the %ORACLE_HOME%\network\admin\tnsnames.ora file on my Windows client machine in order to connect to the new Oracle clustered database:

...
ORCL =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
      (LOAD_BALANCE = yes)
      (FAILOVER = yes)
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl)
      (FAILOVER_MODE =
        (TYPE = session)
        (METHOD = basic)
      )
    )
  )
...
SQL Query to Check the Session's Failover Information

The following SQL query can be used to check a session's failover type, failover method, and if a failover has occurred. We will be using this query throughout this example.

COLUMN instance_name    FORMAT a13
COLUMN host_name        FORMAT a9
COLUMN failover_method  FORMAT a15
COLUMN failed_over      FORMAT a11

SELECT
    instance_name
  , host_name
  , NULL AS failover_type
  , NULL AS failover_method
  , NULL AS failed_over
FROM v$instance
UNION
SELECT
    NULL
  , NULL
  , failover_type
  , failover_method
  , failed_over
FROM v$session
WHERE username = 'SYSTEM';
Transparent Application Failover Demonstration

From our Windows (or other non-RAC client machine), login to the clustered database (orcl) as the SYSTEM user:

C:\> sqlplus system/manager@orcl

SQL*Plus: Release 9.2.0.3.0 - Production on Mon May 10 21:17:07 2004

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production

SQL> SELECT
  2      instance_name
  3    , host_name
  4    , NULL AS failover_type
  5    , NULL AS failover_method
  6    , NULL AS failed_over
  7  FROM v$instance
  8  UNION
  9  SELECT
 10      NULL
 11    , NULL
 12    , failover_type
 13    , failover_method
 14    , failed_over
 15  FROM v$session
 16  WHERE username = 'SYSTEM';

INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER
------------- --------- ------------- --------------- -----------
orcl1         linux1
                        SESSION       BASIC           NO

SQL>
DO NOT logout of the above SQL*Plus session!

 

Now that we have run the query (above), we should now shutdown the instance orcl1 on linux1 using the abort option. To perform this operation, we can use the srvctl command-line utility as follows:
# su - oracle
$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2

$ srvctl stop instance -d orcl -i orcl1 -o abort

$ srvctl status database -d orcl
Instance orcl1 is not running on node linux1
Instance orcl2 is running on node linux2
Now let's go back to our SQL session and rerun the SQL statement in the buffer:
SQL> SELECT
  2      instance_name
  3    , host_name
  4    , NULL AS failover_type
  5    , NULL AS failover_method
  6    , NULL AS failed_over
  7  FROM v$instance
  8  UNION
  9  SELECT
 10      NULL
 11    , NULL
 12    , failover_type
 13    , failover_method
 14    , failed_over
 15  FROM v$session
 16  WHERE username = 'SYSTEM';

INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER
------------- --------- ------------- --------------- -----------
orcl2         linux2
                        SESSION       BASIC           YES

SQL> exit
From the above demonstration, we can see that the above session has now been failed over to instance orcl2 on linux2.

 


Conclusion

Oracle RAC allows the DBA to configure a database solution with superior fault tolerance and load balancing. In this article, I have described an economical solution for setting up and configuring an inexpensive Oracle9i RAC Cluster using Red Hat Linux and FireWire technology. This RAC solution, which can be assembled for around $1,500, will provide you with a fully functional and stable Oracle9i RAC cluster for testing and development.

 


This article originally published at Jeffrey Hunter's DBA/Development Web Site in a slightly different form.

 

Jeffrey Hunter (jhunter@iDevelopment.info) is an Oracle Certified Professional, Java Development Certified Professional, and author and currently works as a senior DBA. His work includes advanced performance tuning, Java programming, capacity planning, database security, and physical/logical database design in UNIX, Linux, and Windows NT environments.
 

 

 

 

 

   

 Copyright © 1996 -2011 by Burleson Enterprises. All rights reserved.


Oracle® is the registered trademark of Oracle Corporation. SQL Server® is the registered trademark of Microsoft Corporation. 
Many of the designations used by computer vendors to distinguish their products are claimed as Trademarks