|
|
|
|
Build Your Own RAC Cluster on Linux and FireWire
by Jeffrey Hunter - OTN
- June 2004 |
|
|
Jeffrey
Hunter is the author of
Conducting the Java Job Interview and
Conducting the J2EE Job Interview
by Rampant TechPress
Build
Your Own RAC Cluster on Linux and
FireWire
by Jeffrey
Hunter
Learn how
to set up and configure an Oracle Real
Applications Cluster for less than
$1,500 (for development and testing
only)
Overview
One of the
most efficient ways to become familiar
with Oracle Real Application
Clusters (RAC) technology is to have
access to an actual Oracle RAC cluster.
In learning this new technology, you
will soon start to realize the benefits
Oracle RAC has to offer like fault
tolerance, new levels of security, load
balancing, and the ease of upgrading
capacity. The challenge, however, is the
price of the hardware required for a
typical production RAC configuration. A
small two-node cluster, for example, can
run anywhere from $10,000 to well over
$20,000. This cost would not even
include shared storage, the heart of a
production RAC environment.
For those who
simply want to become familiar with
Oracle RAC, this article provides a
low-cost alternative for configuring an
Oracle9i RAC system using
commercial off-the-shelf components and
downloadable software. The estimated
cost for this configuration could be
anywhere from $1,000 to $1,500. The
system will comprise a dual-node
cluster, both running Linux (Red Hat
Linux Fedora Core 1 in this example)
with a shared disk array based on
IEEE1394 (FireWire)
drive technology.
Please note
that this is not the only way to build a
low-cost Oracle9i RAC system. I
have seen other solutions that utilize
an implementation based on SCSI rather
than FireWire for shared storage. In
most cases, SCSI will cost more than our
FireWire solution where a typical SCSI
card is priced around $70 and an 80GB
external SCSI drive will cost
$700-$1,000. Keep in mind that some
motherboards may already include
built-in SCSI controllers.
It is
important to note that this
configuration should never be run
in a production environment. In a
production environment, fiber channel is
the technology of choice, since it is
the high-speed serial-transfer interface
that can connect systems and storage
devices in either point-to-point or
switched topologies. FireWire is able to
offer a low-cost alternative to fiber
channel for testing and development, but
it is not ready for production.
NOTE: At the time of
this writing, I had not verified that
these instructions will work with Oracle
Database 10g. I will be
providing a separate article in the next
several months on how to perform a
similar install using 10g.
Oracle9i
Real Application Clusters (RAC)
Introduction
Oracle Real
Application Clusters (RAC) is the
successor to Oracle Parallel Server
(OPS). RAC allows multiple instances to
access the same database (storage)
simultaneously. RAC provides fault
tolerance, load balancing, and
performance benefits by allowing the
system to scale out, and at the same
time since all nodes access the same
database, the failure of one instance
will not cause the loss of access to the
database.
At the heart
of Oracle RAC is a shared disk
subsystem. All nodes in the cluster must
be able to access all of the data, redo
log files, control files and parameter
files for all nodes in the cluster. The
data disks must be globally available in
order to allow all nodes to access the
database. Each node has its own redo log
and control files, but the other nodes
must be able to access them in order to
recover that node in the event of a
system failure.
Not all
clustering solutions use shared storage.
Some vendors use an approach known as a
federated cluster, in which data is
spread across several machines rather
than shared by all. With Oracle RAC,
however, multiple nodes use the same set
of disks for storing data. With Oracle
RAC, the data, redo log, control, and
archived log files reside on shared
storage on raw-disk devices or on a
clustered file system. Oracle's approach
to clustering leverages the collective
processing power of all the nodes in the
cluster and at the same time provides
failover security.
Although it is
not absolutely necessary, Oracle
recommendeds that you install the Oracle
Cluster File System (OCFS). OCFS makes
disk management much easier for you by
creating the same file system on all the
nodes. This isn't necessary, but without
OCFS, you will have to make all
partitions manually. (NOTE:
This article does not go into the
details of installing or utilizing OCFS,
but rather uses all manual methods for
creating partitions and binding raw
devices to those partitions.)
One of the
main reasons why I do not use the Oracle
Cluster File System for Red Hat Linux is
that OCFS comes in the form of RPMs. All
the RPM modules and the precompiled
modules are tied to the Red Hat
Enterprise Linux AS ($1,200)
kernel-naming standard and will not load
in the supplied 2.4.20 linked kernel.
The biggest
difference between Oracle RAC and OPS is
the addition of Cache Fusion. With OPS a
request for data from one node to
another required the data to be written
to disk first, then the requesting node
can read that data. With cache fusion,
data is passed along with locks.
Pre-configured
Oracle9i RAC solutions are
available from vendors such as Dell, IBM
and HP for production environments. This
article, however, focuses on putting
together your own Oracle9i RAC
environment for development and testing
by using Linux servers and a low cost
shared disk solution; FireWire.
What software
is necessary for RAC? Does it have a
separate installation CD to order?
RAC is
contained within the Oracle9i
Database Enterprise Edition. (Oracle
recently announced that RAC is now
available in Oracle Database 10g
Standard Edition as well.) If you
install Oracle9i Enterprise
Edition onto a cluster, and the Oracle
Universal Installer (OUI) recognizes the
cluster, you will be provided the option
of installing RAC. Most UNIX platforms
require an OSD installation for the
necessary clusterware. For Intel
platforms (Linux and Windows), Oracle
provides the OSD software within the
Oracle9i Enterprise Edition
release.
Shared
Storage Overview
Today,
fiber-channel is one of the most popular
solutions for shared storage. As
mentioned earlier, fiber-channel is a
high-speed serial-transfer interface
that is used to connect systems and
storage devices in either point-to-point
or switched topologies. Protocols
supported by fiber channel include SCSI
and IP. Fiber channel configurations can
support as many as 127 nodes and have a
throughput of up to 2.12 gigabits per
second. Fiber-channel, although, is very
expensive. Just the fiber-channel switch
alone can run as much as $1,000. This
does not even include the fiber-channel
storage array and high-end drives, which
can reach prices of about $300 for a
36GB drive. A typical fiber-channel
setup which includes fiber-channel cards
for the servers, a basic setup is
roughly $5,000, which does not include
the cost of the servers that make up the
cluster.
A less
expensive alternative to fiber-channel
is SCSI. SCSI technology provides
acceptable performance for shared
storage, but for administrators and
developers who are accustomed to
GPL-based Linux prices, even SCSI can
come in over budget, at around $1,000 to
$2,000 for a two-node cluster.
Another
popular solution is the Sun NFS (Network
File System). It can be used for shared
storage but only if you are using a
network appliance or something similar.
Specifically, you need servers that
guarantee direct I/O over NFS.
FireWire
Technology
Developed by
Apple Computer and Texas Instruments,
FireWire is a cross-platform
implementation of a high-speed serial
data bus. With its high bandwidth, long
distances (up to 100 meters in length)
and high-powered bus, FireWire is being
used in applications such as digital
video (DV), professional audio, hard
drives, high-end digital still cameras
and home entertainment devices. Today,
FireWire operates at transfer rates of
up to 800 megabits per second while next
generation FireWire calls for speeds to
a theoretical bit rate to 1,600 Mbps and
then up to a staggering 3,200 Mbps.
That's 3.2 gigabits per second. This
speed will make FireWire indispensable
for transferring massive data files and
for even the most demanding video
applications, such as working with
uncompressed high-definition (HD) video
or multiple standard-definition (SD)
video streams.
The following
chart shows speed comparisons of the
various types of disk interface. For
each interface, I provide the maximum
transfer rates in kilobits (kb),
kilobytes (KB), megabits (Mb), and
megabytes (MB) per second. As you can
see, the capabilities of IEEE1394
compare very favorably with other
available disk interface technologies.
| Disk
Interface |
Speed |
| Serial |
115 kb/s - (.115 Mb/s) |
| Parallel (standard) |
115 KB/s - (.115 MB/s) |
| USB 1.1 |
12 Mb/s - (1.5 MB/s) |
| Parallel (ECP/EPP) |
3.0 MB/s |
| IDE |
3.3 - 16.7 MB/s |
| ATA |
3.3 - 66.6 MB/sec |
| SCSI-1 |
5 MB/s |
| SCSI-2 (Fast SCSI / Fast
Narrow SCSI) |
10 MB/s |
| Fast Wide SCSI (Wide SCSI) |
20 MB/s |
| Ultra SCSI (SCSI-3 / Fast-20
/ Ultra Narrow) |
20 MB/s |
| Ultra IDE |
33 MB/s |
| Wide Ultra SCSI (Fast Wide
20) |
40 MB/s |
| Ultra2 SCSI |
40 MB/s |
| IEEE1394(b) |
100 - 400Mb/s -
(12.5 - 50 MB/s) |
| USB 2.x |
480 Mb/s - (60 MB/s) |
| Wide Ultra2 SCSI |
80 MB/s |
| Ultra3 SCSI |
80 MB/s |
| Wide Ultra3 SCSI |
160 MB/s |
| FC-AL Fiber Channel |
100 - 400 MB/s |
Hardware &
Costs
The hardware used to
build our example Oracle9i RAC
environment consists of two Linux
servers and components that can be
purchased at any local computer store or
over the Internet.
|
Server 1 (linux1) |
Dell Dimension XPS D266
Computer
- 266MHz Pentium II
- 384MB RAM
- 60GB Internal HD
- CDROM and Floppy |
$400 |
2 - Ethernet LAN Cards
-
Linksys 10/100 Mpbs - (To
public network)
-
Linksys 10/100 Mpbs - (Used
for Interconnect to linux2)
|
$20
$20 |
1 - FireWire Card
-
SIIG, Inc. 3-Port 1394 I/O Card
|
|
Note: Cards with
chipsets made by VIA or
TI are known to work.
|
|
$30 |
|
Server 2 (linux2) |
Pentium IV Computer
- 1.8GHz Pentium IV
- 300W Power Supply
- 512MB RAM
- 40GB Internal HD
- 32MB AGP Video Card
- CDROM and Floppy |
$600 |
2 - Ethernet LAN Cards
-
Linksys 10/100 Mpbs - (To
public network)
-
Linksys 10/100 Mpbs - (Used
for Interconnect to linux1)
|
$20
$20 |
1 - FireWire Card
-
Belkin FireWire 3-Port 1394 PCI
Card
| |
Note: Cards with
chipsets made by VIA or
TI are known to work.
|
|
$40 |
|
Miscellaneous Components |
FireWire Hard
Drive
-
Maxtor One Touch 200GB USB 2.0 /
Firewire External Hard Drive
| |
Ensure that the FireWire
drive you purchase
supports multiple
logins. If the drive has
a chipset that does not
allow for concurrent
access for more than one
server, the disk and its
partitions can only be
seen by one server at a
time. Disks with the
Oxford 911 chipset are
known to work. Here are
the details about the
disk that I purchased
for this test:
Vendor: Maxtor
Model: OneTouch
Mfg. Part No. or KIT
No.: A01A200 or A01A250
Capacity: 200GB or 250GB
Cache Buffer: 8MB
Spin Rate: 7200 RPM
"Combo" Interface: IEEE
1394 and SPB-2 compliant
(100 to 400 Mbits/sec)
plus USB 2.0 and USB 1.1
compatible |
|
$270 |
1 - Extra FireWire Cable
-
Belkin 6-pin to 6-pin 1394 Cable
|
$15 |
1 - Ethernet hub or switch
-
Linksys EtherFast 10/100 5-port
Ethernet Switch (used for
interconnect int-linux1 /
int-linux2) |
$40 |
4 - Network Cables
-
Category 5e patch cable -
(Connect linux1 to public
network)
-
Category 5e patch cable -
(Connect linux2 to public
network)
-
Category 5e patch cable -
(Connect linux1 to interconnect
ethernet switch)
-
Category 5e patch cable -
(Connect linux2 to interconnect
ethernet switch) |
$5
$5
$5
$5 |
|
Total
|
$1,495 |
A Brief Walk
Through the Process
Before
presenting the details of building our
Oracle9i RAC system, I thought
it would be beneficial to take a brief
walk through the steps involved in
building the environment. (See Figure
1.)
Our
implementation describes a dual node
cluster (each with a single processor),
each server running Red Hat Linux Fedora
Core 1. Note that most of the tasks
within this document will need to be
performed on both servers. I will
indicate at the beginning of each
section whether or not the task(s)
should be performed on both nodes.
|
|
1.
Install Red Hat Linux / Fedora
Core 1 (on both nodes)
For this example configuration,
you will be installing Red Hat
Linux (Fedora Core 1) on both
nodes that make up the RAC
cluster.
|
|
|
2. Configure network settings
(on both nodes)
After installing the Red Hat
Linux software on both nodes,
you will then need to configure
the network on both nodes. This
includes configuring the public
network as well as the
interconnect for the cluster.
You should also adjust the
default and maximum send buffer
size settings for the
interconnect for better
performance when using cache
fusion buffer transfers between
instances. These settings will
be put in your /etc/sysctl.conf
file.
|
|
|
3. Obtain and Install a proper
Linux Kernel (on both nodes)
In this section, we will be
downloading and installing a new
Linux kernel—one that supports
multiple logins to the Fire Wire
storage device. The kernel can
be downloaded from Oracle's
Linux Projects development
group—
http://oss.oracle.com. Once
the new kernel is installed,
there are several configuration
steps in order to load the
FireWire stack.
|
|
|
4. Create UNIX oracle
user account (dba
group) (on both nodes)
We will then create an Oracle
UNIX user id on all nodes within
the RAC cluster. This section
also provides an example login
script (.bash_profile)
that can be used to set all
required environment variables
for the oracle user.
|
|
|
5. Create Partitions on the
Shared FireWire Storage Device
(run once only from a single
node)
This is where we create the
physical and logical volumes
using Logical Volume Manager
(LVM). Instructions will be
provided on how to remove all
partitions from our FireWire
drive and then how to use LVM to
create all of our logical
partitions.
|
|
|
6. Create RAW Bindings (on both
nodes)
After creating our logical
partitions, we need to configure
raw devices on our FireWire
shared storage to be used for
all physical Oracle database
files.
|
|
|
7. Create Symbolic Links From
RAW Volumes (on both nodes)
It is helpful to create symbolic
links from the RAW volumes to
human readable names to make
file recognition easier.
Although this step is optional,
it is highly recommended.
|
|
|
8. Configuring the Linux Servers
(on both nodes)
This section will detail the
steps involved to configure both
Linux machines in order to
prepare them for an Oracle9i
RAC install.
|
|
|
9. Configuring the
hangcheck-timer Kernel Module
(on both nodes)
Oracle9i RAC uses a
kernel module called the
hangcheck-timer to monitor
the health of the cluster and to
restart a RAC mode in case of a
failure. This section explains
the steps required to configure
the hangcheck-timer kernel
module. Although the
hangcheck-timer module is not
required for Oracle Cluster
Manager operation, it is highly
recommended by Oracle.
|
|
|
10. Configuring RAC Nodes for
Remote Access (on both nodes)
When installing Oracle9i
RAC, the Oracle Installer will
use the rsh
command to copy the Oracle
software to all other nodes
within the RAC cluster. Included
in this section are the
instructions for configuring all
nodes within your RAC cluster to
run r* commands like
rsh, rcp, and
rlogin on a RAC node
against other RAC nodes without
a password.
|
|
|
11. Configuring a Machine
Startup Script (on both nodes)
Up to this point, we have talked
in great detail about the
parameters and resources that
will need to be configured on
both nodes for our Oracle9i
RAC configuration. This section
will take a breather and recap
those parameters and commands
(in previous sections of this
document) that need to happen on
each node when the machine is
cycled. Although there are
several ways to do this, I
simply provide a listing of the
commands that you can put into a
startup script (i.e.
/etc/rc.local) that setup all
required resources (disks,
memory, etc.) each time the
machine is booted. Other startup
scripts are included within this
section in order to provide a
check as to whether you have
updated all required scripts
when each machine in the cluster
is booted.
|
|
|
12. Update Red Hat Linux System
(on both nodes)
There are several RPMs that will
need to be applied to all nodes
within the RAC cluster in
preparation for the Oracle
install. All the RPMs are
included on the CDs for Fedora
Core 1, plus I also put links to
the files from this article.
After applying all of the RPMs,
you will then need to apply
Oracle/Linux Patch 3006854.
After applying all required
patches, you should reboot all
nodes within the RAC cluster.
|
|
|
13. Download / Unpack the
Oracle9i Installation
Files (from a single node)
This section includes the steps
to download and unpack the
Oracle9i software
distribution. The software can
be downloaded from
http://otn.oracle.com.
|
|
|
14. Install Oracle9i
Cluster Manager ( from a single
node)
Installing Oracle9i RAC
is a two-step process: (1)
Install the Oracle9i
Cluster Manager and (2) Install
the Oracle9i RDBMS
software. In this section, we
will go through the steps to
install, configure and start the
Oracle Cluster Manager software.
Keep in mind that the
installation of Oracle Cluster
Manager only needs to be
preformed on one of the nodes
(the installation process will
rsh the files out to
all other nodes contained within
the cluster), but the
configuring and starting the
Cluster Manager needs to be
preformed on both nodes.
|
|
|
15. Install Oracle9i
RAC (only needs to be preformed
from a single node)
After installing Oracle Cluster
Manager, it is time to install
the RAC software. This section
provides many of the tasks
involved to install the software
as well as many post
installation tasks that should
be preformed before creating the
Oracle cluster database.
|
|
|
16. Create the Oracle Database
(from a single node)
After all the software has been
installed, we will now use the
Oracle Database Configuration
Assistant (DBCA) to create our
clustered database on the shared
storage (FireWire) device.
|
|
|
17. Creating TNS Networking
Files (on both nodes)
This section simply provides an
example listing of my
listener.ora and
tnsnames.ora files. These
will need to be configured for
each node in the RAC cluster.
The Oracle Installer and Oracle
Database Configuration Assistant
do a great job in keeping these
files up to date. I do, however,
like to make a few changes to
the tnsnames.ora file.
|
|
|
18. Verify the RAC Cluster /
Database Configuration (on both
nodes)
After the Oracle Database
Configuration Assistant has
completed in creating the
clustered database, you should
have a fully functional Oracle9i
RAC cluster running. This
section provides several
commands SQL queries that can be
used to validate your Oracle9i
RAC configuration.
|
|
|
19. Starting & Stopping the
Cluster ( from a single node)
Examples will be given in this
section on how to start and stop
the cluster. This includes how
to fully bring up or down the
entire cluster, along with
examples of how to bring up and
shutdown individual instances
within the cluster.
|
|
|
20. Transparent Application
Failover (TAF) (on one or both
nodes)
Now that we have our cluster up
and running, this section
provides an example on how to
test the Transparent Application
Failover features of Oracle9i
RAC. I will demonstrate how
session failure works and how to
setup your TNS configuration to
take advantage of TAF.
|
Install Red
Hat Linux (Fedora Core 1)
After procuring the
required hardware, it is time to start
the configuration process. The first
step in the process is to install the
Red Hat Linux Fedora Core 1 software on
both servers.
NOTE:
This article does not provide detailed
instructions for installing Red Hat
Linux Fedora Core 1. For the purpose of
this article, I choose to perform a
Custom installation and then "Install
Everything" when prompted for which
products to install. Documentation for
installing Red Hat Linux can be found at
http://www.redhat.com/docs/manuals/.
Configure
Network Settings
Configuring Public and Private Network
Let's start our
Oracle RAC Linux configuration by
ensuring the correct network
configuration. In our two-node example,
we will need to configure the network on
both nodes.
The easiest
way to configure network settings in
RedHat Linux is via the program Network
Configuration. This application can be
started from the command-line as the
"root" user id as follows:
# su -
# /usr/bin/redhat-config-network &
NOTE:
Do not use DHCP naming as the
interconnects need hard IP addresses!
Using the
Network Configuration application, you
will need to configure both NIC devices
as well as the /etc/hosts file.
Both of these tasks can be completed
using the Network Configuration GUI.
Notice that the /etc/hosts
settings are the same for both nodes.
Our example
configuration will use the following
settings:
|
Server 1 (linux1) |
| Device |
IP Address |
Subnet |
Purpose |
| eth0 |
192.168.1.100 |
255.255.255.0 |
Connects linux1 to the
public network |
| eth1 |
192.168.2.100 |
255.255.255.0 |
Connects linux1
(interconnect) to linux2
(int-linux2) |
| /etc/hosts |
127.0.0.1 localhost loopback
192.168.1.100 linux1
192.168.2.100 int-linux1
192.168.1.101 linux2
192.168.2.101 int-linux2
|
|
Server 2 (linux2) |
| Device |
IP Address |
Subnet |
Purpose |
| eth0 |
192.168.1.101 |
255.255.255.0 |
Connects linux2 to the
public network |
| eth1 |
192.168.2.101 |
255.255.255.0 |
Connects linux2
(interconnect) to linux1
(int-linux1) |
| /etc/hosts |
127.0.0.1 localhost loopback
192.168.1.100 linux1
192.168.2.100 int-linux1
192.168.1.101 linux2
192.168.2.101 int-linux2
|
In the screenshots
below, only node 1 (linux1) is shown.
Ensure to make all the proper network
settings to both nodes.
Figure 1: Network
Configuration Screen, Node 1 (linux1)
Figure 2:
Ethernet Device Screen, eth0 (linux1)
Figure 3:
Ethernet Device Screen, eth1 (linux1)
Figure 4: Network
Configuration Screen, /etc/hosts
(linux1)
Adjusting
Network Settings
With Oracle
9.2.0.1 and above, Oracle uses UDP as
the default protocol on Linux for
interprocess communication (IPC), such
as cache fusion buffer transfers
between instances within the RAC
cluster.
Oracle
strongly suggests to adjust the default
and maximum send buffer size (SO_SNDBUF
socket option) to 256KB, and the default
and maximum receive buffer size (SO_RCVBUF
socket option) to 256KB.
The receive
buffers are used by TCP and UDP to hold
received data until is is read by the
application. The receive buffer cannot
overflow because the peer is not allowed
to send data beyond the buffer size
window. This means that datagrams will
be discarded if they don't fit in the
socket receive buffer. This could cause
the sender to overwhelm the receiver.
NOTE:
The default and maximum window size can
be changed in the /proc file
system without reboot:
su - root
# Default setting in bytes of the socket receive buffer
sysctl -w net.core.rmem_default=262144
# Default setting in bytes of the socket send buffer
sysctl -w net.core.wmem_default=262144
# Maximum socket receive buffer size which may be set by using
# the SO_RCVBUF socket option
sysctl -w net.core.rmem_max=262144
# Maximum socket send buffer size which may be set by using
# the SO_SNDBUF socket option
sysctl -w net.core.wmem_max=262144
You should make the
above changes permanent by adding the
following lines to the
/etc/sysctl.conf file for each node
in your RAC cluster:
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144
|
|
|
Listing 1
select event,
total_waits,
round(100 * (total_waits / sum_waits),2) pct_waits,
time_wait_sec,
round(100 * (time_wait_sec /
greatest(sum_time_waited,1)),2)
pct_time_waited,
total_timeouts,
round(100 * (total_timeouts /
greatest(sum_timeouts,1)),2)
pct_timeouts,
average_wait_sec
from
(select event,
total_waits,
round((time_waited / 100),2) time_wait_sec,
total_timeouts,
round((average_wait / 100),2) average_wait_sec
from sys.v_$system_event
where event not in
('lock element cleanup',
'pmon timer',
'rdbms ipc message',
'rdbms ipc reply',
'smon timer',
'SQL*Net message from client',
'SQL*Net break/reset to client',
'SQL*Net message to client',
'SQL*Net more data from client',
'dispatcher timer',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'slave wait',
'virtual circuit status',
'WMON goes to sleep',
'jobq slave wait',
'Queue Monitor Wait',
'wakeup time manager',
'PX Idle Wait') AND
event not like 'DFS%' AND
event not like 'KXFX%'),
(select sum(total_waits) sum_waits,
sum(total_timeouts) sum_timeouts,
sum(round((time_waited / 100),2)) sum_time_waited
from sys.v_$system_event
where event not in
('lock element cleanup',
'pmon timer',
'rdbms ipc message',
'rdbms ipc reply',
'smon timer',
'SQL*Net message from client',
'SQL*Net break/reset to client',
'SQL*Net message to client',
'SQL*Net more data from client',
'dispatcher timer',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'slave wait',
'virtual circuit status',
'WMON goes to sleep',
'jobq slave wait',
'Queue Monitor Wait',
'wakeup time manager',
'PX Idle Wait') AND
event not like 'DFS%' AND
event not like 'KXFX%')
order by 4 desc, 1 asc
Listing 2
SELECT sid,
username,
event,
total_waits,
100 * round((total_waits / sum_waits),2)
pct_of_total_waits,
time_wait_sec,
total_timeouts,
average_wait_sec,
max_wait_sec
FROM
(SELECT a.event,
b.sid sid,
decode (b.username,null,c.name,b.username) username,
a.total_waits total_waits,
round((a.time_waited / 100),2) time_wait_sec,
a.total_timeouts total_timeouts,
round((average_wait / 100),2)
average_wait_sec,
round((a.max_wait / 100),2) max_wait_sec
FROM sys.v_$session_event a,
sys.v_$session b,
sys.v_$bgprocess c,
sys.v_$process d
WHERE a.event NOT IN
('lock element cleanup',
'pmon timer',
'rdbms ipc message',
'smon timer',
'SQL*Net message from client',
'SQL*Net break/reset to client',
'SQL*Net message to client',
'SQL*Net more data from client',
'dispatcher timer',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'slave wait',
'virtual circuit status',
'WMON goes to sleep'
)
AND a.event NOT LIKE 'DFS%'
AND a.event NOT LIKE 'KXFX%'
AND a.sid = b.sid
AND d.addr = b.paddr
AND c.paddr (+) = b.paddr
),
(select sum(total_waits) sum_waits
FROM sys.v_$session_event a,
sys.v_$session b
WHERE a.event NOT IN
('lock element cleanup',
'pmon timer',
'rdbms ipc message',
'smon timer',
'SQL*Net message from client',
'SQL*Net break/reset to client',
'SQL*Net more data from client',
'SQL*Net message to client',
'dispatcher timer',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'slave wait',
'virtual circuit status',
'WMON goes to sleep'
)
AND a.event NOT LIKE 'DFS%'
AND a.event NOT LIKE 'KXFX%'
AND a.sid = b.sid)
order by 6 desc, 1 asc
| Obtain
and Install a Proper Linux Kernel
Overview
The next step is to
obtain and install a new Linux kernel that
supports the use of IEEE1394 devices with
multiple logins. In previous releases of
this article, I included the steps to
download a patched version of the Linux
kernel and then compile it. Thanks to
Oracle's Linux Projects development group,
this is no longer a requirement. They
provide a pre-compiled kernel for Red Hat
Enterprise Linux 3.0 (which also works with
Fedora) that can simply be downloaded and
installed. The instructions for downloading
and installing the kernel are included in
this section. Before going into the details
of how to perform these actions, however,
let's take a moment to discuss the changes
that are required in the new kernel.
While FireWire
drivers already exist for Linux, they often
do not support shared storage.
Normally, when you logon to an OS, the OS
associates the driver to a specific drive
for that machine alone. This implementation
simply will not work for our RAC
configuration. The shared storage (our
FireWire hard drive) needs to be accessed by
more than one node. We need to enable the
FireWire driver to provide nonexclusive
access to the drive so that multiple
servers—the nodes that comprise the cluster—
will be able to access the same storage.
This task is accomplished by removing the
bit mask that identifies the machine during
login in the source code. This results in
allowing nonexclusive access to the FireWire
hard drive. All other nodes in the cluster
login to the same drive during their logon
session, using the same modified driver, so
they too also have nonexclusive access to
the drive.
I'm probably
getting ahead of myself, but I want to cover
several topics before diving into the
details of installing our new Linux kernel.
When we install our new Linux kernel (one
that supports multiple logons to the
FireWire drive) the system will detect and
recognize the FireWire attached drive as a
SCSI device. You will be able to use
standard OS tools to partition the disk,
create a file system, and so on. For Oracle9i
RAC, you must make partitions for all the
files and bind raw devices to those
partitions. This article will make use of
Logical Volume Manager (LVM) to make all
needed paritions (actually to be known as
logical partitions) on the FireWire
shared drive.
Our implementation
describes a dual node cluster (each with a
single processor), each server running Red
Hat Linux Fedora Core 1. Keep in mind that
the process of installing the patched Linux
kernel will need to be performed on
both Linux nodes. Red Hat Linux
Fedora Core 1 includes kernel
linux-2.4.22-1.2115.nptl; we will need to
download the Oracle-supplied 2.4.21-9.0.1
Linux kernel from the following URL:
http://oss.oracle.com/projects/firewire/files.
Perform the
following procedures on both nodes in the
cluster:
- Download one of the following files:
kernel-2.4.21-9.0.1.ELorafw1.i686.rpm
- for single processor
- OR -
kernel-smp-2.4.21-9.0.1.ELorafw1.i686.rpm
- for multiple processors
- Make a backup of your GRUB
configuration file:
In most cases you
will be using GRUB for your boot loader.
Before actually installing the new
kernel ensure to backup a copy of your
/etc/grub.conf file:
# cp /etc/grub.conf /etc/grub.conf.original
- Install the new kernel, as user
root:
# rpm -ivh --force kernel-2.4.21-9.0.1.ELorafw1.i686.rpm - for single processor
- OR -
# rpm -ivh --force kernel-smp-2.4.21-9.0.1.ELorafw1.i686.rpm - for multiple processors
NOTE: Installing the
new kernel using RPM will also undate
your grub or lilo configuration with the
appropiate stanza. There is no need to
add any new stanza to your boot loader
configuration unless you want to have
your old kernel image available.
The following
is a listing of my /etc/grub.conf
file before and then after the kernel
install. As you can see, the install
that I did put in another stanza for the
2.4.21-9.0.1.ELorafw1 kernel.
If you want, you can change the entry (default)
in the new file so that the new kernel
will be the default one booted. By
default, the installer keeps your old
kernel the default one by setting it to
default=1.
Original
/etc/grub.conf File for Fedora Core
1
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda3
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Fedora Core (2.4.22-1.2115.nptl)
root (hd0,0)
kernel /vmlinuz-2.4.22-1.2115.nptl ro root=LABEL=/ rhgb
initrd /initrd-2.4.22-1.2115.nptl.img
Newly Configured
/etc/grub.conf File for Fedora
Core 1 After Kernel Install
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda3
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Fedora Core (2.4.21-9.0.1.ELorafw1)
root (hd0,0)
kernel /vmlinuz-2.4.21-9.0.1.ELorafw1 ro root=LABEL=/ rhgb
initrd /initrd-2.4.21-9.0.1.ELorafw1.img
title Fedora Core (2.4.22-1.2115.nptl)
root (hd0,0)
kernel /vmlinuz-2.4.22-1.2115.nptl ro root=LABEL=/ rhgb
initrd /initrd-2.4.22-1.2115.nptl.img
- Add module options:
Add the following lines to
/etc/modules.conf:
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-remove sbp2 rmmod sd_mod
It is vital
that the parameter
sbp2_exclusive_login of the Serial
Bus Protocol module (sbp2) be
set to zero to allow multiple hosts to
login to and access the FireWire disk
concurrently. The second line ensures
the SCSI disk driver module (sd_mod)
is loaded as well since (sbp2)
requires the SCSI layer. The core SCSI
support module (scsi_mod) will
be loaded automatically if (sd_mod)
is loaded—there is no need to make a
separate entry for it.
- Reboot machine
Reboot your machine
into the new kernel. Ensure the firewire
(ieee1394) pci cards are plugged into
the machine!
- Load the firewire stack
In most cases, the
loading of the FireWire stack will
already be configured in the
/etc/rc.sysinit file. The commands
that are contained within this file that
are responsible for loading the FireWire
stack are:
# modprobe ohci1394
# modprobe sbp2
In older versions of Red Hat, this was
not the case and these commands would
have to be manually run or put within a
startup file. With Fedora Core 1 and
higher, these commands are already put
within the /etc/rc.sysinit file
and run on each boot.
- Rescan SCSI bus
In older versions of
the kernel, I would need to run the
rescan-scsi-bus.sh script in order
to detect the FireWire drive. The
purpose of this script was to create the
SCSI entry for the node by using the
following command:
echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi
With Fedora Core 1,
the disk should be detected
automatically.
- Check for SCSI Device
After you have
rebooted the machine, the kernel should
automatically detect the disk as a SCSI
device (/dev/sdXX). This
section will provide several commands
that should be run on both nodes in the
cluster to ensure the FireWire drive was
successfully detected.
For this
configuration, I was performing the
above procedures on both nodes at the
same time. When complete, I shutdown
both machines, started linux1
first, and then linux2. The
following commands and results are from
my linux2 machine. Again, make
sure that you run the following commands
on both nodes to ensure both machine can
login to the shared drive.
Let's first
check to see that the FireWire adapter
was successfully detected:
# lspci
00:00.0 Host bridge: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge (rev 11)
00:01.0 PCI bridge: Intel Corp. 82845 845 (Brookdale) Chipset AGP Bridge (rev 11)
00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 01)
00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 01)
00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 01)
00:1d.7 USB Controller: Intel Corp. 82801DB USB2 (rev 01)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corp. 82801DB LPC Interface Controller (rev 01)
00:1f.1 IDE interface: Intel Corp. 82801DB Ultra ATA Storage Controller (rev 01)
00:1f.3 SMBus: Intel Corp. 82801DB/DBM SMBus Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1)
02:00.0 Ethernet controller: Linksys Network Everywhere Fast Ethernet 10/100 model NC100 (rev 11)
02:01.0 FireWire (IEEE 1394): Texas Instruments TSB12LV26 IEEE-1394 Controller (Link)
02:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
02:07.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
Second, let's
check to see that the modules are
loaded:
# lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod"
sd_mod 13808 0
sbp2 20556 0
scsi_mod 109864 3 [sg sd_mod sbp2]
ohci1394 28904 0 (unused)
ieee1394 63652 0 [sbp2 ohci1394]
Third, let's make
sure the disk was detected and an entry
was made by the kernel:
# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: Maxtor Model: OneTouch Rev: 0200
Type: Direct-Access
Now let's ensure
the FireWire drive is accessible for
multiple logins and shows a valid login:
# dmesg | grep sbp2
ieee1394: sbp2: Query logins to SBP-2 device successful
ieee1394: sbp2: Maximum concurrent logins supported: 3
ieee1394: sbp2: Number of active logins: 2
ieee1394: sbp2: Logged into SBP-2 device
ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]
ieee1394: sbp2: Reconnected to SBP-2 device
ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]
From the above
output, you can see that the FireWire
drive we have can support concurrent
logins by up to 3 servers. It is vital
that you have a drive where the chipset
supports concurrent access for all nodes
within the RAC cluster.
- Troubleshoot SCSI Device Detection
If you are having
troubles with any of the procedures
(above) in detecting the SCSI device,
you can try the following:
# modprobe -r sbp2
# modprobe -r sd_mod
# modprobe -r ohci1394
# modprobe ohci1394
# modprobe sd_mod
# modprobe sbp2
Create "oracle"
User and Directories (both nodes)
Let's continue our
example by creating the UNIX dba
group and oracle userid along with
all appropriate directories.
# mkdir /u01
# mkdir /u01/app
# groupadd -g 115 dba
# useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle Software Owner" -p oracle oracle
NOTE:
When you are setting the Oracle environment
variables for each RAC node, ensure to
assign each RAC node a unique Oracle SID!
For this example,
I used:
- linux1 :
ORACLE_SID=orcl1
- linux2 :
ORACLE_SID=orcl2
NOTE:
The Oracle Universal Installer (OUI)
requires at most 400MB of free space in the
/tmp directory.
You can check the
available space in /tmp by running
the following command:
# df -k /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda3 36384656 6224240 28312140 19% /
If for some reason
you do not have enough space in /tmp,
you can temporarily create space in another
file system and point your TEMP and
TMPDIR to it for the duration of
the install. Here are the steps to do this:
# su -
# mkdir /<AnotherFilesystem>/tmp
# chown root.root /<AnotherFilesystem>/tmp
# chmod 1777 /<AnotherFilesystem>/tmp
# export TEMP=/<AnotherFilesystem>/tmp # used by Oracle
# export TMPDIR=/<AnotherFilesystem>/tmp # used by Linux programs
# like the linker "ld"
When the installation
of Oracle is complete, you can remove the
temporary directory using the following:
# su -
# rmdir /<AnotherFilesystem>/tmp
# unset TEMP
# unset TMPDIR
After creating the "oracle"
UNIX userid on both nodes, ensure that the
environment is setup correctly by using the
following .bash_profile:
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
alias ls="ls -FA"
# User specific environment and startup programs
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/9.2.0
# Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2,...)
export ORACLE_SID=orcl1
export PATH=.:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp
export LD_ASSUME_KERNEL=2.4.1
Creating
Partitions on the Shared FireWire Storage
Device (one node)
Overview
It is time to create the
physical and logical volumes to be used by
the Logical Volume Manager (LVM). (For a
more detailed view of managing the LVM, see
my article
Managing Physical & Logical Volumes.)
The following table lists the mappings of
logical partition to tablespace that we will
be accomplishing in this section of the
document:
| Logical
Volume |
RAW Volume |
Symbolic
Link |
Tablespace/ File
Name |
Tablespace/ File
Size |
Partition
Size |
| /dev/pv1/lvol1 |
/dev/raw/raw1 |
/u01/app/oracle/oradata/orcl/CMQuorumFile |
Cluster Manager Quorum File |
-
|
5MB
|
| /dev/pv1/lvol2 |
/dev/raw/raw2 |
/u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile |
Shared Configuration File |
-
|
100MB
|
| /dev/pv1/lvol3 |
/dev/raw/raw3 |
/u01/app/oracle/oradata/orcl/spfileorcl.ora |
Server Parameter File |
-
|
10MB
|
| /dev/pv1/lvol4 |
/dev/raw/raw4 |
/u01/app/oracle/oradata/orcl/control01.ctl |
Control File 1 |
-
|
200MB
|
| /dev/pv1/lvol5 |
/dev/raw/raw5 |
/u01/app/oracle/oradata/orcl/control02.ctl |
Control File 2 |
-
|
200MB
|
| /dev/pv1/lvol6 |
/dev/raw/raw6 |
/u01/app/oracle/oradata/orcl/control03.ctl |
Control File 3 |
-
|
200MB
|
| /dev/pv1/lvol7 |
/dev/raw/raw7 |
/u01/app/oracle/oradata/orcl/cwmlite01.dbf |
CWMLITE |
50MB
|
55MB
|
| /dev/pv1/lvol8 |
/dev/raw/raw8 |
/u01/app/oracle/oradata/orcl/drsys01.dbf |
DRSYS |
20MB
|
25MB
|
| /dev/pv1/lvol9 |
/dev/raw/raw9 |
/u01/app/oracle/oradata/orcl/example01.dbf |
EXAMPLE |
250MB
|
255MB
|
| /dev/pv1/lvol10 |
/dev/raw/raw10 |
/u01/app/oracle/oradata/orcl/indx01.dbf |
INDX |
100MB
|
105MB
|
| /dev/pv1/lvol11 |
/dev/raw/raw11 |
/u01/app/oracle/oradata/orcl/odm01.dbf |
ODM |
50MB
|
55MB
|
| /dev/pv1/lvol12 |
/dev/raw/raw12 |
/u01/app/oracle/oradata/orcl/system01.dbf |
SYSTEM |
800MB
|
805MB
|
| /dev/pv1/lvol13 |
/dev/raw/raw13 |
/u01/app/oracle/oradata/orcl/temp01.dbf |
TEMP |
250MB
|
255MB
|
| /dev/pv1/lvol14 |
/dev/raw/raw14 |
/u01/app/oracle/oradata/orcl/tools01.dbf |
TOOLS |
100MB
|
105MB
|
| /dev/pv1/lvol15 |
/dev/raw/raw15 |
/u01/app/oracle/oradata/orcl/undotbs01.dbf |
UNDOTBS1 |
400MB
|
405MB
|
| /dev/pv1/lvol16 |
/dev/raw/raw16 |
/u01/app/oracle/oradata/orcl/undotbs02.dbf |
UNDOTBS2 |
400MB
|
405MB
|
| /dev/pv1/lvol17 |
/dev/raw/raw17 |
/u01/app/oracle/oradata/orcl/users01.dbf |
USERS |
100MB
|
105MB
|
| /dev/pv1/lvol18 |
/dev/raw/raw18 |
/u01/app/oracle/oradata/orcl/xdb01.dbf |
XDB |
150MB
|
155MB
|
| /dev/pv1/lvol19 |
/dev/raw/raw19 |
/u01/app/oracle/oradata/orcl/perfstat01.dbf |
PERFSTAT |
100MB
|
105MB
|
| /dev/pv1/lvol20 |
/dev/raw/raw20 |
/u01/app/oracle/oradata/orcl/redo01.log |
REDO G1 / M1 |
100MB
|
105MB
|
| /dev/pv1/lvol21 |
/dev/raw/raw21 |
/u01/app/oracle/oradata/orcl/redo02.log |
REDO G2 / M1 |
100MB
|
105MB
|
| /dev/pv1/lvol22 |
/dev/raw/raw22 |
/u01/app/oracle/oradata/orcl/redo03.log |
REDO G3 / M1 |
100MB
|
105MB
|
| /dev/pv1/lvol23 |
/dev/raw/raw23 |
/u01/app/oracle/oradata/orcl/orcl_redo2_2.log |
REDO G4 / M1 |
100MB
|
105MB
|
Remove All
Partitions on FireWire Shared Storage
In this example, I will
be using the entire FireWire disk (no
partitions). In this case, I will be using
/dev/sda to create the logical /
physical volumes. This is not the only way
to accomplish the task of creating our LVM
environment. We could also create a Linux
LVM partition (this is type 8e) on
the disk. Let's say that the LVM partition
is the first partition created on the disk.
We would then need to work with
/dev/sda1. Again, in this example, I
will be using the entire FireWire drive
(with no partitions) and therefore accessing
/dev/sda. Before creating our
physical and logical volumes, it is
important to remove any existing partitions
on the FireWire drive (since we will be
using the entire disk) by using the
fdisk command:
# fdisk /dev/sda
Command (m for help): p
Disk /dev/sda: 203.9 GB, 203927060480 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 24791 199133676 c Win95 FAT32 (LBA)
Command (m for help): d
Selected partition 1
Command (m for help): p
Disk /dev/sda: 203.9 GB, 203927060480 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
Create Logical
Volumes
The following set
of commands perform the steps required to
create logical volumes:
- Run the
vgscan command (on all RAC nodes
within the cluster) in order to
create the /etc/lvmtab file.
- Use
pvcreate to create a physical
volume for use by the Logical Volume
Manager (LVM).
- Use
vgcreate to create a volume group
for the drive or for the partition you
want to use for RAW devices. Here we do
the entire single drive. In our example
(below), the command will allow 256
logical partitions and 256 physical
partitions with a 128K extent size.
- Use
lvcreate to create the logical
volumes inside the volume group.
NOTE:
As mentioned above, I needed to run the
vgscan command on all nodes so that it
could create the /etc/lvmtab file.
This should be performed before running the
commands below.
Put the following
commands in a schell script, modify the
permission to execute, and then run it as
the "root" UNIX userid:
vgscan
pvcreate -d /dev/sda
vgcreate -l 256 -p 256 -s 128k /dev/pv1 /dev/sda
lvcreate -L 5m /dev/pv1
lvcreate -L 100m /dev/pv1
lvcreate -L 10m /dev/pv1
lvcreate -L 200m /dev/pv1
lvcreate -L 200m /dev/pv1
lvcreate -L 200m /dev/pv1
lvcreate -L 55m /dev/pv1
lvcreate -L 25m /dev/pv1
lvcreate -L 255m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 55m /dev/pv1
lvcreate -L 805m /dev/pv1
lvcreate -L 255m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 405m /dev/pv1
lvcreate -L 405m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 155m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
lvcreate -L 105m /dev/pv1
Using the script
(above) will result in the creation of
/dev/pv1/lvol1 - /dev/pv1/lvol23.
I typically use
the lvscan command to check the
status of my logical volumes:
[root@linux2 root]# lvscan
lvscan -- ACTIVE "/dev/pv1/lvol1" [5 MB]
lvscan -- ACTIVE "/dev/pv1/lvol2" [100 MB]
lvscan -- ACTIVE "/dev/pv1/lvol3" [10 MB]
lvscan -- ACTIVE "/dev/pv1/lvol4" [200 MB]
lvscan -- ACTIVE "/dev/pv1/lvol5" [200 MB]
lvscan -- ACTIVE "/dev/pv1/lvol6" [200 MB]
lvscan -- ACTIVE "/dev/pv1/lvol7" [55 MB]
lvscan -- ACTIVE "/dev/pv1/lvol8" [25 MB]
lvscan -- ACTIVE "/dev/pv1/lvol9" [255 MB]
lvscan -- ACTIVE "/dev/pv1/lvol10" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol11" [55 MB]
lvscan -- ACTIVE "/dev/pv1/lvol12" [805 MB]
lvscan -- ACTIVE "/dev/pv1/lvol13" [255 MB]
lvscan -- ACTIVE "/dev/pv1/lvol14" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol15" [405 MB]
lvscan -- ACTIVE "/dev/pv1/lvol16" [405 MB]
lvscan -- ACTIVE "/dev/pv1/lvol17" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol18" [155 MB]
lvscan -- ACTIVE "/dev/pv1/lvol19" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol20" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol21" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol22" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol23" [105 MB]
lvscan -- 23 logical volumes with 3.88 GB total in 1 volume group
lvscan -- 23 active logical volumes
Reboot All Nodes in
RAC Cluster
After you have finished
creating the partitions, it is recommended
that you reboot the kernel on all RAC nodes
to make sure that all of the new partitions
are recognized by the kernel on all RAC
nodes:
# su -
# reboot
IMPORTANT:
Keep in mind that you will need to put a
call to the vgscan and then
vgchange -a y< in one of your startup
scripts so that they are run at boot time
for each machine in your RAC cluster. These
two commands will give you an actual volume
manager database before activating all
volume groups. This document will provide
all settings that should go into your
/etc/rc.local script in order to setup
each node within your Oracle9i RAC
cluster.
Create RAW
Bindings (both nodes)
NOTE:
Several of the commands within this section
will need to be performed on every node
within the cluster every time that machine
is booted. Details of these commands and
instructions for placing them in a startup
script are included in the section
All Startup Commands for Each RAC Node.
In this section, I
will provide the instructions for
configuring raw devices on our FireWire
shared storage to be used for all physical
Oracle database files including the Cluster
Manager Quorum File and the Shared
Configuration File for srvctl.
At this point, we
have already created the partions required
on our FireWire shared storage. We now need
to bind all volumes to our raw device by
using the raw command:
usr/bin/raw /dev/raw/raw1 /dev/pv1/lvol1
/usr/bin/raw /dev/raw/raw2 /dev/pv1/lvol2
/usr/bin/raw /dev/raw/raw3 /dev/pv1/lvol3
/usr/bin/raw /dev/raw/raw4 /dev/pv1/lvol4
/usr/bin/raw /dev/raw/raw5 /dev/pv1/lvol5
/usr/bin/raw /dev/raw/raw6 /dev/pv1/lvol6
/usr/bin/raw /dev/raw/raw7 /dev/pv1/lvol7
/usr/bin/raw /dev/raw/raw8 /dev/pv1/lvol8
/usr/bin/raw /dev/raw/raw9 /dev/pv1/lvol9
/usr/bin/raw /dev/raw/raw10 /dev/pv1/lvol10
/usr/bin/raw /dev/raw/raw11 /dev/pv1/lvol11
/usr/bin/raw /dev/raw/raw12 /dev/pv1/lvol12
/usr/bin/raw /dev/raw/raw13 /dev/pv1/lvol13
/usr/bin/raw /dev/raw/raw14 /dev/pv1/lvol14
/usr/bin/raw /dev/raw/raw15 /dev/pv1/lvol15
/usr/bin/raw /dev/raw/raw16 /dev/pv1/lvol16
/usr/bin/raw /dev/raw/raw17 /dev/pv1/lvol17
/usr/bin/raw /dev/raw/raw18 /dev/pv1/lvol18
/usr/bin/raw /dev/raw/raw19 /dev/pv1/lvol19
/usr/bin/raw /dev/raw/raw20 /dev/pv1/lvol20
/usr/bin/raw /dev/raw/raw21 /dev/pv1/lvol21
/usr/bin/raw /dev/raw/raw22 /dev/pv1/lvol22
/usr/bin/raw /dev/raw/raw23 /dev/pv1/lvol23
/bin/chmod 600 /dev/raw/raw1
/bin/chmod 600 /dev/raw/raw2
/bin/chmod 600 /dev/raw/raw3
/bin/chmod 600 /dev/raw/raw4
/bin/chmod 600 /dev/raw/raw5
/bin/chmod 600 /dev/raw/raw6
/bin/chmod 600 /dev/raw/raw7
/bin/chmod 600 /dev/raw/raw8
/bin/chmod 600 /dev/raw/raw9
/bin/chmod 600 /dev/raw/raw10
/bin/chmod 600 /dev/raw/raw11
/bin/chmod 600 /dev/raw/raw12
/bin/chmod 600 /dev/raw/raw13
/bin/chmod 600 /dev/raw/raw14
/bin/chmod 600 /dev/raw/raw15
/bin/chmod 600 /dev/raw/raw16
/bin/chmod 600 /dev/raw/raw17
/bin/chmod 600 /dev/raw/raw18
/bin/chmod 600 /dev/raw/raw19
/bin/chmod 600 /dev/raw/raw20
/bin/chmod 600 /dev/raw/raw21
/bin/chmod 600 /dev/raw/raw22
/bin/chmod 600 /dev/raw/raw23
/bin/chown oracle:dba /dev/raw/raw1
/bin/chown oracle:dba /dev/raw/raw2
/bin/chown oracle:dba /dev/raw/raw3
/bin/chown oracle:dba /dev/raw/raw4
/bin/chown oracle:dba /dev/raw/raw5
/bin/chown oracle:dba /dev/raw/raw6
/bin/chown oracle:dba /dev/raw/raw7
/bin/chown oracle:dba /dev/raw/raw8
/bin/chown oracle:dba /dev/raw/raw9
/bin/chown oracle:dba /dev/raw/raw10
/bin/chown oracle:dba /dev/raw/raw11
/bin/chown oracle:dba /dev/raw/raw12
/bin/chown oracle:dba /dev/raw/raw13
/bin/chown oracle:dba /dev/raw/raw14
/bin/chown oracle:dba /dev/raw/raw15
/bin/chown oracle:dba /dev/raw/raw16
/bin/chown oracle:dba /dev/raw/raw17
/bin/chown oracle:dba /dev/raw/raw18
/bin/chown oracle:dba /dev/raw/raw19
/bin/chown oracle:dba /dev/raw/raw20
/bin/chown oracle:dba /dev/raw/raw21
/bin/chown oracle:dba /dev/raw/raw22
/bin/chown oracle:dba /dev/raw/raw23
NOTE:
Keep in mind that the above bind
steps will need to be done for each node
within the RAC cluster on each startup. It
will be placed in a startup script like
/etc/rc.local.
You can verify raw
bindings by using the raw command:
# raw -qa
/dev/raw/raw1: bound to major 58, minor 0
/dev/raw/raw2: bound to major 58, minor 1
/dev/raw/raw3: bound to major 58, minor 2
/dev/raw/raw4: bound to major 58, minor 3
/dev/raw/raw5: bound to major 58, minor 4
/dev/raw/raw6: bound to major 58, minor 5
/dev/raw/raw7: bound to major 58, minor 6
/dev/raw/raw8: bound to major 58, minor 7
/dev/raw/raw9: bound to major 58, minor 8
/dev/raw/raw10: bound to major 58, minor 9
/dev/raw/raw11: bound to major 58, minor 10
/dev/raw/raw12: bound to major 58, minor 11
/dev/raw/raw13: bound to major 58, minor 12
/dev/raw/raw14: bound to major 58, minor 13
/dev/raw/raw15: bound to major 58, minor 14
/dev/raw/raw16: bound to major 58, minor 15
/dev/raw/raw17: bound to major 58, minor 16
/dev/raw/raw18: bound to major 58, minor 17
/dev/raw/raw19: bound to major 58, minor 18
/dev/raw/raw20: bound to major 58, minor 19
/dev/raw/raw21: bound to major 58, minor 20
/dev/raw/raw22: bound to major 58, minor 21
/dev/raw/raw23: bound to major 58, minor 22
Create Symbolic
Links From RAW Volumes (both nodes)
NOTE:
Several of the commands within this section
will need to be performed on every node
within the cluster every time that machine
is booted. Details of these commands and
instructions for placing them in a startup
script are included in the section
All Startup Commands for Each RAC Node.
I generally create
symbolic links from the RAW volumes to human
readable names to make file recognition
easier. If you decide to NOT use symbolic
links then you will need to use the
/dev/pv1/lvolX designations for
the Oracle files you define when creating /
maintaining tablespaces. For some people,
dealing with the cryptic designations (i.e.
/dev/pv1/lvol21) is simply too much
trouble—it is much easier to work with human
readable names. These commands will need to
be issued once on each Linux server. I
typically include the in the
/etc/rc.local startup script. If you
add tablespaces; a new logical volume, RAW
binding and link name should be added to the
various files on all nodes.
mkdir /u01/app/oracle/oradata
mkdir /u01/app/oracle/oradata/orcl
ln -s /dev/raw/raw1 /u01/app/oracle/oradata/orcl/CMQuorumFile
ln -s /dev/raw/raw2 /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile
ln -s /dev/raw/raw3 /u01/app/oracle/oradata/orcl/spfileorcl.ora
ln -s /dev/raw/raw4 /u01/app/oracle/oradata/orcl/control01.ctl
ln -s /dev/raw/raw5 /u01/app/oracle/oradata/orcl/control02.ctl
ln -s /dev/raw/raw6 /u01/app/oracle/oradata/orcl/control03.ctl
ln -s /dev/raw/raw7 /u01/app/oracle/oradata/orcl/cwmlite01.dbf
ln -s /dev/raw/raw8 /u01/app/oracle/oradata/orcl/drsys01.dbf
ln -s /dev/raw/raw9 /u01/app/oracle/oradata/orcl/example01.dbf
ln -s /dev/raw/raw10 /u01/app/oracle/oradata/orcl/indx01.dbf
ln -s /dev/raw/raw11 /u01/app/oracle/oradata/orcl/odm01.dbf
ln -s /dev/raw/raw12 /u01/app/oracle/oradata/orcl/system01.dbf
ln -s /dev/raw/raw13 /u01/app/oracle/oradata/orcl/temp01.dbf
ln -s /dev/raw/raw14 /u01/app/oracle/oradata/orcl/tools01.dbf
ln -s /dev/raw/raw15 /u01/app/oracle/oradata/orcl/undotbs01.dbf
ln -s /dev/raw/raw16 /u01/app/oracle/oradata/orcl/undotbs02.dbf
ln -s /dev/raw/raw17 /u01/app/oracle/oradata/orcl/users01.dbf
ln -s /dev/raw/raw18 /u01/app/oracle/oradata/orcl/xdb01.dbf
ln -s /dev/raw/raw19 /u01/app/oracle/oradata/orcl/perfstat01.dbf
ln -s /dev/raw/raw20 /u01/app/oracle/oradata/orcl/redo01.log
ln -s /dev/raw/raw21 /u01/app/oracle/oradata/orcl/redo02.log
ln -s /dev/raw/raw22 /u01/app/oracle/oradata/orcl/redo03.log
ln -s /dev/raw/raw23 /u01/app/oracle/oradata/orcl/orcl_redo2_2.log
chown -R oracle:dba /u01/app/oracle/oradata
Configuring the
Linux Servers (both nodes)
NOTE:
Several of the commands within this section
will need to be performed on every node
within the cluster every time that machine
is booted. Details of these commands and
instructions for placing them in a startup
script are included in section
All Startup Commands for Each RAC Node.
This section of the
document focuses on configuring both Linux
servers—getting each one prepared for the
Oracle9i RAC installation.
Swap Space
Considerations
- Installing Oracle9i requires
a minimum of 512MB of memory.
(An inadequate amount of swap during
the installation will cause the Oracle
Universal Installer to either "hang" or
"die")
- To check the amount of memory / swap
you have allocated, type either:
# free
- OR -
# cat /proc/swaps
- OR -
# cat /proc/meminfo | grep
MemTotal
- If you have less than 512MB of
memory (between your RAM and SWAP), you
can add temporary swap space by creating
a temporary swap file. This way you do
not have to use a raw device or even
more drastic, rebuild your system.
As root, make a file that will act as
additional swap space, let's say about
300MB:
# dd if=/dev/zero of=tempswap bs=1k
count=300000
Now we should change the file
permissions:
# chmod 600 tempswap
Finally we format the "partition" as
swap and add it to the swap space:
# mke2fs tempswap
# mkswap tempswap
# swapon tempswap
Setting Shared
Memory
Shared memory
allows processes to access common structures
and data by placing them in a shared memory
segment. This is the fastest form of
interprocess communication (IPC)
available—mainly due to the fact that no
kernel involvement occurs when data is being
passed between the processes. Data does not
need to be copied between processes.
Oracle makes use
of shared memory for its Shared Global Area
(SGA), which is an area of memory shared by
all Oracle backup and foreground processes.
Adequate sizing of the SGA is critical to
Oracle performance because it is responsible
for holding the database buffer cache,
shared SQL, access paths, and so much more.
To determine all
shared memory limits, use the following:
# ipcs -lm
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1
Setting SHMMAX
The SHMMAX
parameters defines the maximum size (in
bytes) for a shared memory segment. The
Oracle SGA comprises shared memory and it is
possible that incorrectly setting SHMMAX
could limit the size of the SGA. When
setting SHMMAX, keep in mind that
the size of the SGA should fit within one
shared memory segment. An inadequate
SHMMAX setting could result in the
following:
ORA-27123: unable to attach to shared memory segment
You can determine the
value of SHMMAX by performing the
following:
# cat /proc/sys/kernel/shmmax
33554432
The default value for
SHMMAX is 32MB. This is often too
small to configure the Oracle SGA. I
generally get the SHMMAX parameter
to 2GB using either of the following
methods:
- You can alter the default setting
for SHMMAX without rebooting
the machine by making the changes
directly to the /proc file
system. This is the method that I use by
placing the following into the
/etc/rc.local startup file:
# >echo "2147483648" > /proc/sys/kernel/shmmax
- You can also use the sysctl
command to change the value of
SHMMAX:
# sysctl -w kernel.shmmax=2147483648
- Lastly, you can make this change
permanent by inserting the kernel
parameter in the /etc/sysctl.conf
startup file:
# echo "kernel.shmmax=2147483648" >> /etc/sysctl.con
Setting SHMMNI
We now look at the
SHMMNI parameters. This kernel
parameter is used to set the maximum number
of shared memory segments systemwide. The
default value for this parameter is 4096.
This value is sufficient and typically does
not need to be changed.
You can determine
the value of SHMMNI by performing
the following:
# cat /proc/sys/kernel/shmmni
4096
Setting SHMALL
Finally, we look at the
SHMALL shared memory kernel
parameter. This parameter controls the total
amount of shared memory (in pages) that can
be used at one time on the system. In short,
the value of this parameter should always be
at least:
ceil(SHMMAX/PAGE_SIZE)
The default size of
SHMALL is 2097152 and be queried
using the following command:
# cat /proc/sys/kernel/shmall
2097152
The default setting
for SHMALL should be adequate for
our Oracle9i RAC installation.
NOTE:
The page size in Red Hat Linux on the
i386 platform is 4,096 bytes. You can,
however, use bigpages, which supports
the configuration of larger memory page
sizes.
Setting Semaphores
Now that we have
configured our shared memory settings, it is
time to take care of configuring our
semaphores. The best way to describe a
semaphore is as a counter that is used to
provide synchronization between processes
(or threads within a process) for shared
resources like shared memory. Semaphore sets
are supported in System V where each one is
a counting semaphore. When an application
requests semaphores, it does so using
"sets."
To determine all
semaphore limits, use the following:
# ipcs -ls
------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
You can also use the
following command:
# cat /proc/sys/kernel/sem
250 32000 32 128
Setting SEMMSL
The SEMMSL
kernel parameter is used to control the
maximum number of semaphores per semaphore
set.
Oracle recommends
setting SEMMSL to the largest
PROCESS instance parameter setting in the
init.ora file for all databases on
the Linux system plus 10. Also, Oracle
recommends setting the SEMMSL to a
value of no less than 100.
Setting SEMMNI
The SEMMNI
kernel parameter is used to control the
maximum number of semaphore sets in the
entire Linux system.
Oracle recommends
setting the SEMMNI to a value of no
less than 100.
Setting SEMMNS
The SEMMNS>
kernel parameter is used to control the
maximum number of semaphores (not semaphore
sets) in the entire Linux system.
Oracle recommends
setting the SEMMNS to the sum of
the PROCESSES instance parameter setting for
each database on the system, adding the
largest PROCESSES twice, and then finally
adding 10 for each Oracle database on the
system.
Use the following
calculation to determine the maximum number
of semaphores that can be allocated on a
Linux system. It will be the lesser of:
SEMMNS -or- (SEMMSL * SEMMNI)
Setting SEMOPM
The SEMOPM
kernel parameter is used to control the
number of semaphore operations that can be
performed per semop system call.
The semop
system call (function) provides the ability
to do operations for multiple semaphores
with one semop system call. A
semaphore set can have the maximum number of
SEMMSL semaphores per semaphore set
and is therefore recommended to set
SEMOPM equal to SEMMSL.
Oracle recommends
setting the SEMOPM to a value of no
less than 100.
Setting
Semaphore Kernel Parameters
Finally, we see how to
set all semaphore parameters using several
methods. In the following, the only
parameter I care about changing (raising) is
SEMOPM. All other default settings
should be sufficient for our example
installation.
- You can alter the default setting
for all semaphore settings without
rebooting the machine by making the
changes directly to the /proc
file system. This is the method that I
use by placing the following into the
/etc/rc.local startup file:
# echo "250 32000 100 128" > /proc/sys/kernel/sem
- You can also use the sysctl
command to change the value of all
semaphore settings:
# sysctl -w kernel.sem="250 32000 100 128"
- Finally, you can make this change
permanent by inserting the kernel
parameter in the /etc/sysctl.conf
startup file:
# echo "kernel.sem=250 32000 100 128" >> /etc/sysctl.conf
Setting File
Handles
When configuring
our Red Hat Linux server, it is critical to
ensure that the maximum number of file
handles is sufficiently large. The setting
for file handles denotes the number of open
files that you can have on the Linux system.
Use the following
command to determine the maximum number of
file handles for the entire system:
# cat /proc/sys/fs/file-max
32768
Oracle recommends that
the file handles for the entire system be
set to at least 65536.
- You can alter the default setting
for the maximum number of file handles
without rebooting the machine by making
the changes directly to the /proc
file system. This is the method that I
use by placing the following into the
/etc/rc.local startup file:
# echo "65536" > /proc/sys/fs/file-max
- You can also use the sysctl
command to change the value of
SHMMAX:
# sysctl -w fs.file-max=65536
- Finally, you can make this change
permanent by inserting the kernel
parameter in the /etc/sysctl.conf
startup file:
# echo "fs.file-max=65536" >> /etc/sysctl.conf
NOTE: You can query the
current usage of file handles by using the
following:
# cat /proc/sys/fs/file-nr
613 95 32768
The file-nr file
displays three parameters:
- Total allocated file handles
- Currently used file handles
- Maximum file handles that can be
allocated
NOTE:
If you need to increase the value in
/proc/sys/fs/file-max, then make sure
that the ulimit is set properly. Usually for
2.4.20 it is set to unlimited. Verify the
ulimit setting my issuing the
ulimit command:
# ulimit
unlimited
Configuring the
hangcheck-timer Kernel Module
Oracle 9.0.1 and 9.2.0.1 used a userspace
watchdog daemon called watchdogd to monitor
the health of the cluster and to restart a
RAC mode in case of a failure. Starting with
Oracle 9.2.0.2, however, this daemon has
been deprecated by a Linux kernel module
named hangcheck-timer which addresses
availability and reliability problems much
better. The hangcheck-timer is loaded into
the kernel and checks if the system hangs.
It will set a timer and check the timer
after a certain amount of time. There is a
configurable threshold to hang-check that,
if exceeded will reboot the machine.
Although the hangcheck-timer module is not
required for Oracle Cluster Manager
operation, it is highly recommended by
Oracle.
The hangcheck-timer.o Module
The hangcheck-timer module uses a
kernel-based timer that periodically checks
the system task scheduler to catch delays in
order to determine the health of the system.
If the system hangs or pauses, the timer
resets the node. The hangcheck-timer module
uses the Time Stamp Counter (TSC) CPU
register which is a counter that is
incremented at each clock signal. The TCS
offers much more accurate time measurements
since this register is updated by the
hardware automatically.
Much more information about the
hangcheck-timer project can be found
here.
Installing the hangcheck-timer.o
Module
The hangcheck-timer was normally shipped by
Oracle, however, this module is now included
with Red Hat Linux AS starting with kernel
versions 2.4.9-e.12 and higher. If you
followed the steps in the "Obtaining and
Installing a proper Linux Kernel," the
hangcheck-timer is already included for you.
Use the following to ensure that you have
the module included:
# find /lib/modules -name "hangcheck-timer.o"
/lib/modules/2.4.21-9.0.1.ELorafw1/kernel/drivers/char/hangcheck-timer.o
Configuring and Loading the
hangcheck-timer Module
There are two key parameters to the
hangcheck-timer module.
- hangcheck-tick: This
parameter defines the period of time
between checks of system health. The
default value is 60 seconds. Oracle
recommends to set it to 30 seconds.
- hangcheck-margin:
This parameter defines the maximum hang
delay that should be tolerated before
hangcheck-timer resets the RAC node. It
defines the margin of error in seconds.
The default value is 180 seconds. Oracle
recommends to set it to 180 seconds.
These two parameters
need to be coordinated with the
MissCount parameter in the
$ORACLE_HOME/oracm/admin/cmcfg.ora file
for the Cluster Manager.
NOTE:
The two hangcheck-timer module
parameters indicate how long a RAC node must
hang before it will reset the system. A node
reset will occur when the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)
Now let's talk about
how to load the module. You can load the
module with the correct parameter settings
manually by using the following:
# su -
# /sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
# grep Hangcheck /var/log/messages*
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel: Hangcheck:
starting hangcheck timer 0.8.0 (tick is 30 seconds, margin is 180 seconds).
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel: Hangcheck: Using TSC.
Put the above
"insmod" command in your
/etc/rc.local file!
Although the
manual method for loading the module (above)
will work, we need a way to load the module
with the correct parameters on every reboot
of the node. We do this by making entries in
the /etc/modules.conf file. Add the
following line to the /etc/modules.conf
file:
# su -
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modules.conf
Now, to test the
module, use the modprobe command.
You can run the modprobe command to
manually load the hangcheck-timer
module with the configured parameters
definded in the /etc/modules.conf
file:
# su -
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages*
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel:
Hangcheck: starting hangcheck timer 0.8.0 (tick is 30 seconds, margin is 180 seconds).
/var/log/messages.1:Apr 30 20:51:47 linux2 kernel:
Hangcheck: Using TSC.
NOTE:
You don't have to run modprobe
after each reboot. The hangcheck-timer
module will be loaded by the kernel
(automatically) when needed.
Configure RAC
Nodes for Remote Access
When running the
Oracle Installer on a RAC node, it will use
the rsh command to copy the Oracle
software to all other nodes within the RAC
cluster. The oracle UNIX account on
the node running the Oracle Installer (runIntaller)
must be trusted by all other nodes in your
RAC cluster. This means that you should be
able to run r* commands like rsh,
rcp, and rlogin on this
RAC node against other RAC nodes without a
password. The rsh daemon validates users
using the /etc/hosts.equiv file and
the .rhosts file found in the
user's (oracle's) home directory.
Unfortunatelly, SSH is not supported.
First, let's make
sure that we have the rsh RPMs
installed on each node in the RAC cluster:
# rpm -q rsh rsh-server
rsh-0.17-19
rsh-server-0.17-19
From the above, we can see that we have the
rsh and rsh-server
installed.
NOTE: If rsh is
not installed, run the following command:
# su -
# rpm -ivh rsh-0.17-5.i386.rpm rsh-server-0.17-5.i386.rpm
To enable the "rsh"
service, the "disable" attribute in the
/etc/xinetd.d/rsh file must be set to "no"
and xinetd must be refreshed. This
can be done by running the following
commands:
# su -
# chkconfig rsh on
# chkconfig rlogin on
# service xinetd reload
Reloading configuration: [ OK ]
To allow the "oracle"
UNIX user account to be trusted among the
RAC nodes, create the /etc/hosts.equiv file:
# su -
# touch /etc/hosts.equiv
# chmod 600 /etc/hosts.equiv
# chown root.root /etc/hosts.equiv
Now add all RAC nodes
to the /etc/hosts.equiv file
similar to the following example:
# cat /etc/hosts.equiv
+linux1 oracle
+linux2 oracle
+int-linux1 oracle
+int-linux2 oracle
Be sure that the
/etc/hosts.equiv file exists on all
nodes in your RAC cluster!
NOTE:
In the above example, the second field
permits only the oracle user account to run
rsh commands on the specified
nodes. For security reasons, the
/etc/hosts.equiv file should be owned
by root and the permissions should
be set to 600. In fact, some
systems will only honor the content of this
file if the owner of this file is root
and the permissions are set to 600.
NOTE:
Before attempting to test your rsh
command, ensure that you are using the
correct version of rsh. By default,
Red Hat Linux puts /usr/kerberos/sbin
at the head of the $PATH variable.
This will cause the Kerberos version of
rsh to be executed.
I will typically rename
the Kerberos version of rsh so that
the normal rsh command is being
used. Use the following:
# su -
# which rsh
/usr/kerberos/bin/rsh
# cd /usr/kerberos/bin
# mv rsh rsh.original
# which rsh
/usr/bin/rsh
You should now test your
connections and run the rsh command
against each RAC. I will be using the node
linux1 to perform the install.
# su - oracle
$ rsh int-linux1 ls -l /etc/hosts.equiv
-rw------- 1 root root 68 May 2 14:45 /etc/hosts.equiv
$ rsh int-linux2 ls -l /etc/hosts.equiv
-rw------- 1 root root 68 May 2 14:45 /etc/hosts.equiv
All Startup Commands
for Each RAC Node (both nodes)
Up to this point, we have talked in great
detail about the parameters and resources
that will need to be configured on both
nodes for our Oracle9i RAC
configuration. In this section we will take
a deep breath and recap those parameters,
commands, and entries (in previous sections
of this document) that need to happen on
each node when the machine is booted.
In this section, I provide all of the
commands, parameters, and entries that we
have talked about so far that will need to
be included into all startup scripts for
each Linux node in the RAC cluster. For each
of the startup files below, I have bolded
the entries that should be included in each
of the startup files in order to provide a
successful RAC node.
File:
/etc/modules.conf—
All kernel parameters and modules that
need to configured.
alias eth0 tulip
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias ieee1394-controller ohci1394
alias sound-slot-0 cmpci
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-remove sbp2 rmmod sd_mod
post-install sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -L >/dev/null 2>&1 || :
pre-remove sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -S >/dev/null 2>&1 || :
alias eth1 8139too
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
File:
/etc/sysctl.conf
We wanted to adjust the default and
maximum send buffer size and default and
maximum receive buffer size for our
interconnect.
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Default setting in bytes of the socket receive buffer
net.core.rmem_default=262144
# Default setting in bytes of the socket send buffer
net.core.wmem_default=262144
# Maximum socket receive buffer size which may be set by using
# the SO_RCVBUF socket option
net.core.rmem_max=262144
# Maximum socket send buffer size which may be set by using
# the SO_SNDBUF socket option
net.core.wmem_max=262144
File: /etc/hosts
All machine/IP entries for nodes in our RAC
cluster.
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
192.168.1.100 linux1
192.168.2.100 int-linux1
192.168.1.101 linux2
192.168.2.101 int-linux2
192.168.1.102 alex
192.168.1.105 bartman
File: /etc/hosts.equiv
Allow logins to each node as the oracle
user account without the need for a
password.
+linux1 oracle
+linux2 oracle
+int-linux1 oracle
+int-linux2 oracle
File: /etc/grub.conf
Determine which kernel to use when the node
is booted.
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda3
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Fedora Core (2.4.21-9.0.1.ELorafw1)
root (hd0,0)
kernel /vmlinuz-2.4.21-9.0.1.ELorafw1 ro root=LABEL=/ rhgb
initrd /initrd-2.4.21-9.0.1.ELorafw1.img
title Fedora Core (2.4.22-1.2115.nptl)
root (hd0,0)
kernel /vmlinuz-2.4.22-1.2115.nptl ro root=LABEL=/ rhgb
initrd /initrd-2.4.22-1.2115.nptl.img
File: /etc/rc.local
These commands are responsible for binding
volumes to raw devices, kernel setpoints,
activating volume groups, creating symbolic
links—all to prepare our shared storage for
each node.
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
touch /var/lock/subsys/local
vgscan
vgchange -a y
# +---------------------------------------------------------+
# | SHARED MEMORY |
# +---------------------------------------------------------+
echo "2147483648" > /proc/sys/kernel/shmmax
echo "4096" > /proc/sys/kernel/shmmni
# +---------------------------------------------------------+
# | SEMAPHORES |
# | ---------- |
# | |
# | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value |
# | |
# +---------------------------------------------------------+
echo "256 32000 100 128" > /proc/sys/kernel/sem
# +---------------------------------------------------------+
# | FILE HANDLES |
# ----------------------------------------------------------+
echo "65536" > /proc/sys/fs/file-max
# +---------------------------------------------------------+
# | BIND ALL RAW DEVICES |
# +---------------------------------------------------------+
/usr/bin/raw /dev/raw/raw1 /dev/pv1/lvol1
/usr/bin/raw /dev/raw/raw2 /dev/pv1/lvol2
/usr/bin/raw /dev/raw/raw3 /dev/pv1/lvol3
/usr/bin/raw /dev/raw/raw4 /dev/pv1/lvol4
/usr/bin/raw /dev/raw/raw5 /dev/pv1/lvol5
/usr/bin/raw /dev/raw/raw6 /dev/pv1/lvol6
/usr/bin/raw /dev/raw/raw7 /dev/pv1/lvol7
/usr/bin/raw /dev/raw/raw8 /dev/pv1/lvol8
/usr/bin/raw /dev/raw/raw9 /dev/pv1/lvol9
/usr/bin/raw /dev/raw/raw10 /dev/pv1/lvol10
/usr/bin/raw /dev/raw/raw11 /dev/pv1/lvol11
/usr/bin/raw /dev/raw/raw12 /dev/pv1/lvol12
/usr/bin/raw /dev/raw/raw13 /dev/pv1/lvol13
/usr/bin/raw /dev/raw/raw14 /dev/pv1/lvol14
/usr/bin/raw /dev/raw/raw15 /dev/pv1/lvol15
/usr/bin/raw /dev/raw/raw16 /dev/pv1/lvol16
/usr/bin/raw /dev/raw/raw17 /dev/pv1/lvol17
/usr/bin/raw /dev/raw/raw18 /dev/pv1/lvol18
/usr/bin/raw /dev/raw/raw19 /dev/pv1/lvol19
/usr/bin/raw /dev/raw/raw20 /dev/pv1/lvol20
/usr/bin/raw /dev/raw/raw21 /dev/pv1/lvol21
/usr/bin/raw /dev/raw/raw22 /dev/pv1/lvol22
/usr/bin/raw /dev/raw/raw23 /dev/pv1/lvol23
/bin/chmod 600 /dev/raw/raw1
/bin/chmod 600 /dev/raw/raw2
/bin/chmod 600 /dev/raw/raw3
/bin/chmod 600 /dev/raw/raw4
/bin/chmod 600 /dev/raw/raw5
/bin/chmod 600 /dev/raw/raw6
/bin/chmod 600 /dev/raw/raw7
/bin/chmod 600 /dev/raw/raw8
/bin/chmod 600 /dev/raw/raw9
/bin/chmod 600 /dev/raw/raw10
/bin/chmod 600 /dev/raw/raw11
/bin/chmod 600 /dev/raw/raw12
/bin/chmod 600 /dev/raw/raw13
/bin/chmod 600 /dev/raw/raw14
/bin/chmod 600 /dev/raw/raw15
/bin/chmod 600 /dev/raw/raw16
/bin/chmod 600 /dev/raw/raw17
/bin/chmod 600 /dev/raw/raw18
/bin/chmod 600 /dev/raw/raw19
/bin/chmod 600 /dev/raw/raw20
/bin/chmod 600 /dev/raw/raw21
/bin/chmod 600 /dev/raw/raw22
/bin/chmod 600 /dev/raw/raw23
/bin/chown oracle:dba /dev/raw/raw1
/bin/chown oracle:dba /dev/raw/raw2
/bin/chown oracle:dba /dev/raw/raw3
/bin/chown oracle:dba /dev/raw/raw4
/bin/chown oracle:dba /dev/raw/raw5
/bin/chown oracle:dba /dev/raw/raw6
/bin/chown oracle:dba /dev/raw/raw7
/bin/chown oracle:dba /dev/raw/raw8
/bin/chown oracle:dba /dev/raw/raw9
/bin/chown oracle:dba /dev/raw/raw10
/bin/chown oracle:dba /dev/raw/raw11
/bin/chown oracle:dba /dev/raw/raw12
/bin/chown oracle:dba /dev/raw/raw13
/bin/chown oracle:dba /dev/raw/raw14
/bin/chown oracle:dba /dev/raw/raw15
/bin/chown oracle:dba /dev/raw/raw16
/bin/chown oracle:dba /dev/raw/raw17
/bin/chown oracle:dba /dev/raw/raw18
/bin/chown oracle:dba /dev/raw/raw19
/bin/chown oracle:dba /dev/raw/raw20
/bin/chown oracle:dba /dev/raw/raw21
/bin/chown oracle:dba /dev/raw/raw22
/bin/chown oracle:dba /dev/raw/raw23
# +---------------------------------------------------------+
# | CREATE SYMBOLIC LINKS |
# +---------------------------------------------------------+
mkdir /u01/app/oracle/oradata
mkdir /u01/app/oracle/oradata/orcl
ln -s /dev/raw/raw1 /u01/app/oracle/oradata/orcl/CMQuorumFile
ln -s /dev/raw/raw2 /u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile
ln -s /dev/raw/raw3 /u01/app/oracle/oradata/orcl/spfileorcl.ora
ln -s /dev/raw/raw4 /u01/app/oracle/oradata/orcl/control01.ctl
ln -s /dev/raw/raw5 /u01/app/oracle/oradata/orcl/control02.ctl
ln -s /dev/raw/raw6 /u01/app/oracle/oradata/orcl/control03.ctl
ln -s /dev/raw/raw7 /u01/app/oracle/oradata/orcl/cwmlite01.dbf
ln -s /dev/raw/raw8 /u01/app/oracle/oradata/orcl/drsys01.dbf
ln -s /dev/raw/raw9 /u01/app/oracle/oradata/orcl/example01.dbf
ln -s /dev/raw/raw10 /u01/app/oracle/oradata/orcl/indx01.dbf
ln -s /dev/raw/raw11 /u01/app/oracle/oradata/orcl/odm01.dbf
ln -s /dev/raw/raw12 /u01/app/oracle/oradata/orcl/system01.dbf
ln -s /dev/raw/raw13 /u01/app/oracle/oradata/orcl/temp01.dbf
ln -s /dev/raw/raw14 /u01/app/oracle/oradata/orcl/tools01.dbf
ln -s /dev/raw/raw15 /u01/app/oracle/oradata/orcl/undotbs01.dbf
ln -s /dev/raw/raw16 /u01/app/oracle/oradata/orcl/undotbs02.dbf
ln -s /dev/raw/raw17 /u01/app/oracle/oradata/orcl/users01.dbf
ln -s /dev/raw/raw18 /u01/app/oracle/oradata/orcl/xdb01.dbf
ln -s /dev/raw/raw19 /u01/app/oracle/oradata/orcl/perfstat01.dbf
ln -s /dev/raw/raw20 /u01/app/oracle/oradata/orcl/redo01.log
ln -s /dev/raw/raw21 /u01/app/oracle/oradata/orcl/redo02.log
ln -s /dev/raw/raw22 /u01/app/oracle/oradata/orcl/redo03.log
ln -s /dev/raw/raw23 /u01/app/oracle/oradata/orcl/orcl_redo2_2.log
chown -R oracle:dba /u01/app/oracle/oradata
/sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
Update Red Hat
Linux System (Oracle Metalink Note
#252217.1) (both nodes)
The following RPMs, all of which are
available on the Red Hat Fedora Core 1 CDs,
will need to be updated as per the steps
described in Metalink Note #252217.1,
"Requirements for Installing Oracle9iR2
on RHEL3."
All of these packages will need to be
installed as the root user.
From Fedora
Core 1 / Disk #1
# cd /mnt/cdrom/Fedora/RPMS
# rpm -Uvh libpng10-1.0.13-9.i386.rpm
From Fedora Core
1 / Disk #2
# cd /mnt/cdrom/Fedora/RPMS
# rpm -Uvh gnome-libs-1.4.1.2.90-35.i386.rpm
From Fedora Core
1 / Disk #3
# cd /mnt/cdrom/Fedora/RPMS
# rpm -Uvh compat-libstdc++-7.3-2.96.118.i386.rpm
# rpm -Uvh compat-libstdc++-devel-7.3-2.96.118.i386.rpm
# rpm -Uvh compat-db-4.0.14-2.i386.rpm
# rpm -Uvh compat-gcc-7.3-2.96.118.i386.rpm
# rpm -Uvh compat-gcc-c++-7.3-2.96.118.i386.rpm
# rpm -Uvh sysstat-4.0.7-5.i386.rpm
# rpm -Uvh openmotif21-2.1.30-8.i386.rpm
# rpm -Uvh pdksh-5.2.14-23.i386.rpm
Set gcc296 and
g++296 in PATH
Put gcc296 and g++296
first in $PATH variable by creating
the following symbolic links:
# mv /usr/bin/gcc /usr/bin/gcc323
# mv /usr/bin/g++ /usr/bin/g++323
# ln -s /usr/bin/gcc296 /usr/bin/gcc
# ln -s /usr/bin/g++296 /usr/bin/g++
Check hostname
Make sure the hostname command
returns a fully qualified host name by
amending the /etc/hosts file if
necessary:
# hostname
Install the 3006854 patch: The Oracle/Linux
Patch 3006854 can be downloaded
here.
# unzip p3006854_9204_LINUX.zip
# cd 3006854
# sh rhel3_pre_install.sh
Reboot the System
At this point, reboot
all nodes within the RAC cluster before
attempting to install the Oracle software.
# init 6
|
|
|
|
Download/Unpack the Oracle9i
Installation Files
It is now time to
download and extract the Oracle9i
RDBMS software for Linux. As of March
26, 2004, Oracle includes the Oracle9i
RDBMS software with the 9.2.0.4.0
patchset already included. This will
save considerable time since the
patchset does not have to be downloaded
and installed.
- Login as the newly created
"oracle" user account. (su -
oracle). Most of the actions
throughout the rest of this document
should be done as the "oracle" user
account unless otherwise noted.
-
Download Oracle9i (9.2.0.4.0)
for Linux x86 from OTN. (If you
do not currently have an OTN
account, you will need to create
one. This is a FREE account!)
- Save the following files to a
temporary directory:
-
ship_9204_linux_disk1.cpio.gz
(538,906,295 bytes)
-
ship_9204_linux_disk2.cpio.gz
(632,756,922 bytes)
-
ship_9204_linux_disk3.cpio.gz
(296,127,243 bytes)
- Run "gunzip <filename>" on all
the files (e.g.,gunzip
ship_9204_linux_disk1.cpio.gz)
- Extract the cpio archives with
the command cpio -idmv <
<filename> (e.g., cpio -idmv
< ship_9204_linux_disk1.cpio)
NOTE: Some browsers
will uncompress the files but leave
the extension the same (gz) when
downloading. If the above steps do
not work for you, try skipping step
1 and go directly to step 2 without
changing the filename (e.g.,
"cpio -idmv <
ship_9204_linux_disk1.cpio.gz"
- You should now have three
directories called "Disk1,
Disk2 and Disk3"
containing the Oracle9i
Installation files:
/Disk1
/Disk2
/Disk3
Install
Oracle9i Cluster Manager
Introduction
At this point, all of the
pre-installation steps should have been
performed on all RAC nodes and
re-booted.
This section of the document provides
the instructions for installing and
configuring the Oracle9i Cluster
Manager (Node Monitor) oracm
software on all RAC nodes. The
runInstaller command only needs to
be run from one of the RAC nodes. I will
be running the Oracle Installer from
linux1.
Create the
Cluster Manager Quorum File
If we were using a Clustered File
System, it would be appropriate at this
time to create the Cluster Manager
Quorum File. Because we are using raw
devices, however, we simply need to
recognize the raw partition we will be
using for the Cluster Manager Quorum
File and then create a symbolic link to
the raw device.
For our example, we already created a
5MB RAW partition, named it
/dev/raw/raw1 and created the
symbolic link that pointed to it:
# su - oracle
$ ls -l oradata/orcl/CMQuorumFile
lrwxrwxrwx 1 oracle dba 13 May 2 20:28 oradata/orcl/CMQuorumFile -> /dev/raw/raw1
Testing
rsh
You should test your
connections and run the rsh
command against each RAC. I will be
using the node linux1 to
perform the install.
# su - oracle
$ rsh int-linux1 hostname
linux1
$ rsh int-linux2 hostname
linux2
Installing
Oracle9i Cluster Manager
To install the
Oracle Cluster Manager software, simply
navigate to Disk1 of the Oracle
installation software and run the
runInstaller command:
$ su - oracle
$ cd oracle_install/Disk1
$ ./runInstaller
Initializing Java Virtual Machine from /tmp/OraInstall2004-05-02_08-45-13PM/jre/bin/java. Please wait...
|
Screen Name |
Response |
| Welcome Screen |
Click "Next" |
| Inventory Location |
Click "OK" |
| UNIX Group Name |
Use "dba" |
| Root Script Window |
Open another window, login
as the root userid, and run
"/tmp/orainstRoot.sh". When the
script has completed, return to
the dialog from the Oracle
Installer and hit Continue. |
| File Locations |
Leave the "Source Path" at
its default setting. For the
Destination name, I like to use
"Oracle9iRAC." You can leave the
Destination path at its default
value, which should be
/u01/app/oracle/product/9.2.0. |
| Available Products |
Select "Oracle Cluster
Manager 9.2.0.4.0" |
| Public Node
Information |
Public Node 1: linux1
Public Node 2: linux2 |
| Private Node
Information |
Private Node 1:
int-linux1
Private Node 2: int-linux2 |
| Quorum Disk
Information |
/u01/app/oracle/oradata/orcl/CMQuorumFile |
| Summary |
Click "Install" |
When the installation of Oracle
Cluster Manager is complete, simply
click the "Exit" button.
Configuring
Oracle9i Cluster Manager (both
nodes)
If this installation were using
Oracle Cluster Manager below 9.2.0.2, we
would need to configure the watchdog
daemon. This is not necessary because
the version we are using is 9.2.0.4.0.
Starting with 9.2.0.2.0, Oracle has
replaced the watchdog daemon
with the hangcheck-timer kernel
module. We will need to update some of
those configuration files and this
section describes those changes.
ADD the following line to the
$ORACLE_HOME/oracm/admin/cmcfg.ora
file:
KernelModuleName=hangcheck-timer
ADJUST the value of the MissCount
parameter in the
$ORACLE_HOME/oracm/admin/cmcfg.ora
file based on the sum of the
hangcheck_tick and
hangcheck_margin values. The
MissCount parameter must be set to
a minimum value of 60 and it must be
greater than the sum of
hangcheck_tick + hangcheck_margin.
In our example, hangcheck_tick +
hangcheck_margin is 210. I will
therefore set MissCount in
$ORACLE_HOME/oracm/admin/cmcfg.ora
to the value 210.
MissCount=210
Your
$ORACLE_HOME/oracm/admin/cmcfg.ora>
file should now look similar to the
following:
HeartBeat=15000
ClusterName=Oracle Cluster Manager, version 9i
PollInterval=1000
MissCount=210
PrivateNodeNames=int-linux1 int-linux2
PublicNodeNames=linux1 linux2
ServicePort=9998
CmDiskFile=/u01/app/oracle/oradata/orcl/CMQuorumFile
HostName=int-linux1
KernelModuleName=hangcheck-timer
Starting and
Stopping Oracle9i Cluster Manager
(both nodes)
You should now start the the Cluster
Manager (CM) and Node Monitor oracm.
To do this, run the following commands
on all nodes in the RAC cluster:
# su -
# . ~oracle/.bash_profile # Set Oracle environment
# $ORACLE_HOME/oracm/bin/ocmstart.sh
oracm </dev/null 2>&1 >/u01/app/oracle/product/9.2.0/oracm/log/cm.out &
After starting the Cluster Manager (CM)
and Node Monitor, check to make sure it
is running:
# ps -ef | grep oracm | grep -v 'grep'
root 5476 1 0 18:24 pts/1 00:00:00 oracm
root 5478 5476 0 18:24 pts/1 00:00:00 oracm
root 5479 5478 0 18:24 pts/1 00:00:00 oracm
root 5480 5478 0 18:24 pts/1 00:00:00 oracm
root 5481 5478 0 18:24 pts/1 00:00:00 oracm
root 5483 5478 0 18:24 pts/1 00:00:00 oracm
root 5484 5478 0 18:24 pts/1 00:00:00 oracm
root 5485 5478 0 18:24 pts/1 00:00:00 oracm
root 5486 5478 0 18:24 pts/1 00:00:00 oracm
root 5491 5478 0 18:24 pts/1 00:00:00 oracm
If you need to stop the oracm
process, you have to kill it at the O/S
level:
# su -
# pkill oracm
If you would like more information on
administering the Oracle Cluster
Manager, see
Oracle9i Real Application Clusters
Administration.
NOTE:
If you only see one oracm
process in the process table when you
run the ps command, it is
probably because you have the procps
RPM installed. The ps command
that comes with the procps RPM does not
show a thread as a separate process in
the ps output.
# rpm -qf /bin/ps
procps-2.0.17-1
Installing
Oracle9i RAC
Required RPMs
Before attempting to install the Oracle9i
Real Application Cluster 9.2.0.4.0
software (RAC software + database
software), you need to ensure
that the pdksh and ncurses4
RPMs are installed on all RAC nodes. If
these packages are not installed, you
will get the following error message
when attempting to run the
$ORACLE_HOME/root.sh file on each
RAC node during the software
installation:
...
error: failed dependencies:
libncurses.so.4 is needed by orclclnt-nw_lssv.Build.71-1
error: failed dependencies:
orclclnt = nw_lssv.Build.71-1 is needed by orcldrvr-nw_lssv.Build.71-1
error: failed dependencies:
orclclnt = nw_lssv.Build.71-1 is needed by orclnode-nw_lssv.Build.71-1
orcldrvr = nw_lssv.Build.71-1 is needed by orclnode-nw_lssv.Build.71-1
libscsi.so is needed by orclnode-nw_lssv.Build.71-1
libsji.so is needed by orclnode-nw_lssv.Build.71-1
error: failed dependencies:
orclclnt = nw_lssv.Build.71-1 is needed by orclserv-nw_lssv.Build.71-1
orclnode = nw_lssv.Build.71-1 is needed by orclserv-nw_lssv.Build.71-1
/bin/ksh is needed by orclserv-nw_lssv.Build.71-1
package orclman-nw_lssv.Build.71-1 is already installed
** Installation of LSSV did not succeed. Please refer
** to the Installation Guide at http://www.legato.com/LSSV
** and contact Oracle customer support if necessary.
Before attempting the installation,
check that the two required RPMs are
installed by running the following
command:
# rpm -q pdksh ncurses4
pdksh-5.2.14-23
ncurses4-5.0-12
If you need to install these RPMs, then
locate the RPMs on the Red Hat Linux CDs
and run the following:
# su -
# rpm -Uvh pdksh-5.2.14-23.i386.rpm ncurses4-5.0-12.i386.rpm
Creating the
Shared Configuration File for srvctl
If we were using a Clustered File
System, it would be appropriate at
this time to create the Shared
Configuration File. Because we are
using raw devices, however, we simply
need to recognize the raw partition we
will be using for the Cluster Manager
Quorum File and then create a symbolic
link to the raw device.
For our example, we already created a
100MB RAW partition, named it
/dev/raw/raw2 and created the
symbolic link that pointed to it:
# su - oracle
$ ls -l oradata/orcl/SharedSrvctlConfigFile
lrwxrwxrwx 1 oracle dba 13 May 2 20:28 oradata/orcl/SharedSrvctlConfigFile -> /dev/raw/raw2
Removing
system01.dbf Symbolic Link
Before starting the
Oracle Universal Installer for Oracle9i
RAC, you will need to remove the
symbolic link you have for the SYSTEM
tablespace (system01.dbf)
on all nodes in the RAC
cluster. After the Oracle installation
is complete, we will be re-creating the
symbolic link. To remove the symbolic
link:
# su - oracle
$ rm oradata/orcl/system01.dbf
If you fail to remove this symbolic
link, the following error will be
display before the installation can
start:
Installing
Oracle9i 9.2.0.4.0 Database
Software with Oracle9i RAC
To install the
Oracle9i 9.2.0.4.0 Database
Software with Oracle9i RAC
software, simply navigate to Disk1
of the Oracle installation software and
run the runInstaller command:
$ su - oracle
$ cd oracle_install/Disk1
$ ./runInstaller
Initializing Java Virtual Machine from /tmp/OraInstall2004-05-03_06-51-06PM/jre/bin/java. Please wait...
| Screen Name |
Response |
| Welcome Screen |
Click "Next" |
| Cluster Node
Selection |
Select (highlight) all RAC
nodes by using the shift key and
clicking each node with the left
mouse button. If all of the
nodes in your RAC cluster are
not showing up, or if the Node
Selection Screen does not
appear, then the Oracle Cluster
Manager (Node Manager) oracm> is
probably not running on all RAC
nodes. For more information, see
Starting and Stopping Oracle9i
Cluster Manager. under the
"Installing Oracle9i Cluster
Manager" section. |
| Inventory Location |
Keep all defaults and click
"OK" |
| Available Products |
Select "Oracle9i Database
9.2.0.4.0" and click "Next" |
| Installation Types |
Select "Enterprise Edition
(2.84GB)" and click "Next" |
| Database
Configuration: |
Select Software Only" and
click "Next" |
| Shared Configuration
File Name |
/u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile |
| Summary |
Click "Install" |
Notes During the
Oracle Installation
- Errors during the "Link" phase (ins_oemagent.mk
/ ins_ctx.mk): You should
not receive any errors during the
Link phase. For those of you
familiar with installing Oracle on
Linux, receiving errors during the
Linux phase should not be anything
new to you. I was surprised to not
receive any errors during the link
phase of this install, as I did when
installing Oracle 9.2.0.4.0 on
Fedora Core 1. If you do receive any
errors during the linking phase, you
can read my article "Installing
Oracle9i (9.2.0.4.0) on Red Hat
Linux (Fedora Core 1)."
- Performing remote operations
(99%): During the installation, a
dialog displays "Performing
remote operations (99%)", you
will see a command similar to the
following (below) running on the RAC
nodes:
$ ps -ef | grep cpio | grep -v 'grep'
oracle 7902 7901 0 21:07 ? 00:00:00 bash -c /bin/sh -c cd /; cpio -idmuc
oracle 7910 7902 14 21:07 ? 00:00:09 cpio -idmuc
If you see the above command
running, it shows that the Oracle
software is currently being
installed (copied) to all RAC
node(s).
There are still reported bugs in
the Oracle Installer that prevent
Oracle (sometimes) from installing
the software on all nodes in the RAC
cluster. If the Installer hangs at "Performing
remote operations (99%)" and
the above BASH command is NOT
running on the Oracle RAC nodes
anymore, then you will need to abort
the installation. If you continue to
receive errors during this phase of
the installation, the only
workaround (at the time of this
writing) would be to run
runInstaller on all RAC nodes
to install the software on each RAC
node separately.
- Running root.sh Script:
When the "Link" phase is complete,
you will be prompted to run the
$ORACLE_HOME/root.sh script as
the "root" user account.
NOTE: Before running the root.sh
script on any nodes, you will need
to perform several manual actions:
Edit the root.sh script
and near the bottom of the script (I
believe line # 249); if the line
reads:
if [ ! -f "$OPSCONFIG" ];then
then change it to:
if [ -f "$OPSCONFIG" ];then
Second, create the
srvConfig.loc file:
# su -
# mkdir -p /var/opt/oracle
# touch /var/opt/oracle/srvConfig.loc
# /u01/app/oracle/product/9.2.0/root.sh
When running the root.sh
script, ensure that you run it on
ALL RAC servers before clicking "OK"
in the Oracle installation dialog
box.
- Oracle Enterprise Manager
Console: When the Oracle Enterprise
Manager Console comes up, exit from
the application. We will be creating
the Oracle cluster database in a
later section.
Post
Installation Step
- Replace system01.dbf
Symbolic Link
Earlier in this section, you
needed to remove the symbolic link
on all nodes in the RAC cluster that
pointed to the datafile (system01.dbf)—the
datafile that will be used for the
SYSTEM tablespace. Now that the
installation is complete, you can
now re-create this symbolic link on
all nodes in the RAC cluster:
# su - oracle
$ ln -s /dev/raw/raw12 /u01/app/oracle/oradata/orcl/system01.dbf
- Create Missing Directories Not
Replicated on Remote Nodes
After the Oracle9i RAC
software installation, some
directories may not get created. You
should check for the following
directories and if they do not
exist, you should create them as the
oracle UNIX user:
# su - oracle
# For Cluster Manager
$ mkdir -p $ORACLE_HOME/oracm/log
# For SQL*Net Listener
$ mkdir -p $ORACLE_HOME/network/log
$ mkdir -p $ORACLE_HOME/network/trace
# For database instances
$ mkdir -p $ORACLE_HOME/rdbms/log
$ mkdir -p $ORACLE_HOME/rdbms/audit
# For Oracle Intelligent Agent
$ mkdir -p $ORACLE_HOME/network/agent/log
$ mkdir -p $ORACLE_HOME/network/agent/reco
# For Oracle HTTP Server (Apache)
$ mkdir -p $ORACLE_HOME/Apache/Agent/logs
$ mkdir -p $ORACLE_HOME/Apache/Jserv/logs
- Initialize the Shared
Configuration File
Before attempting to initialize
Shared Configuration File, make sure
that the Oracle Global Services
daemon is NOT running, by using the
following command:
# su - oracle
$ gsdctl stat
GSD is not running on the local node
If the deamon is running, then shut
it down.
To initialize the Shared
Configuration File by running the
following command only on one RAC
node:
# su - oracle
$ srvconfig -init
NOTE: If you
receive a PRKR-1025 error
when attempting to run the
srvconfig -init command, check
that you have the valid entry for "srvconfig_loc"
in your
/var/opt/oracle/srvConfig.loc
file and that the file is owned by "oracle".
This entry gets created by the
root.sh.
If you receive a PRKR-1064
error when attempting to run the
srvconfig -init command, then
check if
/u01/app/oracle/oradata/orcl/SharedSrvctlConfigFile
file is accessable by all RAC nodes:
$ cd ~oracle/oradata/orcl
$ ls -l SharedSrvctlConfigFile
lrwxrwxrwx 1 oracle dba 13 May 2 20:17 SharedSrvctlConfigFile -> /dev/raw/raw2
- Start Oracle Global Services
After initializing the Shared
Configuration File, you will need to
manually start the Oracle Global
Services daemon (gsd) to
ensure that it works. At this point
in the installation, the Global
Services daemon should be down. To
confirm this, run the following
command:
#su - oracle
$ gsdctl stat
GSD is not running on the local node
Let's manually start the Global
Services daemon (gsd) by
running the following command on
all nodes in the RAC cluster:
# su - oracle
$ gsdctl start
Successfully started GSD on local node
- Check Node Name and Node Number
Mappings
In most cases, the Oracle Global
Services daemon (gsd)
should successfully start on all
local nodes in the RAC cluster.
There are problems, however, where
the node name and node
number mappings are not correct
in the cmcfg.ora file on
node 2. This does not happen very
often, but it has happened to me on
at least one occasion.
If the node name and
node number mappings are not
correct, it will not show up until
you attempt to run the Database
Configuration Assistant (dbca)—the
assistant we will be using later to
create our cluster database. The
error reported by the DBCA will say
something to the effect, "gsd daemon
has not been started on node 2".
To check that the node name and
number mappings are correct on your
cluster, run the following command
on both your nodes:
Listing for node1:
$ lsnodes -n
linux1 0
linux2 1
Listing for node2:
$ lsnodes -n
linux2 1
linux1 0
The above example shows that my node
name to node number mappings are
correct. If you run into the problem
where the node name and node number
are not mapped correctly, you should
make the changes to node 2 in the
cmcfg.ora file. Edit the
file to change the ordering for the
following entries:
PrivateNodeNames=int-linux1 int-linux2
PublicNodeNames=linux1 linux2
- Update Node Startup Script (/etc/rc.local)
Throughout this article, I have
been inserting all the commands in
the file /etc/rc.local that
should be run on each node for our
Oracle9i RAC configuration.
There are several more commands
that should be put in at this time.
Put the following commands at the
end of your /etc/rc.local
file:
...
. ~oracle/.bash_profile
rm -rf $ORACLE_HOME/oracm/log/*.ts
$ORACLE_HOME/oracm/bin/ocmstart.sh
su - oracle -c "gsdctl start"
su - oracle -c "lsnrctl start"
- Reboot All Nodes
Before attempting to create the
Oracle cluster database, I would
reboot all nodes within the cluster.
I had problems at one time with the
database creation (using DBCA) and
rebooting all nodes after the
Oracle9i RAC software
installation and before the creation
of the cluster database creation,
helped in resolving this issue.
This also provides a chance to
ensure that any of the additions we
made to our /etc/rc.local
startup file, are being run.
- Verify Node Configuration
After rebooting each of the nodes
in the RAC cluster, here is a list
of commands I manually run to ensure
that each node is configured
correctly and the startup scripts
are correctly running their required
tasks.
Keep in mind, that you should run
the following on each node in the
RAC cluster:
# su -
# raw -qa
>/dev/raw/raw1: bound to major 58, minor 0
/dev/raw/raw2: bound to major 58, minor 1
/dev/raw/raw3: bound to major 58, minor 2
/dev/raw/raw4: bound to major 58, minor 3
/dev/raw/raw5: bound to major 58, minor 4
/dev/raw/raw6: bound to major 58, minor 5
/dev/raw/raw7: bound to major 58, minor 6
/dev/raw/raw8: bound to major 58, minor 7
/dev/raw/raw9: bound to major 58, minor 8
/dev/raw/raw10: bound to major 58, minor 9
/dev/raw/raw11: bound to major 58, minor 10
/dev/raw/raw12: bound to major 58, minor 11
/dev/raw/raw13: bound to major 58, minor 12
/dev/raw/raw14: bound to major 58, minor 13
/dev/raw/raw15: bound to major 58, minor 14
/dev/raw/raw16: bound to major 58, minor 15
/dev/raw/raw17: bound to major 58, minor 16
/dev/raw/raw18: bound to major 58, minor 17
/dev/raw/raw19: bound to major 58, minor 18
/dev/raw/raw20: bound to major 58, minor 19
/dev/raw/raw21: bound to major 58, minor 20
/dev/raw/raw22: bound to major 58, minor 21
/dev/raw/raw23: bound to major 58, minor 22
# lvscan
lvscan -- ACTIVE "/dev/pv1/lvol1" [5 MB]
lvscan -- ACTIVE "/dev/pv1/lvol2" [100 MB]
lvscan -- ACTIVE "/dev/pv1/lvol3" [10 MB]
lvscan -- ACTIVE "/dev/pv1/lvol4" [200 MB]
lvscan -- ACTIVE "/dev/pv1/lvol5" [200 MB]
lvscan -- ACTIVE "/dev/pv1/lvol6" [200 MB]
lvscan -- ACTIVE "/dev/pv1/lvol7" [55 MB]
lvscan -- ACTIVE "/dev/pv1/lvol8" [25 MB]
lvscan -- ACTIVE "/dev/pv1/lvol9" [255 MB]
lvscan -- ACTIVE "/dev/pv1/lvol10" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol11" [55 MB]
lvscan -- ACTIVE "/dev/pv1/lvol12" [805 MB]
lvscan -- ACTIVE "/dev/pv1/lvol13" [255 MB]
lvscan -- ACTIVE "/dev/pv1/lvol14" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol15" [405 MB]
lvscan -- ACTIVE "/dev/pv1/lvol16" [405 MB]
lvscan -- ACTIVE "/dev/pv1/lvol17" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol18" [155 MB]
lvscan -- ACTIVE "/dev/pv1/lvol19" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol20" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol21" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol22" [105 MB]
lvscan -- ACTIVE "/dev/pv1/lvol23" [105 MB]
lvscan -- 23 logical volumes with 3.88 GB total in 1 volume group
lvscan -- 23 active logical volumes
$ ls -l ~oracle/oradata/orcl
total 0
lrwxrwxrwx 1 oracle dba 13 May 5 17:01 CMQuorumFile -> /dev/raw/raw1
lrwxrwxrwx 1 oracle dba 13 May 5 17:01 control01.ctl -> /dev/raw/raw4
lrwxrwxrwx 1 oracle dba 13 May 5 17:01 control02.ctl -> /dev/raw/raw5
lrwxrwxrwx 1 oracle dba 13 May 5 17:01 control03.ctl -> /dev/raw/raw6
lrwxrwxrwx 1 oracle dba 13 May 5 20:38 cwmlite01.dbf -> /dev/raw/raw7
lrwxrwxrwx 1 oracle dba 13 May 5 20:38 drsys01.dbf -> /dev/raw/raw8
lrwxrwxrwx 1 oracle dba 13 May 5 20:38 example01.dbf -> /dev/raw/raw9
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 indx01.dbf -> /dev/raw/raw10
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 odm01.dbf -> /dev/raw/raw11
lrwxrwxrwx 1 oracle dba 14 May 5 17:01 orcl_redo2_2.log -> /dev/raw/raw23
lrwxrwxrwx 1 oracle dba 14 May 5 17:01 perfstat01.dbf -> /dev/raw/raw19
lrwxrwxrwx 1 oracle dba 14 May 5 17:01 redo01.log -> /dev/raw/raw20
lrwxrwxrwx 1 oracle dba 14 May 5 17:01 redo02.log -> /dev/raw/raw21
lrwxrwxrwx 1 oracle dba 14 May 5 17:01 redo03.log -> /dev/raw/raw22
lrwxrwxrwx 1 oracle dba 13 May 5 17:01 SharedSrvctlConfigFile -> /dev/raw/raw2
lrwxrwxrwx 1 oracle dba 13 May 5 17:01 spfileorcl.ora -> /dev/raw/raw3
lrwxrwxrwx 1 oracle dba 14 May 5 18:58 system01.dbf -> /dev/raw/raw12
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 temp01.dbf -> /dev/raw/raw13
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 tools01.dbf -> /dev/raw/raw14
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 undotbs01.dbf -> /dev/raw/raw15
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 undotbs02.dbf -> /dev/raw/raw16
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 users01.dbf -> /dev/raw/raw17
lrwxrwxrwx 1 oracle dba 14 May 5 20:38 xdb01.dbf -> /dev/raw/raw18 $
ps -ef | grep oracm | grep -v 'grep'
root 5476 1 0 19:28 pts/1 00:00:00 oracm
root 5478 5476 0 19:28 pts/1 00:00:00 oracm
root 5479 5478 0 19:28 pts/1 00:00:00 oracm
root 5480 5478 0 19:28 pts/1 00:00:00 oracm
root 5481 5478 0 19:28 pts/1 00:00:00 oracm
root 5483 5478 0 19:28 pts/1 00:00:00 oracm
root 5484 5478 0 19:28 pts/1 00:00:00 oracm
root 5485 5478 0 19:28 pts/1 00:00:00 oracm
root 5486 5478 0 19:28 pts/1 00:00:00 oracm
root 5491 5478 0 19:28 pts/1 00:00:00 oracm $
gsdctl stat
GSD is running on the local node $
lsnodes -n
linux1 0
linux2 1
$ srvctl status database -d orcl
PRKR-1007 : getting of cluster database orcl configuration failed,
PRKR-1001 : cluster database orcl does not exist
PRKO-2005 : Application error: Failure in getting Cluster Database Configuration for: orcl
NOTE: When we run the
srvctl status database -d orcl
command (above), we WANT to get the
PRKR-1007, PRKR-1001, and PRKO-2005
errors. If we were to get the following
results from the command:
Instance orcl1 is not running on node linux1 Instance orcl2 is not running on node linux2
then the Oracle installer created the
orcl database. This database will need
to be deleted within the DBCA BEFORE
creating the Oracle clustered database.
Create the
Oracle Database
We will be using the Oracle Database
Configuration Assistant (DBCA) to create
our clustered database on the shared
storage (FireWire) device.
NOTE: Keep in mind
that on several occassions for me the
Oracle Universal Installer created an
Oracle Database named orcl.
You will need to delete this database
BEFORE attempting to create your
clustered database. To start the
database creation process, run the
following:
# su - oracle
$ dbca -datafileDestination /u01/app/oracle/oradata &
| Screen Name |
Response |
| Type of Database |
Select "Oracle Cluster
Database" and click "Next" |
| Operations |
Select "Create a database"
and click "Next" |
| Node Selection |
Click the "Select All"
button to the right. If all of
the nodes in your RAC cluster
are not showing up, or if the
Node Selection Screen does not
appear, then the Oracle Cluster
Manager (Node Manager) oracm
is probably not running on all
RAC nodes. For more information,
see
Starting and Stopping Oracle9i
Cluster Manager under the
"Installing Oracle9i
Cluster Manager" section. |
| Database Templates |
Select "New Database" and
click "Next" |
| Database
Identification |
Global Database Name:
orcl
SID Prefix: orcl |
| Database Features |
For your new database, you
can keep all database features
selected. I typically do. If you
want to, however, you can clear
any of the boxes to not install
the feature in your new
database.
Click "Next" when finished. |
| Database Connection
Options |
Select "Dedicated Server
Mode" and click "Next" |
| Initialization
Parameters |
Click "Next" |
| Database Storage |
If you have followed this
article and created all symbolic
links, then the datafiles for
all tablespaces should match the
DBCA. I do, however, change the
initial size for each
tablespace. To do this,
negotiate through the navigation
tree for all tablespaces and
change the value for the
following tablespaces:
- CWMLITE: 50MB
- DRSYS: 20MB
- EXAMPLE: 250MBMB
- INDX: 100MB
- ODM: 50MB
- SYSTEM: 800MB
- TEMP: 250MB
- TOOLS: 100MB
- UNDOTBS1: 400MB
- UNDOTBS2: 400MB
- USERS: 100MB
- XDB: 150MB
If you need to, select
appropriate files and then click
"Next" |
| Creation Options |
Click here for a snapshot of
the options I used to create my
cluster database
When you are ready to start
the database creation process,
click "Finish" |
| Summary |
Click "OK" |
When prompted to "Perform Another
Operation", click "No".
Notes During the
Oracle RAC Database Creation Process
- ORA-29807 Error
Within the "Creating data
dictionary views" phase of the
database creation process, you will
receive an ORA-29807 error. If you
look in the log file, you will see
the following:
drop operator XMLSequence
*
ERROR at line 1:
ORA-29807: specified operator does not exist
This is a known issue (Bug: 2686156)
and can be ignored. To continue the
database creation process, hit the
"Ignore" button:
- ORA-01430 Error
Within the "Adding Oracle
Spatial" phase of the database
creation process, you will receive
an ORA-01430 error. If you look in
the log file, you will see the
following:
(SDO_ROOT_MBR mdsys.sdo_geometry)
*
ERROR at line 2:
ORA-01430: column being added already exists in table
This is a known issue and can be
ignored. To continue the database
creation process, hit the "Ignore"
button:
When the DBCA has completed, you will
have a fully functional Oracle RAC
cluster running.
Creating TNS
Networking Files
listener.ora
You should not need to make any
changes to your listener.ora
file, which is created by the Oracle
installer. All instances on the node
will automatically register with the
listener.
listener.ora
# LISTENER.ORA.LINUX1 Network Configuration File:
# /u01/app/oracle/product/9.2.0/network/admin/listener.ora.linux1
# Generated by Oracle configuration tools.
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
)
)
)
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /u01/app/oracle/product/9.2.0)
(PROGRAM = extproc)
)
(SID_DESC =
(ORACLE_HOME = /u01/app/oracle/product/9.2.0)
(SID_NAME = orcl1)
)
)
tnsnames.ora
Here is a copy of my tnsnames.ora
file that I have configured for
Transparent Application Failover (TAF).
You can put this file on each node in
the RAC cluster in the directory
$ORACLE_HOME/network/admin.
tnsnames.ora
# TNSNAMES.ORA Network Configuration File:
# /u01/app/oracle/product/9.2.0/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.
LISTENERS_ORCL =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
)
LISTENER_ORCL1 =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
LISTENER_ORCL2 =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
ORCL =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
(LOAD_BALANCE = yes)
(FAILOVER = yes)
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
(FAILOVER_MODE =
(TYPE = session)
(METHOD = basic)
)
)
)
ORCL1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = orcl)
(INSTANCE_NAME = orcl1)
)
)
ORCL2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = orcl)
(INSTANCE_NAME = orcl2)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
INST1_HTTP =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = SHARED)
(SERVICE_NAME = MODOSE)
(PRESENTATION = http://HRService)
)
)
Verifying the
RAC Cluster/Database Configuration
When the DBCA has completed, you will
have a fully functional Oracle RAC
cluster running.
This section provides several
commands and SQL queries that can be
used to validate your Oracle9i
RAC configuration.
--Look for Oracle Cluster Manager--
$ ps -ef | grep oracm | grep -v 'grep'
root 5476 1 0 19:28 pts/1 00:00:00 oracm
root 5478 5476 0 19:28 pts/1 00:00:00 oracm
root 5479 5478 0 19:28 pts/1 00:00:00 oracm
root 5480 5478 0 19:28 pts/1 00:00:00 oracm
root 5481 5478 0 19:28 pts/1 00:00:00 oracm
root 5483 5478 0 19:28 pts/1 00:00:00 oracm
root 5484 5478 0 19:28 pts/1 00:00:00 oracm
root 5485 5478 0 19:28 pts/1 00:00:00 oracm
root 5486 5478 0 19:28 pts/1 00:00:00 oracm
root 5491 5478 0 19:28 pts/1 00:00:00 oracm
--Look for the Oracle Global Services daemon--
$ gsdctl stat
GSD is running on the local node
--Using srvctl--
$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2
$ srvctl config database -d orcl
linux1 orcl1 /u01/app/oracle/product/9.2.0
linux2 orcl2 /u01/app/oracle/product/9.2.0
Query gv$instance
SELECT
inst_id
, instance_number inst_no
, instance_name inst_name
, parallel
, status
, database_status db_status
, active_state state
, host_name host
FROM gv$instance
ORDER BY inst_id;
INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST
-------- -------- ---------- --- ------- ----------- ------- -------
1 1 orcl1 YES OPEN ACTIVE NORMAL linux1
2 2 orcl2 YES OPEN ACTIVE NORMAL linux2
Starting &
Stopping the Cluster
This section details various ways and
commands necessary to startup and
shutdown the instances in your Oracle9i
RAC cluster. Ensure that you are logged
in as the "oracle" UNIX user:
# su - oracle
Starting the
Cluster
Startup all registered instances:
$ srvctl start database -d orcl
Startup the orcl2 instance:
$ srvctl start instance -d orcl -i orcl2
Stopping the
Cluster
Shutdown all registered instances:
$ srvctl stop database -d orcl>
Shutdown orcl2 instance using
the immediate option:
$ srvctl stop instance -d orcl -i orcl2 -o immediate
Shutdown orcl2 instance using
the abort option:
$ srvctl stop instance -d orcl -i orcl2 -o abort
Transparent
Application Failover (TAF)
Overview
It is not uncommon for businesses to
demand 99.99% or even 99.999%
availability for their enterprise
applications. Think about what it would
take to ensure a downtime of no more
than .5 hours or even no downtime during
the year. To answer many of these high
availability requirements, businesses
are investing in mechanisms that provide
for automatic failover when one
participating system fails. When
considering the availability of the
Oracle database, Oracle9i RAC
provides a superior solution with its
advanced failover mechanisms. Oracle9i
RAC includes the required components
that all work within a clustered
configuration responsible for providing
continuous availability—when one of the
participating systems fail within the
cluster, the users are automatically
migrated to the other available systems.
A major component of Oracle9i
RAC that is responsible for failover
processing is the Transparent
Application Failover (TAF) option. All
database connections (and processes)
that loose connections are reconnected
to another node within the cluster. The
failover is completely transparent to
the user.
This final section provides a short
demonstration on how automatic failover
works in Oracle9i RAC. Please
note that a complete discussion on
failover in Oracle9i RAC would be
an article in of its own. My intention
here is to present a brief overview and
example of how it works.
One important note before continuing
is that TAF happens automatically within
the OCI libraries. This means that your
application (client) code does not need
to change in order to take advantage of
TAF. Certain configuration steps,
however, will need to be done on the
Oracle TNS file tnsnames.ora.
NOTE: Keep in mind
that using the Java thin client will not
be able to participate in TAF since it
never reads the tnsnames.ora
file.
Setup
tnsnames.ora File
Before demonstrating TAF, we need to
configure a tnsnames.ora file
on a non-RAC client machine (if you have
a Windows machine laying around). Ensure
that you have Oracle RDBMS software
installed. (Actually, you only need a
client install of the Oracle software.)
Here is the entry I put into the
%ORACLE_HOME%\network\admin\tnsnames.ora
file on my Windows client machine in
order to connect to the new Oracle
clustered database:
...
ORCL =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2)(PORT = 1521))
(LOAD_BALANCE = yes)
(FAILOVER = yes)
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
(FAILOVER_MODE =
(TYPE = session)
(METHOD = basic)
)
)
)
...
SQL Query to
Check the Session's Failover Information
The following SQL query can be used
to check a session's failover type,
failover method, and if a failover has
occurred. We will be using this query
throughout this example.
COLUMN instance_name FORMAT a13
COLUMN host_name FORMAT a9
COLUMN failover_method FORMAT a15
COLUMN failed_over FORMAT a11
SELECT
instance_name
, host_name
, NULL AS failover_type
, NULL AS failover_method
, NULL AS failed_over
FROM v$instance
UNION
SELECT
NULL
, NULL
, failover_type
, failover_method
, failed_over
FROM v$session
WHERE username = 'SYSTEM';
Transparent
Application Failover Demonstration
From our Windows (or other non-RAC
client machine), login to the clustered
database (orcl) as the
SYSTEM user:
C:\> sqlplus system/manager@orcl
SQL*Plus: Release 9.2.0.3.0 - Production on Mon May 10 21:17:07 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
SQL> SELECT
2 instance_name
3 , host_name
4 , NULL AS failover_type
5 , NULL AS failover_method
6 , NULL AS failed_over
7 FROM v$instance
8 UNION
9 SELECT
10 NULL
11 , NULL
12 , failover_type
13 , failover_method
14 , failed_over
15 FROM v$session
16 WHERE username = 'SYSTEM';
INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER
------------- --------- ------------- --------------- -----------
orcl1 linux1
SESSION BASIC NO
SQL>
DO NOT logout
of the above SQL*Plus session!
Now that we have run the query (above),
we should now shutdown the instance
orcl1 on linux1 using the
abort option. To perform this
operation, we can use the srvctl
command-line utility as follows:
# su - oracle
$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2
$ srvctl stop instance -d orcl -i orcl1 -o abort
$ srvctl status database -d orcl
Instance orcl1 is not running on node linux1
Instance orcl2 is running on node linux2
Now let's go back to our SQL session and
rerun the SQL statement in the buffer:
SQL> SELECT
2 instance_name
3 , host_name
4 , NULL AS failover_type
5 , NULL AS failover_method
6 , NULL AS failed_over
7 FROM v$instance
8 UNION
9 SELECT
10 NULL
11 , NULL
12 , failover_type
13 , failover_method
14 , failed_over
15 FROM v$session
16 WHERE username = 'SYSTEM';
INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER
------------- --------- ------------- --------------- -----------
orcl2 linux2
SESSION BASIC YES
SQL> exit
From the above demonstration, we can see
that the above session has now been
failed over to instance orcl2
on linux2.
Conclusion
Oracle RAC allows the DBA to configure a
database solution with superior fault
tolerance and load balancing. In this
article, I have described an economical
solution for setting up and configuring
an inexpensive Oracle9i RAC
Cluster using Red Hat Linux and FireWire
technology. This RAC solution, which can
be assembled for around $1,500, will
provide you with a fully functional and
stable Oracle9i RAC cluster for
testing and development.
This article originally published at Jeffrey Hunter's
DBA/Development Web Site in a
slightly different form.
Jeffrey Hunter (jhunter@iDevelopment.info)
is an Oracle Certified Professional,
Java Development Certified Professional,
and author and currently works as a
senior DBA. His work includes advanced
performance tuning, Java programming,
capacity planning, database security,
and physical/logical database design in
UNIX, Linux, and Windows NT
environments.
|
|
|
|
|
|