Excerpt by Rampant Author Porus Homi
Havewala, author of
Oracle Enterprise Manager Grid Control by Rampant TechPress.
In this article I will offer an overview of the architecture
used to achieve this high scalability in Grid Control. This kind
of information will be useful for customers that are
contemplating the use of Grid Control but need guidance about
properly architecting their solutions.
Suppose a DBA
team, or its management, decide to implement Grid Control for a
VLDB system. The normal tendency would be to use a test or
development server to install the product, be it on a flavor of
Unix, Linux, or Windows. This means all OEM Grid Control
components (the current release at the time of writing being
Release 4) are placed on a single server. This includes the
repository database, Oracle Management Service (OMS), and the EM
agent.
Then, OEM Agents would be installed by either the
push or pull method, on a few other development and test
database servers. After the DBA team experiments with the
functionality of Grid Control, it would likely tentatively
decide to install an agent on a production server for the first
time.
Let's say eventually management decides to move
the whole shebang of Grid Control to production, but it now
makes the mistake of assuming that what works for a few
development servers would also work for production. It
authorizes the DBA team to install OEM Grid Control on a
production server, again a single server. The team installs all
the components again on a single server, perhaps sharing the
Grid Control install with a production or test database. This is
followed by OEM agents being installed on all the production and
test database servers pointing back to the Grid Control server.
As the Grid Control workload gradually increases, as
more and more databases are managed by more DBAs, as more and
more monitoring is performed, as Grid Control is used more and
more for RMAN backups, Data Guard setup and monitoring, cloning
of databases and homes and so on, the Grid Control system grinds
to a halt.
We need to understand the Grid Control
internals. The main working component of Grid Control, the
engine as it were, is OMS.
This is a J2EE application
deployed on Oracle Application Server 10g; the member components
are the Oracle HTTP Server, the Oracle Application Server
Containers for Java (OC4J), and the OracleAS Web Cache.
Therefore, Grid Control is a reduced version of Oracle
Application Server itself.
At the Unix server level, we
see a Unix process that is the actual OC4J_EM process. This is
also seen when the opmnctl command is executed:
./opmnctl status
Processes in Instance:
EnterpriseManager0.GridMgt001.in.mycompany.com
-------------------+--------------------+-------+---------
ias-component | process-type | pid | status
-------------------+--------------------+-------+---------
WebCache | WebCacheAdmin | 2071 | Alive
WebCache | WebCache | 2099 | Alive
OC4J | OC4J_EM | 27705 | Alive
OC4J | home | N/A | Down
dcm-daemon | dcm-daemon | N/A | Down
LogLoader | logloaderd | N/A | Down
HTTP_Server | HTTP_Server | 2072 | Alive
A small digression at this stage: Since the OMS runs on
Oracle Application Server, you can control it like you would do
with Application Server: use the EM Application Server control,
or at the command line use opmnctl (Oracle Process Management
Notification Control), or dcmctl (Distributed Configuration
Management Control).
Thus, OC4J_EM is only a single Unix
process with its own PID. The memory used by this process is
also limited, it is set by the file $ORACLE_HOME/opmn/conf/opmn.xml.
You could perhaps increase the memory used by the process but it
remains just a single process. We can imagine the one process
being used for managing numerous databases and servers—to
perform various tasks such as Data Guard setups, cloning, and so
on—and understand why such a setup will simply not scale.
Obviously, if the database itself were to run on a single
process, with the db writer, the log writer, the archiver, and
numerous other process functions being performed by a single
process, then the database would become less efficient and
scalable. This is the primary reason why, if all Grid Control
components are placed on a single server, only limited
scalability will be achieved: you would be limited to one
OC4J_EM process with its own limits of memory and processor
speed. If the OC4J_EM process were to reach the limits of its
memory under heavy load, and the process were to slow down or
not respond, then other DBAs would not be able to login to the
Grid Control Console for their own database management work.
Placing Grid Control components on a single server is not
recommended in production, neither is sharing it with a
production or test database on the same server. Grid Control
needs its own server, and it needs its own set of servers in a
properly architected solution. It is recommended that some time
be spent to plan the Grid Control site being contemplated for
production.
\Senior management should be convinced of the need for this
initial study, it should approve the budget for the solution,
and the work should then be scoped out and performed as a
professional project, since Grid Control is an enterprise
solution and not a minor tool to deploy on a DBA workstation.
OEM Internals
OEM Grid Control
is drastically different from previous incarnations of
Enterprise Manager. In the past, Enterprise Manager was not so
scalable, simply because it was not N-tiered. The oldest avatar
was Server Manager, which was a PC executable utility.
When Grid Control was created, the internal architecture was
drastically altered to the N-tier model. Oracle's vision is
broadly N-tier, which is in line with and also sets the
direction for modern IT thought. Grid Control became the three
components mentioned previously, and because the main engine,
the OMS, now runs on the application server as an OC4J
application, it instantly became scalable.
Herein lies the secret of the immense scalability of Grid
Control. The boundaries were broken, and horizontal scaling were
opened to the EM world. The more OMS servers you add to the EM
site, the more targets you can manage.
The Right
Architecture for very large OEM
Our real-life
large site implementation example will illustrate this concept
more clearly. At the foundation of the implementation,
industry-standard and open architecture can be utilized, such as
Linux servers.
There is no need to deploy powerful expensive servers (beefy
beasts that typically have 24 or more CPUs and 32GB or more
memory). Smaller 4 CPU machines with 8 GB memory are being used,
since the intention is to scale horizontally and not vertically.
The "Free Space" mentioned in the specification table is for
the Oracle software, such as the Oracle Database Home, the
Oracle Management Service Home, and the Agent Home. It does not
include the database, which will be placed on either a SAN or a
NAS (Netapps filer). The database space requirement for the EM
Repository would be approximately 60 to 70GB, with an equal
amount of space reserved for the Flash Recovery Area, where all
archive logs and RMAN backups will be stored. Oracle recommends
database backups to disk (the Flash Recovery Area), so that fast
disk-based recovery is possible.
Even with a large number of targets being monitored and
managed, the database size rarely goes above above 60 to 70GB
with out-of-the-box functionality. A new feature of Grid Control
is that the EM repository database (10g) manages itself so far
as space is concerned, in the sense that it performs rollups of
metric data at predetermined intervals. Hence the metric data
that is being collected continuously from the targets does not
drastically increase the database size. On the other hand, it is
possible to manually create extra metrics for monitoring, and
this may lead to an increase in the database size greater than
this example figure.
During the installation phase, the Full Grid Control software
is installed first of all on one of the servers, using the Grid
Control installation CDs. This is done by selecting the
Enterprise Manager 10g Grid Control using a new database
installation type. This server becomes the repository server
since the repository database is created on this machine. Being
a full install, an OMS and EM Agent are also installed on the
same repository server. (You can ignore the OMS at this point:
more on this later.)
Next, an additional OMS is installed
on each of the other servers, this is done using the same Grid
Control Installation Cds but selecting the Additional Management
Service installation type. During the installation of the
additional service, you are asked to point at an existing
repository, so point to the repository database on the first
server. The repository database must be up and running at this
stage with a successful installation of the repository in the
Sysman schema.
In the process of the Additional
Management Service installation type, only the management
service (OMS) and the EM agent will be installed. This is
completed on three or more additional servers, these servers now
become the management server pool.
The repository
database server can be complemented with a standby database
server using Oracle Data Guard, or optionally an Oracle RAC
cluster on multiple nodes if it is a requirement to horizontally
scale up the repository database performance. But a noteworthy
point is that in Grid Control, the performance requirement is
not so much on the database side, but more on the management
server side. The highest scalability is achieved on the
management servers since the OC4J_EM is where the bulk of the
Grid Control work is performed. This is the reason why the
architecture should include three or more management servers
that are load balanced for a large Grid Control setup.
Load balancing the pool of management servers forms an integral
part of this architecture. A hardware load balancer, such as a
Big IP Application Switch Load Balancer from F5 Networks, can be
used for this purpose. (This company's flagship product is the
BIG-IP network appliance. The network appliance was originally a
network load balancer, but now also offers more functionality
such as access control and application security.)
The
load balancer is set up with its own IP address and domain name
for example: gridcentral.in.mycompany.com. The load balancer in
turn points to the IP addresses of the three management servers.
When a service request is received at the IP address or domain
name of the load balancer, and this can be at a particular port
which can be set up at the balancer level, the balancer decides
to distribute the incoming service request to any of the three
simultaneously active management servers in its pool, at the
port specified.
Grid control uses various ports for different purpose—for
example, there is a certain port used for the Console logons,
and a different port used for the Agent uploads of target metric
data. The Big IP must be set up for all these ports so that load
balancing occurs for Grid Control Console logons as well as for
Agent uploads of target metric data.
An additional
benefit is that this would give excellent redundancy to the Grid
Control system. If one of the management servers were to stop
functioning for any reason, such as could occur under heavy
load, the OC4J_EM process may need to be restarted using
opmnctl. Thus one of the management servers can be inactivated,
while the other active management servers continue to service
requests as distributed by the Big-IP load balancer.
The load balancer automatically ignores the non-reachable IP
(discovered to be so by its own monitors, which checks the pool
members on an ongoing basis, at predetermined intervals). So,
failure of any of the existing management server instances
simply results in the load balancer directing all subsequent
service requests to the active surviving instances. When the Big
IP monitor detects that the node is back on line, the node or
service is automatically added back into the pool.
Software load balancing could alternatively be used, instead
of hardware load balancing. This is a simple solution that uses
software, such as network domain names, to route requests to the
three management servers. The hardware solution is more
expensive, but it is recommended since it is a more powerful
solution. A hardware load balancer responsible for load
balancing as well as failover capabilities should form an
integral part of the total architecture solution, making the
solution much more robust and flexible.
To manage the Big IP load balancers, internal IPs must be
assigned to both the primary and the standby load balancers, and
a floating IP address must be assigned which points to either
the primary or standby load balancer depending on which balancer
is active. You would then manage the load balancer via the
floating IP using the URL as listed in the table below. This is
the Big IP management utility or Web console. Login to this
console using the Admin password or the Support password. (New
users can be created in the Big IP web console with read-only
rights if require.)
The Big IP root password is used for logging in at the Linux
level using SSH. The balancer runs Linux but with a reduced
command set shell. This is the command line interface (CLI) of
Big IP. Commands are slightly different from normal Linux, for
eg. in the CLI, the command "bigtop" is used to monitor the load
balancer.
The internal IPs and Floating IP are illustrated
in the following table (each IP address is shown as
nnn.nnn.nnn.nn but is implicitly unique):
Hostname
Ip Address Description Big Ip Management URL
GridBal001 nnn.nnn.nnn.nn Unit 1 IP Address
https:///bigipgui/bigconf.cgi
GridBal002 nnn.nnn.nnn.nn Unit 2 IP Address
https:///bigipgui/bigconf.cgi
GridBal003 nnn.nnn.nnn.nn Floating IP
Address https:///bigipgui/bigconf.cgi
Of the two load balancer units GridBal002 and GridBal002,
any one unit could be active (actually handling the load
balancing). Typically the two units will have 3 addresses
associated with them: Unit 1 IP, Unit 2 IP, Floating IP. The
Floating IP is a shared IP address and will only "exist" on the
unit that is active at that time.
The other servers in the Grid Control configuration are
illustrated by the following table:
Hostname Ip Address
Description
GridMgt001 nnn.nnn.nnn.nn Management
Server One (OMS 1)
GridMgt002
nnn.nnn.nnn.nn Management Server Two (OMS 2)
GridMgt003
nnn.nnn.nnn.nn Management Server Three (OMS 3)
GridMgt100
nnn.nnn.nnn.nn Virtual Management Server (Virtual OMS)
GridDb001
nnn.nnn.nnn.nn Database Server One (DBS 1) (Primary or RAC node)
GridDb002
nnn.nnn.nnn.nn Database Server Two (DBS 2) (Standby or RAC node)
For the purposes of load balancing, Big IP uses
the concepts of virtual servers, pools, associated nodes
(members) and rules to guide the load balancing. A virtual OMS
server is set up at the Big IP level with its own IP address,
this in turn points to a pool of Oracle management servers with
their own IP addresses. Therefore the outside world has merely
to point to the virtual OMS server's IP address or domain name,
for both Grid Console logons or Agent uploads from multiple
targets. The pool of Oracle Management servers is set up using
the IP address:port combination, which means you can have one
pool set up for Grid Console logons, and another pool set up for
Agent uploads to the OMS.
Two new pools were created, EMAgentUploads and EMConsoles.
Each pool has the three OMS nodes (the 3 active ones; however
you could add a node which is still being setup and keep it as
"forced down" in Big IP so it wont be monitored). The difference
between the pools is at the port level. The pool EMAgentUploads
is using port 4889 for Agent uploads, and the pool EMConsoles is
using port 7777 for console access (7777 is the default port for
Oracle Web Cache).
At the pool level, Big IP also allows
you to define the persistence (stickiness) should subsequent
service requests be routed to the same pool member or not. While
Grid Console logons do not require stickiness (we do not care if
the console uses a different OMS each time the DBA connects), it
was decided that agent uploads could benefit from this
stickiness. The pools were modified accordingly and "simple
persistence" was set up for the agent uploads pool, but none for
the console logons pool.
Two new Virtual OMS servers
were created, the first using port 4889 for agent uploads using
the EMAgentUploads pool, and the second using port 7777 for the
Web Cache EM Console using the EMConsoles pool. Both virtual
servers are using the same reserved IP address (but the ports
are different).
Big IP Monitors that continuously inspect the status of pool
members can also be set up. One such monitor EMMon was setup
using the send string of "GET /em/upload" and the receive rule
of "Http XML File receiver" which was as per the Enterprise
Manager Advanced Configuration Guide.
Now, when the
corporate network alias "gridcentral.in.mycompany.com" is
switched to point to the virtual OMS server GridMgt100, the Big
IP load balancer starts being used by production.
A
point to note is that the initial changes, seen as successful at
the Big IP management console, were not effective at the URL
level (the URLs didn't work) until the Big IP was failed over to
its standby and back again. Any configuration changes performed
on the active load balancer should be propagated to the standby
load balancer. This is done by the Big-IP configuration utility,
go to Redundant Properties and click on Synchronize
Configuration. This makes the standby balancer configuration to
be the same as the active, including all pools, virtual servers,
and rules, so the standby will be ready to take over the load
balancing in the event of a failover.
Another notable
point is that when changing the admin password, because the
admin user is configured as the configsync user, you must change
the password to match on the peer controller in order for
configsync to work.
It is also possible to manually fail
over. Before any failover to the standby Big IP, it is
recommended to mirror all connections. However, be aware that
this setting has a CPU performance hit. This is selected under
the properties of Virtual server ..Mirror connections.
It was
noted that a management server had been installed on the Grid
Control Repository server during the initial install. Since the
management server function has been separated from the
repository function in this architecture, it is not recommended
to use the extra management server that has been installed on
the repository server. Simply dedicate that server only for the
repository. For this purpose, only the three stand-alone
management servers were placed in the Big IP load balancer
pools.
The extra management server is a Java process
that runs on the repository server and takes up memory and
processing power, so it may be a good idea to use opmnctl on
this server and shutdown the management server (OC4J_EM). Or, if
Unix reboot scripts are being written that startup the OMS,
Agent, and Database on the servers whenever there is a reboot,
simply leave out starting the OMS in the case of the repository
server. Just start the Listener, the Database, and then the
Agent. On the other management servers, start the OMS and the
Agent.