|
Transportable Tablespaces
Transportable tablespaces are
now portable across platforms,
making data publication quicker
and easier. Plus, external table
downloads make the task of data
movement with transformation
simpler and faster.
How do
you move data from one database
to another? Of the several
methods, one in particular
stands out: transportable
tablespaces. In this approach,
you take a set of
self-contained, read-only
tablespaces, export only the
metadata, copy the datafiles of
those tablespaces at the OS
level to the target platform,
and import the metadata into the
data dictionary—a process known
as plugging.
OS file
copy is generally much faster
than other traditional means of
data movement such as
export/import or SQL*Loader.
However, in Oracle9i
Database and below, a
restriction limits its
usefulness to only a few cases
in which both the target and
source database run on the same
OS platform—you can't transport
tablespaces between Solaris and
HP-UX, for example.
In
Oracle Database 10g,
this restriction has
disappeared: you can now
transport tablespaces between
platforms as long as the OS byte
orders are identical...
How do
you know which operating systems
follow which byte order? Instead
of guessing or having to search
the internet, simply issue the
query:
SQL> select * from v$transportable_platform order by platform_id;
Suppose
you want to transport a
tablespace USERS from a host
machine SRC1, running Linux on
Intel Architecture to machine
TGT1, running Microsoft Windows.
Both the source and target
platforms are of little endian
type. The datafile for the
tablespace USERS is
users_01.dbf. You would follow
an approach similar to the
following.
- Make the tablespace READ
ONLY:
alter tablespace users read only;
- Export the tablespace.
From the OS prompt, issue:
exp tablespaces=users transport_tablespace=y file=exp_ts_users.dmp
The file exp_ts_users.dmp
contains only metadata—not
the contents of the
tablespace USERS—so it will
be very small.
- Copy the files
exp_ts_users.dmp and
users_01.dbf to the host
TGT1. If you were using FTP,
you would specify the binary
option.
- Plug the tablespace into
the database. From the OS
command prompt, you would
issue:
imp tablespaces=users transport_tablespace=y file=exp_ts_users.dmp datafiles='users_01.dbf'
After
Step 4, the target database will
have a tablespace named USERS
and the contents of the
tablespace will be available...
This
capability is particularly
useful in data warehouses where
smaller, subject-oriented data
marts are often populated from
the warehouse after a refresh.
With 10g, these data
marts can now be placed in
smaller, cheaper machines, such
as Intel boxes running Linux,
with the data warehouse server
on a larger enterprise-class
machine. In essence, with
transportable tablespaces, you
can now make better use of
various hardware and OS mixes.
Across Differing Endianness of
Platforms
If the
platforms are of different
endianness, how will you achieve
transferability? As I explained
earlier, the byte order of the
target machine, if different
than the source, will read the
data file incorrectly, making
the mere copying of the data
files impossible. But don't lose
heart; help is available from
the Oracle 10g RMAN
utility, which supports the
conversion of datafiles from one
byte order to another.
In the
above example, if the host SRC1
runs on Linux (little endian)
and the target host TGT1 runs on
HP-UX (big endian), you need to
introduce another step between
Steps 3 and 4 for conversion.
Using RMAN, you would convert
the datafile from Linux to HP-UX
format on the source machine
SRC1 (assuming you have made the
tablespace read only):
RMAN> convert tablespace users
2> to platform 'HP-UX (64-bit)'
3> format='/home/oracle/rman_bkups/%N_%f';
This step
produces a file in the standard
RMAN file format <tablespace_name>_<absolute_datafile_no>
in the directory /home/oracle/rman_bkups.
Note that the datafile for
tablespace USERS is not touched;
rather, a new file is created
for HP-UX. Now this file can be
copied over to the target
system, and the rest of the
steps are easy.
This
RMAN conversion command is quit
powerful. In the form given
above, it can create the
datafiles in sequence. For a
tablespace containing multiple
datafiles, you can instruct
conversion to run in parallel.
To do so, you would add a clause
to the above command:
parallelism = 4
which
creates four RMAN channels, with
each one working on a datafile.
However, a more useful approach
is to convert a large number of
tablespaces in one step, which
is where parallelism can really
help. Here we are converting two
tablespaces, USERS and MAINTS,
to HP-UX:
RMAN> convert tablespace users, maints
2> to platform 'HP-UX (64-bit)'
3> format='/home/oracle/rman_bkups/%N_%f'
4> parallelism = 5;
In the
above examples, the converted
file names are difficult to
decipher and tie to the original
files (for instance, file
users01.dbf becomes USERS_4).
Instead, you can use the other
format for naming data files.
This process is similar to that
for renaming data files in Data
Guard. You could use:
RMAN> convert tablespace users
2> to platform 'HP-UX (64-bit)'
3> db_file_name_convert '/usr/oradata/dw10/dw10','/home/oracle/rman_bkups'
4> ;
which preserves the file names after conversion. If you change to directory /home/oracle/rman_bkups, you will see the files users01.dbf and users02.dbf, corresponding to the original files in the same names.
In the
above cases, we converted the
files on the source platform.
However, you can do that on the
target platform as well. For
example, you can copy file
users01.dbf to host TGT1 running
HP-UX and then convert the file
to HP-UX format with:
RMAN> convert
2> datafile '/usr/oradata/dw10/dw10/users01.dbf'
3> format '/home/oracle/rman_bkups/%N_%f'
4> ;
This
approach will create a file in
the format specified in the
directory. But why would
you want to convert the
datafiles on the target
platform, exactly? One reason
could be shorter downtime, which
requires the tablespaces to be
READ ONLY state only for the
duration of the copy to the
target host. You could
triple-mirror the datafile, make
the tablespace read only, break
the third mirror, and
immediately make the tablespace
read/write. This third mirror
could then be mounted on the
target system and converted at
leisure. This arrangement
minimizes the duration for which
the tablespace must remain read
only.
Another
reason could be performance. The
OLTP database may be under a
constant load and using the RMAN
convert operation may strain the
system more than desired.
Instead, the conversion can be
offloaded to the data warehouse
server, where more CPUs are
usually available for parallel
operations.
Both
these problems have been
addressed in 10g. Now
you can unload a table to a
portable format quickly using
the external table creation
process. Step 1 above becomes
this simple SQL:
create directory dump_dir as '/home/oracle/dump_dir';
create table trans_dump
organization external
(
type oracle_datapump
default directory dump_dir
location ('trans_dump.dmp')
)
as
select * from trans
/
This
command creates a file named
trans_dump.dmp in the directory
/home/oracle/dump_dir. This file
is not exactly ASCII text; the
metadata is plain text but the
actual data is in raw format.
However, this file is portable
across operating systems,
similar to the export dump
file—but unlike export, the
download of the data is
extremely fast. You would copy
this file to the data warehouse
server and create the external
table in the same manner as
before, but this time
substituting this file as the
source.
So what
are the differences between
older data transfer mechanisms
and this one? There are several.
First, you can create a portable
file extremely quickly without
writing any complex SQL,
selecting columns of the table,
and so on. Second, you can use
this file as an input for the
external table, making it
possible to view the data as a
regular table and load that data
into other tables after data
manipulation. You can also
enhance the performance of the
data download to this external
table as shown below.
create table trans_dump
organization external
(
type oracle_datapump
default directory dump_dir
location ('trans_dump.dmp')
)
parallel 2
as
select * from trans
/
This
command creates the same file,
only in parallel. You should do
that to take advantage of
multiple host CPUs, if
available. In addition to going
parallel, you can also download
the table to multiple external
files as shown below.
create table trans_dump
organization external
(
type oracle_datapump
default directory dump_dir
location ('trans_dump_1.dmp','trans_dump_2.dmp')
)
parallel 4
as
select * from trans
/
This
command creates two files
trans_dump_1.dmp and
trans_dump_2.dmp, instead of
only one. This approach is
helpful in spreading files
across many physical devices or
controllers to reduce
I/O-related waits.
Conclusion
By
allowing tablespaces to be
transportable across platforms,
10g offers a powerful
solution for data warehouse data
movements. Coupled with External
Table download, this feature
bridges the gap between source
and target databases for data
publication—whether it's an
OLTP, data warehouse, or data
mart database—and allows you to
make appropriate platform
choices for particular types of
applications.
Furthermore, by making
transportable tablespaces
viable, 10g makes data
refreshes quicker and more
frequent so that analyzed data
is available to end users
sooner. This capability can also
be used to publish data via
offline media to different
databases, regardless of their
host systems. Using external
table downloads the utility to
move large quantities of data as
an ETL tool is finally available
to the end user. |