Journey into Partitioning Methods
Arup Nanda
Partitioning, the best thing that happened to Oracle since
sliced bread — perhaps that trite analogy has a grain of truth
in this case. Sliced bread gave mankind the ability to tackle
the monolithic loaf by breaking it into bite-sized pieces, not
unlike what partitioning does for our Oracle tables. The slices
hold patties, tomatoes, cheese, and all the relishes you may
want to fit in — analogous to data in the table. However, a
sandwich made from the slices is still a logical entity — just
like a table. If a slice gets soggy, all you need to do is to
replace it with a fresh one — sounds familiar to an operation
called partition exchange, doesn't it? It's therefore no wonder
that partitioning was embraced so quickly by the DBA community.
Implementation, however, was not so swift; there were doubts
about Cost Based Optimizer (CBO), and since partitioning
required it, most users shunned the new technology. With the
gradual improvement and widespread acceptance of CBO,
partitioning has come to adorn not just data warehouses, but
also OLTP databases.
Previously in DBAZine, I have explored some pitfalls and
potential problems that you should be aware of while designing
and maintaining a partitioning scheme in my articles, "Perils
and Pitfalls in Partitioning,"
Part
1 and
Part
2. After you have decided to move to partitioning, the next
logical question is how to convert the existing tables.
There are various methods to achieve the results, from the plain
and simple export/import to the sophisticated online
redefinition in Oracle9i. Each method is unique in some or other
way — making it perfect for your specific situation. This
article explains them all and their relative merits and
drawbacks so that you can decide on their suitability at your
site.
Apart from the already documented and known techniques to
convert existing tables, this article also explains another
means developed by the author, which is undocumented. In case
the table has to be available for changes during the conversion
process, the article describes a way to allow that.
Example Setup
For the sake of illustration, let's assume the existing
table, named MYTAB, as follows:
COL1 NUMBER(10)
COL2 VARCHAR2(200)
COL3 NUMBER(10)
COL4 DATE
The table has an index IN_MYTAB on COL2 and a
check constraint CK_MYTAB on COL4. We need to
convert this table to a range partitioned one on column COL1.
Each partition of the table should be in a different tablespace.
The index should be LOCAL with tablespaces defined
separately.
Deciding Factors
Before starting with any particular conversion approach,
think about the exact criteria that will temper your decision.
The criteria that you have to consider are:
- Availability: Does the table have
to be available while being converted? Or, can it be
easily shut off from users for the duration? Can the
table be available for read-only operation while
being converted? If there can be shut-off time for
the table, how long is that? The degree of
availability needs affects the choice of a proper
method.
- Quiet Time: Some databases are
equally loaded all over the 24-hour period, 7 days a
week. However, most databases follow a pattern in
which a specific spread of time — e.g., Sunday
nights on a non-month-end date — may be a period of
little activity. (This does not mean no activity,
just lessened load on the database.) Does your
database have such a window of quiet time? If so,
for how long and how often?
- Space: Do you have extra space in
the database to spare? If so, how much and where? In
the case of a typical Storage Area Network (SAN)
system with virtualization in place, you may have a
huge filesystem with plenty of room to spare. In
this case, location does not make a difference; but
in for a system in which the filesystems are
carefully defined for tables keeping in mind their
access patterns, the location where the free space
is available makes a difference. You also have to
consider whether the space available can be added to
the database while taking away a similar chunk from
the database elsewhere. And, is this temporary space
available on some temporary filesystem where a
database file cannot be created (e.g. '/tmp') but
sufficient space is available there for doing some
other work?
- Redo Generation: Some operations
are designed to minimize redo. Large redo
information generation means a larger number of
archived log files, which may overwhelm your tape
backup system. Of course, if your database runs in
NOARCHIVELOG mode, this lessens the problem,
but your production system is most likely in
ARCHIVELOG mode.
- Rollback Segment: How big are
rollback segments in your databases? This is the
second most important factor in the selection of a
conversion method. Some techniques rely on reading
the source tables. Due to the read consistency
requirement of the Oracle database engine, the
database must provide the image of the block that
was present when the query started. This past image
is constructed from the data stored in rollback
segment. If the query is long, the rollback segment
may eventually run out of extents to store the past
image data. When the long running query tries to get
the data from the segment, it faces an error called
ORA-1555 Snapshot Too Old. Even if the RBSs are
huge, there is a good chance that the query will
face the ORA-1555 error if the run time is too long.
- Special Datatypes: Some datatypes
such as LONGs in your table may make some conversion
approaches inapplicable. If you have such columns in
your table, you need to rethink your strategy.
Bear in mind these important factors while evaluating the
following methods for your particular situation.
Method 1: Direct Creation
In this method, you would simply create a table as
partitioned for the first time by selecting from the target
table. For example, you would issue
CREATE TABLE MYTAB_NEW
PARTITION BY RANGE (COL1) (
PARTITION P1 VALUES LESS THAN (101) TABLESPACE TS_P1,
PARTITION P2 VALUES LESS THAN (201) TABLESPACE TS_P2,
PARTITION P3 VALUES LESS THAN (301) TABLESPACE TS_P3,
PARTITION PM VALUES LESS THAN (MAXVALUE) TABLESPACE TS_PM
)
NOLOGGING
AS
SELECT * FROM MYTAB;
This creates a new table named MYTAB_NEW as a partitioned
table and with all the data from the table MYDATA. The use of
NOLOGGING at the end makes sure the data is created with minimum
redo. If necessary, you can also use the PARALLEL clause to
create this table in parallel.
After the table is created, drop the original table and
rename this new table to the original table name.
DROP TABLE MYTAB;
RENAME MYTAB_NEW TO MYTAB;
Then you can create the indexes and constraints on this table
to match what existed earlier.
Advantages
- The operation is quite simple.
- Use of NOLOGGING ensures minimal redo
generation.
- Parallel Query can be used to speed up
processing.
Disadvantages
- The space requirement is significantly high.
There should be enough space for both the source and
target tables at any given point. This could be a
serious problem if your database is strained for
space. Since the probable reason you are trying to
make a table partitioned is that it is large, this
problem can be quite challenging.
- Space is required in a filesystem in which the
database files can be created, not just in any
filesystem. For instance, you may have a large /tmp
filesystem, but that can't be used in this
conversion method. In the case of Oracle Parallel
Server or Real Application Cluster, the space should
be available in the shared storage area, not in the
local area.
- There is a need for rollback segments. While
creating the table as select from another table,
Oracle uses the snapshot of the table at the
beginning of the query. This creates considerable
load on the rollback segment space. If you are
dealing with a huge table, the rollback segment
space may not be sufficient; worse, in an active
system, the create table may produce an ORA-1555
Snapshot Too Old error.
- The constraints, indexes, and triggers must be
created separately after the new table is created.
Indexes and constraints, like primary keys that also
create indexes, may require a long time to be
created. This adds to the overall elapsed time. The
new table must be analyzed after being created, too.
High space, time, and resource consumption makes the Direct
Creation option less attractive if you are dealing with medium
to big databases. However, for smaller databases, the simplicity
of the option makes it quite attractive.
Method 2: Direct Load Insert
In this method, the partitioned table is created empty. This
can be easily done by running the listing provided for Method 1,
but adding the line, WHERE 1 = 2 to the very end of the
script after SELECT * FROM MYTAB line. By adding this
line, the table is created but the data is not loaded, since the
condition 1 = 2 is never satisfied.
After the table is created, the data is loaded from the
source table using
INSERT /*+ APPEND */ INTO MYTAB_NEW
SELECT * FROM MYTAB;
This use of hint APPEND above, called direct load insert,
loads the table data after the high-water mark of the target
table, and therefore generates minimal redo information. Redo
Log generation is comparable to the first method, but
considerably less time is required since the data loading
patterns are different. In a Direct Load operation, the rows are
not inserted as regular inserts, unlike an INSERT statement or
IMPORT. Rather, the data blocks are formatted and placed
directly over the high-water mark. Since this operation does not
require that an insert statement be processed, the SQL buffer is
completely bypassed, for significant savings. However, since the
source table is still read, the problem of large rollback
segment usage remains, and that could cause ORA-1555 errors.
After the table is created, the original table is dropped and
the new table is renamed as the original table. After that, all
the indexes, constraints, and so on, can be created in the newly
created table.
Advantages
- Loading the target is quite fast because the
APPEND option is used (this loads data above the
high-water mark).
- It is a relatively simple operation.
- Parallel query servers can be used, which will
enhance the speed of the operation.
Disadvantages
- Free space equal to at least the total size of
the table must be available. In medium to large
databases, this may mean that a substantial amount
of size is required, which may be difficult to
obtain, even temporarily.
- As in the case of the previously discussed
methods, the space must be available in a filesystem
where database files can be created.
- Since read consistency on the source table is
still necessary, there is a high rollback segment
requirement, which may lead to ORA-1555 errors.
- The total conversion process time required is
still high, when the index and constraint creation
tasks are added. Local index partitions can't be
exploited.
The relative simplicity of this approach makes it an
attractive option, but the high rollback segment requirement
makes it unsuitable for large databases. If your database is
medium size, your storage array has some room to spare and
needed downtime can be obtained, so this method is strongly
recommended.
Method 3: Export/Import
The previous methods are efficient, but require space inside
the database. If space is available in some filesystem like
/tmp, which cannot be added to the database, export the data
from the source table MYTAB to a dump file in the filesystem
using the export utility. (Using the option DIRECT=Y and
setting RECORDSIZE and BUFFER parameters to proper
values will speed up the process.) Then drop the table MYTAB,
create a partitioned table MYTAB in its place, and import the
data back.
Another variation of the process is exporting parts of the
table that correspond to the data in the partitions using a
where clause in the export utility, then compressing the dump
file to make room for the other parts. While importing, a single
dump file can be uncompressed, imported to a single partition,
and then dropped.
Advantages
- The biggest advantage of this method is that
significantly less space is required. Export
typically consumes less space than the source table.
But the need for space in a regular filesystem like
/tmp, and not in the database, makes the process
attractive.
- If needed, the export dump file can be created
on a tape, an optical drive, a NAS, or even in a
separate system, using a UNIX pipe. This flexibility
may open up several other avenues for space.
- If a table contains LONG columns, this is one of
the only two possible options.
Disadvantages
- Imports cannot be tweaked to minimize
logging, unlike the direct loader option. Therefore,
the redo generation and consequently, the archived
log generation, are quite high.
- The rollback segment requirement is also quite
high. This can be lessened somehow by importing
partition by partition, and by committing with
buffer sizes in the import command line parameter.
- This makes the table unavailable for the whole
operation.
- Since the table is dropped and recreated, all
other operations, such as creation of indexes, have
to be performed later, and this may consume time.
- The import utility does not take advantage of
parallel query servers.
Method 4: SQL*Loader
Instead of using the direct load insert, you could use
SQL*Loader with the DIRECT option to load the data, too.
To do this, create a flat file from the source table using
SQL*Plus and load the data into the table. With the DIRECT
option, the table data is loaded after the high-water mark, as
in the case of a direct load insert; this generates minimal redo
information. In addition, since the load is from a static file,
there is no rollback requirement to hold the "before" image copy
of the source data. This dramatically reduces the rollback
segment use and eliminates the rollback segment overflow and
ORA-1555 errors.
In Oracle 9i, external tables can be used for such
operations. Besides the static file requirement, the advantages
and disadvantages to this option are identical to those for the
direct load insert method. However, since SQL*Loader must read
from a flat file, the extra step of creating a flat file is
necessary, and that might make the process slightly more
complicated. Oracle does not provide a utility to dump the
contents into a flat file; you must write your own (however,
there are several scripts for this available on the Internet, so
this does not pose a real problem). The use of a static file as
a source mandates that the tables remain unchanged until the
conversion process is complete, affecting availability. On the
positive side, though, the loading can be done partition by
partition, so the space requirement may be somewhat less than
that required for other methods.
Method 5: Individual Partition Creation
This is a more complex but controllable operation. In this
method, you would create small tables resembling the partitions
of the target tablespace. Using the previously cited example,
issue the following statements:
CREATE TABLE MYTAB_P1
TABLESPACE TS_P1
NOLOGGING
AS
SELECT *
FROM MYTAB
WHERE COL1 < 101;
This creates a table with data identical to what would be
present in the partition P1 of the target table. Next, create
index and constraints of the table:
CREATE INDEX IN_MYTAB_P1
ON MYTAB_P1 (COL2)
TABLESPACE INDX_P1
NOLOGGING;
If necessary, use parallelism to speed up the process. Next,
analyze the table to make sure the data has proper
partition-level statistics. Creating the indexes and generating
stats at this stage are not mandatory; that can be put off until
later.
Create similar tables corresponding partitions, naming then
MYTAB_P2, MYTAB_P3, and so on, and create the indexes and
constraints accordingly. When all these tables are ready, create
a target table using the table creation script. Then, exchange
the partitions of the target table with the corresponding source
tables. For instance, you would issue the following:
ALTER TABLE MYTAB
EXCHANGE PARTITION P1
WITH TABLE MYTAB_P1
INCLUDING INDEX
WITHOUT VALIDATION;
This operation replaces the empty partition P1 of the table
with the standalone table MYTAB_P1. The standalone index
IN_MYTAB_P1 also replaces the corresponding partition of the
local index IN_MYTAB. And the stats gathered on the standalone
table are transferred to the partition of the target table. In
reality, the partition and the table data are not exchanged;
pointers inside the data dictionary are simply updated to point
to the appropriate objects. Since no data is actually moved, the
redo generation is negligible.
Any global indexes can now be formed on the newly produced
table. The constraints of the old table are now placed on the
new table with the NOVALIDATE option. (If you have not
yet created the local indexes, you can do so at this point.) You
can produce the indexes with the NOLOGGING option and by
using the parallel operation to decrease elapsed time.
Advantages
- Creation of individual partitions is separately
controlled, and sometimes offers significant
advantages. For example, in a sales table, the rows
related to past sales may be static, so they can be
converted to part tables at any time. The more
current and changing data could then be transferred
at a quieter time, ensuring minimal downtime for
operations. With the other methods cited, the entire
table is converted as a whole; finding a longer
quiet and down time may be impossible.
- Since you can form part tables and indexes in
parallel, the process is actually a little faster if
sufficient free space and CPU power is available.
- If space is a problem, the partitions can be
done one by one, increasing the elapsed time, but
minimizing the need for space.
Disadvantages
- This option requires space. If the entire table
is converted as a whole, the space requirement is
double the size of the table, which can be
substantial. Even if the partitions are done one
after the other, the size necessity is still high —
double the size of the biggest partition size.
- There is also the possibility of ORA-1555 errors
in case of large tables.
Apart from the already documented and known partitioning
methods, this article also explains another method developed by
the author, which is undocumented. In case the table has to be
available for changes during the conversion process, the article
describes a way to facilitate that.
Previously, we explored five ways to convert a table into a
partitioned table. In this article, we will continue with the
sixth one.
Method 6: Online Redefinition
Oracle 9i has introduced a new package, dbms_redefinition, to
reorganize tables online. A table can be changed without taking
away access from the users. Under the hood, dbms_redefinition
uses the Materialized View technology to convert the tables.
This feature can be exploited to convert a non-partitioned table
to a partitioned one. There are several restrictions on how this
package can be used, however; for example, the table should not
have a snapshot log (or materialized view log) defined on it,
should not have columns of datatype BFILE, LONG, and so on.
It is much easier to convert a table using online
redefinition when the table has a primary key. If there is no
primary key, some non-default parameters must be passed to
complete the definition process. To better explain the general
concepts, I'll assume that the table has no primary key.
First, check to see whether the table can be reorganized
using dbms_redefinition by issuing the following:
begin
dbms_redefinition.can_redef_table (
'SCOTT',
'MYTAB',
dbms_redefinition.cons_use_rowid);
end;
If the table has a primary key, use the following syntax
instead.
begin
dbms_redefinition.can_redef_table (
'SCOTT',
'MYTAB');
end;
Note the use of the fixed value
dbms_redefinition.cons_use_rowid, which instructs the package to
use only the rowid as the mechanism to test. If the table can be
reorganized with the package, the above command does not return
anything. If there is a reason why dbms_redefinition can't be
used to reorganize the table, the package throws out a
self-explanatory error.
Assuming that the redefinition can be done, the next step is
to create the target table as follows:
CREATE TABLE MYTAB_NEW
PARTITION BY RANGE (COL1)
(
PARTITION P1 VALUES LESS THAN (101) TABLESPACE TS_P1,
PARTITION P2 VALUES LESS THAN (201) TABLESPACE TS_P2,
PARTITION P3 VALUES LESS THAN (301) TSBALESPACE TS_P3,
PARTITION PM VALUES LESS THAN (MAXVALUE) TABLESPACE TS_PM
)
AS
SELECT * FROM MYTAB
WHERE 1=2;
This creates an empty table in the form we want it to be.
Next, it's time to start the redefinition process. Issue the
following:
begin
dbms_redefinition.start_redef_table (
'SCOTT',
'MYTAB',
'MYTAB_NEW',
null,
dbms_redefinition.cons_use_rowid
);
end;
Note the use of NULL as the fourth parameter. This parameter
is for mapping columns of the original table to the new table,
if they are different. Since we are not changing the column
names, this parameter is left to null. The last parameter
instructs the package that the table does not have a primary key
and the rowid should be used to redefine. If there is a primary
key, you can leave this parameter as the default value
dbms_redefinition.cons_use_pk.
Behind the scenes, Oracle creates several objects to
facilitate the conversion — for instance, a materialized view
log is created on the table MYTAB, a trigger is created to
populate the log, a materialized view is created using the
prebuilt table MYTAB_NEW, and so on. (Note: Since a materialized
view log is created on the table MYTAB, the table should not
have a view log before the process starts.)
While Oracle creates these objects, the source table is fully
accessible and the changes are captured in the MV log. From time
to time, you should use the following command to synchronize the
table and materialized view.
begin
dbms_redefinition.sync_interim_table (
'SCOTT',
'MYTAB',
'MYTAB_NEW'
);
end;
When the sync-up operations are complete and you have reached
a relatively quite time, finish off the synchronization process.
begin
dbms_redefinition.finish_redef_table (
'SCOTT',
'MYTAB',
'MYTAB_NEW'
);
end;
This operation does the final sync-up of the materialized
view with the main table, drops the MV logs, etc. and renames
the main table to the target table. While doing the final step,
it holds a lock on the table momentarily, restricting access.
That is why you should do it at a relatively quiet time. Since
the finish process also does a sync up, that step can be
shortened if the sync ups are done prior to calling the
procedure. That is the reason for calling the
sync_interim_table() procedure several times. The interim table
MYTAB_NEW can then be dropped.
Advantages of Online Redefinition
The immediate advantage of this option is clear: — it can be
done online. Apart from the brief moment when the final
synchronization is done, the table MYTAB is fully accessible.
Online redefinition sounds great, but should you
automatically use it if you have Oracle9i? Not necessarily.
Disadvantages of Online Redefinition
The biggest disadvantage is the space requirement. You'll
need space equivalent to twice the size of the entire table,
plus space for the MV logs, and so on. So, space requirements
for online redefinition might be a constraint in some cases.
Another restriction placed on the table is that it should not
have an MV log. This can render many tables ineligible for
online redefinition, particularly in DW and replication
environments, where there is the greatest need to convert to a
partitioned table.
If you are dealing with a large table, the MV log can be
substantially large, affecting performance. And since these are
created in your default tablespace, which may also contain the
parent table, sometimes there may be contention at the disk
level for access.
Typically, with large snapshots, the efficient method for
conversion is to create a table first using direct load insert,
then to use that table to create the snapshot with the "prebuilt
table" clause. Since the snapshot is created automatically by
Oracle and the user has no control over it, the initial
population cannot take advantage of such load reducing features.
Therefore, the initial load could severely strain resources like
rollback segments, temporary tablespaces, and so on. And the
rollback segment usage may trigger ORA-1555 errors.
To summarize, if you use the online redefinition method of
conversion
- you can't take advantage of the capability to
convert the table partition by partition, and
- you must have Oracle9i.
So, this method should only be used when the database
absolutely cannot be inaccessible at all. However, if you choose
this option, make sure that you have plenty of disk space and
rollback segment space, and try to do it during a relatively
quiet time.
Method 7: Split-Split
This method is not documented by Oracle, but was devised by
the author when all of the other methods were not viable because
of the constraints of the system. All of the previously
described methods require enough space to be available to hold
at least the biggest partition in the system. While that
generally might not be a problem, under certain circumstances,
the space may not be available to achieve the objective.
However, the split-split method uses no additional space.
First, create the partitioned table with only one partition,
the maxvalue partition. Next, split this partition at the
lowest boundary point and repeat the process until all
the partitions are created. Let's see how this is done using the
same example we cited previously.
First, create a table called MYTAB_NEW as follows:
create table MYPART_NEW
(
col1 number,
col2 varchar2(10),
col3 char(2)
)
NOLOGGING
partition by range (col1)
(
partition pm values less than (maxvalue)
tablespace TS_PM_DATA
);
Note that, even though our target table has several
partitions, only the maximum value partition has been defined in
this table. Although the table is defined, it does not contain
any significant space at this time. The table is empty and
contains only the minimum number of extents as defined for the
tablespace TS_PM_DATA.
Next, create the indexes and constraints as seen in the table
MYTAB:
CREATE INDEX IN_PART
ON PART (COL2) LOCAL
NOLOGGING;
ALTER TABLE PART
ADD CONSTRAINT CK_PART_01
CHECK (COL3 IS NOT NULL);
Next, we will exchange the table MYTAB with this partition
ALTER TABLE PART EXCHANGE PARTITION PM
WITH TABLE NOPART INCLUDING INDEXES;
This statement swaps the table's partition PM with the table
MYTAB. The contents of MYTAB are now in the PM partition, and
the MYTAB table is empty. Since this operation merely changes
the data dictionary and doesn't physically move data, it doesn't
generate redo and is extremely quick. The clause INCLUDING
INDEXES swaps the indexes too,(i.e., the partition of the local
index IN_MYTAB now contains the index information).
Next, split this single partition, starting with the lowest
boundary (partition P1).
ALTER TABLE PART SPLIT PARTITION PM AT (101)
INTO (PARTITION P1, PARTITION PM);
This command creates a new partition called P1, and moves the
rows with a COL1 value of less than 101 into this from PM. Since
the table is defined as NOLOGGING, this doesn't generate much
redo. After this operation, the partition PM contains data for
the partitions other than P1. You should repeat this splitting
process, but now with P2 in mind:
ALTER TABLE PART SPLIT PARTITION PM AT (201)
INTO (PARTITION P2, PARTITION PM);
This should be repeated until the partition PM is split up to
P4, to the last partition but one. Since the index is defined as
LOCAL, it will have been split, too, along with the table
partition splitting command.
At the completion of the process, you will have a table
called MYTAB_NEW with all the data from MYTAB and with the same
indexes and constraints. You should now drop the table MYTAB and
rename the table MYTAB_NEW to MYTAB so that applications will be
able to access this table. Also, you should restore the
privileges associated with NOPART to PART.
However, renaming table does not rename the constraints or
indexes. Although applications may not be affected by the new
name of the index and constraints, it may be necessary to change
the names to avoid confusion. The names are changed by the
following statements:
ALTER INDEX IN_PART RENAME TO IN_NOPART;
ALTER TABLE NOPART RENAME CONSTRAINT
CK_PART_01 TO CK_NOPART_01;
The latter command is available in Oracle9i only. If you are
using V8i and cannot do this, you could drop the constraint from
the MYTAB table first, then re-create the constraint with
NOVALIDATE option.
A slight variation of this method is to create the index and
constraints at the very end of the process. This makes the
splitting process quite fast, since there is no splitting of the
indexes.
Advantages
By eliminating the need to have two copies of the data
simultaneously, no additional space for the tables is consumed;
the space is carved out from the original table. In
space-starved environments, this could be a plus.
Also, when the data is moved from one partition to the other,
a little but not-so-insignificant side effect is that the data
is defragmented.
Finally, if the tablespaces for all the partitions are the
same as that of the original table, then the splitting does not
physically move data between files. This significantly reduces
redo generation and load on the I/O subsystem.
Disadvantages
The partition split operation physically moves the data from
one partition to the other. This is extremely time and resource
consuming as compared to direct load insert method. Redo
generation is minimized, not suppressed completely and the size
of the redo is significantly larger than in case of the direct
load insert method.
It needs the table to be inaccessible for the entire duration
of the operation, even for read only tasks.
Method 8: Copy
This approach uses the SQL*Plus command COPY to copy rows
from one table to the other. Usually, COPY is used to copy rows
across databases over db links, but this option can be used in
this case as well.
First, create the target table MYTAB_NEW with the full
partitioning structure, then copy the rows from the source table
MYTAB.
A word of caution here: the command COPY has few friends at
Oracle Corporation. There's been no enhancements since Oracle8
and may be deprecated in future releases. At least in 9i, this
command is available.
COPY FROM SCOTT/TIGER@MYDB -
INSERT MYTAB_NEW -
USING SELECT * FROM MYTAB
At this time, several important points are worth noting.
First, COPY is a command in SQL*Plus, like CONNECT, SHOW, and so
on; it's not a SQL command like SELECT. Therefore, you must use
it inside SQL*Plus only and not inside a stored procedure.
Second, note the use of a hyphen at the end of each
line. Since COPY is a SQL*Plus command, the command is expected
to end after a carriage return. However, since our command is
actually in three lines, we have indicated the continuation of
the line using the hyphen.
So, what does COPY provide that the other methods don't? It
is the support for LONG datatype. If your tables contain LONG
datatype, you can't use Create Table As Select (CTAS) or
Direct Load Insert to copy the table rows. COPY is the
only option in addition to Export/Import.
By default, COPY commits only after the end of the process.
However, this can be controlled using variables inside the
SQL*Plus environment:
- ARRAYSIZE : This number determines how many rows
will be fetched in a single attempt.
- COPYCOMMIT: This number determines after how
many batches (each of ARRAYSIZE number of rows) the
commit should occur.
These variables are set by the SET command in SQL*Plus. So,
if ARRAYSIZE is 1000 and COPYCOMMIT is 100, a commit occurs at
the end of every 100,000 records. Unfortunately, the transaction
cannot be made to produce less redo data or suppress generation
of undo data, unlike NOLOGGING or Direct Load Insert.
Nevertheless, by making a judicious decision on these two
parameters, a commit frequency can be obtained in such a way
that the rollback segments are not filled up, and the excessive
commits don't happen either.
After the table rows are copied, you can drop the original
table, rename the table MYTAB_NEW to MYTAB, and then recreate
the indexes and constraints.
Advantages
The only key advantage for the COPY method is its support for
LONG. Obviously, if your table does not have this datatype, this
is no longer an advantage.
Tables can be converted partition by partition. Temporary
part tables can be created and populated as per the code below,
and the space requirement will be less than a full table size.
COPY FROM SCOTT/TIGER@MYDB -
INSERT MYTAB_NEW -
USING SELECT * FROM MYTAB -
WHERE COL2 BETWEEN 101 AND 200
Note the use of a predicate after the USING clause, for
filtering the rows.
Disadvantages
The only supported datatypes are DATE, NUMBER, CHAR,
VARCHAR2, and LONG. If you have LOBs or other Oracle object
types, you can't use the COPY method.
Other disadvantages include the fact that
- the redo generation can't be suppressed
- the rollback segment requirement is huge, so
ORA-1555 errors are possible
- the space requirement is twice the table size or
at least twice that of the biggest partition
These disadvantages may make the COPY method seem worthless,
but in key cases in which some LONG columns are present, this
option will be useful.
Conversion in an Active Database
All these methods described above, with the exception of the
Oracle9i Online Redefinition, rely on stopping all transactions
during the conversion to a partitioned table. If the
transactions are not stopped, the changes that occur from start
to finish of the process are lost when the target table is
renamed to the original table. Either the tablespaces should be
made read-only, or the database should be brought down and
brought up in read-only mode before starting the conversion
process. This also alleviates the problem of running out of
Rollback Segment space as well as the likelihood of ORA-1555
errors.
However, if it is not possible to stop the updates to this
table during conversion, the setup needs to be modified. First,
identify the primary key columns of the table. If the table does
not have any primary key, it is advisable to establish some
surrogate keys to identify a row uniquely. Make sure that the
columns identified are not nullable. In the following example,
assume the column is COL1, then create a table to hold changes
to the table:
CREATE TABLE MYTAB_CHANGES
(
CHANGE_TYPECHAR(1) NOT NULL,
COL1_VALUE NUMBER(10) NOT NULL,
CHANGE_DATEDATE NOT NULL
);
Define a trigger on the table MYTAB as follows to capture
changes to the table MYTAB:
CREATE OR REPLACE TRIGGER TR_MYTAB_CHANGES
AFTER INSERT, DELETE or UPDATE ON MYTAB
FOR EACH ROW
BEGIN
IF INSERTING THEN
INSERT INTO MYTAB_CHANGES
VALUES ('I',:NEW.COL1,SYSDATE);
ELSIF DELETING THEN
INSERT INTO MYTAB_CHANGES
VALUES ('D',:NEW.COL1,SYSDATE);
ELSIF UPDATING THEN
INSERT INTO MYTAB_CHANGES
VALUES ('U',:NEW.COL1,SYSDATE);
ELSE
RAISE_APPLICATION_ERROR (-20001,'Wrong Operation');
END IF;
END;
This trigger places the changes (new records, deleted
records, and updates) to the changes table. After the operation
of creating the target table is over, further changes are to be
disallowed, and the target table is updated from the join of
this changes table and the main source table. (Note: the code
segment given below is for demonstration purpose only; you
should provide an extensive error handling facility before
deploying in production.)
BEGIN
FOR CHGREC IN (
SELECT * FROM MYTAB_CHANGES
ORDER BY CHANGE_DATE) LOOP
IF (CHGREC.CHANGE_TYPE = 'I') THEN
BEGIN
INSERT INTO MYTAB_NEW
SELECT * FROM MYTAB
WHERE COL1 = CHGREC.COL1_VALUE;
EXCEPTION
WHEN DUP_VALUE_ON_IND THEN
NULL;
END;
IF (CHGREC.CHANGE_TYPE = 'D') THEN
BEGIN
DELETE MYTAB_NEW
WHERE COL1 = CHGREC.COL1_VALUE;
IF (SQL%NOTFOUND) THEN
NULL;
END IF;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
END;
IF (CHGREC.CHANGE_TYPE = 'I') THEN
BEGIN
DELETE MYTAB_NEW
WHERE COL1 = CHGREC.COL1_VALUE;
INSERT INTO MYTAB_NEW
SELECT * FROM MYTAB
WHERE COL1 = CHGREC.COL1_VALUE;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
WHEN OTHERS THEN
RAISE;
END;
END IF;
END IF;
END LOOP;
END;
This script will update the rows of MYTAB_NEW with the data
from the changes table. After this process is over, the data in
the table MYTAB_NEW will be identical to the data in the table
MYTAB.
Conclusion
Converting the table from partitioned to non-partitioned
structure is a process that consumes considerable time and
effort, and DBAs managing different sites have different
constraints to deal with when making this executing this
process.
|