|
Get a good, fast look at your data
with materialized views.
Users at Acme Bank were complaining that
typical and repeated queries were taking too long to execute.
Because these queries are executed over and over by many users,
any improvement in response time is bound to help.
The DBAs at Acme Bank have tuned the most
commonly used queries, all the needed indexes are present, and
no further SQL tuning is going to make any difference to the
performance of these queries.
The Acme Bank DBAs' solution: Use
materialized views (MVs).
This article discusses how to plan for
MVs, how to set up and confirm different MV capabilities, how to
automatically generate the scripts to create MVs, how to make
query rewrite (QR) available, and how to make sure that QR gets
used.
From Queries to Views
The following is a common query at Acme
Bank:
SELECT acc_type, SUM(cleared_bal) totbal
FROM accounts
GROUP BY acc_type;
And the following is an MV,
mv_bal,
for this query:
CREATE OR REPLACE MATERIALIZED VIEW mv_bal
REFRESH ON DEMAND AS
SELECT acc_type, SUM(cleared_bal) totbal
FROM accounts
GROUP BY acc_type;
MVs are segments similar to tables, in
which the output of queries is stored in the database.
Now suppose a user wants to get the total
of all account balances for the account type 'C' and issues the
following query:
SELECT SUM(cleared_bal)
FROM accounts
WHERE acc_type = 'C';
Because the
mv_bal
MV already contains the totals by account type, the user could
have gotten this information directly from the MV, by issuing
the following:
SELECT totbal
FROM mv_bal
WHERE acc_type = 'C';
This query against the mv_bal MV would
have returned results much more quickly than the query against
the accounts
table. Running a query against the MV will be faster than
running the original query, because querying the MV does not
query the source tables.
But there is another difference between
querying the source tables and the MV. When a user executes a
query against the source tables, the results return the
current
data. Querying the MV, however, often returns the data
as of the time the MV was created.
And because the MV is not updated when the data in the source
tables is, it is bound to get out of sync.
To keep the data in sync, the MV is
refreshed from time to time, either manually or automatically.
There are two ways to refresh data in MVs. In one of them, the
MV is completely wiped clean and then repopulated with data from
the source tables—a process known as
complete refresh.
In some cases, however, when the source tables may have changed
very little, it is possible to refresh the MV only for changed
records on the source tables—a process known as
fast refresh.
To use fast refresh, however, you must have created the MV as
fast-refreshable. Because it updates only changed records, fast
refresh is faster than complete refresh. (See the
Oracle Database Data Warehousing
Guide for more information on
refreshing MVs.)
To make sure that users will query MVs at
Acme Bank, even if they don't know anything about the MVs, the
bank uses the QR feature. With QR, the database engine can
rewrite the original query from the user to use an MV,
automatically,
so that the user need not be aware of all the MVs in place. In
some cases, such as in an OLTP system, it may not be desirable
to query from an MV that may not be current with the data in the
source tables. QR can be disabled in such cases—either
databasewide, for a specific session, or just for specific MVs.
The DBA can enable QR for a session or
for the entire system, by setting the
QUERY_REWRITE_INTEGRITY
parameter to true. For any MV to be used in a QR, the MV must
have been built to include the
ENABLE QUERY REWRITE
clause. For example, the following creates the
acc_mgr_view
MV with the QR feature from the complex Acme Bank query shown in
Listing 1:
CREATE OR REPLACE MATERIALIZED VIEW
acc_mgr_view
ENABLE QUERY REWRITE AS
<the query shown in Listing 1>
/
Code Listing 1:
Original query
select acc_mgr_id,
acc_type_desc,
decode
(a.sub_acc_type,null,'?', sub_acc_type_desc)
sub_acc_type_desc,
sum(cleared_bal) tot_cleared_bal,
sum(uncleared_bal) tot_uncleared_bal,
avg(cleared_bal) avg_cleared_bal,
avg(uncleared_bal) avg_uncleared_bal,
sum(cleared_bal+uncleared_bal) tot_total_bal,
avg(cleared_bal+uncleared_bal) avg_total_bal,
min(cleared_bal+uncleared_bal) min_total_bal
from balances b,
accounts a,
acc_types at,
sub_acc_types sat
where a.acc_no = b.acc_no
and at.acc_type = a.acc_type
and sat.sub_acc_type (+) = a.sub_acc_type
group by acc_mgr_id, acc_type_desc,
decode
(a.sub_acc_type,null,'?', sub_acc_type_desc)
Estimating Space
Converting the query in Listing 1 to an
MV is easy, but there are several questions to answer before
creating the MV, including how much space it will occupy and how
many rows it will contain.
The
DBMS_MVIEW
package in Oracle Database 10g will help get these
answers. To estimate the size of the proposed MV, the script in
Listing 2 calls
DBMS_MVIEW.ESTIMATE_MVIEW_SIZE.
Code Listing 2:
Estimating MV size
set serveroutput on size 999999
declare
l_num_rows number;
l_num_bytes number;
l_stmt varchar2(2000);
begin
l_stmt := 'select acc_mgr_id,
<the query shown in Listing 1>
(a.sub_acc_type,null,''?'', sub_acc_type_desc)';
dbms_mview.estimate_mview_size
(
stmt_id => 'Est1',
select_clause => l_stmt,
num_rows => l_num_rows,
num_bytes => l_num_bytes
);
dbms_output.put_line('Number of rows = '||l_num_rows);
dbms_output.put_line('Size (bytes) = '||l_num_bytes);
end;
/
Number of rows = 2829000
Size (bytes) = 667644000
The output in Listing 2 estimates that
the MV will contain 2,829,000 rows and that it's going to be
about 667MB in size. These are approximations made from
optimizer statistics gathered on the source tables earlier, and
the exact values may be different. But it helps plan for space
and determine in which tablespace to place this MV.
Checking Capabilities
The
DBMS_MVIEW.EXPLAIN_MVIEW
procedure checks the features and capabilities of an MV before
it is created and writes the results to a table named
MV_CAPABILITIES_TABLE.
First, the DBA creates this table, by running the utlxmv.sql
script in the rdbms/admin directory under Oracle Home.
Listing 3 uses the
DBMS_MVIEW.EXPLAIN_MVIEW
procedure to see what type of operations the proposed MV can be
used for.
After the first part of Listing 3
populates the results in
MV_CAPABILITIES_TABLE,
the second part (after the --
Now check the capabilities
comment) goes on to select from that table, as shown in the
SELECT ROWNUM,
CAPABILITY_NAME, ... query.
To aid in the explanation of the output, the query uses
ROWNUM
to denote line numbers.
Code Listing 3:
Checking capabilities
declare
l_stmt varchar2(2000);
begin
l_stmt := 'select acc_mgr_id,
<the query shown in Listing 1>
(a.sub_acc_type,null,''?'', sub_acc_type_desc)';
dbms_mview.explain_mview
(
stmt_id => 'MV_Tune1',
mv => l_stmt
);
end;
/
--
-- Now check the capabilities
--
SELECT ROWNUM, CAPABILITY_NAME, POSSIBLE, MSGTXT, RELATED_TEXT
FROM mv_capabilities_table
WHERE STATEMENT_ID = 'Est1'
AND CAPABILITY_NAME LIKE 'REFRESH%'
ORDER BY SEQ
/
-- Output
--
LN CAPABILITY_NAME P MSGTXT RELATED_TEXT
--- -------------------------------------- -- --------------------------------- -------------------
1 REFRESH_COMPLETE Y
2 REFRESH_FAST N
3 REFRESH_FAST_AFTER_INSERT N agg(expr) requires AVG_CLEARED_BAL
correspondingCOUNT(expr)
function
4 REFRESH_FAST_AFTER_INSERT N one or more joins present
in mv
5 REFRESH_FAST_AFTER_INSERT N GROUP BY clause in mv
6 REFRESH_FAST_AFTER_INSERT N aggregate function in mv
7 REFRESH_FAST_AFTER_INSERT N the detail table does not ARUP.SUB_ACC_TYPES
not have a materialized
view log
8 REFRESH_FAST_AFTER_ONETAB_DML N SUM(expr) without TOT_TOTAL_BAL
COUNT(expr)
9 REFRESH_FAST_AFTER_ONETAB_DML N SUM(expr) without TOT_UNCLEARED_BAL
COUNT(expr)
10 REFRESH_FAST_AFTER_ONETAB_DML N SUM(expr) without TOT_CLEARED_BAL
COUNT(expr)
11 REFRESH_FAST_AFTER_ONETAB_DML N see the reason why
REFRESH_FAST_AFTER_INSERT
is disabled
12 REFRESH_FAST_AFTER_ANY_DML N see the reason why
REFRESH_FAST_AFTER_ONETAB_DML
is disabled
13 REFRESH_FAST_PCT N PCT is not possible on any
of the detail tables in the
materialized view
The results in Listing 3 tell the story.
The CAPABILITY_NAME
column lists the different refresh capabilities, and the column
POSSIBLE (P)
shows via a simple Y
(yes) or N
(no) whether that capability is possible in this MV. The MV is
capable of being completely refreshed, as shown in line 1, but
it's not capable of being fast-refreshed, as shown in line 2.
Why not? The DBA goes down the list to see why the
refresh-related features are not possible. Line 3 indicates that
the fast refresh is not possible because the MV does not have a
COUNT()
expression in the query where aggregation functions such as
SUM()
are used. If an MV uses aggregation functions, then
COUNT()
must be in the query to make it fast-refreshable. The column
RELATED_TEXT
in the output shows which columns in the MV are being referred
to in the MSGTXT
column. The result
agg(expr)—short for
aggregation(expression)—in
MSGTXT
in line 3 refers to the column
AVG_CLEARED_BAL
in the MV. Because the
AVG(CLEARED_BAL)
clause is used in the query to build the MV, there must
additionally be a
COUNT(CLEARED_BAL) or
COUNT(*)
expression to make the MV fast-refreshable.
Additionally, fast refreshes require the
creation of materialized view logs on the base table. These logs
record the changes occurring in the base tables. During fast
refresh, these logs are read by the refresh process to determine
the changes to apply to the MV. In line 7, the
MSGTXT
column shows the
detail table does not have a materialized view log
and the RELATED_TEXT
column shows the table on which MV logs are not present—ARUP.SUB_ACC_TYPES.
Generating Scripts
After the Acme DBA uses the information
in Listing 3 to make the planned MV capable of fast refresh,
it's time to generate the file that will generate the MV. The
procedure in Listing 4 specifies a task name and passes it and
the SQL query (in Listing 1) to the
DBMS_ADVISOR.TUNE_MVIEW
procedure to generate all necessary SQL statements to create the
MV. After the DBA runs the script in Listing 4, running the
following script generates the recommendations of the DBMS
Advisor:
CREATE DIRECTORY TMP_DIR AS '/tmp'
/
BEGIN
DBMS_ADVISOR.CREATE_FILE (
DBMS_ADVISOR.GET_TASK_SCRIPT
('MV_Tune_Task1'),
'TMP_DIR', 'mv_tune_task1.sql'
);
END;
Code Listing 4:
Generating scripts
declare
l_stmt varchar2(2000);
l_task_name varchar2(30);
begin
l_task_name := 'MV_Tune_Task1';
l_stmt := 'create materialized view acc_mgr_view
enable query rewrite as
... <the query shown in Listing 1>...';
dbms_advisor.tune_mview
(
task_name => l_task_name,
mv_create_stmt => l_stmt
);
end;
/
This creates a file named
mv_tune_task1.sql in the /tmp directory that contains all the
SQL statements needed to create the MV logs on all the necessary
base tables and create the MV with all the appropriate
parameters to make it fast-refreshable. Running the
mv_tune_task1.sql file in SQL*Plus will create all the necessary
objects. The mv_tune_task1.sql file is available as
Listing 8.
Confirming Query
Rewrite
Listing 5 contains another popular user
query at Acme Bank; it uses column, table, and function
information similar to that of the query in the
acc_mgr_view
MV. Listing 5 also uses the
AUTOTRACE
feature of SQL*Plus to provide information on whether QR occurs
when the query runs. The
AUTOTRACE
output clearly shows that the query uses all the source tables,
not the materialized view,
acc_mgr_view.
Why?
Code Listing 5:
Confirming query rewrite
SQL> alter session set query_rewrite_enabled = true;
Session altered.
SQL> set autotrace traceonly explain
SQL> select acc_type_desc, sum(cleared_bal)
2 from balances b, accounts a, acc_types at
3 where a.acc_no = b.acc_no
4 and at.acc_type = a.acc_type
5 group by acc_type_desc;
Execution Plan
--------------------------------------------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=533 Card=4 Bytes=112)
1 0 SORT (GROUP BY) (Cost=533 Card=4 Bytes=112)
2 1 HASH JOIN (Cost=523 Card=50000 Bytes=1400000)
3 2 HASH JOIN (Cost=146 Card=50000 Bytes=850000)
4 3 TABLE ACCESS (FULL) OF 'ACC_TYPES' (TABLE) (Cost=3 Card=4 Bytes=40)
5 3 TABLE ACCESS (FULL) OF 'ACCOUNTS' (TABLE) (Cost=140 Card=100000 Bytes=700000)
6 2 TABLE ACCESS (FULL) OF 'BALANCES' (TABLE) (Cost=106 Card=100000 Bytes=1100000)
Running the utlxrw.sql script in $ORACLE_HOME/rdbms/admin
sets up a table named
REWRITE_TABLE,
whose results will help determine why the optimizer did not use
QR on the query in Listing 5. Running the
DBMS_VIEW.EXPLAIN_REWRITE
procedure, shown in Listing 6, populates the
REWRITE_TABLE
table with the findings. Querying the
MESSAGE
column in
REWRITE_TABLE provides the
reason why:
SQL> SELECT MESSAGE FROM
REWRITE_TABLE;
QSM-01110: a lossy join in MV,
ACC_MGR_VIEW, between tables, ACCOUNTS and SUB_ACC_TYPES, not found in query
Code Listing 6:
Checking for query rewrite
truncate table rewrite_table
/
declare
l_stmt varchar2(2000);
l_task_name varchar2(30);
begin
l_stmt := 'select
acc_type_desc,
sub_acc_type,
sum(cleared_bal)
from
balances b,
accounts a,
acc_types at
where
a.acc_no = b.acc_no
and
at.acc_type = a.acc_type
group by
acc_type_desc, sub_acc_type';
dbms_mview.explain_rewrite
(
query => l_stmt,
mv => 'ACC_MGR_VIEW',
statement_id => 'MV_Explain_RW1'
);
end;
/
commit
/
select message from rewrite_table
/
The result shows that QR will not occur
because the joins used in the MV and in the query are different.
In the MV, four tables—ACCOUNTS,
BALANCES, ACC_TYPES, and
ACC_SUB_TYPES—are
joined. However, the new query joins only three tables, leaving
SUB_ACC_TYPES,
which is reported in the
MESSAGE
column. The absence of this join would have meant inaccurate
results if QR were used, so the optimizer decided not to use it.
Rewriting the query in Listing 5 to
include another predicate,
AND SAT.ACC_SUB_TYPE =
A.ACC_SUB_TYPE, and place the
table ACC_SUB_TYPE
SAT in the
FROM
clause should make QR possible. This revised query is in Listing
7. After the DBA passes the revised query to the procedure in
Listing 6, examining the
REWRITE_TABLE
provides the good news:
SQL> SELECT MESSAGE
FROM REWRITE_TABLE;
QSM-01033: query rewritten with
materialized view, ACC_MGR_VIEW
Code Listing 7:
Rewrite of Listing 5, confirming query rewrite
SQL> set autotrace traceonly explain
SQL> select acc_type_desc,
2 sum(cleared_bal)
3 from balances b,
4 accounts a,
5 acc_types at,
6 sub_acc_types sat,
7 where a.acc_no = b.acc_no
8 and at.acc_type = a.acc_type
9 and sat.sub_acc_type = a.sub_acc_type
10 group by acc_type_desc
SQL> /
Execution Plan
--------------------------------------------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=6 Card=4 Bytes=64)
1 0 SORT (GROUP BY) (Cost=6 Card=4 Bytes=64)
2 1 MAT_VIEW REWRITE ACCESS (FULL) OF 'ACC_MGR_VIEW' (MAT_VIEW REWRITE) (Cost=5 Card=1600 Bytes=25600)
Running the revised query in Listing 7
confirms that the optimizer rewrites the query and generates the
execution plan, also included in Listing 7. This plan shows that
the ACC_MGR_VIEW
MV is used instead of the base tables, even though the user
specified them in the query. QR is indeed happening.
Code Listing 8:
Generated script for creating MV and MV logs
Rem SQL Access Advisor: Version 10.1.0.1 - Production
Rem
Rem Username: ARUP
Rem Task: MV_Tune_Task1
Rem Execution date:
Rem
set feedback 1
set linesize 80
set trimspool on
set tab off
set pagesize 60
whenever sqlerror CONTINUE
CREATE MATERIALIZED VIEW LOG ON
"ARUP"."ACCOUNTS"
WITH ROWID, SEQUENCE("ACC_NO","ACC_TYPE","SUB_ACC_TYPE","ACC_MGR_ID")
INCLUDING NEW VALUES;
ALTER MATERIALIZED VIEW LOG FORCE ON
"ARUP"."ACCOUNTS"
ADD ROWID, SEQUENCE("ACC_NO","ACC_TYPE","SUB_ACC_TYPE","ACC_MGR_ID")
INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW LOG ON
"ARUP"."BALANCES"
WITH ROWID, SEQUENCE("ACC_NO","CLEARED_BAL","UNCLEARED_BAL")
INCLUDING NEW VALUES;
ALTER MATERIALIZED VIEW LOG FORCE ON
"ARUP"."BALANCES"
ADD ROWID, SEQUENCE("ACC_NO","CLEARED_BAL","UNCLEARED_BAL")
INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW LOG ON
"ARUP"."ACC_TYPES"
WITH ROWID, SEQUENCE("ACC_TYPE","ACC_TYPE_DESC")
INCLUDING NEW VALUES;
ALTER MATERIALIZED VIEW LOG FORCE ON
"ARUP"."ACC_TYPES"
ADD ROWID, SEQUENCE("ACC_TYPE","ACC_TYPE_DESC")
INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW LOG ON
"ARUP"."SUB_ACC_TYPES"
WITH ROWID, SEQUENCE("SUB_ACC_TYPE","SUB_ACC_TYPE_DESC")
INCLUDING NEW VALUES;
ALTER MATERIALIZED VIEW LOG FORCE ON
"ARUP"."SUB_ACC_TYPES"
ADD ROWID, SEQUENCE("SUB_ACC_TYPE","SUB_ACC_TYPE_DESC")
INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW ARUP.ACC_MGR_VIEW
REFRESH FAST WITH ROWID
ENABLE QUERY REWRITE
AS SELECT ARUP.SUB_ACC_TYPES.SUB_ACC_TYPE_DESC C1, ARUP.ACC_TYPES.ACC_TYPE_DESC
C2, ARUP.ACCOUNTS.ACC_MGR_ID C3, ARUP.ACCOUNTS.SUB_ACC_TYPE C4, SUM("ARUP"."BALANCES"."CLEARED_BAL")
M1, COUNT("ARUP"."BALANCES"."CLEARED_BAL") M2, SUM("ARUP"."BALANCES"."UNCLEARED_BAL")
M3, COUNT("ARUP"."BALANCES"."UNCLEARED_BAL") M4, SUM(("ARUP"."BALANCES"."UNCLEARED_BAL"
+ "ARUP"."BALANCES"."CLEARED_BAL")) M5, COUNT(("ARUP"."BALANCES"."UNCLEARED_BAL"
+ "ARUP"."BALANCES"."CLEARED_BAL")) M6, COUNT(*) M7 FROM ARUP.SUB_ACC_TYPES,
ARUP.ACC_TYPES, ARUP.BALANCES, ARUP.ACCOUNTS WHERE ARUP.ACCOUNTS.SUB_ACC_TYPE
= ARUP.SUB_ACC_TYPES.SUB_ACC_TYPE AND ARUP.ACCOUNTS.ACC_TYPE = ARUP.ACC_TYPES.ACC_TYPE
AND ARUP.ACCOUNTS.ACC_NO = ARUP.BALANCES.ACC_NO GROUP BY ARUP.SUB_ACC_TYPES.SUB_ACC_TYPE_DESC,
ARUP.ACC_TYPES.ACC_TYPE_DESC, ARUP.ACCOUNTS.ACC_MGR_ID, ARUP.ACCOUNTS.SUB_ACC_TYPE;
whenever sqlerror EXIT SQL.SQLCODE
begin
dbms_advisor.mark_recommendation('MV_Tune_Task1',1,'IMPLEMENTED');
end;
/
Preventing Runaway
Queries
As demonstrated in Listing 5, if the user
writes the query a little differently, the optimizer does not
rewrite the query to select from the MV and the query selects
from the base tables directly. The user may think that the query
is rewritten and that the results are coming from the MV, but in
fact, the query is not rewritten and the results are coming from
the source tables, extending the response time significantly.
The
REWRITE_OR_ERROR
hint in the SQL query will make sure the query is rewritten to
select from the MV or at least inform the user if that does not
happen. The following adds the
REWRITE_OR_ERROR
hint to the SQL query in which QR did not occur (Listing 5):
SELECT /*+ REWRITE_OR_ERROR */
SELECT acc_type_desc,
sub_acc_type_desc,
...
The query returns the following error
message if it does not rewrite:
ORA-30393: a query block in the
statement did not rewrite
Conclusion
Oracle provides tools for creating,
managing, and tuning MVs in the
DBMS_MVIEW
package. Acme Bank is seeing increased query performance, using
QR whenever possible and using error messages from queries that
do not rewrite to further optimize its use of MVs and QR.
Arup Nanda
(arup@proligence.com)
is the manager of Database Systems at Starwood Hotels and
Resorts in White Plains, New York. He is the recipient of
Oracle Magazine's
2003 DBA of the Year award and the coauthor of
Oracle Privacy Security Auditing,
from Rampant Press (rampant-books.com), 2003. |