Oracle Consulting Oracle Training Oracle Support Development
Home
Catalog
Oracle Books
SQL Server Books
IT Books
Job Interview Books
eBooks
Rampant Horse Books
911 Series
Pedagogue Books
SQL Server Books
Oracle Software
image
image
Write for Rampant
Publish with Rampant
Rampant News
Rampant Authors
Rampant Staff
  Phone
  252-431-0050
Oracle News
Oracle Forum
Oracle Tips
Articles by our Authors
Press Releases
image
image

Oracle 11g Books

Oracle tuning

Oracle training

Oracle support

Remote Oracle

STATSPACK Viewer

    Privacy Policy

 

 
  Understanding Oracle SQL Optimization
 
ORACLE MAGAZINE
By Kimberly Floss

Improvements in the Oracle Database 10g Optimizer make it even more valuable for tuning.

Since its introduction in Oracle7, the cost-based optimizer (cost-based optimizer) has become more valuable and relevant with each new release of the Oracle database while its counterpart, the rule-based optimizer (RBO), has become increasingly less so. The difference between the two optimizers is relatively clear: The cost-based optimizer chooses the best path for your queries, based on what it knows about your data and by leveraging Oracle database features such as bitmap indexes, function-based indexes, hash joins, index-organized tables, and partitioning, whereas the RBO just follows established rules (heuristics). With the release of Oracle Database 10g, the RBO's obsolescence is official and the cost-based optimizer has been significantly improved yet again.

For starters, the cost-based optimizer now has a special tuning mode that is used by many other Oracle Database 10g features, including the SQL Tuning Advisor and SQL Access Advisor. One of the artifacts produced by the SQL tuning mode is a SQL profile—auxiliary statistics specific to a given statement—that can be persisted to the data dictionary and used at runtime (during normal cost-based optimizer operations) to improve SQL performance. SQL profiles are particularly useful for packaged applications (or any application for which you cannot access the source code).

SQL profiles use sampling to collect additional information; partial execution techniques to verify and adjust cost estimates; and execution history information for the SQL statement to modify parameter settings, if needed. . . .

In addition, however, Oracle has made some fundamental changes to the normal behavior of the cost-based optimizer, including changes to the cost model and the transformation engine. Let's take a look at these changes in the context of processing a SQL statement.

Oracle Database 10g cost-based optimizer Built on New Transformation Framework

SQL queries submitted to the Oracle database engine first run through the parser, which checks syntax and analyzes semantics. The result of this run-through is a set of query blocks that is sent to the optimizer, which comprises three main functional subsystems—specifically, the query transformation engine, the estimator, and the plan generator.

The query transformation engine applies heuristics- or rules-based algorithms to the query blocks. Some common transformations include select-join, group-by, and distinct-view merging and materialized view rewrite . .

More important, however, the cost model itself has changed. In Oracle9i Database, the cost-based optimizer used an I/O cost model that evaluated everything primarily in terms of single block reads, largely ignoring CPU costs (or using constants to estimate CPU costs). In Oracle Database 10g, the cost model now includes system statistics that realistically capture CPU and I/O on your system with respect to the Oracle workload.
 

Oracle Database 10g cost-based optimizer Uses a New Cost Model

In Oracle Database 10g, the default cost model is CPU plus I/O, in which the cost unit is time: The cost-based optimizer estimates execution time for a query by estimating the number of I/O operations, the type of I/O operations, and the number of CPU cycles the database will perform while executing the query. These estimates depend on the existence of system statistics, which the cost-based optimizer uses to convert the number of CPU cycles and the number of I/Os into execution time. (Note that some operations, such as bitmap merge, sort, or hash joins, may not require any I/O but that all operations include a CPU component.)

Oracle Database 10g gathers two types of system statistics—statistics captured without a workload (noworkload) and statistics captured with a workload. Noworkload statistics capture I/O system performance—average I/O seek time and transfer speed—and CPU speed. When gathering noworkload statistics, the cost-based optimizer issues sample reads of different sizes from the database's datafiles; it times every read and then uses statistical methods to compute average seek time and transfer speed. This takes from a few seconds to a few minutes. The cost-based optimizer computes CPU speed in millions of cycles per second.

Workload statistics make the cost-based optimizer aware of the workload. The system statistics captured during workload conditions identify whether the system is I/O- or CPU-bound; the cost-based optimizer uses the data to adjust the cost of the plans accordingly. To gather workload statistics, execute these commands at the start and the conclusion of your workload:

 

dbms_stats.gather_system_stats(gathering_mode=>'start')
...
dbms_stats.gather_system_stats(gathering_mode=>'stop')

 

You can see the values obtained by querying sys.aux_stats$, as follows:

You can set the parameter to other values (3, 4, 5, and up to 10) to enable dynamic sampling on analyzed tables. Dynamic sampling level 3 enables the selectivity computation of complex predicates, whose selectivity would be have a "guesstimate," and level 4 enables the combined selectivity computation of all predicates on a single table, thus capturing any data correlations in its selectivity estimate. Levels 5 and up increase the sample size, thus improving the accuracy of the result from dynamic sampling.

(If you disable dynamic sampling (by setting this parameter to 0) and the optimizer doesn't have statistics available during the estimation phase, it will use default values to cost-out the estimates for rows and other objects that it will need to obtain.

db_multiblock_read_count—The optimizer uses mbrc (one of the system statistics) instead of db_multiblock_read_count parameter during query compilation to compute costs for table scan and fast full index scans. In Oracle Database 10g, the default value of this parameter is 8 (which is equal to an I/O size of 8 * std-block-size). However, the default I/O size for query execution is automatically determined by the I/O layer, and is set appropriately for the system configuration, relying on workload system statistics for an accurate value of the multiblock read size.

A large number of query blocks and interdependence of the transformations themselves add to the complexity of any cost-based transformation strategy. With Oracle Database 10g, the transformation engine has been redesigned as an extensible framework that can support new algorithms as needed, to ensure that Oracle can always quickly select the optimal transformation.

In addition, the Oracle Database 10g transformation framework supports several new state space search algorithms that ensure that the optimizer finds the best plan quickly, even when transformations are complex.

The estimator measures cardinality (the number of rows); selectivity (the fraction of rows from a row set filtered by predicate); and cost, defined as resources (disk I/O, CPU usage, and memory) of the various operations (scanning a table or joining tables together, for instance) that will yield the data. The accuracy of these estimates depends on statistics—the object statistics and now, in Oracle Database 10g, system statistics. SQL profiles specific to any query are also used at this point to effectively estimate costs.

The plan generator recombines the query in various ways, constructing alternative plans by using different join methods and join orders. The more table joins in the original query, the more alternative plans are generated and the longer the time it may take the cost-based optimizer to find the "best" plan. Conversely, if there are few joins in a particular query or the joins are in the best order to begin with, the cost-based optimizer will find the optimal plan quickly.

Ultimately, the cost-based optimizer picks the best plan—the one with the lowest estimated cost relative to other plans. But precisely how the cost-based optimizer measures the cost of processing any SQL statement—the cost model—is influenced by many factors, including several initialization parameters, starting with the optimizer_mode. The optimizer_mode sets the overall performance goal for the cost-based optimizer in terms of throughput (the minimal time to return all rows) versus response time (the minimal time to return the first rows).

 

SQL> select sname, pname, pval1
from sys.aux_stats$;
SNAME       	PNAME      PVAL1
-------------	---------  -------
SYSSTATS_INFO   STATUS  
SYSSTATS_INFO   DSTART
SYSSTATS_INFO   DSTOP
SYSSTATS_INFO   FLAGS	    1
SYSSTATS_MAIN   CPUSPEEDNW  502.005
SYSSTATS_MAIN   IOSEEKTIM   10
SYSSTATS_MAIN   IOTFRSPEED  4096
SYSSTATS_MAIN   SREADTIM    7.618
SYSSTATS_MAIN   MREADTIM    14.348
SYSSTATS_MAIN   CPUSPEED    507
SYSSTATS_MAIN   MBRC 	    6  
SYSSTATS_MAIN 	MAXTHR      32768   
SYSSTATS_MAIN 	SLAVETHR      

13 rows selected.

CPUSPEEDNW, IOSEEKTIM, and IOTFRSPEED are noworkload statistics; SREADTIM, MREADTIM, CPUSPEED, MBRC, MAXTHR, and SLAVETHR represent workload statistics.

If both workload and noworkload statistics are available, the optimizer uses workload statistics.

  • SREADTIM—single block read time (milliseconds)—is the average time Oracle takes to read a single block.
  • MREADTIM—multiblock read time (milliseconds)—is the average time taken to read sequentially.
  • MBRC—multiblock read count—is the blocks, on average, read during multiblock sequential reads. The cost-based optimizer uses MBRC instead of the db_ multiblock_read_count parameter during query optimization to compute costs for table and fast full index scans.
  • MAXTHR—maximum I/O system throughput—is captured only if the database runs parallel queries.
  • SLAVETHR—maximum slave I/O throughput—is captured only if the database runs parallel queries.
Another benefit of the Oracle Database 10g cost model is that it enables the cost-based optimizer to use other techniques to create the fastest plan possible—for example, by reordering predicates in a query.

Because the cost-based optimizer knows the costs associated with a given query in terms of CPU and because the main ingredient in the cost of a predicate is CPU, the cost-based optimizer can calculate costs with alternative predicate orderings and reorganize the predicates in the most efficient order.  

For example, assume that for the following select statement, b < 0 returns 1 row and a > 0 returns all rows in table t:

select * from t 
where a > 0 and b < 0;

If predicate order is a > 0 and b < 0, then the database will evaluate b < 0 for every row. However, if the order is b < 0 and a > 0, then the predicate a > 0 will be evaluated just once.

The algorithms for ordering predicates consider the costs of various combinations of the predicate order and selectivity when determining the best order. This feature provides considerable improvement, especially for nonselective and expensive predicates.

Conclusion

Oracle's cost-based optimizer is a strategic subsystem of Oracle Database. Given that the cost-based optimizer improves with each new release of Oracle Database, you'll want to consider the ramifications that changes to the cost-based optimizer may have on your Oracle database applications.


Kimberly Floss specializes in Oracle performance tuning and SQL tuning techniques and is also the author of Oracle SQL Tuning and cost-based optimizer Internals from Rampant TechPress.

 

   

Oracle® is the registered trademark of Oracle Corporation. SQL Server® is the registered trademark of Microsoft Corporation.
Many of the designations used by computer vendors to distinguish their products are claimed as Trademarks