Oracle Cost Based Optimizer
Oracle Cost Based Optimizer
By: Divya
CONTENTS
Introduction Rule Based Optimizer Cost Based Optimizer Cost Based Optimizer and Database Statistics
ANALYZE STATEMENT DBMS_UTILITY DBMS_STATS Scheduling Stats Transfering Stats
Behaviour Adjustments
Optimizer_index_cost_adj Optimizer_index_caching Optimizer_max_permutations Db_file_multiblock_read_count Parallel_automatic_tuning Hash_area_size Sort_area_size size
Update Statistics Update Statistics with BRCONNECT Internal rules for Update Statistics Steps for running Optimizer Script
Whenever a valid SQL statement is processed Oracle has to decide how to retrieve the necessary data. This decision can be made using one of two methods:
You can use the analyze command and the DBMS_STAT package to generate statistics about the objects in your database. The generated statistics include the number of rows in a table and the number if distinct keys in an index. Based on the statistics, the CBO evaluates the cost of the available execution path that has the lowest relative cost. If you use the CBO, you need to make sure that you analyze the data frequently enough for the statistics to accurately reflect the data within your database. If a query references tables that have been analyzed and tables that have not been analyzed, the CBO selects values for the missing statistics-and it may decide to perform an inappropriate execution path. To improve performance, you should use either the RBO or the CBO consistently throughout your database. Since the CBO supports changes in data volumes and data distribution, you should favour its use.
Analyze Statement
6
The ANALYZE statement can be used to gather statistics for a specific table, index or cluster. Once you have analyzed an object, you can query the statistics-related columns of the data dictionary views to see the values generated. When analyzing you can scan the full object (via the compute statistics clause) or part of the object (via the estimate statistics clause). In genera, you can gather adequate statistics by analyzing 10 to 20 percent of an object. The statistics can be computed exactly, or estimated based on a specific number of rows, or a percentage of rows: ANALYZE TABLE employees COMPUTE STATISTICS; ANALYZE INDEX employees_pk COMPUTE STATISTICS; ANALYZE TABLE employees ESTIMATE STATISTICS SAMPLE 100 ROWS; ANALYZE TABLE employees ESTIMATE STATISTICS SAMPLE 15 PERCENT;
DBMS_UTILITY
The DBMS_UTILITY package can be used to gather statistics for a whole schema or database. Both methods follow the same format as the analyze statement: EXEC DBMS_UTILITY.analyze_schema(SCOTT,COMPUTE); EXEC DBMS_UTILITY.analyze_schema(SCOTT,ESTIMATE, estimate_rows => 100); EXEC DBMS_UTILITY.analyze_schema(SCOTT,ESTIMATE, estimate_percent => 15); EXEC DBMS_UTILITY.analyze_database(COMPUTE); EXEC DBMS_UTILITY.analyze_database(ESTIMATE, estimate_rows => 100); EXEC DBMS_UTILITY.analyze_database(ESTIMATE, estimate_percent => 15);
DBMS_STATS
The DBMS_STATS package was introduced in Oracle 8i and is Oracles preferred method of gathering object statistics. It is a replacement of the analyze command, and is the recommended method as of Oracle 9i. Oracle list a number of benefits to using it including parallel execution, long term storage of statistics and transfer of statistics between servers. Once again, it follows a similar format to the other methods: EXEC DBMS_STATS.gather_database_stats; EXEC DBMS_STATS.gather_database_stats(estimate_percent => 15); EXEC DBMS_STATS.gather_schema_stats(SCOTT); EXEC DBMS_STATS.gather_schema_stats(SCOTT, estimate_percent => 15); EXEC DBMS_STATS.gather_table_stats(SCOTT, EMPLOYEES); EXEC DBMS_STATS.gather_table_stats(SCOTT, EMPLOYEES, estimate_percent => 15); EXEC DBMS_STATS.gather_index_stats(SCOTT, EMPLOYEES_PK); EXEC DBMS_STATS.gather_index_stats(SCOTT, EMPLOYEES_PK, estimate_percent => 15); This package also gives you the ability to delete statistics: EXEC DBMS_STATS.delete_database_stats; EXEC DBMS_STATS.delete_schema_stats(SCOTT); EXEC DBMS_STATS.delete_table_stats(SCOTT, EMPLOYEES); EXEC DBMS_STATS.delete_index_stats(SCOTT, EMPLOYEES_PK);
Scheduling Stats
Scheduling the gathering of statistics using DBMS_Job is the easiest way to make sure they are always up to date: SET SERVEROUTPUT ON DECLARE l_job NUMBER;
8
BEGIN DBMS_JOB.submit(l_job, BEGIN DBMS_STATS.gather_schema_stats(SCOTT); END;, SYSDATE, SYSDATE + 1); COMMIT; DBMS_OUTPUT.put_line(Job: || l_job); END; / The above code sets up a job to gather statistics for SCOTT for the current time every day. You can list the current jobs on the server using the DBS_JOBS and DBA_JOBS_RUNNING views. EXEC DBMS_JOB.remove(X); COMMIT; Where X is the number of the job to be removed.
Transfering Status
It is possible to transfer statistics between servers allowing consistent execution plans between servers with varying amounts of data. First the statistics must be collected into a statistics table. In the following examples the statistics for the APPSCHEMA user are collected into a new table, STATS_TABLE, which is owned by DBASCHEMA: SQL> EXEC DBMS_STATS.create_stat_table(DBASCHEMA,STATS_TABLE); SQL> EXEC DBMS_STATS.export_schema_stats(APPSCHEMA,STATS_TABLE,N ULL,DBASCHEMA); This table can then be transfered to another server using your preferred method (Export/Import, SQLPlus Copy etc.) and the stats imported into the data dictionary as follows:
Behavior Adjustments
There are some things that the CBO cannot detect, which is where the DBA comes in. The types of SQL statements, the speed of the disks and the load on the CPUs, all affect the "best" execution plan for a SQL statement. For example, the best execution plan at 4:00 A.M. when 16 CPUs are idle may be quite different from the same query at 3:00 P.M. when the system is 90 percent utilized. Despite the name "Oracle", the CBO is not psychic, and Oracle can never know, a priori, the exact load on the Oracle system. Hence the Oracle professional must adjust the CBO behavior periodically. Most Oracle professionals make these behavior adjustments using the instance-wide CBO behavior parameters such as optimizer_index_cost_adj and optimizer_index_caching.
10
However, Oracle does not recommend changing the default values for many of these CBO settings because the changes can affect the execution plans for thousands of SQL statements. Here are some of the major adjustable parameters that influence the behavior of the CBO: Optimizer_index_cost_adj Optimizer_index_caching Optimizer_max_permutations Db_file_multiblock_read_count Parallel_automatic_tuning Hash_area_size Sort_area_size size
Optimizer_index_cost_adj
This parameter alters the costing algorithm for access paths involving indexes. The smaller the value, the cheaper the cost of index access.
Optimizer_index_caching
This is the parameter that tells Oracle how much of your index is likely to be in the RAM data buffer cache. The setting for optimizer_index_caching affects the CBO's decision to use an index for a table join (nested loops), or to favor a full-table scan.
Optimizer_max_permutations
This controls the maximum number of table join permutations allowed before the CBO is forced to pick a table join order. For a six-way table join, Oracle must evaluate 6-factorial, or 720, possible join orders for the tables.
Db_file_multiblock_read_count
11
When set to a high value, the CBO recognizes that scattered (multi-block) reads may be less expensive than sequential reads. This makes the CBO friendlier to full-table scans.
Parallel_automatic_tuning
When set "on", full-table scans are parallelized. Because parallel full-table scans are very fast, the CBO will give a higher cost to index access, and be friendlier to full-table scans.
UPDATE STATISTICS
The Oracle cost-based optimizer (CBO) uses the statistics to optimize access paths when retrieving data for queries. If the statistics are out-of-date, the CBO might generate inappropriate access paths (such as using the wrong index), resulting in poor performance. By running update statistics regularly, you make sure that the database statistics are up-to-date, so improving database performance. You can update statistics using one of the following methods: Using DBA Planning Calendar in the Computing Center Management System (CCMS) Using BRCONNECT
12
The DBA Planning Calendar uses the BRCONNECT commands. It is recommended to use this approach because you can easily schedule update statistics to run automatically at specified intervals (for example, weekly).
BRCONNECT supports update statistics for the following: Partitioned tables, except where partitioned tables are explicitly excluded by setting the active flag in the DBSTATC table to I. For more information, see SAP Note 424243. Info cube tables for the SAP Business Information Warehouse (SAP BW) BRCONNECT performs update statistics using a two-step approach: Checks each table to see if the statistics are out-of-date If required, updates the statistics on the table immediately after the check
13
1. BRCONNECT determines the working set of tables and indexes to be checked and updated. To do this, it uses: Options -t|-table and -e|-exclude, as described in -f stats (these options take priority) Stats_table and stats_exclude parameters 2. If the working set contains pool, cluster or other tables that have the ACTIVE flag in the DBSTATC table set to N or R, BRCONNECT immediately deletes the statistics for these tables, because they negatively affect database performance. 3. BRCONNECT checks statistics for the remaining tables in the working set, including tables that have the ACTIVE flag in the DBSTATC table set to A or P, as follows: If the table has the MONITORING attribute set, BRCONNECT reads the number of inserted, deleted, and updated rows from the DBA_TAB_MODIFICATIONS table (this is available from Oracle 8.1 onwards). Otherwise, BRCONNECT uses the standard method (see table below) to update statistics by using the unique index. The method used is: Method and sample defined for the table in the DBSTATC table (has highest priority) Method and sample from the options -m|-method or -s|sample of -f stats -method (takes priority) or the stats_method and stats_sample_size parameters Default method and sample (has lowest priority) The following table describes the default method Number of rows in table Rows 10,000 <= 100,000 <= Rows Rows < 10,000 < 100,000 < 1,000,000 Analysis method Sample size C E E P30 P10
14
1,000,000 <=
Rows
< 10,000,000
E E
P3 P1
Analysis method C means computes the statistics exactly. Analysis method E means estimate the statistics using the sample size specified. For example, "E P10" means that BRCONNECT takes an estimated sample using 10% of rows. For the CH, CX, EH, and EX methods, histograms are created. For the CI, CX, EI and EX methods, the structure of indexes is validated in addition to collecting statistics. 4. BRCONNECT uses the number of new rows for each table in the working set, as derived in the previous step, to see if either of the following is true: Number of new rows is greater than or equal to number of old rows * (100 + threshold) / 100 Number of new rows is less than or equal to number of old rows * 100 / (100 + threshold) The standard threshold is 50, but the value defined in -f stats -change or the stats_change_threshold parameter is used if specified.
5. BRCONNECT immediately updates statistics after checking for the following tables: Tables where either of the conditions in the previous step is true Tables from the DBSTATC table with either of the following values: ACTIVE field U ACTIVE field R or N and USE field A (relevant for the application monitor)
15
6. BRCONNECT writes the results of update statistics to the DBSTATTORA table and also, for tables with the DBSTATC history flag or usage type A, to the DBSTATHORA table. 7. For tables with update statistics using methods EI, EX, CI, or CX, BRCONNECT validates the structure of all associated indexes and writes the results to the DBSTATIORA table and also, for tables with the DBSTATC history flag or usage type A, to the DBSTAIHORA table. 8. BRCONNECT immediately deletes the statistics that it created in this procedure for tables with the ACTIVE flag set to N or R in the DBSTATC table.
16
It will take around 30 minutes to get complete. 3. Optimizer script has completed successfully
17
18
19