Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Cost Based Optimizer – 1 of 2 Hotsos Enterprises, Ltd. Grapevine, Texas Oracle.  Performance.  Now. [email_address]
Agenda Cost Based Optimizer and its impact in performance Data Points Collected by the Statistics Gathering Process The 10053 Trace File Case Studies
Cost Based Optimizer
Cost Based Optimizer (CBO) The CBO in reality is a complex decision making software Use several Database Initialization Parameters These are listed in the 10053 trace file Uses several session level initialization parameter These are parameters at the session level that override the database initialization parameters Uses statistics about the objects These statistics will be discussed later Hints to the optimizer Uses Statistics about the system (CPU, Disk etc) Use this information and makes decisions on the “best way” to generate an execution plan
CBO will be part of your life if you keep working with Oracle. The cost-based query optimizer (CBO)… Uses data from a variety of sources Estimates the costs of several execution plans Chooses the plan it estimates to be the least expensive Characteristics Adapts to changing circumstances Frustrating if you don’t know what it considers as input Works great if you know how to use it But produces very poor results if you lie to it The only query optimizer supported by Oracle Corporation from release 10 onward
The cost-based query optimizer chooses the plan that it computes as having the lowest estimated cost. Don’t assume the following are identical CBO’s estimated cost of an execution plan The actual cost of an execution plan CBO’s cost estimate can be imperfect Are your CBO inputs perfect? CBO isn’t perfect, but by 9.2 it’s almost always good enough Without properly collected statistics, the CBO will use RBO if no statistics exist on any object in the statement use default statistics if statistics exist for a single object in the statement but not others use dynamic sampling to generate statistics (based on parameter setting and Oracle version)
Cost Based Optimizer
Execution plan changes can result in profoundly different application performance. Table size change Device latency change Execution plan change Type C performance changes are the most profound size change performance change performance change performance change
Recap The CBO is a complex piece of software It uses several data points to calculate the cost of the execution plan and will choose the plan with the lowest cost It is dynamic and will adapt to changing data better than the Cost Based Optimizer A good understanding of the Cost Based Optimizer is imperative in understanding the rationale behind some of the choices
Data Points Collected by the Cost Based Optimizer
Data Points Collected by the Cost Based Optimizer Table Statistics Column Statistics Index Statistics System Statistics
Table Statistics Statistics collected for tables appear in the data dictionary views  *_TABLES Number of rows ( NUM_ROWS )  Number of data blocks below the high water mark ( BLOCKS )   Number of data blocks allocated to the table that have never been used ( EMPTY_BLOCKS )  Average available free space in each data block in bytes ( AVG_SPACE )  Number of chained rows ( CHAIN_CNT )  Average row length, including the row's overhead, in bytes ( AVG_ROW_LEN )
Column Statistics Statistics collected for columns appear in the data dictionary views  *_TAB_COLUMNS  and  *_TAB_COL_STATISTICS. Number of distinct column values ( NUM_DISTINCT )  Lowest value in a column ( LOW_VALUE )  Highest value in a column ( HIGH_VALUE ) Selectivity estimate for column ( DENSITY ) Number of null values in a column ( NUM_NULLS ) Number of histogram buckets ( NUM_BUCKETS ) Average column length ( AVG_COL_LEN ) Existence/type of histogram ( HISTOGRAM )
Index Statistics Statistics collected for indexes appear in the data dictionary views  *_INDEXES . Depth of the index from its root block to its leaf blocks ( BLEVEL ) Number of leaf blocks ( LEAF_BLOCKS )  Number of distinct indexed values ( DISTINCT_KEYS )  Average number of leaf blocks in which each distinct value in the index appears ( AVG_LEAF_BLOCKS_PER_KEY )  Average number of data blocks in the table that are pointed to by a distinct value in the index ( AVG_DATA_BLOCKS_PER_KEY )  Clustering factor (how well ordered the rows are with respect to the indexed values) ( CLUSTERING_FACTOR )
System Statistics Statistic Description sreadtim average time (ms) to complete a single block read mreadtim average time (ms) to complete a multi-block read  cpuspeed average CPU cycles per second (in millions) cpuspeednw estimate of CPU cycles (v10) ioseektim estimate of average read seek time (v10) iotfrspeed estimate of transfer speed of I/O system (v10) mbrc average multiblock read count (in blocks) maxthr maximum I/O system throughput (in bytes/second) slavethr average slave I/O throughput (in bytes/second)
DBMS_STATS DBMS_STATS are used to collect the above described data points There are several debates on what the % of data collected should be What the options to the DBMS_STATS should be on Our perspective It should be dependent on the best plans that can be generated for all the SQL being executed in the system The frequency and the % of stats being gathered itself should not become a malignant load on the system
Recap There are several data points that are used by the optimizer They are collected using DBMS_STATS All these data points help the optimizer in deciding to choose a execution plan The collection of statistics should be a function of the quality of the plans that can be generated and the special cases that will drive higher frequency and % estimate of statistics collection
Example of Data Points Gathered by the Optimizer
10053 Trace File
10053 Trace File The 10053 Trace File is the dump of the analysis done by the Cost Based Optimizer in figuring out the execution plan It is unlike the 10046 trace file which has a specific syntax and is usually the same across version barring a few exceptions (Millsap, Optimizing Oracle Performance) It should be used as the last resort for trouble shooting Usually the difference in the rows estimated by the Cost Based Optimizer and the Actual Rows from the execution plan is good enough to see where the problem is
See Webinar 1 – SQL Tuning with Trace Data and DBMS_XPLAN
10053 Trace File Structure Peeked Values of the Bind variables in the SQL Statement Parameters used by the Optimizer Base Table Statistics Table Stats Column Stats Index Stats Single Table  Access Path Join Order Determination (Tries Several Combinations and determines the type of the join and the order of the join) Query Itself
Script to Generate 10053 Trace File alter session set max_dump_file_size = unlimited; alter session set tracefile_identifier = ‘Query1_10053’; alter session set events '10053 trace name context forever, level 1'; Explain plan for select * from dba_objects where owner=‘SCOTT’; alter session set events ’10053 trace name context off’; exit;
Recap The 10053 Trace File is a good tool to understand the reasoning and the data behind the optimizer Should usually be used a last resort Produce large amount of data due to the number of combinations and data points involved in making a decision Usually the row difference in the execution plan and the explain plan should be more than enough to understand why the optimizer is making a certain choice of row source operation
Case Studies
Index Scan versus Full Table Scan’s
Setup Create a table which is a copy of dba_objects create table myobj as select * from dba_objects; We will create indexes on owner and object_type We will now gather default statistics and run some queries We will create indexes on owner and object_id We will then study some simple queries on this table
Should I do a Full Table Scan or Use an Index? Here is a simple query Full Table Scan select object_type, count(1) from myobj where object_id between 10 and 11558 group by object_type; Index Range Scan select object_type, count(1) from myobj where object_id between 10 and 11557 group by object_type;
Review 10053 Trace Files
Data Points Full Table Scan Cost 180 Logical I/O’s 772  Index Scan Cost 179 Logical I/O’s 166
Shouldn’t Oracle minimize the number of Logical I/O’s? Yes Yes, it is probably the right thing to do No Because, it is probably not the right thing The answer is it depends on several things the most important being db_file_multiblock_read_count In our experiment it was set to 16 At 16, the cost or the time taken to execute the query by Oracle will be faster if it did a full table scan using the multi block read count than if it did a sequential read of the index and the table You can set different values for db_file_multiblock_read_count and see you can see the cost calculations and the plans change
Now let us set db_file_multiblock_read_count to 8 Full Table Scan select object_type, count(1) from myobj where object_id between 10 and 14301 group by object_type; Index Range Scan select object_type, count(1) from myobj where object_id between 10 and 14300 group by object_type;
Let us set it to 8 and see what happens Full Table Scan Cost 221 Logical I/O’s 772  Index Scan Cost 220 Logical I/O’s 211
So there are a lot of moving parts!!! Yes, there are. How do you manage that Keep your optimizer environment as stable as possible The only thing that should be changing are the data points collected by the DBMS_STATS gathering process Keep a copy of your old statistics Just in case you may have to revert back to old stats to get back to the good ol’ plan
A couple of more concepts
Cardinality/Selectivity Cardinality The predicted number of rows generated by a row source operation That is what you see in the explain plan Selectivity Cardinality is calculated by estimating the selectivity which is the expected fraction of rows that will pass the predicate test Is very sensitive to number of distinct values Determines join order choice based on cardinality and hence selectivity
Simple Selectivity Calculate Range for the Query (Rng)  Obtained from Query Calculate  the difference between column high value and column low value (hival-loval) Obtained from dba_tab_columns Divide Rng and (hival-loval) Add N/num_distinct where N is the range bound count and num_distinct is the number of distinct values for this column which is obtained from dba_tab_columns
Sample Selectivity Calculation select object_type, count(1) from myobj where object_id between 10 and 11558 group by object_type; Range is 11558-10 = 11548 Column High Value – 69892 Column Low Value – 14 N = 2 Number of Distinct Values – 55618 Note the optimizer see that 10 as a lower bound value does not make sense and upgrades it to 14 Select (11548-14)/(69892-14)+(2/55618) * 55618 from dual gives cardinality which is 9190 and let us look at the explain plan cardinality
Cost What do we mean by the cost of the query? Oracle Performance Guide Cost  = (Number of Single Block Reads * single read time + number of multi block reads * multi block read time  +  CPU Cycles / CPU Cycles / second ) /Time to read a single block Dimensional Analysis Show Cost is just a number  Although it is the optimizer guess at how long the query will take to execute
Selectivity and Cardinality Impact cost and join strategies Selectivity and Cardinality impact cost and join strategies as we saw in the example You can predict the cost and behavior of your queries as the data volume increases The number of distinct values is very important and determines the selectivity and hence cardinality and hence access strategies and hence costs The number of distinct values from non-unique columns on which predicates are written and joined can severely be impacted  the quality of the statistics being collected
So how can you know if a plan is good? Look at the cardinality of the explain plan Look at the rows actually obtained from the execution plan (from the trace file or the row source execution statistics) See how closely they match If there is a wide variance, study if it is impacting your query performance and determine what additional statistics need to be gathered
How can you predict the nature of plans due to data growth? Use the cardinality hint to study the performance of the query and its behaviour
Scripts
10053_SCOTT.sql alter session set max_dump_file_size = unlimited; alter session set tracefile_identifier = 'MYOBJ_SCOTT'; alter session set events '10053 trace name context forever, level 1'; explain plan for select object_type,count(*) from myobj where owner='SCOTT' group by object_type; alter session set events '10053 trace name context off'; exit;
10053_sys.sql alter session set max_dump_file_size = unlimited; alter session set tracefile_identifier = 'SYS_OBJ'; alter session set events '10053 trace name context forever, level 1'; explain plan for select object_type,count(*) from myobj where owner='SYS' group by object_type; alter session set events '10053 trace name context off'; exit;
Col_stats.sql SELECT  table_name "Table Name",  column_name "Column Name", num_distinct "Number of Distinct Values",  low_value  "Low Value",  high_value  "High Value",  density  "Density",  num_nulls  "Number of Nulls",  num_buckets  "Number of Buckets",  avg_col_len  "Average Column Length",  histogram  "Histogram" FROM  user_tab_columns  WHERE  table_name='MYOBJ';
Gather_table_stats.sql begin dbms_stats.gather_table_stats(USER,'MYOBJ'); end;
Index_stats.sql SELECT  table_name "Table Name",  index_name "Index Name",  blevel  "Blevel",  leaf_blocks "Leaf Blocks",  distinct_keys "Distinct Keys",  AVG_LEAF_BLOCKS_PER_KEY "Avg Leaf Blocks per Key",  AVG_DATA_BLOCKS_PER_KEY "Avg Data Blocks Per Key",  CLUSTERING_FACTOR "Clustering Factor"  FROM  user_indexes  WHERE  table_name='MYOBJ';
Join_strategy_choice.sql select /*+ cardinality (e 6059) cardinality (d 200) */ e.empno,e.ename,d.dname,d.loc from emp e, dept d where e.deptno=d.deptno; select /*+ cardinality (e 6060) cardinality (d 200) */ e.empno,e.ename,d.dname,d.loc from emp e, dept d where e.deptno=d.deptno;
Raw_to_num.sql select column_name, raw_to_num(low_value) low_val, raw_to_num(high_value) high_val from user_tab_columns where table_name  = 'MYOBJ' and column_name = 'OBJECT_ID'
Tab_stats.sql select table_name "Table Name",  num_rows "No. of Rows", blocks "Blocks",  empty_blocks "Empty Blocks",  avg_space "Average Space",  chain_cnt "Chain Count",  avg_row_len "Average Row Length",  sample_size "Sampe Size" from user_tables where table_name='MYOBJ';

More Related Content

Cost Based Optimizer - Part 1 of 2

  • 1. Cost Based Optimizer – 1 of 2 Hotsos Enterprises, Ltd. Grapevine, Texas Oracle. Performance. Now. [email_address]
  • 2. Agenda Cost Based Optimizer and its impact in performance Data Points Collected by the Statistics Gathering Process The 10053 Trace File Case Studies
  • 4. Cost Based Optimizer (CBO) The CBO in reality is a complex decision making software Use several Database Initialization Parameters These are listed in the 10053 trace file Uses several session level initialization parameter These are parameters at the session level that override the database initialization parameters Uses statistics about the objects These statistics will be discussed later Hints to the optimizer Uses Statistics about the system (CPU, Disk etc) Use this information and makes decisions on the “best way” to generate an execution plan
  • 5. CBO will be part of your life if you keep working with Oracle. The cost-based query optimizer (CBO)… Uses data from a variety of sources Estimates the costs of several execution plans Chooses the plan it estimates to be the least expensive Characteristics Adapts to changing circumstances Frustrating if you don’t know what it considers as input Works great if you know how to use it But produces very poor results if you lie to it The only query optimizer supported by Oracle Corporation from release 10 onward
  • 6. The cost-based query optimizer chooses the plan that it computes as having the lowest estimated cost. Don’t assume the following are identical CBO’s estimated cost of an execution plan The actual cost of an execution plan CBO’s cost estimate can be imperfect Are your CBO inputs perfect? CBO isn’t perfect, but by 9.2 it’s almost always good enough Without properly collected statistics, the CBO will use RBO if no statistics exist on any object in the statement use default statistics if statistics exist for a single object in the statement but not others use dynamic sampling to generate statistics (based on parameter setting and Oracle version)
  • 8. Execution plan changes can result in profoundly different application performance. Table size change Device latency change Execution plan change Type C performance changes are the most profound size change performance change performance change performance change
  • 9. Recap The CBO is a complex piece of software It uses several data points to calculate the cost of the execution plan and will choose the plan with the lowest cost It is dynamic and will adapt to changing data better than the Cost Based Optimizer A good understanding of the Cost Based Optimizer is imperative in understanding the rationale behind some of the choices
  • 10. Data Points Collected by the Cost Based Optimizer
  • 11. Data Points Collected by the Cost Based Optimizer Table Statistics Column Statistics Index Statistics System Statistics
  • 12. Table Statistics Statistics collected for tables appear in the data dictionary views *_TABLES Number of rows ( NUM_ROWS ) Number of data blocks below the high water mark ( BLOCKS ) Number of data blocks allocated to the table that have never been used ( EMPTY_BLOCKS ) Average available free space in each data block in bytes ( AVG_SPACE ) Number of chained rows ( CHAIN_CNT ) Average row length, including the row's overhead, in bytes ( AVG_ROW_LEN )
  • 13. Column Statistics Statistics collected for columns appear in the data dictionary views *_TAB_COLUMNS and *_TAB_COL_STATISTICS. Number of distinct column values ( NUM_DISTINCT ) Lowest value in a column ( LOW_VALUE ) Highest value in a column ( HIGH_VALUE ) Selectivity estimate for column ( DENSITY ) Number of null values in a column ( NUM_NULLS ) Number of histogram buckets ( NUM_BUCKETS ) Average column length ( AVG_COL_LEN ) Existence/type of histogram ( HISTOGRAM )
  • 14. Index Statistics Statistics collected for indexes appear in the data dictionary views *_INDEXES . Depth of the index from its root block to its leaf blocks ( BLEVEL ) Number of leaf blocks ( LEAF_BLOCKS ) Number of distinct indexed values ( DISTINCT_KEYS ) Average number of leaf blocks in which each distinct value in the index appears ( AVG_LEAF_BLOCKS_PER_KEY ) Average number of data blocks in the table that are pointed to by a distinct value in the index ( AVG_DATA_BLOCKS_PER_KEY ) Clustering factor (how well ordered the rows are with respect to the indexed values) ( CLUSTERING_FACTOR )
  • 15. System Statistics Statistic Description sreadtim average time (ms) to complete a single block read mreadtim average time (ms) to complete a multi-block read cpuspeed average CPU cycles per second (in millions) cpuspeednw estimate of CPU cycles (v10) ioseektim estimate of average read seek time (v10) iotfrspeed estimate of transfer speed of I/O system (v10) mbrc average multiblock read count (in blocks) maxthr maximum I/O system throughput (in bytes/second) slavethr average slave I/O throughput (in bytes/second)
  • 16. DBMS_STATS DBMS_STATS are used to collect the above described data points There are several debates on what the % of data collected should be What the options to the DBMS_STATS should be on Our perspective It should be dependent on the best plans that can be generated for all the SQL being executed in the system The frequency and the % of stats being gathered itself should not become a malignant load on the system
  • 17. Recap There are several data points that are used by the optimizer They are collected using DBMS_STATS All these data points help the optimizer in deciding to choose a execution plan The collection of statistics should be a function of the quality of the plans that can be generated and the special cases that will drive higher frequency and % estimate of statistics collection
  • 18. Example of Data Points Gathered by the Optimizer
  • 20. 10053 Trace File The 10053 Trace File is the dump of the analysis done by the Cost Based Optimizer in figuring out the execution plan It is unlike the 10046 trace file which has a specific syntax and is usually the same across version barring a few exceptions (Millsap, Optimizing Oracle Performance) It should be used as the last resort for trouble shooting Usually the difference in the rows estimated by the Cost Based Optimizer and the Actual Rows from the execution plan is good enough to see where the problem is
  • 21. See Webinar 1 – SQL Tuning with Trace Data and DBMS_XPLAN
  • 22. 10053 Trace File Structure Peeked Values of the Bind variables in the SQL Statement Parameters used by the Optimizer Base Table Statistics Table Stats Column Stats Index Stats Single Table Access Path Join Order Determination (Tries Several Combinations and determines the type of the join and the order of the join) Query Itself
  • 23. Script to Generate 10053 Trace File alter session set max_dump_file_size = unlimited; alter session set tracefile_identifier = ‘Query1_10053’; alter session set events '10053 trace name context forever, level 1'; Explain plan for select * from dba_objects where owner=‘SCOTT’; alter session set events ’10053 trace name context off’; exit;
  • 24. Recap The 10053 Trace File is a good tool to understand the reasoning and the data behind the optimizer Should usually be used a last resort Produce large amount of data due to the number of combinations and data points involved in making a decision Usually the row difference in the execution plan and the explain plan should be more than enough to understand why the optimizer is making a certain choice of row source operation
  • 26. Index Scan versus Full Table Scan’s
  • 27. Setup Create a table which is a copy of dba_objects create table myobj as select * from dba_objects; We will create indexes on owner and object_type We will now gather default statistics and run some queries We will create indexes on owner and object_id We will then study some simple queries on this table
  • 28. Should I do a Full Table Scan or Use an Index? Here is a simple query Full Table Scan select object_type, count(1) from myobj where object_id between 10 and 11558 group by object_type; Index Range Scan select object_type, count(1) from myobj where object_id between 10 and 11557 group by object_type;
  • 30. Data Points Full Table Scan Cost 180 Logical I/O’s 772 Index Scan Cost 179 Logical I/O’s 166
  • 31. Shouldn’t Oracle minimize the number of Logical I/O’s? Yes Yes, it is probably the right thing to do No Because, it is probably not the right thing The answer is it depends on several things the most important being db_file_multiblock_read_count In our experiment it was set to 16 At 16, the cost or the time taken to execute the query by Oracle will be faster if it did a full table scan using the multi block read count than if it did a sequential read of the index and the table You can set different values for db_file_multiblock_read_count and see you can see the cost calculations and the plans change
  • 32. Now let us set db_file_multiblock_read_count to 8 Full Table Scan select object_type, count(1) from myobj where object_id between 10 and 14301 group by object_type; Index Range Scan select object_type, count(1) from myobj where object_id between 10 and 14300 group by object_type;
  • 33. Let us set it to 8 and see what happens Full Table Scan Cost 221 Logical I/O’s 772 Index Scan Cost 220 Logical I/O’s 211
  • 34. So there are a lot of moving parts!!! Yes, there are. How do you manage that Keep your optimizer environment as stable as possible The only thing that should be changing are the data points collected by the DBMS_STATS gathering process Keep a copy of your old statistics Just in case you may have to revert back to old stats to get back to the good ol’ plan
  • 35. A couple of more concepts
  • 36. Cardinality/Selectivity Cardinality The predicted number of rows generated by a row source operation That is what you see in the explain plan Selectivity Cardinality is calculated by estimating the selectivity which is the expected fraction of rows that will pass the predicate test Is very sensitive to number of distinct values Determines join order choice based on cardinality and hence selectivity
  • 37. Simple Selectivity Calculate Range for the Query (Rng) Obtained from Query Calculate the difference between column high value and column low value (hival-loval) Obtained from dba_tab_columns Divide Rng and (hival-loval) Add N/num_distinct where N is the range bound count and num_distinct is the number of distinct values for this column which is obtained from dba_tab_columns
  • 38. Sample Selectivity Calculation select object_type, count(1) from myobj where object_id between 10 and 11558 group by object_type; Range is 11558-10 = 11548 Column High Value – 69892 Column Low Value – 14 N = 2 Number of Distinct Values – 55618 Note the optimizer see that 10 as a lower bound value does not make sense and upgrades it to 14 Select (11548-14)/(69892-14)+(2/55618) * 55618 from dual gives cardinality which is 9190 and let us look at the explain plan cardinality
  • 39. Cost What do we mean by the cost of the query? Oracle Performance Guide Cost = (Number of Single Block Reads * single read time + number of multi block reads * multi block read time + CPU Cycles / CPU Cycles / second ) /Time to read a single block Dimensional Analysis Show Cost is just a number Although it is the optimizer guess at how long the query will take to execute
  • 40. Selectivity and Cardinality Impact cost and join strategies Selectivity and Cardinality impact cost and join strategies as we saw in the example You can predict the cost and behavior of your queries as the data volume increases The number of distinct values is very important and determines the selectivity and hence cardinality and hence access strategies and hence costs The number of distinct values from non-unique columns on which predicates are written and joined can severely be impacted the quality of the statistics being collected
  • 41. So how can you know if a plan is good? Look at the cardinality of the explain plan Look at the rows actually obtained from the execution plan (from the trace file or the row source execution statistics) See how closely they match If there is a wide variance, study if it is impacting your query performance and determine what additional statistics need to be gathered
  • 42. How can you predict the nature of plans due to data growth? Use the cardinality hint to study the performance of the query and its behaviour
  • 44. 10053_SCOTT.sql alter session set max_dump_file_size = unlimited; alter session set tracefile_identifier = 'MYOBJ_SCOTT'; alter session set events '10053 trace name context forever, level 1'; explain plan for select object_type,count(*) from myobj where owner='SCOTT' group by object_type; alter session set events '10053 trace name context off'; exit;
  • 45. 10053_sys.sql alter session set max_dump_file_size = unlimited; alter session set tracefile_identifier = 'SYS_OBJ'; alter session set events '10053 trace name context forever, level 1'; explain plan for select object_type,count(*) from myobj where owner='SYS' group by object_type; alter session set events '10053 trace name context off'; exit;
  • 46. Col_stats.sql SELECT table_name "Table Name", column_name "Column Name", num_distinct "Number of Distinct Values", low_value "Low Value", high_value "High Value", density "Density", num_nulls "Number of Nulls", num_buckets "Number of Buckets", avg_col_len "Average Column Length", histogram "Histogram" FROM user_tab_columns WHERE table_name='MYOBJ';
  • 48. Index_stats.sql SELECT table_name "Table Name", index_name "Index Name", blevel "Blevel", leaf_blocks "Leaf Blocks", distinct_keys "Distinct Keys", AVG_LEAF_BLOCKS_PER_KEY "Avg Leaf Blocks per Key", AVG_DATA_BLOCKS_PER_KEY "Avg Data Blocks Per Key", CLUSTERING_FACTOR "Clustering Factor" FROM user_indexes WHERE table_name='MYOBJ';
  • 49. Join_strategy_choice.sql select /*+ cardinality (e 6059) cardinality (d 200) */ e.empno,e.ename,d.dname,d.loc from emp e, dept d where e.deptno=d.deptno; select /*+ cardinality (e 6060) cardinality (d 200) */ e.empno,e.ename,d.dname,d.loc from emp e, dept d where e.deptno=d.deptno;
  • 50. Raw_to_num.sql select column_name, raw_to_num(low_value) low_val, raw_to_num(high_value) high_val from user_tab_columns where table_name = 'MYOBJ' and column_name = 'OBJECT_ID'
  • 51. Tab_stats.sql select table_name "Table Name", num_rows "No. of Rows", blocks "Blocks", empty_blocks "Empty Blocks", avg_space "Average Space", chain_cnt "Chain Count", avg_row_len "Average Row Length", sample_size "Sampe Size" from user_tables where table_name='MYOBJ';

Editor's Notes

  1. Note that without properly collected statistics, the CBO will do one of two things: if no statistics exist for any object used in the SQL statement, the CBO may use rule-based optimization (prior to v10) or use dynamic sampling if statistics exist for any single object but not others in the SQL statement, the CBO may use a set of default statistics for the object without statistics or use dynamic sampling. CBO default statistics for objects without collected stats (prior to v10…in v10 dynamic sampling is typically used instead of defaults): TABLE SETTING DEFAULT STATISTICS cardinality (number of blocks * (block size – cache layer) / average row length average row length 100 bytes number of blocks 100 or actual value based on the extent map remote cardinality (distrib) 2000 rows remote average row length 100 bytes INDEX SETTING DEFAULT STATISTICS levels 1 leaf blocks 25 leaf blocks/key 1 data blocks/key 1 distinct keys 100 clustering factor 800
  2. Plot A illustrates a situation in which the execution plan does not change, but the query response time varies significantly as the number of rows in the table changes. This kind of thing occurs when an application chooses a TABLE ACCESS (FULL) execution plan for a growing table. It’s what causes RBO-based applications to appear fast in a small development environment, but then behave poorly in the production environment. Plot B illustrates the marginal improvement that’s achievable, for example, by distributing an inefficient application’s workload more uniformly across the disks in a disk array. Notice that the execution plan (or “shape of the performance curve”) isn’t necessarily changed by such an operation (although, if the output of dbms_stats.gather_system_statistics changes as a result of the configuration change, then the plan might change). The performance for a given number of rows might change, however, as the plot here indicates. Plot C illustrates what is commonly the most profound type of performance change: an execution plan change. This situation can be caused by a change to any of CBO inputs. For example, an accidental deletion of a segment’s statistics can change a plan from a nice fast plan (depicted by the green curve, which is O(log n)) to a horrifically slow plan (depicted by the red curve, which is O(n 2 )). The phenomenon illustrated in plot C is what has happened when a query that was fast last week now runs for 14 hours without completing before you finally give up and kill the session.