SQL Tuning Workshop v2
SQL Tuning Workshop v2
D12395GC10
Production 1.0
January 2002
D34339
Authors Copyright © Oracle Corporation, 1998, 1999, 2000, 2001, 2002. All rights reserved.
Nancy Greenberg This documentation contains proprietary information of Oracle Corporation. It is provided
under a license agreement containing restrictions on use and disclosure and is also
Priya Vennapusa
protected by copyright law. Reverse engineering of the software is prohibited. If this
documentation is delivered to a U.S. Government Agency of the Department of Defense,
then it is delivered with Restricted Rights and the following legend is applicable:
Technical Contributors
Restricted Rights Legend
and Reviewers
Use, duplication or disclosure by the Government is subject to restrictions for commercial
Howard Bradley computer software and shall be deemed to be Restricted Rights software under Federal
Laszlo Czinkoczki law, as set forth in subparagraph (c)(1)(ii) of DFARS 252.227-7013, Rights in Technical
Dan Gabel Data and Computer Software (October 1988).
Connie Dialeris Green This material or any portion of it may not be copied in any form or by any means without
John Hibbard the express prior written permission of Oracle Corporation. Any other copying is a
Lilian Hobbs violation of copyright law and may result in civil and/or criminal penalties.
John Hoff
If this documentation is delivered to a U.S. Government Agency not within the
Alexander Hunold Department of Defense, then it is delivered with “Restricted Rights,” as defined in FAR
Tamas Kerepes 52.227-14, Rights in Data-General, including Alternate III (June 1987).
Susan Kotsovolos
Herve Lejeune The information in this document is subject to change without notice. If you find any
problems in the documentation, please report them in writing to Education Products,
Stefan Lindblad Oracle Corporation, 500 Oracle Parkway, Box SB-6, Redwood Shores, CA 94065. Oracle
Diana Lorentz Corporation does not warrant that this document is error-free.
Howard Ostrow
All references to Oracle and Oracle products are trademarks or registered trademarks of
Arjan Pellenkoft
Oracle Corporation.
Stacey Procter
Shankar Raman All other products or company names are used for identification purposes only, and may
Mariajesus Senise be trademarks of their respective owners.
Janet Stern
Don Sullivan
Ric Van Dyke
Lachlan Williams
Publisher
Shane Mattimoe
Contents
Instructor Preface
iii
3 EXPLAIN and AUTOTRACE
Objectives 3-2
Creating the Plan Table 3-3
The EXPLAIN PLAN Command 3-5
EXPLAIN PLAN Example 3-6
Displaying the Execution Plan 3-7
Interpreting the Execution Plan 3-9
Using V$SQL_PLAN 3-11
V$SQL_PLAN Columns 3-12
Querying V$SQL_PLAN 3-14
SQL*Plus AUTOTRACE 3-15
SQL*Plus AUTOTRACE Examples 3-16
SQL*Plus AUTOTRACE Statistics 3-17
Summary 3-18
Practice Overview 3-19
4 SQL Trace and TKPROF
Objectives 4-2
SQL Trace Facility 4-3
How to Use the SQL Trace Facility 4-4
Initialization Parameters 4-5
Switching On SQL Trace 4-7
Finding Your Trace Files 4-8
Formatting Your Trace Files 4-9
TKPROF Command Options 4-11
Output of the TKPROF Command 4-12
TKPROF Output Example: No Index 4-17
TKPROF Output Example: Unique Index 4-18
Some TKPROF Interpretation Pitfalls 4-19
Summary 4-20
Practice Overview 4-21
5 Rule-Based Optimization Versus Cost-Based Optimization
Objectives 5-2
Overview 5-3
Functions of the Oracle9i Optimizer 5-4
Rule-Based Optimization 5-5
Cost-Based Optimization 5-6
Choosing Between RBO and CBO 5-8
Setting the Optimizer Approach 5-9
First Rows Optimization 5-10
Rule-Based Optimization 5-11
RBO Ranking Scheme 5-12
Rule-Based Optimization Example 5-13
Influencing Rule-Based Optimization 5-14
iv
Summary 5-15
Practice Overview 5-16
Guided Practice Page 5-17
6 Indexes and Basic Access Methods
Objectives 6-2
ROWIDs 6-3
Indexes 6-4
B*-Tree Indexes 6-5
B*-Tree Index Structure 6-6
B*-Tree Index Example 6-7
CREATE INDEX Syntax 6-8
Composite Index Guidelines 6-9
Index Statistics 6-10
Effect of DML Operations on Indexes 6-11
Indexes and Constraints 6-12
Indexes and Foreign Keys 6-13
Basic Access Methods 6-14
Skip Scanning of Indexes 6-15
Identifying Unused Indexes 6-18
Enabling and Disabling the Monitoring of Index Usage 6-19
Clusters 6-20
Cluster Example 6-21
Index Clusters 6-22
Index Clusters: Performance Characteristics 6-23
Index Clusters: Limitations and Guidelines 6-24
Hash Clusters 6-25
Hash Clusters: Limitations 6-26
When to Use Clusters 6-27
Summary 6-29
7 Collecting Statistics
Objectives 7-2
The ANALYZE Command 7-3
Table Statistics 7-5
Index Statistics 7-7
Column Statistics 7-10
The DBMS_STATS Package 7-12
DBMS_STATS: Generating Statistics 7-13
Copy Statistics Between Databases 7-14
Example: Copying Statistics 7-15
Example: Gathering Statistics 7-16
Predicate Selectivity 7-17
Bind Variables and Predicate Selectivity 7-18
Histograms 7-20
v
Histograms and Selectivity 7-21
Histogram Statistics: Example 7-22
Histogram Tips 7-24
When to Use Histograms 7-25
Choosing a Sample Size 7-26
Choosing the Number of Buckets 7-27
Viewing Histogram Statistics 7-28
Summary 7-29
Practice Overview 7-30
8 Influencing the Optimizer
Objectives 8-2
Setting the Optimizer Mode 8-3
Some Additional Parameters 8-4
Optimizer Hint Syntax 8-6
Rules for Hints 8-7
Hint Recommendations 8-8
Optimizer Hint Example 8-9
Hint Categories 8-10
Basic Access Path Hints 8-11
Advanced Access Path Hints 8-13
Buffer Cache Hints 8-14
Hints and Views 8-15
View Processing Hints 8-17
Summary 8-18
Practice Overview 8-19
9 Sorting and Joining
Objectives 9-2
Tuning Sort Performance 9-3
Top-N SQL 9-4
Join Terminology 9-5
Join Operations 9-7
Nested Loops Joins 9-8
Nested Loops Join Plan 9-9
Sort/Merge Joins 9-10
Sort/Merge Join Plan 9-11
Hash Joins 9-12
Hash Join Plan 9-13
Joining Multiple Tables 9-14
Outer Joins 9-15
SQL: 1999 Outer Joins 9-16
Full Outer Joins 9-17
Execution of Outer Joins 9-18
The Optimizer and Joins 9-19
Join Order Rules 9-20
vi
RBO Join Optimization 9-21
CBO Join Optimization 9-23
Estimating Join Costs 9-24
Star Joins 9-25
Hints for Join Orders 9-27
Hints for Join Operations 9-28
Other Join Hints 9-30
Subqueries and Joins 9-31
Initialization Parameters that Influence Joins 9-33
Throwaway of Rows 9-34
Minimize Throwaway of Rows 9-35
Minimize Processing 9-36
Summary 9-37
10 Optimizer Plan Stability
Objectives 10-2
Optimizer Plan Stability 10-3
Plan Equivalence 10-4
Creating Stored Outlines 10-5
Using Stored Outlines 10-6
Data Dictionary Information 10-7
Execution Plan Logic 10-8
Maintaining Stored Outlines 10-9
Outline Editing Overview 10-11
Editable Attributes 10-13
Outline Cloning 10-14
Outline: Administration and Security 10-15
Configuration Parameters 10-17
Create Outline Syntax 10-18
Outline Cloning Examples 10-19
Summary 10-22
Practice Overview 10-23
11 Advanced Indexes
Objectives 11-2
Bitmapped Indexes 11-3
Bitmapped Index Structure 11-4
Creating Bitmapped Indexes 11-5
Using Bitmapped Indexes for Queries 11-7
Combining Bitmapped Indexes 11-8
When to Use Bitmapped Indexes 11-9
Advantages of Bitmapped Indexes 11-10
Bitmapped Index Guidelines 11-11
vii
What Is a Bitmap Join Index? 11-12
Bitmap Join Index: Advantages and Disadvantages 11-14
Indexes and Row-Access Methods 11-16
Index Hints 11-17
INDEX_COMBINE Hint Example 11-18
Star Transformation 11-20
Star Transformation Example 11-22
Function-Based Indexes 11-24
Function-Based Indexes: Usage 11-25
Data Dictionary Information 11-26
Summary 11-27
12 Materialized Views and Temporary Tables
Objectives 12-2
Materialized Views 12-3
Create Materialized Views 12-4
Refresh Materialized Views 12-5
Materialized Views: Manual Refresh 12-7
Query Rewrites 12-8
Create Materialized Views: Syntax Options 12-11
Enabling and Controlling Query Rewrites 12-12
Query Rewrite Example 12-13
Dimensions: Overview 12-15
Dimensions and Hierarchies 12-16
Dimensions: Example Table 12-17
Dimensions and Hierarchies 12-18
Create Dimensions and Hierarchies 12-19
Dimensions Based on Multiple Tables 12-20
Dimensions with Multiple Hierarchies 12-21
Temporary Tables 12-22
Creating Temporary Tables 12-24
Summary 12-26
13 Alternative Storage Techniques
Objectives 13-2
Storing User Data 13-3
Index-Organized Tables 13-4
IOT Performance Characteristics 13-5
IOT Limitations 13-6
When to Use Index-Organized Tables 13-7
Creating Index-Organized Tables 13-8
IOT Row Overflow 13-9
Retrieving IOT Information 13-10
External Tables 13-11
External Tables Performance Characteristics 13-12
viii
External Tables Limitations 13-13
Why Use External Tables? 13-14
Creating External Tables 13-15
Retrieving External Tables Information 13-16
Summary 13-17
Appendix A: Workshops
Index
ix
x
A
Workshops
3. Get and run queries ws01_01, ws01_01a, ws01_02, and ws01_03. First remove all
indexes from the CUSTOMERS table by running the dai.sql script.
Note: dai.sql removes all nonprimary key indexes.
SQL> @dai
on which table: customers
9. In this exercise you investigate the treatment of NULL-values and use the CUSTOMERS table
for that purpose. First, run the ws01_11a.sql script to remove some values from the
CUST_EMAIL column. Second, create an index on the CUST_EMAIL column of the
CUSTOMERS table; this column contains many NULL-values. Then start query ws01_11b.
SQL> @atoff
SQL> @ws01_11a
SQL> describe customers
SQL> @ci
on which table : customers
on which column(s): cust_email
Creating index on: customers cust_email
Enter value for index_name: cust_email_idx
SQL> @li customers
SQL> @attox
SQL> get ws01_11b
1 select cust_email -- ws01_11b.sql
2 from customers
3* where cust_email is null
SQL> /
10. Now suppose that you are interested in all rows that do not contain a NULL value. Start
ws01_12 and compare with ws01_11b.
SQL> get ws01_12
1 select cust_id -- ws01_12.sql
2 from customers
3* where cust_email is NOT null
SQL> /
The index is not used again, but this time for a different reason. The optimizer decision is
based on selectivity considerations. You can rewrite the SQL statement and force rule-based
optimization, as shown in query ws01_13.
SQL> get ws01_13
1 select cust_id -- ws01_13.sql
2 from customers
3* where cust_email > ’a’
SQL> /
The rule-based optimizer will use the index now; for the cost-based optimizer, it depends on
the statistics. Under which circumstances are queries ws01_12 and ws01_13 logically
equivalent?
Notes:
The best approach to influence the cost-based optimizer is to specify a hint. Start ws01_14
and compare the results with ws01_12.
Note: If the choose.sql script is not run, the cost-based optimizer is still used because of
the presence of the hint.
11. You already investigated IS NOT NULL. Now look at negations in general (using <>, !=,
or NOT). Note that the distinction between normal conditions and negations is only important
for the rule-based optimizer because it must assume selectivity. Because the cost-based
optimizer is statistics-driven, it disregards negations, although most conditions with a
negation have a bad selectivity.
SQL> @li customers
SQL> @rule
SQL> get ws01_15
1 select cust_last_name -- ws01_15.sql
2 from customers
3* where cust_credit_limit = 7000
SQL> /
SQL> get ws01_16
1 select cust_last_name -- ws01_16.sql
2 from customers
3* where cust_credit_limit != 50000
SQL> /
SQL> get ws01_17
1 select cust_last_name -- ws01_17.sql
2 from customers
3* where NOT (cust_credit_limit > 50000)
SQL> /
Oracle9i: SQL Tuning Workshop A-10
Workshop 1: Single Table, Single Predicate (continued)
Query ws01_15 shows that the index on the CUST_CREDIT_LIMIT column is used.
Query ws01_16 shows that the index on the CUST_CREDIT_LIMIT column is not used
(because of the negation). Query ws01_17 shows that a negation of an inequality will be
internally translated into a positive formulation of the condition. Negations are not part of the
predicate. Note that you issued an ALTER SESSION command first to force rule-based
optimization.
Notes:
Workshop Solutions.
6. The difference is that ws01_08 selects the last name, so a table access is not necessary; this
query only accesses the index. In ws01_07, table access is necessary to retrieve the
customer ID.
8. The search pattern in ws01_10 does not start with a wildcard, but the index is still unusable.
This is because the CUST_ID column is a numeric column, so the optimizer must apply an
implicit data type conversion and rewrites the WHERE clause to read:
where to_char(cust_id) like ’7%’
Workshop Summary
After completing this workshop, you should have learned the following:
• Indexes are only usable if they exist and are available.
• Indexes are only usable if the corresponding column is referenced in a WHERE clause.
• Indexes are only usable if the column name appears clean in the predicate.
• Indexes are not used if the column is part of any expression, or function, or in case of implicit
data type conversion.
• The LIKE operator can benefit only from indexes if the leading character of the search
pattern is not a wildcard and the column contains alphanumeric data.
• If all column values to be selected are part of an index, a table access is not necessary.
• NULL-values are not stored in indexes, therefore full table scans are needed for IS NULL
searches.
• You use the index hint to force the cost-based optimizer to use an index.
• You use SQL*Plus AUTOTRACE to display SQL statement execution plans.
• The index is not used if the NOT EQUAL (!=) operator is present.
2. Verify the existence of the plan table, and use the attox.sql script to enable AUTOTRACE
TRACEONLY EXPLAIN to suppress SQL statement output and produce execution plans;
check the settings with SHOW AUTOTRACE.
SQL> describe plan_table
SQL> @attox
SQL> show autotrace
Notes:
4. Get ws02_02, and compare the results with ws02_01. Apparently, the index on the
CUST_ID column is used: The rows are retrieved in sorted order by accessing the rows
through the index.
SQL> get ws02_02
1 select cust_first_name -- ws02_02.sql
2 , cust_last_name
3 , cust_credit_limit
4 from customers
5* order by cust_id
SQL> /
Note: The presence of statistics could change the optimizer’s behaviour. If you analyzed the
CUSTOMERS table, make sure to delete the statistics or force rule-based optimization.
Why do you think the index on the CUST_CREDIT_LIMIT column is not used? What is the
difference between ws02_01 and ws02_02?
Notes:
Investigate what happens if you create an additional index on the CUST_CITY column:
SQL> @ci
on which table : customers
on which column(s): cust_city
Creating index on: customers cust_city
Enter value for index_name: cust_city_idx
SQL> @li customers
SQL> get ws02_03
SQL> /
This time the optimizer apparently prefers to use the index on the CUST_CITY column to
reduce the number of rows that must be sorted.
Notes:
Note: In ws02_05 the index is used, although the indexed column is part of an expression
(cust_credit_limit+1000). Why does ws02_06 not show the same behavior?
7. Start query ws02_07. A WHERE clause is added and you see the result. Note that dropping
the index on the CUST_CITY column and running ws02_07 again shows a full table scan
and a sort. The index on the CUST_CITY column is no longer usable. Verify the existence of
the plan table, and use the attox.sql scriptto enable AUTOTRACE TRACEONLY
EXPLAIN to suppress SQL statement output and produce execution plans; check the settings
with SHOW AUTOTRACE.
SQL> get ws02_07
1 select max(cust_credit_limit) -- ws02_07.sql
2 from customers
3* where cust_city = 'Paris'
SQL> /
SQL> drop index cust_city_idx;
SQL> get ws02_07
SQL> /
9. The following queries investigate the SQL set operators. These operators unconditionally
result in sort operations, regardless of the presence of indexes. Create any indexes you like to
investigate this. The sorts are needed because the SQL set operators are supposed to filter
duplicate rows from the result.
SQL> get ws02_09
1 select country_id from countries -- ws02_09.sql
2 intersect
3* select country_id from customers
SQL> /
SQL> get ws02_10
1 select country_id from countries -- ws02_10.sql
2 minus
3* select country_id from customers
SQL> /
SQL> get ws02_11
1 select country_id from countries -- ws02_11.sql
2 union
3* select country_id from customers
SQL> /
Notes:
10. Examine the GROUP BY operator. Like the SELECT DISTINCT and the set operators, a
sort operation will always be part of the execution plan.
SQL> get ws02_13
1 select cust_city -- ws02_13.sql
2 , avg(cust_credit_limit)
3 from customers
4* group by cust_city
5 /
SQL> @ci
on which table : customers
on which column(s): cust_city
Creating index on: customers cust_city
Enter value for index_name: cust_city_idx
SQL> get ws02_13
SQL> /
Notes:
Instructor Note
On question 7, nested loops is still favored when the optimizer goal is set to ALL_ROWS.
5. Change the OPTIMIZER_GOAL setting back to ALL_ROWS and repeat your analysis.
SQL> @allrows
SQL> get ws03_02
SQL> /
The optimizer chooses the hash join operation over the sort/merge join, which is reasonable.
No expensive sort operations are involved.
Notes:
Now compare the logical I/O statistics collected so far. For your reference, on a database
with a 2 KB block size, the following values were measured:
Try USE_NL(C CO) ORDERED and be prepared: This could take several minutes.
Remember that nested loops hints (which specify the joining order) are only considered if the
optimizer uses a nested loops operation.
This is an example where the merge join operation is more efficient than the other join
operations, even when retrieving all rows from the query. This is caused by the significant
throwaway of rows due to the non-indexed, non-join predicate.
A nested loops join with the COUNTRIES table as driving (outer) table, using the single row
predicate (and CO.COUNTRY_REGION = ’Americas’), is an effective execution plan
(only one row from each row source).
If you experiment with hints, you can force a hash join with full table scans and even a
nested loops join with the opposite join order (resulting in a full table scan of CUSTOMERS).
This means that the rule that single row predicates should always be first in the join order can
be violated when using sufficient disturbing hints.
How can the CUSTOMERS table be the outer table of the nested loops join?
Notes:
Having CUSTOMERS as an inner table results in the smallest number of rows thrown away.
This is why (in this case) it is more efficient to have the largest row source as the outer table.
Usually the smallest row source is the outer table in a nested loops join.
Notes:
11. Analyze the SQL statement in ws03_08.sql. Try to find the optimal execution plan.
SQL> get ws03_08
1 select c.cust_last_name -- ws03_08.sql
2 , s.time_id
3 , s.prod_id
4 from sales s, customers c
5 where c.cust_id = s.cust_id (+)
6* and s.prod_id = 7145
SQL> /
Notes:
This is an Top-N query statement. The subquery includes the ORDER BY clause to ensure
that the ranking is in the desired order. For results retrieving the largest values, a DESC
parameter is needed. The outer query is used to limit the number of rows in the final result
set.
To make this a star join, you must create an index with the following properties:
• It must be a concatenated index.
• The columns must correspond to the foreign key constraints to the smaller lookup tables (also
called the dimension tables).
Instructor Note
Star queries are unusual, and difficult for query optimizers, because the optimal strategy requires
that the smaller tables (CUSTOMERS, PRODUCTS, PROMOTIONS, and TIMES) undergo
Cartesian-product joins. That is, these smaller tables are joined together despite the fact that there
are no join predicates between them. In general, Cartesian-product joins are expensive and should
be avoided. However, for star queries, it is more efficient to use Cartesian-product joins on
DIMENSION tables than to repeatedly access the data from the FACT table.
Alternatively, you can use a view to construct the intermediate table AVGTAB but
sometimes this view gets expanded improperly. The NO_MERGE hint can prevent this
expansion of the query.
5. Analyze the SQL statement in ws04_05.sql by using SQL Trace and TKPROF.
SQL> @hashtrue
SQL> @atoff
SQL> alter session set timed_statistics = true;
SQL> alter session set sql_trace = true;
SQL> get ws04_05
1 select c.cust_last_name -- ws04_05
2 from customers c
3 where c.country_id = ’US’
4 and c.cust_id NOT IN (select s.cust_id
5* from sales s)
SQL> /
SQL> alter session set sql_trace = false;
This is a SELECT statement with a subquery. Because the predicate with the subquery
contains a NOT IN operator, this statement is also known as an anti-join.
The default behavior of the Oracle server is to go through the table in the subquery for
every row in the main query. By hinting to use a sort/merge or a hash operation instead (by
using the MERGE_AJ or HASH_AJ hints in the subquery), there is a good chance of
improving the performance of the statement. Try to find the optimal performance for this
query by using ws04_06.sql to specify several hints.
SQL> @aton
SQL> get ws04_06
1 select c.cust_last_name -- ws04_06
2 from customers c
3 where c.country_id = ’US’
4 and c.cust_id NOT IN (/*+ &hint */
5 select s.cust_id
6* from sales s)
SQL> /
Enter value for hint: ...
Notes:
The default behavior of the Oracle server is to go through the table in the subquery for every
row in the main query.
By hinting to use a sort/merge or a hash operation instead (by using the MERGE_SJ or
HASH_SJ hints in the subquery), there is only a small chance of improving the performance
of the statement, because the subquery can use a good (selective) index to evaluate the
EXISTS predicate.
Create an index on the PROD_ID column in the SALES table and retest your results.
Note: This may take some time.
SQL> @ci
on which table : sales
on which column(s): prod_id
SQL> @ws04_08
Enter value for hint…
Notes:
Notes:
This is also a semi-join statement, as in the previous exercise. Try to find the optimal
performance for this query by using the hints MERGE_SJ or HASH_SJ in the subquery.
This is an example of a statement where the lack of a usable index can be circumvented by
using nondefault execution methods. This is important to remember when you tune
statements that must run sporadically but still require acceptable performance.
SQL> @dai
on which table: customers
SQL> @ci
on which table : customers
on which column(s): cust_gender
Creating index on: customers cust_gender
Enter value for index_name: I_CUSTOMERS_CUST_GENDER
SQL> @ci
on which table : customers
on which column(s): cust_postal_code
Creating index on: customers cust_postal_code
Enter value for index_name: I_CUSTOMERS_CUST_POSTAL_CODE
SQL> @ci
on which table : customers
on which column(s): cust_credit_limit
Creating index on: customers cust_credit_limit
Enter value for index_name: I_CUSTOMERS_CUST_CREDIT_LIMIT
SQL> @index
Enter value for table_name: customers
Notes:
3. Set AUTOTRACE to explain only. Examine query ws05_02.sql. The statement contains an
INDEX hint. Run this statement with different indexes and take notes about the results.
Alternately examine and run ws05_02b.sql and ws05_02c.sql which use other
indexes and take notes of these results.
SQL> @attox
SQL> get ws05_02
1 select /*+ INDEX (r I_CUSTOMERS_CUST_CREDIT_LIMIT) */
2 c.* -- ws05_02.sql
3 from customers c
4 where cust_gender = ’M’
5 and cust_postal_code = 40804
6 and cust_credit_limit = 10000
SQL> /
Notes:
SQL> @ws05_02c
1 select /*+ INDEX (r I_CUSTOMERS_CUST_POSTAL_CODE) */
2 c.* -- ws05_02c.sql
3 from customers c
4 where cust_gender = ’M’
5 and cust_postal_code = 40804
6 and cust_credit_limit = 10000
SQL> /
SQL> @atoff
Notes:
INDEX_NAME INDEX_TYPE
------------------------------ ----------
I_CUSTOMERS_CUST_CREDIT_LIMIT NORMAL
I_CUSTOMERS_CUST_GENDER NORMAL
I_CUSTOMERS_CUST_POSTAL_CODE NORMAL
SQL> @atto
SQL> get ws05_03
1 select /*+ AND_EQUAL (r &index_name1, &index_name2) */
2 c.* -- ws05_03a.sql
3 from customers c
4 where cust_gender = ’M’
5 and cust_postal_code = 40804
6* and cust_credit_limit = 10000
SQL> /
Enter value for index_name: I_CUSTOMERS_CUST_CREDIT_LIMIT
Enter value for index_name: I_CUSTOMERS_CUST_GENDER
SQL> @atoff
Notes:
Try more combinations for index_name1 and index_name2. Also try to enter all three
index names by entering two index names after one of the prompts.
Notes:
SQL> @dai
on which table: customers
SQL> @ci
on which table : customers
on which column(s): cust_gender, cust_credit_limit,
cust_postal_code
Note: The index creation may take a while.
SQL> @atto
SQL> get ws05_01
1 select c.* -- ws05_01.sql
2 from customers c
3 where cust_gender = ’M’
4 and cust_postal_code = 40804
5* and cust_credit_limit = 10000
SQL> /
This is the best approach so far. The concatenated index acts like a premerged set of indexes.
This is an ideal situation because the WHERE clause contains a predicate for all three columns
in the concatenated index.
Notes:
Get ws05_04b.sql and compare it with the original statement in ws05_01.sql. Now
the predicate on the cust_credit_limit column is removed.
SQL> get ws05_04b
1 select
2 c.* -- ws05_04b.sql
3 from customers c
4 where cust_gender = ’M’
5* and cust_postal_code = 40804
SQL> /
Get ws05_04c.sql and compare it with the original statement in ws05_01.sql. Now
the predicate on the cust_gender column is removed.
SQL> get ws05_04c
1 select
2 c.* -- ws05_04c.sql
3 from customers c
4 where cust_postal_code = 40804
5* and cust_credit_limit = 10000
SQL> /
Notes:
The optimizer may not use the bitmap index even with a hint.
8. Make sure that you have a normal B*-tree index on the cust_year_of_birth column,
and cust_credit_limit column. Now get ws05_05.sql. This time you see two
predicates combined with an OR operator.
SQL> @dai
on which table: customers
SQL> @ci
on which table : customers
on which column(s): cust_year_of_birth
SQL> @ci
on which table : customers
on which column(s): cust_credit_limit
You see that both indexes are used and combined with a CONCATENATION operator.
Investigate what happens if you drop the index on the cust_year_of_birth column:
SQL> drop index I_CUSTOMERS_CUST_YEAR_OF_BIRTH;
SQL> @ws05_05
SQL> /
This time the optimizer apparently prefers to perform a full table scan. Why?
Notes:
9. Examine and start ws05_06.sql. This creates a primary key constraint and unique index
on the CUST_ID column of CUSTOMERS table. Run index.sql to check available
indexes. Now get and run query ws05_08.sql.
SQL> @ws05_06.sql
SQL> @index
Enter value for table_name: customers
SQL> get ws05_08
1 select c.* -- ws05_08.sql
2 from customers c
3* where cust_id in (88340,104590,44910)
SQL> /
10. Drop all indexes on the customers table, and create three bitmapped indexes again, then get
query ws05_09.sql. This statement has a complicated WHERE clause. Bitmapped indexes
are good for this type of statement; you see several bitmap operations in the execution plan.
SQL> @dai
on which table: customers
SQL> @cbi
on which table : customers
on which column(s): cust_year_of_birth
SQL> @cbi
on which table : customers
on which column(s): cust_postal_code
SQL> @cbi
on which table : customers
on which column(s): cust_credit_limit
SQL> attox
SQL> get ws05_09
1 select c.*
2 from customers c
3 where (c.cust_year_of_birth = ’1970’ and
4 c.cust_postal_code = 40804)
5* and not cust_credit_limit = 15000
SQL>/
Notes:
SQL> @ci
on which table : customers
on which column(s): cust_last_name, cust_first_name
SQL> @ws05_11
SQL> /
As you see, the optimizer benefits from a fast full index scan. Remember that the Oracle9i
Server uses multiblock reads but does not guarantee any ordering. When you add an ORDER
BY clause, you see a sort operation in the execution plan.
Notes:
13. Now examine the COUNT function. The COUNT function can benefit from bitmapped indexes
by counting the number of ones in a bitmap, which is an efficient operation.
SQL> @aton
SQL> get ws05_13
1 select count(*) credit_limit
2 from customers -- ws05_13.sql
3* where cust_credit_limit = 10000;
SQL> /
The I/O is reduced considerably. If you have some time left, replace the bitmapped index
with a normal one. Then you see that a normal index also improves performance; however,
the bitmapped index is roughly 10 times more efficient. Note that bitmapped indexes are only
considered by the CBO, so you must analyze your tables or force cost-based optimization.
14. Finally, investigate the benefits of function-based indexes. Drop all indexes on the
CUSTOMERS table, analyze the table and start query ws05_14.sql.
SQL> @dai
on which table: customers
SQL> @atto
SQL> get ws05_14
1 select cust_id, country_id
2 from customers -- ws05_14.sql
3* where lower( cust_last_name) like ’gentle’
SQL> /
Notes:
As a last step, run the ws05_15.sql script to recreate the primary key index on the
CUSTOMERS table for the remaining workshops.
SQL> get ws05_15
1 ALTER TABLE customers
2* ADD CONSTRAINT customers_pk PRIMARY KEY (cust_id)
SQL> /
Table altered.
2. Drop indexes on the SALES and PRODUCTS tables. Compare the SQL statements in
ws06_02.sql and ws06_03.sql.
SQL> @dai
on which table: sales
SQL> @dai
on which table: products
SQL> set timing on
SQL> @atto
These two statements produce the same rows (although they may be differently sorted). Why
is there a difference in the execution plans?
Notes:
5. Run the SQL statement in ws06_07.sql. This script creates a view called
V_CUSTYOB_1962, based on customers born in 1962.
SQL> get ws06_07
1 create or replace view v_custyob_1962 -- ws06_07.sql
2 as select cust_id
3 , cust_last_name
4 , cust_income_level
5 from customers
6* where cust_year_of_birth = 1962
SQL> /
6. Run the SQL statement in ws06_09.sql. This script creates a view called
V_CUST_CREDIT_LIMIT, based on the average credit limits per country.
SQL> get ws06_09
1 create or replace view v_avg_credit_limit – ws06_09.sql
2 as select country_id
3 , avg(cust_credit_limit) AVG_CREDIT
4 from customers
5* group by country_id
SQL> /
Consider the query in ws06_10.sql that accesses the view. The query selects the average
credit limits for customers for a country code.
SQL> get ws06_10
1 select * -- ws06_10.sql
2 from v_avg_credit_limit
3* where country_id = 'US'
SQL> /
Notes:
7. Run the SQL statement in ws06_11.sql. This script creates a view which contains a
SELECT statement with the UNION operator. Run the ws06_12.sql script to query data
from the view.
SQL> get ws06_11
1 create or replace view v_99_00_times -- ws06_12.sql
2 as
3 select *
4 from times
5 where calendar_year = ’1999’
6 union
7 select *
8 from times
9* where calendar_year = ’2000’
SQL> /
Create an index on the CALENDAR_YEAR column in the TIMES table. Rerun the
ws06_12.sql script and compare the results. Is the index used? Why or why not?
SQL> @dai
on which table: times
SQL> @ci
on which table : times
on which column(s): calendar_year
SQL> get ws06_12
1 select calendar_month_number -- ws06_12.sql
2 , calendar_month_name
3 , calendar_quarter_desc
4 , calendar_quarter_number
5 from v_99_00_times
6* where calendar_year = ’1999’
SQL> /
Notes:
SQL> @ws07_stats
SQL> get ws07_01
1 select name, value -- ws07_01.sql
2 from v$parameter
3 where name in (’hash_join_enabled’
4* ,’optimizer_mode‘)
SQL> /
SQL> @allrows
SQL> @ws07_01
Notes:
PLAN_TABLE Description
Column
STATEMENT_ID The value of the optional STATEMENT_ID parameter
specified in the EXPLAIN PLAN statement
TIMESTAMP The date and time when the EXPLAIN PLAN statement was
issued
REMARKS Any comment (of up to 80 bytes) you want to associate with
each step of the explained plan. If you need to add or change
a remark on any row of the PLAN_TABLE, then use the
UPDATE statement to modify the rows of the PLAN_TABLE.
OPERATION The name of the internal operation performed in this step. In
the first row generated for a statement, the column contains
one of the following values:
•DELETE STATEMENT
•INSERT STATEMENT
•SELECT STATEMENT
•UPDATE STATEMENT
OPTIONS A variation on the operation described in the OPERATION
column.
OBJECT_NODE The name of the database link used to reference the object (a
table name or view name). For local queries using parallel
execution, this column describes the order in which output
from operations is consumed.
OBJECT_OWNER The name of the user who owns the schema containing the
table or index.
OBJECT_NAME The name of the table or index.
OBJECT_INSTANCE A number corresponding to the ordinal position of the object
as it appears in the original statement. The numbering
proceeds from left to right, outer to inner with respect to the
original statement text. View expansion results in
unpredictable numbers.
OBJECT_TYPE A modifier that provides descriptive information about the
object; for example, NON-UNIQUE for indexes.
OPTIMIZER The current mode of the optimizer.
PLAN_TABLE Description
Column
SEARCH_COLUMNS Not currently used.
ID A number assigned to each step in the execution plan.
PARENT_ID The ID of the next execution step that operates on the output
of the ID step.
POSITION For the first row of output, this indicates the optimizer’s
estimated cost of executing the statement. For the other rows,
it indicates the position relative to the other children of the
same parent.
COST The cost of the operation as estimated by the optimizer’s
cost-based approach. For statements that use the rule-based
approach, this column is null. Cost is not determined for
table access operations. The value of this column does not
have any particular unit of measurement; it is merely a
weighted value used to compare costs of execution plans.
The value of this column is a function of the CPU_COST and
IO_COST columns.
CARDINALITY The estimate by the cost-based approach of the number of
rows accessed by the operation.
BYTES The estimate by the cost-based approach of the number of
bytes accessed by the operation.
OTHER_TAG Describes the contents of the OTHER column. (See the next
table for more information on the possible values for this
column.)
PARTITION_START The start partition of a range of accessed partitions. It can
take one of the following values:
• n indicates that the start partition has been identified by the
SQL compiler, and its partition number is given by n.
• KEY indicates that the start partition will be identified at
run time from partitioning key values.
• ROW LOCATION indicates that the start partition (same as
the stop partition) will be computed at run time from the
location of each record being retrieved. The record location
is obtained by a user or from a global index.
• INVALID indicates that the range of accessed partitions is
empty.
PLAN_TABLE Description
Column
PARTITION_STOP The stop partition of a range of accessed partitions. It can
take one of the following values:
• n indicates that the stop partition has been identified by the
SQL compiler, and its partition number is given by n.
• KEY indicates that the stop partition will be identified at
run time from partitioning key values.
• ROW LOCATION indicates that the stop partition (same as
the start partition) will be computed at run time from the
location of each record being retrieved. The record location
is obtained by a user or from a global index.
• INVALID indicates that the range of accessed partitions is
empty.
PARTITION_ID The step that has computed the pair of values of the
PARTITION_START and PARTITION_STOP columns.
OTHER Other information that is specific to the execution step that a
user might find useful.
DISTRIBUTION Stores the method used to distribute rows from producer
query servers to consumer query servers.
CPU_COST The CPU cost of the operation as estimated by the optimizer’s
cost-based approach. For statements that use the rule-based
approach, this column is null. The value of this column is
proportional to the number of machine cycles required
for the operation.
IO_COST The I/O cost of the operation as estimated by the optimizer’s
cost-based approach. For statements that use the rule-based
approach, this column is null. The value of this column is
proportional to the number of data blocks read by the
operation.
TEMP_SPACE The temporary space, in bytes, used by the operation as
estimated by the optimizer’s cost-based approach. For
statements that use the rule-based approach, or for operations
that don’t use any temporary space, this column is null.
Parameter Description
TIMED_STATISTICS This enables and disables the collection of timed
statistics, such as CPU and elapsed times, by the SQL
trace facility, as well as the collection of various
statistics in the dynamic performance tables. The default
value of false disables timing. A value of true enables
timing. Enabling timing causes extra timing calls for
low-level operations. This is a dynamic parameter. It is
also a session parameter.
MAX_DUMP_FILE_ When the SQL trace facility is enabled at the instance
SIZE level, every call to the server produces a text line in a
file in the operating system’s file format. The maximum
size of these files (in operating system blocks) is limited
by this initialization parameter. The default is 500. If
you find that the trace output is truncated, then increase
the value of this parameter before generating another
trace file. This is a dynamic parameter. It is also a
session parameter.
USER_DUMP_DEST This must fully specify the destination for the trace file
according to the conventions of the operating system.
The default value is the default destination for system
dumps on the operating system.This value can be
modified with ALTER SYSTEM SET
USER_DUMP_DEST= newdir. This is a dynamic
parameter. It is also a session parameter.
Column Description
SQL_STATEMENT This is the SQL statement for which the SQL trace
facility collected the row of statistics. Because this
column has datatype LONG, you cannot use it in
expressions or WHERE clause conditions.
DATE_OF_INSERT This is the date and time when the row was inserted into
the table. This value is not exactly the same as the time
the statistics were collected by the SQL trace facility.
DEPTH This indicates the level of recursion at which the SQL
statement was issued. For example, a value of 0
indicates that a user issued the statement. A value of 1
indicates that Oracle generated the statement as a
recursive call to process a statement with a value of 0 (a
statement issued by a user). A value of n indicates that
Oracle generated the statement as a recursive call to
process a statement with a value of n-1.
USER_ID This identifies the user issuing the statement. This value
also appears in the formatted output file.
CURSOR_NUM Oracle uses this column value to keep track of the cursor
to which each SQL statement was assigned.
This is level 1
This is level 2 position 1
This is level 3 position 1 — The first operation that is processed
This is level 3 position 2
This is level 2 position 2
This is level 3 position 1
Determining the Order in Which the Plan Will Be Executed
1. Find the first line that does not have anything subordinate to it. Scan down the list of
operations to be executed, look for the last time that is indented from the previous line.
This is your starting location. The object on that line is the driving table or the associated
index or cluster. Process steps 2 – 4 until all lines have been processed.
2. Perform the operation on the current line.
3. If the next line that has not already been processed is at the same level as the current line,
drop down to it an proceed to step 4. If not, then scan back up the list to find the previous
line that is at the next previous level and return back to step 2. For example, while at
level 3, position 2, if the next line is not position 3 then go back up to the previous level
2.
4. If there is nothing subordinate to the current line, return back to step 2, otherwise scan
down the list looking for the first line that does not have anything subordinate to it before
returning to step 2.
Using the above example, the order in which the lines will be processed is:
6 This is level 1
3 This is level 2 position 1
1 This is level 3 position 1
2 This is level 3 position 2
5 This is level 2 position 2
4 This is level 3 position 1
@saved_settings
set termout on
undef INDEX_NAME
undef TABLE_NAME
undef COLUMN_NAME
@saved_settings
set termout on
undef INDEX_NAME
undef TABLE_NAME
undef COLUMN_NAME
@saved_settings
set termout on
undef INDEX_NAME
undef TABLE_NAME
undef COLUMN_NAME
select c.cust_last_name
, c.cust_year_of_birth -- ws03_04.sql
, co.country_name
from customers c
, countries co
where c.country_id = co.country_id
and co.country_region = ’Americas’
/
SELECT * -- ws03_10.sql
FROM (SELECT prod_id
, prod_name
, prod_desc
, prod_list_price
, prod_min_price
FROM products
ORDER BY prod_min_price ASC)
WHERE ROWNUM <= 10
/
select
c.* -- ws05_04a.sql
from customers c
where cust_gender = ’M’
and cust_credit_limit = 10000
/
select
c.* -- ws05_04b.sql
from customers c
where cust_gender = ’M’
and cust_postal_code = 40804
/
select
c.* -- ws05_04c.sql
from customers c
where cust_postal_code = 40804
and cust_credit_limit = 10000
/
select * -- ws06_10.sql
from v_avg_credit_limit
where country_id = ’US’
/
select * -- ws06_14.sql
from fquarter_pscat_costs_mv
where fiscal_year = 1999
/
select * -- ws07_03.sql
from promotions
where promo_id > 300
/
select * -- ws07_04.sql
from promotions_iot
where promo_id > 300
/
select * -- ws07_05.sql
from promotions_iot
where promo_subcategory = ’online discount’
/
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 TABLE ACCESS (BY INDEX ROWID) OF ’CUSTOMERS’
2 1 INDEX (UNIQUE SCAN) OF ’CUSTOMERS_PK’ (UNIQUE)
5. Set AUTOTRACE to suppress the command output and show both execution plans
and statistics, and run step 2 again.
SQL> set autotrace traceonly
SQL> select cust_first_name, cust_last_name
2 from customers
3 where cust_id = 100;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 TABLE ACCESS (BY INDEX ROWID) OF ’CUSTOMERS’
2 1 INDEX (UNIQUE SCAN) OF ’CUSTOMERS_PK’ (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
447 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
6. If you have some time left, enter your own SQL statements and experiment with
EXPLAIN PLAN and AUTOTRACE.
No formal solution.
6. Finally, start TKPROF in such a way that execution plans are added to the
report and recursive SQL statements are suppressed.
OS> tkprof <your_trace_file>.trc run1.txt
explain=<username>/<password> sys=no
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 NESTED LOOPS
2 1 TABLE ACCESS (FULL) OF ’CUSTOMERS’
3 1 TABLE ACCESS (BY INDEX ROWID) OF ’COUNTRIES’
4 3 INDEX (UNIQUE SCAN) OF ’COUNTRY_PK’ (UNIQUE)
4. Analyze the SQL statement shown below to see which optimizer mode is used.
SQL> select *
2 from customers
3 where cust_credit_limit = 15000;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 TABLE ACCESS (FULL) OF ’CUSTOMERS’
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=362
Card=1079 Bytes=75530)
1 0 HASH JOIN (Cost=362 Card=1079 Bytes=75530)
2 1 TABLE ACCESS (FULL) OF ’COUNTRIES’ (Cost=1
Card=1 Bytes=55)
3 1 TABLE ACCESS (FULL) OF ’CUSTOMERS’ (Cost=360
Card=50000 Bytes=750000)
Although the optimizer goal is different, the execution plan is still the same. The rule-
based optimizer is still used because there are no table statistics available yet.
COUNT(*) PROD_STATUS
---------- --------------------
800 available, no stock
8000 available, on stock
400 not available
300 obsolete
500 ordered
Because the data is highly skewed with “available, on stock” occurring much more often
than the other values, the use of a histogram is appropriate.
3. Create a histogram for the PROD_STATUS column with 20 buckets and view the
histogram statistics.
Table analyzed.
ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
800 5.0605E+35
8800 5.0605E+35
9200 5.7341E+35
9500 5.7834E+35
10000 5.7867E+35
Note: You only get five buckets, although you asked for 20. This is because the
PROD_STATUS column only has five distinct values, so Oracle can store exact statistics
in five rows. Note that the ENDPOINT_NUMBER column contains a cumulative total; the
ENDPOINT_VALUE contains the PROD_STATUS values, stored in a floating point
format which makes them unreadable.
4. Calculate the selectivity of the following select statement, both before and after creating
the histogram:
SQL> select * from products
2 where prod_status = ’obsolete’;
- Before the histogram: selectivity = 1/5 = 20%
- After the histogram: selectivity = 300/10000 = 3%
5. Identify the last analyze date and sample size for all tables in your schema.
SQL> select table_name, sample_size, last_analyzed
2 from user_tables;
6. Delete the statistics for the PRODUCTS table and check the data dictionary again.
SQL> analyze table products delete statistics;
SQL> select table_name, sample_size, last_analyzed
2 from user_tables;
7. View the results of the EXPLAIN PLAN command. What optimization method is used?
With the PRODUCTS table, which approach is more efficient?
Because the table is small, a full table scan is preferable. If table statistics were
available on the PRODUCTS table, the cost-based optimizer would automatically
choose a full table scan for this select statement.
3. Check the data dictionary view USER_OUTLINE_HINTS to see which hints are stored
with your outline.
SQL> col hint format a10
SQL>
SQL> select *
2 from user_outline_hints
3 where name = ’TEAM76505’
4 /
6 rows selected.
4. Issue an ALTER SESSION command to enable your session using the stored outline.
SQL> alter session
2 set use_stored_outlines = TRUE
3 /
5. Execute the SQL statement in step 1, create an index on the PROD_STATUS column in
the PRODUCTS table, and execute the SQL statement again to see whether the execution
plan has changed.
SQL> set autotrace traceonly explain
SQL> select p.prod_id, p.prod_name, p.prod_min_price
2 from products p
3 where p.supplier_id = 260
4 /
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=109 Card=35 Bytes=1295)
6. Check the USER_OUTLINES data dictionary view again to see whether the outline has
been used.
SQL> select category, used, timestamp, sql_text
2 from user_outlines
3 where name = ’TEAM76505’
4 /
CATEGORY USED TIMESTAMP
------------------------------ ------------------------- ---------
SQL_TEXT
----------------------------------------------------------------------
DEFAULT USED 14-NOV-01
select p.prod_id, p.prod_name, p.prod_min_price
from products p
where p.supplier
7. Drop the outline, and issue the SQL statement again to see that the index will now be
used.
SQL> drop outline team76505
2 /
SQL> set autotrace traceonly explain
SQL>
SQL> select p.prod_id, p.prod_name, p.prod_min_price
2 from products p
3 where p.supplier_id = 260
4 /
Execution Plan
-----------------------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=35 Bytes=1295)