SQL Performance Tuning
SQL Performance Tuning
Shalabh Mehrotra,
Senior Solutions Architect
Noida, India
www.globallogic.com
SQL Performance Tuning Shalabh Mehrotra, Senior Solutions Architect
Table of Contents
Introduction .................................................................................................................................................................................................................. 3
General SQL Tuning Guidelines ............................................................................................................................................................................. 3
Write Sensible Queries .......................................................................................................................................................................... 3
Tuning Subqueries ................................................................................................................................................................................... 3
Limit the Number of Tables in a Join ............................................................................................................................................... 3
Recursive Calls Should be kept to a Minimum ........................................................................................................................... 3
Improving Parse Speed .......................................................................................................................................................................... 4
Alias Usage .................................................................................................................................................................................................. 4
Driving Tables ............................................................................................................................................................................................. 4
Caching Tables .......................................................................................................................................................................................... 5
Avoid Using Select * Clauses .............................................................................................................................................................. 5
Exists vs. In .................................................................................................................................................................................................. 5
Not Exists vs. Not In ................................................................................................................................................................................ 5
In with Minus vs. Not In for Non-Indexed Columns .................................................................................................................. 6
Correlated Subqueries vs. Inline Views ......................................................................................................................................... 6
Views Usage ................................................................................................................................................................................................ 6
Use Decode to Reduce Processing ................................................................................................................................................ 6
Inequalities ................................................................................................................................................................................................... 7
Using Union in Place of Or ................................................................................................................................................................... 7
Using Union All Instead of Union ....................................................................................................................................................... 7
Influencing the Optimizer Using Hints ............................................................................................................................................ 7
Presence Checking ................................................................................................................................................................................. 7
Using Indexes to Improve Performance ........................................................................................................................................ 8
Why Indexes are Not Used ................................................................................................................................................................... 8
Conclusion .................................................................................................................................................................................................................... 9
Referenes ...................................................................................................................................................................................................................... 9
Introduction
Database performance is one of the most challenging become a hindrance! For this reason, hints should be
aspects of an organization’s database operations. A well- avoided if possible, especially the /*+ RULE */ hint.
designed application may still experience performance
problems if the SQL it uses is poorly constructed. It is Try adding new indexes to the system to reduce
much harder to write efficient SQL than it is to write excessive full table scans. Typically, foreign key columns
functionally correct SQL. As such, SQL tuning can help should be indexed, as these are regularly used in join
signficantly improve a system’s health and performance. conditions. On occasion it may be necessary to add
The key to tuning SQL is to minimize the search path composite (concatenated) indexes that will only aid
that the database uses to find the data. individual queries. Remember, excessive indexing can
reduce INSERT, UPDATE and DELETE performance.
The target audience of this whitepaper includes
developers and database administrators who want to Tuning Subqueries
improve the performance of their SQL queries.
If the SQL contains subqueries, tune them. In fact, tune
General SQL Tuning Guidelines them first. The main query will not perform well if the
subqueries can’t perform well themselves. If a join will
The goals of writing any SQL statement include provide you with the functionality of the subquery, try the
delivering quick response times, using the least CPU join method first before trying the subquery method. Pay
resources, and achieving the fewest number of I/O attention to correlated subqueries, as they tend to be
operations. The following content provides best very costly and CPU- insentive.
practices for optimizing SQL performance.
Limit the Number of Tables in a Join
Write Sensible Queries
There are several instances when the processing time
Identify SQL statements that are taking a long time to can be reduced several times by breaking the SQL
execute. Also identify SQL statements that involve the statement into smaller statements and writing a PL/
joining of a large number of big tables and outer joins. SQL block to reduce database calls. Also, packages
The simplest way to do this usually involves running the reduce I/O since all related functions and procedures
individual statements using SQLPlus and timing them are cached together. Use the DBMS_SHARED_POOL
(SET TIMING ON). Use EXPLAIN to look at the execution package to pin a SQL or PL/SQL area. To pin a set of
plan of the statement. Look for any full table accesses packages to the SQL area, start up the database and
that look dubious. Remember, a full table scan of a small make a reference to the objects that cause them to
table is often more efficient than access by rowid. be loaded. Use DBMS_SHARED_POOL.KEEP to pin it.
Pinning prevents memory fragmentation. It also helps to
Check to see if there are any indexes that may help reserve memory for specific programs.
performance. A quick way to do this is to run the
statement using the Rule Based Optimizer (RBO) Recursive Calls Should be Kept to a Minimum
(SELECT /*+ RULE */). Under the RBO, if an index is
present, it will be used. The resulting execution plan may Recursive calls are SQL statements that are triggered by
give you some ideas as to which indexes to play around Oracle itself. Large amount of recursive SQL executed
with. You can then remove the RULE hint and replace it by SYS could indicate space management activities
with the specific index hints you require. This way, the such as extent allocations taking place. This is not
Cost Based Optimizer (CBO) will still be used for table scalable and impacts user response time. Recursive SQL
accesses where hints aren’t present. Remember, if data executed under another user ID is probably SQL and PL/
volumes change over time, the hint that helped may SQL, and this is not a problem.
The Oracle trace utility tkprof provides information that the engine may perform a Merge Join or a Nested
about recursive calls. This value should be taken into Loop join to retrieve the data. Despite this challenge,
consideration when calculating resource requirement for there are a few rules you can use to improve the
a process. Tkprof also provides library cache misses and performance of your SQL.
provides the username of the individual who executed
the SQL statement. Tkprof-generated statistics can be Oracle processes result sets one table at a time. It starts
stored in a table tkprof_table to be queried later. by retrieving all the data for the first (driving) table. Once
this data is retrieved, it is used to limit the number of
Improving Parse Speed rows processed for subsequent (driven) tables. In the
case of multiple table joins, the driving table limits the
Execution plans for SELECT statements are cached rows processed for the first driven table.
by the server, but unless the exact same statement is
repeated, the stored execution plan details will not be Once processed, this combined set of data is the driving
reused. Even differing spaces in the statement will cause set for the second driven table, etc. Roughly translated,
this lookup to fail. Use of bind variables allows you to this means that it is best to process tables that will
repeatedly use the same statements while changing the retrieve a small number of rows first. The optimizer
WHERE clause criteria. Assuming the statement does will do this to the best of its ability, regardless of the
not have a cached execution plan, it must be parsed structure of the DML, but the following factors may help.
before execution. The parse phase for statements can
be decreased by efficient use of aliasing. Both the Rule and Cost-based optimizers select a driving
table for each DML statement. If a decision cannot be
Alias Usage made, the order of processing is from the end of the
FROM clause to the start. Therefore, you should always
If an alias is not present, the engine must resolve which place your driving table at the end of the FROM clause.
tables own the specified columns. A short alias is parsed Always choose the table with less number of records as
more quickly than a long table name or alias. If possible, the driving table. If three tables are being joined, select
reduce the alias to a single letter. The following is an the intersection tables as the driving table.
example:
The intersection table is the table that has many tables
Bad Statement dependent on it. Subsequent driven tables should be
SELECT first_name, last_name, country FROM employee, placed in order so that those retrieving the most rows
countries are nearer to the start of the FROM clause. However, the
WHERE country_id = id WHERE clause should be written in the opposite order,
AND last_name = ‘HALL’; with the driving tables conditions first and the final driven
table last (example below).
Good Statement
SELECT e.first_name, e.last_name, c.country FROM FROM d, c, b, a
employee e, countries c WHERE a.join_column = 12345
WHERE e.country_id = c.id AND a.join_column = b.join_column AND b.join_column =
AND e.last_name = ‘HALL’; c.join_column AND c.join_column = d.join_column;
Depending on the number of rows and the presence of Avoid Using Select * Clauses
indexes, Oracle may now pick “D” as the driving table.
Since “D” now has two limiting factors (join_column and The dynamic SQL column reference (*) gives you a way
name), it may be a better candidate as a driving table. to refer to all of the columns of a table. Do not use the
The statement may be better written as: * feature because it is very inefficient -- the * has to
be converted to each column in turn. The SQL parser
FROM c, b, a, d handles all the field references by obtaining the names
WHERE d.name = ‘JONES’ of valid columns from the data dictionary and substitutes
AND d.join_column = 12345 them on the command line, which is time consuming.
AND d.join_column = a.join_column AND a.join_column =
b.join_column AND b.join_column = c.join_column Exists vs. In
This grouping of limiting factors will guide the optimizer The EXISTS function searches for the presence of a
more efficiently, making table “D” return relatively few single row that meets the stated criteria, as opposed
rows, and so making it a more efficient driving table. to the IN statement that looks for all occurrences. For
Remember, the order of the items in both the FROM example:
and WHERE clause will not force the optimizer to pick a
specific table as a driving table, but it may influence the PRODUCT - 1000 rows
optimizer’s decision. The grouping of limiting conditions ITEMS - 1000 rows
onto a single table will reduce the number of rows
returned from that table, which will therefore make it a (A)
stronger candidate for becoming the driving table. Also, SELECT p.product_id
you can have control over which table will drive the query FROM products p
through the use of the ORDERED hint. No matter what WHERE p.item_no IN (SELECT i.item_no
order the optimizer is from, that order can be overridden FROM items i);
by the ORDERED hint. The key is to use the ORDERED
hint and vary the order of the tables to get the correct (B)
order from a performance standpoint. SELECT p.product_id
FROM products p
Caching Tables WHERE EXISTS (SELECT ‘1’
FROM items i
Queries will execute much faster if the data they WHERE i.item_no = p.item_no)
reference is already cached. For small, frequently used
tables, performance may be improved by caching tables. For query A, all rows in ITEMS will be read for every row
Normally, when full table scans occur, the cached data in PRODUCTS. The effect will be 1,000,000 rows read
is placed on the Least Recently Used (LRU) end of the from ITEMS. In the case of query B, a maximum of 1 row
buffer cache. This means that it is the first data to be from ITEMS will be read for each row of PRODUCTS,
paged out when more buffer space is required. thus reducing the processing overhead of the statement.
If the table is cached (ALTER TABLE employees Not Exists vs. Not In
CACHE;), the data is placed on the Most Recently Used
(MRU) end of the buffer, and so it is less likely to be In subquery statements such as the following, the NOT
paged out before it is re-queried. Caching tables may IN clause causes an internal sort/ merge.
alter the CBO’s path through the data and should not be
used without careful consideration. SELECT * FROM student
WHERE student_num NOT IN (SELECT student_num
FROM class)
IF v_count = 0 THEN
Influencing the Optimizer Using Hints
-- Do processing related to no small items present
END IF;
Hints are special instructions to the optimizer. You
OR
can change the optimization goal for an individual
BEGIN
statement by using Hint. Some commonly used Hints
SELECT ‘1’
are CHOOSE, RULE, FULL(table_name), INDEX(table_
INTO v_dummy
name index_name), USE_NL, USE_HASH(table_ name),
FROM items
PARALLEL(table_name parallelism), etc.
WHERE item_size = ‘SMALL’
AND rownum = 1;
SELECT /*+rule*/ name,
EXCEPTION
acct_allocation_percentage
WHEN NO_DATA_FOUND THEN
FROM accounts WHERE account_id = 1200
-- Do processing related to no small items present
END;
In these examples, only a single record is retrieved in the Why Indexes are Not Used
presence/absence check.
The presence of an index on a column does not
Using Indexes to Improve Performance guarantee it will be used. The following is a list of factors
that may prevent an index from being used:
Indexes primarily exist to enhance performance. But they
do not come without a cost. Indexes must be updated • The optimizer decides it would be more efficient
during INSERT, UPDATE and DELETE operations, which not to use the index. As a rough rule of thumb, an
may slow down performance. Some factors to consider index will be used on evenly distributed data if it
when using indexes include: restricts the number of rows returned to 5% or less
of the total number of rows. In the case of randomly
• Choose and use indexes appropriately. Indexes distributed data, an index will be used if it restricts
should have high selectivity. Bitmapped indexes the number of rows returned to 25% or less of the
improve performance when the index has fewer total number of rows.
distinct values like Male or Female.
• You perform mathematical operations on the indexed
• Avoid using functions like “UPPER” or “LOWER” on column, i.e. WHERE salary + 1 = 10001
the column that has an index. In case there is no
way that the function can be avoided, use Functional • You concatenate a column, i.e. WHERE firstname || ‘ ‘
Indexes. || lastname = ‘JOHN JONES’
• Index partitioning should be considered if the • You do not include the first column of a
table on which the index is based is partitioned. concatenated index in the WHERE clause of your
Furthermore, all foreign keys must have indexes or statement. For the index to be used in a partial
should form the leading part of Primary Key. match, the first column (leading- edge) must be used.
• Occasionally you may want to use a concatenated • The use of OR statements confuses the CBO. It
index with the SELECT column. This is the most will rarely choose to use an index on a column
favored solution when the index not only has all referenced using an OR statement. It will even ignore
the columns of the WHERE clause, but also the optimizer hints in this situation. The only way to
columns of the SELECT clause. In this case there is guarantee the use of indexes in these situations is to
no need to access the table. You may also want to use the /*+ RULE */ hint.
use a concatenated index when all the columns of
the WHERE clause form the leading columns of the • You use the is null operator on a column that is
index. indexed. In this situation, the optimizer will ignore the
index.
• When using 9i, you can take advantage of skip scans.
Index skip scans remove the limitation posed by • You mix and match values with column data types.
column positioning, as column order does not restrict This practice will cause the optimizer to ignore the
the use of the index. index. For example, if the column data type is a
number, do not use single quotes around the value
• Large indexes should be rebuilt at regular intervals in the WHERE clause. Likewise, do not fail to use
to avoid data fragmentation. The frequency of
rebuilding depends on the extents of table inserts.
single quotes around a value when it is defined as To tune effectively, you must know your data. Your
an alphanumeric column. For example, if a column system is unique, so you must adjust your methods to
is defined as a varchar2(10), and if there is an index suit your system. A single index or a single query can
built on that column, reference the column values bring an entire system to a near standstill. Get those bad
within single quotes. Even if you only store numbers SQL and fix them. Make it a habit...and stick with it.
in it, you still need to use single quotes around your
values in the WHERE clause, as not doing so will
result in full table scan. References
• When Oracle encounters a NOT, it will choose not 1. Oracle 9I Documentation
to use an index and will perform a full-table scan
instead. Remember, indexes are built on what is in a 2. Oracle Performance Tuning 101 by Gaja Krishna
table, not what isn’t in a table. Vaidyanatha, Kirtikumar Deshpande and John Kostelac
3. http://www.dba-village.com/dba/village/dvp_papers.
Conclusion Main?CatA=45 by Sumit Popli and Puneet Goenka.
Contact
Emily Younger
+1.512.394.7745
emily.younger@globallogic.com