Increasing Database Performance Using Indexes
Increasing Database Performance Using Indexes
2/2011 13
The performance issues are presented. The process of performance tuning is described. The
indexing concept is introduced. The two types of defining an index are identified. The paper
presents and explains the way a clustered and nonclustered index works. A compared analysis
between clustered and nonclustered indexes is created. The benefits of using indexes on
databases are identified and explained. A demonstration of using SEEK and SCAN indexes is
presented, showing the logic behind these two types of indexes. Cases when an index must be
used and how this index is created, are presented along with the performance improvement
that is obtained after successfully creating the index.
Keywords: Clustered index, Nonclustered index, Optimization, Database, Performance
1 Introduction
Performance is one of the most
important metric that describes if a
Time response measurement after
tuning.
The database designer should focus on those
project is a success or a mistake. It is also techniques that provide the most benefits.
one of the most common problems the Among all the techniques of improving the
programmers are dealing with. Either if database performance, indexing and query
we are taking into consideration a new optimization stand up as they provide visible
starting project or an application that is results. On the other hand, abusing indexes
already running on production we should and inappropriate indexes might harm the
always keep in mind the performance performance of the system.
aspects. This means that the design for The structure of the database used for
performance process should start early in this demonstration is described in figure 1.
the development of an application. The
architecture of the system should be
design in a manner to meet the
performance requirements and to allow
performance tuning. The book [1]
presents techniques to improve or fine
tune a database to achieve maximum
performance. Fig. 1 Structure of test database
There is no recipe of designing perfect
databases, but there are techniques and In order to define the index concept we
tips that can improve the quality of the will take a look at how the data is retrieved
design. Improving the database from database with no indexes.
performance is a cycling activity that
should be including in each of SELECT * FROM Client WHERE
LastName = LastName4
development stage.
The performance tuning process includes
In the above query the SQL Server will
three steps:
look in all the rows from the beginning of
Time response measurement before
the table to the end of it searching for those
tuning;
rows that are meeting the condition in the
Tuning performed;
14 Increasing Database Performance using Indexes
WHERE clause. If the searched table the data in the table is physically
contains few rows the response of the alphabetically sorted based on FirstName
above query might be very prompt, but in value. When inserting a new row into the
case of tables that contain millions of database, it will be inserted in a certain
rows the query might take a while. For position so that the sorting is still kept.
this reason creating an index on the table Figure 2 schematically presents the tree
allows SQL Server to get the result structure of the clustered index on Client
without searching the whole data in the table column FirstName.
table. Indexing is the best way to reduce
the logical read and disk input/output as it
provides a mechanism of structured
search instead of row by row search [8].
Basically an index is a copy of the data in
the table, sorted in a certain logical
manner. There are two different ways of
defining an index:
Like a dictionary: a dictionary is a
list of words ordered alphabetically.
An index defined like a dictionary
is a set of data ordered in a Fig. 2 The structure of the clustered
lexicographic manner. For this index
reason the search on the index will
not include all the rows but will As shown in the figure above the leaf nodes
position easier based on the ordered represent the actual data pages while the
data. intermediate nodes of the tree structure are
Like a book index: this approach of
index pages. All the pages in the structure
creating an index will not alter the are linked. The top node in the structure is
layout of the data in the table, but the root index page, while the middle level
just like the index of a book will nodes are intermediate index pages. Each
position the data in the table to the row in an index page refers either another
corresponding position in the table. index page or a data page. This reference is
An index defined in this manner a pointer to a page number. The root index
will contain the data in the indexed page contains a row with the value
column and the corresponding row Munteanu which points to the intermediate
number of the data. index page number 98, while the index page
An example of how to design a spatial 98 contains a row with the value Danila
database is presented in detail in [3]. which points to the data page 78. The data
page 78 contains two rows with the
2 Clustered indexes corresponding values.
A clustered index is an index that When searching using the clustered index,
contains the table data ordered in a the row to row search will be avoided. For
physical manner. When creating a the following query
clustered index on a table, the data rows
are reordered on the disk based on the SELECT * FROM Client WHERE
index key sequence so that it meets the LastName=Cioloca
indexed ordering. For this reason only
one clustered index is allowed to be SQL Server will first get the root index page
created on one single table. from the sysindexes table, and then it will
For the Client table, when creating a search in the rows of it to find the highest
clustered index on FirstName column, key value not greater then Cioloca, and will
Database Systems Journal vol. II, no. 2/2011 15
Query 1:
select * from Client where Fig. 5 Execution plan of Query 1 and
LastName = 'LastName4' Query 2
Query 2: An index scan either clustered or
select * from Client where
FirstName = 'FirstName4' nonclustered will do a complete scan in a
18 Increasing Database Performance using Indexes
Nonclustered indexes should be used for new data is added to the index. In our
queries that return small amount of data example we will add 5 and automatically a
and to columns used in WHERE clauses new page will be created and the 7 and 8
that return exact matches. Large result will be moved to the new page in order to
sets are reading more tables pages make room for the 5 on the original page.
anyway so they will not benefit from a Due to this, the index will be out of logical
nonclustered index. order as seen in figure 7.
6 Index fragmentation
When the user performs operations like
INSERT, UPDATE or DELETE on the Fig. 7 New index structure
database, table fragmentation may occur.
Also when changes affect the data that is This type of fragmentation will affect the
covered by the index, then index performance of queries that do not have
fragmentation occurs and the information specific searches or that return unordered
in the index is scattered over the result sets, but will affect queries that
database. Due to this, when a query is returned ordered sets. An example of an
performed against a heavy fragmented ordered result set is a query that is returning
table, the operation will take longer time. everything from page 4 to 12. This query
Index fragmentation comes in two has to complete an extra page switch in
different forms: external fragmentation order to return the 7 and 8 pages. If the
and internal fragmentation. In each of fragmentation affects tables with hundreds
these forms, the pages within the index of pages the amount of extra page switches
are used inefficiently. External will be significantly greater.
fragmentation means that the logical
order of the pages are wrong and internal In order to determine the level of
fragmentation represents that the amount fragmentation the following command can
of data stored within each page is less be used:
than the data page.
Each leaf page of an index must be in a DBCC SHOWCONTIG (TableName)
logical order otherwise external
fragmentation occurs. When an index is The syntax of this command is:
created, the index keys are sorted in a
DBCC SHOWCONTIG
logical order on a set of index pages. [({ table_name
Each time new data is inserted in the | table_id
index, there could be the possibility that | view_name
the new keys are inserted between | view_id }
existing keys. This may lead to the [ , index_name | index_id ]
)
creation of new index pages to ]
accommodate the existing keys that were [ WITH { ALL_INDEXES
moved so that new keys can be inserted | FAST [ , ALL_INDEXES ]
in correct order. | TABLERESULTS [ , { ALL_INDEXES } ]
[ , { FAST | ALL_LEVELS } ]
Lets assume the current index structure
}
presented in figure 6. ]
The command returns the number of Dropping and rebuilding an index has the
pages scanned, the number of extents advantage of completely rebuilding an index
scanned the number of times the DBCC and does reorder the index pages,
statement moved from one extent to compacting the pages, and dropping any
another while parsing the pages of the unneeded pages. This operation should be
table or index, the average number of done on indexes that show high levels of
pages per extent, the scan density. both internal and external fragmentation.
Pages Scanned: If the number of rows DROP INDEX and CREATE INDEX
contained in the table or index divided by
the approximate row size is significantly Rebuilding the index can reduce
greater than the number of pages scanned fragmentation and it is done by using the
then there could be internal fragmentation following statement:
of the index.
Extents Scanned: Take the number of DBCC DBREINDEX
pages scanned and divide that number by
8, rounded to the next highest interval. This operation is similar to dropping and
This figure should match the number of creating the index, except that it will rebuild
extents scanned returned by DBCC the index physically allowing the SQL
SHOWCONTIG. If the number returned Server to assign new pages to the index and
by DBCC SHOWCONTIG is higher, reduce both internal and external
then you have some external fragmentation. This statement also recreates
fragmentation. The seriousness of the indexes with existing constraints.
fragmentation depends on just how high Defragmenting an index by using the DBCC
the shown value is from the estimated INDEXDEFRAG statement reduces the
value. external fragmentation by rearranging the
Extent Switches: This number should be existing leaf pages of an index to the logical
equal to (Extents Scanned 1). Higher order of the index key and internal
numbers indicate external fragmentation. fragmentation by compacting the rows
Extent Scan Fragmentation: Shows any within index pages then discarding
gaps between extents. This percentage unnecessary pages. The time needed to
should be 0% and higher percentages execute this statement is longer than
indicate external fragmentation. recreating an index if the amount of
fragmentation is high. DBCC
By analyzing the results provided by INDEXDEFRAG can defragment an index
DBCC SHOWCINTIG on our Person while other processes are accessing the
table we can see that the number of extent index, eliminating the blocking restrictions.
switches is much greater than the number
of extents scanned. In this case there is 7 Conclusions
external fragmentation.
Database Systems Journal vol. II, no. 2/2011 21