SQL Server - What Are The Differences Between A Clustered and A Non-Clustered Index - Stack Overflow
SQL Server - What Are The Differences Between A Clustered and A Non-Clustered Index - Stack Overflow
SQL Server - What Are The Differences Between A Clustered and A Non-Clustered Index - Stack Overflow
- Stack Overflow
_
Stack Overflow is a community of 4.7 Join the Stack Overflow community to:
million programmers, just like you,
helping each other.
1 You can only have one clustered index per table. But there are plenty of other differences... – tjrobinson Sep
18 '08 at 11:17
A clustered index actually describes the order in which records are physically stored on the disk, hence the
reason you can only have one. A Non-Clustered Index defines a logical order that does not match the
physical order on disk. – Josh Sep 18 '08 at 11:19
Clustered basically means that the data is in that phisical order in the table. This is why you can have only
one per table. Unclustered means it's "only" a logical order. – Biri Sep 18 '08 at 11:20
1 @biri what is "logical" order? a Non clustered index stores the index keys in order physically and it stores a
pointer to the table, namely the clustered index key. – Stephanie Page Apr 27 '12 at 2:46
@Stephanie Page: logical from the table point of view. Of course non-clustered indexes are ordered
physically in the index itself. – Biri Jun 17 '13 at 12:08
9 Answers
Clustered Index
Both types of index will improve performance when select data with fields that use the index but
will slow down update and insert operations.
Because of the slower insert and update clustered indexes should be set on a field that is
normally incremental ie Id or Timestamp.
SQL Server will normally only use an index if its selectivity is above 95%.
7 There are also storage considerations. When inserting rows into a table with no clustered index, the rows
are stored back to back on the page and updating a row may result in the row being moved to the end of
table, leaving empty space and fragmenting the table and indexes. – Jeremiah Peschka Sep 18 '08 at
http://stackoverflow.com/questions/91688/what-are-the-differences-between-a-clustered-and-a-non-clustered-index 1/4
4/25/2016 sql server - What are the differences between a clustered and a non-clustered index? - Stack Overflow
15:44
What does it mean that an index is "faster to read"? How many more x per second can you do? What is
x? – Stephanie Page Aug 9 '10 at 22:23
3 you don't have to care what is x. All you need to know is that for an app with millions of users, x will be
significant – Pacerier Jul 23 '11 at 13:42
11 It's purely dogma. It's not "faster to read because the data is stored in order". It's faster to read because
you avoid an index read AND THEN the table read. It's faster to range scan (if that's meaningful) because
the data is stored in order. i.e. the clustering factor is perfect. – Stephanie Page Apr 27 '12 at 2:52
4 Also the idea that 95% of the records need to be unique is a fallacy. Say you have a table with 1,000,000
rows and you index a column with 500,000 keys. 0% are unique but each key returns 2 out of a million
rows. This index is absolutely useful regardless that 0% of the records are unique. – Stephanie Page Apr
27 '12 at 3:29
Clustered indexes physically order the data on the disk. This means no extra data is needed for
the index, but there can be only one clustered index (obviously). Accessing data using a clustered
index is fastest.
All other indexes must be non-clustered. A non-clustered index has a duplicate of the data from
the indexed columns kept ordered together with pointers to the actual data rows (pointers to the
clustered index if there is one). This means that accessing data through a non-clustered index
has to go through an extra layer of indirection. However if you select only the data that's available
in the indexed columns you can get the data back directly from the duplicated index data (that's
why it's a good idea to SELECT only the columns that you need and not use *)
1 'However if you select only the data that's available in the indexed columns you can get the data back
directly from the duplicated index data' - yes that is the important exception to the prefer clustered index
heuristic. I guess in this case you essentially have a clustered index, but less data in the table you are
querying so potentially it can be read faster off disk. – briantyler Sep 19 '12 at 17:02
Clustered indexes are stored physically on the table. This means they are the fastest and you can
only have one clustered index per table.
Non-clustered indexes are stored separately, and you can have as many as you want.
The best option is to set your clustered index on the most used unique column, usually the PK.
You should always have a well selected clustered index in your tables, unless a very compelling
reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.
2 can you elaborate more on "we should always have a clustered index in our tables" ? without elaboration
that statement is simply wrong because of the word always – Pacerier Jul 23 '11 at 13:43
1 You're right Pacerier, one shouldn't use absolute statements lightly. Though I don't know of a single case
when you shouldn't have a well selected clustered index, such case might exist so I've changed my answer
to a more generic version. – Santiago Cepas Jul 27 '11 at 10:24
Clustered Index
1. There can be only one clustered index for a table.
2. Usually made on the primary key.
3. The leaf nodes of a clustered index contain the data pages.
Non-Clustered Index
1. There can be only 249 non-clustered indexes for a table.
2. Usually made on the any key.
http://stackoverflow.com/questions/91688/what-are-the-differences-between-a-clustered-and-a-non-clustered-index 2/4
4/25/2016 sql server - What are the differences between a clustered and a non-clustered index? - Stack Overflow
3. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf
nodes contain index rows.
Clustered basically means that the data is in that phisical order in the table. This is why you can
have only one per table.
Pros:
Clustered indexes work great for ranges (e.g. select * from my_table where my_key between
@min and @max)
In some conditions, the DBMS will not have to do work to sort if you use an orderby statement.
Cons:
Clustered indexes are can slow down inserts because the physical layouts of the records have to
be modified as records are put in if the new keys are not in sequential order.
A clustered index actually describes the order in which records are physically stored on the disk,
hence the reason you can only have one.
A Non-Clustered Index defines a logical order that does not match the physical order on disk.
A clustered index is essentially a sorted copy of the data in the indexed columns.
The main advantage of a clustered index is that when your query (seek) locates the data in the
index then no additional IO is needed to retrieve that data.
The overhead of maintaining a clustered index, especially in a frequently updated table, can lead
to poor performance and for that reason it may be preferable to create a non-clustered index.
An indexed database has two parts: a set of physical records, which are arranged in some
arbitrary order, and a set of indexes which identify the sequence in which records should be read
to yield a result sorted by some criterion. If there is no correlation between the physical
arrangement and the index, then reading out all the records in order may require making lots of
independent single-record read operations. Because a database may be able to read dozens of
consecutive records in less time than it would take to read two non-consecutive records,
performance may be improved if records which are consecutive in the index are also stored
consecutively on disk. Specifying that an index is clustered will cause the database to make
some effort (different databases differ as to how much) to arrange things so that groups of
records which are consecutive in the index will be consecutive on disk.
For example, if one were to start with an empty non-clustered database and add 10,000 records
in random sequence, the records would likely be added at the end in the order they were added.
Reading out the database in order by the index would require 10,000 one-record reads. If one
were to use a clustered database, however, the system might check when adding each record
http://stackoverflow.com/questions/91688/what-are-the-differences-between-a-clustered-and-a-non-clustered-index 3/4
4/25/2016 sql server - What are the differences between a clustered and a non-clustered index? - Stack Overflow
whether the previous record was stored by itself; if it found that to be the case, it might write that
record with the new one at the end of the database. It could then look at the physical record
before the slots where the moved records used to reside and see if the record that followed that
was stored by itself. If it found that to be the case, it could move that record to that spot. Using
this sort of approach would cause many records to be grouped together in pairs, thus potentially
nearly doubling sequential read speed.
In reality, clustered databases use more sophisticated algorithms than this. A key thing to note,
though, is that there is a tradeoff between the time required to update the database and the time
required to read it sequentially. Maintaining a clustered database will significantly increase the
amount of work required to add, remove, or update records in any way that would affect the
sorting sequence. If the database will be read sequentially much more often than it will be
updated, clustering can be a big win. If it will be updated often but seldom read out in sequence,
clustering can be a big performance drain, especially if the sequence in which items are added to
the database is independent of their sort order with regard to the clustered index.
http://stackoverflow.com/questions/91688/what-are-the-differences-between-a-clustered-and-a-non-clustered-index 4/4