An index, clustered or non clustred, can be used by the query optimizer if and only if the leftmost key in the index is filtered on. So if you define an index on columns (A, B, C), a WHERE condition on B=@b
, on C=@c
or on B=@b AND C=@c
will not fully leverage the index (see note). This applies also to join conditions. Any WHERE filter that includes A
will consider the index: A=@a
or A=@a AND B=@b
or A=@a AND C=@c
or A=@a AND B=@b AND C=@c
.
So in your example if you make the clustred index on part_no
as the leftmost key, then a query looking for a specific part_id
will not use the index and a separate non-clustered index must exist on part-id
.
Now about the question which of the many indexes should be the clustered one. If you have several query patterns that are about the same importance and frequency and contradict each other on terms of the keys needed (eg. frequent queries by either part_no
or part_id
) then you take other factors into consideration:
- width: the clustered index key is used as the lookup key by all other non-clustered indexes. So if you choose a wide key (say two uniquidentifier columns) then you are making all the other indexes wider, thus consuming more space, generating more IO and slowing down everything. So between equaly good keys from a read point of view, choose the narrowest one as clustered and make the wider ones non-clustered.
- contention: if you have specific patterns of insert and delete try to separate them physically so they occur on different portions of the clustered index. Eg. if the table acts as a queue with all inserts at one logical end and all deletes at the other logical end, try to layout the clustered index so that the physical order matches this logical order (eg. enqueue order).
- partitioning: if the table is very large and you plan to deploy partioning then the partitioning key must be the clustered index. Typical example is historical data that is archived using a sliding window partitioning scheme. Even thow the entities have a logical primary key like 'entity_id', the clustred index is done by a datetime column that is also used for the partitioning function.
- stability: a key that changes often is a poor candidate for a clustered key as each update the clustered key value and force all non-clustered indexes to update the lookup key they store. As an update of a clustered key will also likely relocate the record into a different page it can cause fragmentation on the clustered index.
Note: not fully leverage as sometimes the engine will choose an non-clustered index to scan instead of the clustered index simply because is narrower and thus has fewer pages to scan. In my example if you have an index on (A, B, C) and a WHERE filter on B=@b
and the query projects C
, the index will be likely used but not as a seek, as a scan, because is still faster than a full clustered scan (fewer pages).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…