table clustered index decleration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

table clustered index decleration

Author

Topic

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2005-04-29 : 05:23:18

I have a table with a handfull of columns. Among them is a Primary Key (PK) column and a foreign key (FK) column. Not composite.

Now normally I would create the clustered unique PK index just on my PK column. Followed by a non-clustered index on the FK Column.

Why do I see some developers on this site suggest a composite clustered PK index on the PK & FK Column when the PK column is already unique?

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2005-04-29 : 12:20:07

Some ramblings:

the ideal indexing for a table totaly depends on the queries that are run against the table(s). One advantage to combining those columns in a single index would be that if those columns together satisfy a query requirement, the "covered" index is all the optimizer would need. Wouldn't even need to go to the actual table.

One disadvantage to that design is that if the table whose PK is this tables FK is being updated/deleted, then Sql Server needs to find that value in tables that reference the PK. So without an individual index on the foreign key column (or a composite index where the fk column is the first key) Sql server won't be able to utilize an index for checking the fk.

Just for general preferences, I like to keep the pk as narrow as possible so that other indexes on the table don't need the overhead of a wide PK.

Be One with the Optimizer
TG

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-04-29 : 22:12:24

quote:
Why do I see some developers on this site suggest a composite clustered PK index on the PK & FK Column when the PK column is already unique?

You're assuming that a clustering key and a unique key are either the same, and/or a good combination. This is not always the case, and in fact can be quite detrimental.

An example: a phone directory has the phone number as primary key, but it is clustered on last name, first name. To cluster a phone book on phone number would make it pretty useless.

quote:
I like to keep the pk as narrow as possible so that other indexes on the table don't need the overhead of a wide PK.

Again, this is a misconception that the PK is always the clustering key. You want to have a narrow CLUSTERING KEY because all non-clustered indexes are built upon the clustered index, if there is one. Your primary key does not have to be clustered, and if it is not then its size is irrelevant, as no other indexes will be built upon it.

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2005-05-03 : 03:44:33

Why would a phone directory with a clustered index on phonenumber and a non-clustered index on the name combination be useless?
You would have a "SELECT FirstName, LastName FROM Phonebook WHERE Number = @MyNumber"
By having a clustered index on number, you are enabling SQL to find that unique record faster than it otherwise could of. Once it's there, it can get the Names using a non-clustered index. (I havent tested this too see if that's what happens.)
If there is something wrong with what I am saying (and I think there might be) please tell me what.

And on a table where a lot of different information will be selected (unlike phone directory) or all 5/10/15 columns, surely it makes sense to have a unique id column, declare it as unique, PK & put a clustered index on it.
Isn't that a no-brainer?

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-05-03 : 08:28:49

quote:
Why would a phone directory with a clustered index on phonenumber and a non-clustered index on the name combination be useless?

Sorry I wasn't clear, I didn't mean a computer database containing phone numbers, I meant an actual, physical phone book. The point was to illustrate that it is not always a good idea to cluster on the primary key. How often would you use a phone book sorted by phone number? See the next bit:

quote:
By having a clustered index on number, you are enabling SQL to find that unique record faster than it otherwise could of.

Consider the complementary query for a second. If you're searching by name and expect to retrieve multiple rows, clustering on phone number could easily result in a clustered index or table scan, since the names would be scattered throughout the database. Under those circumstances, doing hundreds of seeks and bookmark lookups would be less efficient.

As you mentioned about the names, the query optimizer can certainly seek using a non-clustered index on phone number. Anything that would return a singleton result would be best served by a non-clustered index. The key point is that you are rarely searching for multiple phone numbers, but you would often query by name and expect to return more than one.

quote:
And on a table where a lot of different information will be selected (unlike phone directory) or all 5/10/15 columns, surely it makes sense to have a unique id column

I never said otherwise.

quote:
declare it as unique, PK & put a clustered index on it.

That makes sense, except for the red part. Again, I meant to point out that PK/unique key and clustering key are NOT the same thing. You may well have a table where the two coincide on the same column(s), but you should not just automatically create clustered primary keys without looking at your data and the kinds of queries you expect to run against that table.

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2005-05-03 : 08:57:28

If I have a clustered index on the number and then a non-clustered index ontop of that for the name and surname, why would that cause a table scan? I don't get it.

And if you retrieve the information in numerous ways: sometimes by a certain data range, other times by a name, other times by a number (and all are used as often as the other) where do you put the clustered index then??

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-05-03 : 16:14:24

First off, literally, imagine a phone book sorted by phone number. Then look up the phone number(s) for John Smith. Pay attention to how many times you (would have to) flip pages. Depending on the name(s) you choose, you'll flip pages so often that it's more efficient to simply start at page 1 and scan for John Smith.

I know this sounds dumb, but I'm using a physical manifestation for a reason. The logic that would govern your physical actions has a very close analogy to how the computer would do it. The disk provides better sequential read performance than random performance, so scanning a large chunk of sequential data is much more efficient than many smaller chunks of random data.

As far as choosing what to cluster on, that's the trick.

The example I used is not necessarily a good one, since Last Name, First Name would be a wide key. But again, the underlying goal is to break the mental linking of primary key and clustering key.

A better example of why clustering on primary key is a bad idea is if you have a uniqueidentifier as primary key. Clustering on a GUID is bad because the values are generated randomly, they are basically nonsense values that are only unique and nothing else. You can seek a singleton GUID easily, but inserting or deleting GUIDs that are clustered can cause all kinds of extra, random I/O that decrease performance significantly.

I recently re-clustered a 45 million+ row table from GUID, date to Date, GUID, and saw a 100% improvement in every single query, because almost every one of them dealt with date ranges, and could be read and written sequentially.

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2005-05-04 : 06:03:51

I'm starting to get it now. Had a look back at some of my old 6.5 notes as well. One other thing that came out of it was that a UNIQUE and PRIMARY KEY CONSTRAINT on the ID column can perform near as dammit to a clustered index. So save the clustered index for something else.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources