Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 To Cluster, or Not To Cluster. That is the.....

Author  Topic 

SamC
White Water Yakist

3467 Posts

Posted - 2003-03-13 : 18:35:01
I've read what Henderson and BOL had to say about clustered indicies. Pretty obscure stuff. I can't claim to have a good grasp of the issues.

I am still mystified by the behavior of a clustered index when all elements of the cluster are not referenced.
[url]http://sqlteam.com/Forums/topic.asp?TOPIC_ID=24432[/url]
Nigel states the first element of the cluster must always be referenced. I couldn't find any discussion on this in BOL.

My PageHits table is a textbook for a good index design.

CREATE TABLE [PageHits] (
[PageHitID] [int] IDENTITY (1000000, 1) NOT NULL ,
[ClientID] [int] NOT NULL ,
[UserID] [int] NOT NULL ,
[CourseID] [int] NULL ,
[ModuleID] [int] NULL ,
[UrlDate] [datetime] NOT NULL DEFAULT (getdate()),
[Url] [varchar] (300) NOT NULL ,
[NextURLDate] [datetime] NULL
) ON [PRIMARY]
END

Background on the Column values:

ClientID has 2 maybe 4 values. Not a lot of clients. For practical purposes, ClientID is a lousy column. I think I'll remove it.
UserID has about 40,000+ values. Each UserId has about 100 rows in the table which may contain several million rows.
CourseID, only 1 or 2 values. Could grow to 5 values !
ModuleID, only 20 values per CourseID

The most frequent operation is a table insert, so we don't want to over index the table.

The most frequent query is a lookup UserID and we don't want a table scan. This query will return 100 rows tops for a single user.

The second most frequent query is the CourseID - to retrieve statistics on a course. This query would return 40,000*100 rows, quite a lot of data to summarize.

What reasoning comes into play when designing an index for this table? I'm beginning to think a single index on UserID is all that is needed.

Sam

nr
SQLTeam MVY

12543 Posts

Posted - 2003-03-13 : 19:04:43
Found it eventually.

Under Indexes

An index contains an entry with one or more columns (the search key) from each row in a table. A B-tree is sorted on the search key, and can be searched efficiently on any leading subset of the search key. For example, an index on columns A, B, C can be searched efficiently on A, on A, B, and A, B, C.


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-03-13 : 21:30:23
So it is in BOL. First paragraph too.

Nuts.

Thanks Nigel.

Now how to design the best index for PageHits?



Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-03-14 : 04:48:42
You will need an index on UserID.
As you are doing a lot of inserts make it non clustered.

What are you doing with the courseID query.
Maybe you could maintain an aggregate table to service it.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page
   

- Advertisement -