Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Clustered index advice

Author  Topic 

aiken
Aged Yak Warrior

525 Posts

Posted - 2002-03-02 : 12:22:17
I've got a table that's basically a web server pageview log. As such, it sees a ton of inserts, and few queries. I'm confused about where to put the clustered index; at the moment, I have it on the identity column (a bad idea, I know), but I'm just about to remove the identity column as it's just a waste of space anyways (Note to new DBA's: not every table needs an identity column!).

Here's the schema as is. "i" is going away (and no, I didn't name it "datetime," I inherited it that way.


CREATE TABLE [dbo].[log_running] (
[pagename] [varchar] (35) NOT NULL ,
[i_users] [int] NOT NULL ,
[userlevel] [tinyint] NOT NULL ,
[datetime] [smalldatetime] NOT NULL ,
[site_id] [tinyint] NOT NULL ,
[rowguid] uniqueidentifier ROWGUIDCOL NOT NULL ,
[iServers] [tinyint] NOT NULL ,
[i] [int] IDENTITY (1, 1) NOT NULL
) ON [PRIMARY]


Now, my understanding is that I want to put the clustered index on something that will minimize hotspots during inserts, so datetime doesn't seem like a good idea. iServers and site_id only have 3-4 possible values. Rowguid? It's for merge replication, but I could put the index on it. However, those values seem pretty sequential at any given moment.

Any advice appreciated...

Thanks
-b

aiken
Aged Yak Warrior

525 Posts

Posted - 2002-03-04 : 20:04:34
I'm afraid this got lost over the weekend, and I could really use some advice. Anyone?

Thanks
-b

Go to Top of Page

Jay99

468 Posts

Posted - 2002-03-05 : 09:17:38
If this table were all by itself on its own spindle (or RAID), I would say the DATETIME would be you best bet. Minimize the latency of the disk jumping all over the place and minimize the page splits as it tries to order the table on one of the other fields. Same idea as you transaction log . . . you want it to n+1 record to be right next the the nth record and preferable at the end of the file. The 'hotspot' is only bad if it is in the middle of the table; if its at the end, the engine will just allocate pages and extents and no reordering will need to take place.

However, what really matters are the 'few queries' that will be run. If your queries don't 'WHERE' on dates, then you would be better off with a heap, since the lookup on the internal row number will be faster than the scan/seek on the datetime. You will find the answer to your question in the joins and conditions of your 'few queries'. Which fields are even candidates for the clustered index, and then which one of those will give the greatest performance on your queries at the least cost to inserts?

As a side note, I have had a situation that was similar; however, in my case, the queries where run as a batch, together, once a week. I ended up creating and dropping the indexes around the batch of queries so that all week long, the table was just a heap, but the create index + queries-using-indexes was faster than the queries-not-using-indexes . . .


Jay
Go to Top of Page
   

- Advertisement -