Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 performance considerations when adding a column ..

Author  Topic 

sql777
Constraint Violating Yak Guru

314 Posts

Posted - 2005-09-01 : 07:02:23
So my table looks like:
Table: Content

contentid websiteid type subject body datecreated
1 1 1 'sql server' NTEXT datetime
2 1 2
3 2 1


Now say I have 1 million rows in the table.
The websiteid tells you which website the content is for.

The 'type' column I have basically determines what section of a website that row of content is for. So 1 could be articles, 2 could be blogs, 3 could be comments etc.

Select statements will look like:

SELECT *
FROM Content
WHERE websiteid=1 AND type=1 AND datecreated > @somedate


I believe I should put a index on columns websiteid & type.

The goal here is to not have to create 3 tables (articles, blogs, comments) since there are exactly the same, but performance is VERY important for this application so if need be i'll do the extra work of 3 tables if it really makes a difference (and when does it make a difference...3 million rows?)

thanks for the advice!

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2005-09-01 : 07:22:03
And your question is ...?
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-09-01 : 07:24:36
Adding indexes should help, but you run the risk of having low selectivity because you'll have 100,000+ rows with the same values, and the optimizer may not use them unless you force index hints on your queries. Splitting the data into 3 different tables wouldn't improve anything in that regard.

If you're always going to be searching with a date range or value, definitely make sure to index that column, and perhaps consider making that your clustered index as well. It will cost more in index space, but date range queries benefit tremendously when the date value is clustered.
Go to Top of Page

sql777
Constraint Violating Yak Guru

314 Posts

Posted - 2005-09-01 : 07:43:10
>>Splitting the data into 3 different tables wouldn't improve anything in that regard.
Well if I split the into 3 tables, I won't need the 'type' column so I thought it would speed things up.

I see what your saying, if there are 2 million rows, and there are 3 'type' values (1,2,3), it doesn't really help having a index as much as you would think because 1/3 of the table has that same value.
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-09-01 : 07:51:10
1. For that specific query you may add dateCreated as a third column in the index.
2. If you never select content from more then one type you may use 3 tables, but I see no problem in adding type column.
3. Having type as a second column of already selective index helps for sure.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-09-01 : 07:53:52
quote:
I won't need the 'type' column so I thought it would speed things up.
Not really. You'd save a little tiny bit of space but lose an indexable attribute that could make a significant difference. Again, you're at a point with the row counts where it may not matter much.

I'd say it's worth testing on the full table to see if the index helps, and if you need to add hints, try it and see if they help. Even if they don't help, I still think 1 table will be a lot easier to manage than 3 identical tables. Also try clustering on the date column, see if that makes a difference.

Also, if you benchmark your tests, make sure to run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE before each query to make sure each run starts with a clean slate, so your measures will be accurate.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-09-01 : 09:57:28
We do much like you mention (lots of HTML stuff in a single table, but serving different "purposes")

Our website has a cache of "Recently retrieved stuff", so things that are requested often never result in a SQL request at all.

In our experience this knocks spots off improving the SQL Server efficiency.

1) User wants ID=1234
2) Is it in our application cache?
3a) Yes - > Get it and output it
3b) Mark the cache entry as "used" (date/time, or incrementing number)
4a) No, get it from SQL, output it
4b) If cache is full delete oldest entry
4c) Store the text in the cache

Kristen
Go to Top of Page
   

- Advertisement -