What Indices For Many-To-Many Join Table? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

What Indices For Many-To-Many Join Table?

Author

Topic

dbleyl
Starting Member

21 Posts

Posted - 2002-01-18 : 08:48:37

Hi,

A table of Warehouses, a table of Products, a Sales intersection table with WarehouseID, ProductID and UnitsSold.

100 Warehouses, 500,000 Products.

What indices are most appropriate for the Sales table?

In general, what are the most appropriate indices for a many-to-many relationship?

Thanks
Don

AjarnMark
SQL Slashing Gunting Master

3246 Posts

Posted - 2002-01-18 : 15:00:10

Indices on the fields used in the join (WarehouseID and ProductID) are usually a good start.

--------------------------------
There's a new General in town...

dbleyl
Starting Member

21 Posts

Posted - 2002-01-18 : 16:24:25

Thanks, sorry for not qualifying my post--what I'm really asking is:

1.)Is a clustered index appropriate for join tables?

2.) What criteria should be used to pick the column(s).

Selectivity? Most frequent query? How often the rows are updated? The ratio of rows in the two tables?

2 indices and 1 cluster per table means choosing one over the other. And in this case, testing means a lot of index rebuilding. This could be begging the question; is there something better than m-to-m?

JustinBigelow
SQL Gigolo

1157 Posts

Posted - 2002-01-18 : 17:19:07

quote:

2 indices and 1 cluster per table means choosing one over the other.

You can build a multi-column clustered index. Which ever column you specify first in the index creation will create the inital groupings on that column and then sub-grouped on the second. IMHO if you have 100 warehouses compared to 500,000 products I would specify it as "constraint PK_WarehouseProduct primary key (warehouseID, productID)". Otherwise you'll have a pretty good distribution on both columns in the index as opposed to just the ProductID.

m2c,
Justin

Lavos
Posting Yak Master

200 Posts

Posted - 2002-01-19 : 01:38:59

Another important note is "covered indexes". If everything Sql Server needs is served up by the index, then it need never go to the actual table rows. I've heard it described as the closest you can get to having multiple clustered indexes.

Personally, I think it really depends on how your data is accessed. If 80% of the time you are seeing which warehouses have sold a particular product, then it makes sense for it to be indexed on the productID first, then warehouseID, and vice versa if you are seeing what products a warehouse has sold.

How the data is used and input will generally be the best indication of what indexes you should create. If you have time to play with it, create multiple indexes and view their usage statistics over time. Drop the unused ones and make your decision of which one to make clustered by how the data is accessed from disk. (try to keep the clustered key small. The non-clustered keys have to store it's value, so a large clustered key will slow down the access times for all your other indexes.)

----------------------
"O Theos mou! Echo ten labrida en te mou kephale!"

dbleyl
Starting Member

21 Posts

Posted - 2002-01-22 : 10:16:45

Thanks. I've got a clustered index on WarehouseID, and it seems to be working fine.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources