Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
chris_lunt
Starting Member
25 Posts |
Posted - 2014-01-29 : 11:44:47
|
Hello FolksCan anybody tell me wehat is 'generally' the best approach, is one index with many columns better than many indexes with one column?More. . .When I have an cluserted index on the primary key that is a foreign key in another table, that seems fine for the join but then I need another index or indexes for the fields in the where clause. Currently the tables seems to have quite a few indexes built on various columns (sometimes more than one), sometimes the first column is repeated in another index but the second is different. Would you just convert these to one index with many columns?Also . . .I understand if the first column in an index is not used, then the index will not be used. How does this pan out in the following situation, the join uses the first column and the where has some of the other columns?Thanks for anyones help, I understand indexes are very subjective to their environment but i'm struggling to find any really good explanations of how to move on with indexes. Does anybody have sometghing they could point me at? |
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2014-01-29 : 12:38:49
|
In general, in very broad terms, a single index with multiple columns is much better than multiple indexes each with a single column, simply because so often SQL will be able to use the multi-column index and won't use the single column index(es).You also need to understand the importance of the clustered index vs. non-clustered indexes.Equally importantly, you should use the information SQL itself gives you about index usage. For example, pick the main db/table(s) you are interested in examining index usage on, then run the following code on them and post the results. From there, we can combine your knowledge of the table WHERE conditions with what SQL tells us to reconfigure your indexes.NOTE: Do NOT automatically build every index that SQL "says" is "missing". You need someone knowledgeable to review the missing index results and decide which indexes to implement, and to decide the specific details of how to implement that index (clus vs nonclus, key cols vs included cols, etc.).DECLARE @list_missing_indexes bitDECLARE @table_name_pattern sysnameSET @list_missing_indexes = 1SET @table_name_pattern = '%'PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)--SELECT create_date FROM sys.databases WITH (NOLOCK) WHERE name = N'tempdb'IF @list_missing_indexes = 1BEGIN SELECT GETDATE() AS capture_date, DB_NAME(mid.database_id) AS Db_Name, OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name, mid.equality_columns, mid.inequality_columns, mid.included_columns, migs.*, mid.statement, mid.object_id, mid.index_handle FROM sys.dm_db_missing_index_details mid WITH (NOLOCK) LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON mig.index_handle = mid.index_handle LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON migs.group_handle = mig.index_group_handle --order by --DB_NAME, Table_Name, equality_columns WHERE 1 = 1 AND mid.database_id = DB_ID() AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern AND OBJECT_NAME(mid.object_id) NOT LIKE 'tmp%' --AND mid.object_id IN (OBJECT_ID('RATING_LCC_LOG')) --AND mid.object_id IN (OBJECT_ID('dbo.shipStatusHeader_trn')) --AND mid.object_id IN (OBJECT_ID('dbo.fgt'), OBJECT_ID('dbo.mul'), OBJECT_ID('dbo.shipment')) --AND mid.object_id IN (OBJECT_ID('dbo.MoveHeader')) --AND mid.object_id IN (OBJECT_ID('dbo.loadCarrierStatus_trn'), OBJECT_ID('dbo.loadHeader_trn')) ORDER BY --avg_total_user_cost * (user_seeks + user_scans) DESC, Db_Name, Table_Name, equality_columns, inequality_columnsEND --IF-- list index usage stats (seeks, scans, etc.)SELECT ius2.row_num, DB_NAME() AS db_name, OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name, i.name AS index_name, FILEGROUP_NAME(i.data_space_id) AS filegroup_name, (SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days, dps.row_count, fk.Reference_Count AS fk_ref_count, ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads, ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates, ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update, ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates, ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_updateFROM sys.indexes i WITH (NOLOCK)INNER JOIN sys.objects o WITH (NOLOCK) ON o.object_id = i.object_idLEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON dps.object_id = i.object_id AND dps.index_id = i.index_idLEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON ius.database_id = DB_ID() AND ius.object_id = i.object_id AND ius.index_id = i.index_idLEFT OUTER JOIN ( SELECT database_id, object_id, MAX(user_scans) AS user_scans, ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans FROM sys.dm_db_index_usage_stats WITH (NOLOCK) WHERE database_id = DB_ID() --AND object_name(object_id, database_id) = N'shipStatusHeader_trn' --AND index_id > 0 GROUP BY database_id, object_id) AS ius2 ON ius2.database_id = DB_ID() AND ius2.object_id = i.object_idLEFT OUTER JOIN ( SELECT referenced_object_id, COUNT(*) AS Reference_Count FROM sys.foreign_keys WHERE is_disabled = 0 GROUP BY referenced_object_id ) AS fk ON fk.referenced_object_id = i.object_idWHERE i.object_id > 100 AND i.is_hypothetical = 0 AND i.type IN (0, 1, 2) AND o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND ( o.name LIKE @table_name_pattern AND o.name NOT LIKE 'dtprop%' AND o.name NOT LIKE 'filestream[_]' AND o.name NOT LIKE 'MSpeer%' AND o.name NOT LIKE 'MSpub%' AND o.name NOT LIKE 'sys%' )ORDER BY --row_count DESC, --ius2.row_num, --user_scans&|user_seeks -- list clustered index first, if any, then other index(es) db_name, table_name, CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END, index_namePRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120) |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2014-01-29 : 17:12:31
|
quote: I understand if the first column in an index is not used, then the index will not be used.
That's not quite true. SQL cannot do a seek on the index if the first column is not used, but it still would scan that index if it had all the columns used in the query. |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2014-01-29 : 17:13:31
|
quote: Originally posted by ScottPletcher In general, in very broad terms, a single index with multiple columns is much better than multiple indexes each with a single column, simply because so often SQL will be able to use the multi-column index and won't use the single column index(es).You also need to understand the importance of the clustered index vs. non-clustered indexes.Equally importantly, you should use the information SQL itself gives you about index usage. This is best shown with a specific example. Therefore, pick the main db/table(s) you are most interested in tuning indexes on, then run the following code on them and post the results. From there, we can combine your knowledge of the table WHERE conditions with what SQL tells us to reconfigure your indexes.NOTE: Do NOT automatically build every index that SQL "says" is "missing". You need someone knowledgeable to review the missing index results and decide which indexes to implement, and to decide the specific details of how to implement that index (clus vs nonclus, key cols vs included cols, etc.).DECLARE @list_missing_indexes bitDECLARE @table_name_pattern sysnameSET @list_missing_indexes = 1SET @table_name_pattern = '%'PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)--SELECT create_date FROM sys.databases WITH (NOLOCK) WHERE name = N'tempdb'IF @list_missing_indexes = 1BEGIN SELECT GETDATE() AS capture_date, DB_NAME(mid.database_id) AS Db_Name, OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name, mid.equality_columns, mid.inequality_columns, mid.included_columns, migs.*, mid.statement, mid.object_id, mid.index_handle FROM sys.dm_db_missing_index_details mid WITH (NOLOCK) LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON mig.index_handle = mid.index_handle LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON migs.group_handle = mig.index_group_handle --order by --DB_NAME, Table_Name, equality_columns WHERE 1 = 1 AND mid.database_id = DB_ID() AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern AND OBJECT_NAME(mid.object_id) NOT LIKE 'tmp%' --AND mid.object_id IN (OBJECT_ID('RATING_LCC_LOG')) --AND mid.object_id IN (OBJECT_ID('dbo.shipStatusHeader_trn')) --AND mid.object_id IN (OBJECT_ID('dbo.fgt'), OBJECT_ID('dbo.mul'), OBJECT_ID('dbo.shipment')) --AND mid.object_id IN (OBJECT_ID('dbo.MoveHeader')) --AND mid.object_id IN (OBJECT_ID('dbo.loadCarrierStatus_trn'), OBJECT_ID('dbo.loadHeader_trn')) ORDER BY --avg_total_user_cost * (user_seeks + user_scans) DESC, Db_Name, Table_Name, equality_columns, inequality_columnsEND --IF-- list index usage stats (seeks, scans, etc.)SELECT ius2.row_num, DB_NAME() AS db_name, OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name, i.name AS index_name, FILEGROUP_NAME(i.data_space_id) AS filegroup_name, (SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days, dps.row_count, fk.Reference_Count AS fk_ref_count, ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads, ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates, ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update, ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates, ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_updateFROM sys.indexes i WITH (NOLOCK)INNER JOIN sys.objects o WITH (NOLOCK) ON o.object_id = i.object_idLEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON dps.object_id = i.object_id AND dps.index_id = i.index_idLEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON ius.database_id = DB_ID() AND ius.object_id = i.object_id AND ius.index_id = i.index_idLEFT OUTER JOIN ( SELECT database_id, object_id, MAX(user_scans) AS user_scans, ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans FROM sys.dm_db_index_usage_stats WITH (NOLOCK) WHERE database_id = DB_ID() --AND object_name(object_id, database_id) = N'shipStatusHeader_trn' --AND index_id > 0 GROUP BY database_id, object_id) AS ius2 ON ius2.database_id = DB_ID() AND ius2.object_id = i.object_idLEFT OUTER JOIN ( SELECT referenced_object_id, COUNT(*) AS Reference_Count FROM sys.foreign_keys WHERE is_disabled = 0 GROUP BY referenced_object_id ) AS fk ON fk.referenced_object_id = i.object_idWHERE i.object_id > 100 AND i.is_hypothetical = 0 AND i.type IN (0, 1, 2) AND o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND ( o.name LIKE @table_name_pattern AND o.name NOT LIKE 'dtprop%' AND o.name NOT LIKE 'filestream[_]' AND o.name NOT LIKE 'MSpeer%' AND o.name NOT LIKE 'MSpub%' AND o.name NOT LIKE 'sys%' )ORDER BY --row_count DESC, --ius2.row_num, --user_scans&|user_seeks -- list clustered index first, if any, then other index(es) db_name, table_name, CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END, index_namePRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)
|
|
|
chris_lunt
Starting Member
25 Posts |
Posted - 2014-01-30 : 04:01:45
|
Hello Scott, thanks for yur help here. I've just run that query against a couple of tables and wanted to upload the results vthat I've copied to an excel file. Can that be done in this forum (presume I just can't see where to attach files)? The results are quite uinreadable when pasted directly into the message field.My email address is chris.lunt@uhns.nhs.uk if you want to drop me a line and I'll email the results directly to you.Regarding the basics, I think I understand the cluserted versus non clustered OK but I'd be happy to hear your interpretation. Mine is as follows, you can have one clustered index, it's always a covered index so once you're at the leaf level you also have the data. Non clustered require an extra step once you're at the leaf to go and find the data (unless the column is included), non cluserted differ slighty on a table with a clustered key too as once you get to the leaf level you hvae enough info to use the cluistered key to find the data rather than using the rowid of the data as you would on a non clustered index on a table without a clustered index.Probably loads more to it and I'd like to hear more of what you've got to say.Thanks |
|
|
jackv
Master Smack Fu Yak Hacker
2179 Posts |
Posted - 2014-01-30 : 06:49:27
|
Hi, BOL has some very good details regarding differences between Clustered Indexes and Non Clustered Indexes. You are right , in the sense a) The clustered index sorts table data on physical 2) Non-clustered indexes “point” back to the clustered indexes data pagesA key element to consider when creating an indexing strategy is that all index definitions are profiled with the nature of the usage and query demands.Jack Vamvas--------------------http://www.sqlserver-dba.com |
|
|
|
|
|
|
|