Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Indexing

Author  Topic 

chris_lunt
Starting Member

25 Posts

Posted - 2014-01-29 : 11:44:47
Hello Folks

Can anybody tell me wehat is 'generally' the best approach, is one index with many columns better than many indexes with one column?

More. . .
When I have an cluserted index on the primary key that is a foreign key in another table, that seems fine for the join but then I need another index or indexes for the fields in the where clause. Currently the tables seems to have quite a few indexes built on various columns (sometimes more than one), sometimes the first column is repeated in another index but the second is different. Would you just convert these to one index with many columns?

Also . . .
I understand if the first column in an index is not used, then the index will not be used. How does this pan out in the following situation, the join uses the first column and the where has some of the other columns?

Thanks for anyones help, I understand indexes are very subjective to their environment but i'm struggling to find any really good explanations of how to move on with indexes. Does anybody have sometghing they could point me at?

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2014-01-29 : 12:38:49
In general, in very broad terms, a single index with multiple columns is much better than multiple indexes each with a single column, simply because so often SQL will be able to use the multi-column index and won't use the single column index(es).

You also need to understand the importance of the clustered index vs. non-clustered indexes.

Equally importantly, you should use the information SQL itself gives you about index usage. For example, pick the main db/table(s) you are interested in examining index usage on, then run the following code on them and post the results. From there, we can combine your knowledge of the table WHERE conditions with what SQL tells us to reconfigure your indexes.

NOTE: Do NOT automatically build every index that SQL "says" is "missing". You need someone knowledgeable to review the missing index results and decide which indexes to implement, and to decide the specific details of how to implement that index (clus vs nonclus, key cols vs included cols, etc.).




DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname

SET @list_missing_indexes = 1
SET @table_name_pattern = '%'

PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)

--SELECT create_date FROM sys.databases WITH (NOLOCK) WHERE name = N'tempdb'

IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
migs.*,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
migs.group_handle = mig.index_group_handle
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID()
AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern
AND OBJECT_NAME(mid.object_id) NOT LIKE 'tmp%'
--AND mid.object_id IN (OBJECT_ID('RATING_LCC_LOG'))
--AND mid.object_id IN (OBJECT_ID('dbo.shipStatusHeader_trn'))
--AND mid.object_id IN (OBJECT_ID('dbo.fgt'), OBJECT_ID('dbo.mul'), OBJECT_ID('dbo.shipment'))
--AND mid.object_id IN (OBJECT_ID('dbo.MoveHeader'))
--AND mid.object_id IN (OBJECT_ID('dbo.loadCarrierStatus_trn'), OBJECT_ID('dbo.loadHeader_trn'))
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF

-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name, OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name, i.name AS index_name,
FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
(SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days,
dps.row_count,
fk.Reference_Count AS fk_ref_count,
ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
WHERE
database_id = DB_ID()
--AND object_name(object_id, database_id) = N'shipStatusHeader_trn'
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
o.name NOT LIKE 'sys%'
)
ORDER BY
--row_count DESC,
--ius2.row_num, --user_scans&|user_seeks
-- list clustered index first, if any, then other index(es)
db_name, table_name, CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END, index_name

PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)

Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2014-01-29 : 17:12:31
quote:
I understand if the first column in an index is not used, then the index will not be used.


That's not quite true. SQL cannot do a seek on the index if the first column is not used, but it still would scan that index if it had all the columns used in the query.
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2014-01-29 : 17:13:31
quote:
Originally posted by ScottPletcher

In general, in very broad terms, a single index with multiple columns is much better than multiple indexes each with a single column, simply because so often SQL will be able to use the multi-column index and won't use the single column index(es).

You also need to understand the importance of the clustered index vs. non-clustered indexes.

Equally importantly, you should use the information SQL itself gives you about index usage. This is best shown with a specific example. Therefore, pick the main db/table(s) you are most interested in tuning indexes on, then run the following code on them and post the results. From there, we can combine your knowledge of the table WHERE conditions with what SQL tells us to reconfigure your indexes.

NOTE: Do NOT automatically build every index that SQL "says" is "missing". You need someone knowledgeable to review the missing index results and decide which indexes to implement, and to decide the specific details of how to implement that index (clus vs nonclus, key cols vs included cols, etc.).




DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname

SET @list_missing_indexes = 1
SET @table_name_pattern = '%'

PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)

--SELECT create_date FROM sys.databases WITH (NOLOCK) WHERE name = N'tempdb'

IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
migs.*,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
migs.group_handle = mig.index_group_handle
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID()
AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern
AND OBJECT_NAME(mid.object_id) NOT LIKE 'tmp%'
--AND mid.object_id IN (OBJECT_ID('RATING_LCC_LOG'))
--AND mid.object_id IN (OBJECT_ID('dbo.shipStatusHeader_trn'))
--AND mid.object_id IN (OBJECT_ID('dbo.fgt'), OBJECT_ID('dbo.mul'), OBJECT_ID('dbo.shipment'))
--AND mid.object_id IN (OBJECT_ID('dbo.MoveHeader'))
--AND mid.object_id IN (OBJECT_ID('dbo.loadCarrierStatus_trn'), OBJECT_ID('dbo.loadHeader_trn'))
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF

-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name, OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name, i.name AS index_name,
FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
(SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days,
dps.row_count,
fk.Reference_Count AS fk_ref_count,
ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
WHERE
database_id = DB_ID()
--AND object_name(object_id, database_id) = N'shipStatusHeader_trn'
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
o.name NOT LIKE 'sys%'
)
ORDER BY
--row_count DESC,
--ius2.row_num, --user_scans&|user_seeks
-- list clustered index first, if any, then other index(es)
db_name, table_name, CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END, index_name

PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)



Go to Top of Page

chris_lunt
Starting Member

25 Posts

Posted - 2014-01-30 : 04:01:45
Hello Scott, thanks for yur help here. I've just run that query against a couple of tables and wanted to upload the results vthat I've copied to an excel file. Can that be done in this forum (presume I just can't see where to attach files)? The results are quite uinreadable when pasted directly into the message field.

My email address is chris.lunt@uhns.nhs.uk if you want to drop me a line and I'll email the results directly to you.

Regarding the basics, I think I understand the cluserted versus non clustered OK but I'd be happy to hear your interpretation. Mine is as follows, you can have one clustered index, it's always a covered index so once you're at the leaf level you also have the data. Non clustered require an extra step once you're at the leaf to go and find the data (unless the column is included), non cluserted differ slighty on a table with a clustered key too as once you get to the leaf level you hvae enough info to use the cluistered key to find the data rather than using the rowid of the data as you would on a non clustered index on a table without a clustered index.

Probably loads more to it and I'd like to hear more of what you've got to say.

Thanks
Go to Top of Page

jackv
Master Smack Fu Yak Hacker

2179 Posts

Posted - 2014-01-30 : 06:49:27
Hi, BOL has some very good details regarding differences between Clustered Indexes and Non Clustered Indexes. You are right , in the sense a) The clustered index sorts table data on physical 2) Non-clustered indexes “point” back to the clustered indexes data pages
A key element to consider when creating an indexing strategy is that all index definitions are profiled with the nature of the usage and query demands.





Jack Vamvas
--------------------
http://www.sqlserver-dba.com
Go to Top of Page
   

- Advertisement -