Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2002-01-04 : 09:28:53
|
Narendra writes "I have posted this question on a couple of sites for which i did not get a satisfactory response other than that it was expected behaviour and that the plans would be different because of use of indexes. --Create tableCREATE TABLE [dbo].[Test] ( [TestID] [INT] IDENTITY (1,1) NOT NULL , [Fk] [INT] NOT NULL , [TStamp] [TIMESTAMP] NOT NULL )GO--Add a Clustered IndexALTER TABLE [dbo].[Test] WITH NOCHECK ADD CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED ( [TestID] )GO--load the table with 10000 recordsDECLARE @CounterFk INT, @Counter INTSET @CounterFk = 1WHILE( @CounterFk <= 1000 )BEGIN SET @Counter = 1 WHILE ( @Counter < = 10 ) BEGIN INSERT INTO Test (Fk) Values ( @CounterFk) SET @Counter = @Counter + 1 END SET @CounterFk = @CounterFk + 1ENDGO--view all teh records in the tableSELECT * FROM TestGO--Select statement which gives the occurence of Fk's value--which uses Tstamp in where clause SELECT TestOuter.TestID, TestOuter.Fk, ( SELECT COUNT( TestInner.Fk) FROM Test TestInner WHERE TestOuter.Fk = TestInner.Fk And Cast(TestOuter.TStamp As Money) >= Cast( TestInner.TStamp As Money ) ) AS CounterFROM Test TestOuter--Select statement which gives the occurence of Fk's value-- which uses TestID in the where clauseSELECT TestOuter.TestID, TestOuter.Fk, ( SELECT COUNT( TestInner.Fk) FROM Test TestInner WHERE TestOuter.Fk = TestInner.Fk And TestOuter.TestID >= TestInner.TestID ) AS CounterFROM Test TestOuter In the select statements above the execution plans differ and also the time taken for executions(1 sec for involving tstamp column vs 15 sec involving pk of the table which has a clustered index) differ drastically when executed repeatedly. I could not figure out much from teh execution plans. I would be glad if u could explain them and why the difference in performance." |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-01-04 : 12:01:37
|
| Assuming SQL Server 2000.The query plan for the last query is a very naive approach: for each row in TestOuter, seek the range of TestInner.TestID <= TestOuter.TestID, keep only the ones with the same Fk value and count them. While it may be doing an index seek, it is having to look at O(n^2) rows in that inner loop, even though the seek is only delivering 1-10 rows to be aggregated.The previous query first builds a temporary index for TestInner on Fk and then seeks on that for each row TestOuter row. So it only looks at 10 rows in the temporary index for each TestOuter row. Then it throws away the ones with smaller timestamps and counts what's left. The Hash Match/Cache doesn't look very useful since it's hashing on (tstamp, fk) which is unique by definition.As to why the first query doesn't take the temporary index approach, I don't know. Given that the query processor estimates doing it the naive way will take much longer it does seem pretty odd. What's totally weird is that it does take that approach if you ALTER TABLE TEST DROP CONSTRAINT PK_TEST first. I'd suggest it's a query optimizer bug. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-01-04 : 15:01:37
|
| My 2¢, could be (probably is!) completely wrong:I don't know how or if this relates to Arnold's observations, but it may also be that you're using the WHERE clause to perform a join, instead of using the JOIN syntax:SELECT TestOuter.TestID, TestOuter.Fk, COUNT( TestInner.Fk) AS Counter FROM Test TestInner INNER JOIN Test TestOuter ON (TestOuter.Fk=TestInner.Fk And Cast(TestOuter.TStamp As Money)>=Cast(TestInner.TStamp As Money))I think this might work too:SELECT TestOuter.TestID, TestOuter.Fk, COUNT(*) AS Counter FROM Test TestInner INNER JOIN Test TestOuter ON (TestOuter.Fk=TestInner.Fk And TestOuter.TStamp>=TestInner.TStamp)I only translated the first query.The Count(*) might generate a more favorable plan than Count(TestInner.FK). If the JOIN works without using CAST() then the performance should also pick up a little.In my experience (not tested or confirmed) SQL Server's optimizer joins better when using the JOIN syntax rather than the WHERE syntax. The JOIN syntax seems to generate a more consistent plan between slight changes than the WHERE syntax does. This could be the "bug" Arnold mentions. I don't remember ever getting the same plan using both styles, but again I haven't really researched it either.HTH |
 |
|
|
GreatInca
Posting Yak Master
102 Posts |
Posted - 2002-01-04 : 15:30:45
|
| The join syntax is almost always better than the where syntax. |
 |
|
|
ToddV
Posting Yak Master
218 Posts |
Posted - 2002-01-04 : 16:03:12
|
quote: SELECT TestOuter.TestID, TestOuter.Fk, COUNT( TestInner.Fk) AS Counter FROM Test TestInner INNER JOIN Test TestOuter ON (TestOuter.Fk=TestInner.Fk And Cast(TestOuter.TStamp As Money)>=Cast(TestInner.TStamp As Money))
You need a group by here: GROUP BY TestOuter.TestID, TestOuter.Fk |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-01-04 : 18:04:49
|
       Thanks Todd! I owe you a |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-01-05 : 07:20:52
|
| Yes, knowing that SQL Server's optimization of subqueries in the SELECT part tends to be rather naive, I'd probably write the query that way myself -- assuming the temporary storage for the hash join is likely to fit. I was just concentrating on why the performance with the subquery is made worse by having an index, albeit a not very useful index for the query.CREATE INDEX IX_Test_Fk ON Test (Fk)Adding an index on Fk will help the subquery that uses TestID, as expected.But now the version using TStamp varies depending on whether that CAST is left in or not. With the CAST, it produces a query plan with a higher cost than before, using a bookmark lookup from the new index. Without the CAST, it avoids IX_Test_Fk and keeps the temporary index approach!CREATE INDEX IX_Test_Fk ON Test (Fk, TStamp)Alternatively, creating an index that covers for both TestID and TStamp subqueries yields 3 query plans with the same approach using that index.Perhaps "bug" was an overstatement: by the very nature of optimization there is always an element of heuristics involved.Edited by - Arnold Fribble on 01/05/2002 07:38:56 |
 |
|
|
narendrareddy
Starting Member
6 Posts |
Posted - 2002-01-07 : 01:36:08
|
| Thanks a lot for your explanations and suggestions. Satisfied,Reddy. |
 |
|
|
|
|
|
|
|