Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Correlated Query Performance low when Indexes are used.

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2002-01-04 : 09:28:53
Narendra writes "I have posted this question on a couple of sites for which i did not get a satisfactory response other than that it was expected behaviour and that the plans would be different because of use of indexes.

--Create table
CREATE TABLE [dbo].[Test] (
[TestID] [INT] IDENTITY (1,1) NOT NULL ,
[Fk] [INT] NOT NULL ,
[TStamp] [TIMESTAMP] NOT
NULL
)
GO

--Add a Clustered Index
ALTER TABLE [dbo].[Test] WITH NOCHECK ADD
CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED
(
[TestID]
)
GO

--load the table with 10000 records
DECLARE @CounterFk INT,
@Counter INT
SET @CounterFk = 1
WHILE( @CounterFk <= 1000 )
BEGIN
SET @Counter = 1
WHILE ( @Counter < = 10 )
BEGIN
INSERT INTO Test (Fk) Values ( @CounterFk)
SET @Counter = @Counter + 1
END
SET @CounterFk = @CounterFk + 1
END

GO
--view all teh records in the table
SELECT * FROM Test

GO
--Select statement which gives the occurence of Fk's value
--which uses Tstamp in where clause
SELECT
TestOuter.TestID,
TestOuter.Fk,
(
SELECT
COUNT( TestInner.Fk)
FROM
Test TestInner
WHERE
TestOuter.Fk = TestInner.Fk
And
Cast(TestOuter.TStamp As Money)
>= Cast( TestInner.TStamp As Money )
) AS Counter
FROM
Test TestOuter

--Select statement which gives the occurence of Fk's value
-- which uses TestID in the where clause
SELECT
TestOuter.TestID,
TestOuter.Fk,
(
SELECT
COUNT( TestInner.Fk)
FROM
Test TestInner
WHERE
TestOuter.Fk = TestInner.Fk
And
TestOuter.TestID >=
TestInner.TestID
) AS Counter
FROM
Test TestOuter


In the select statements above the execution plans differ
and also the time taken for executions(1 sec for involving tstamp column vs 15 sec involving pk of the table which has a clustered index) differ drastically when executed repeatedly.

I could not figure out much from teh execution plans. I would be glad if u could explain them and why the difference in performance."

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-01-04 : 12:01:37
Assuming SQL Server 2000.

The query plan for the last query is a very naive approach: for each row in TestOuter, seek the range of TestInner.TestID <= TestOuter.TestID, keep only the ones with the same Fk value and count them. While it may be doing an index seek, it is having to look at O(n^2) rows in that inner loop, even though the seek is only delivering 1-10 rows to be aggregated.

The previous query first builds a temporary index for TestInner on Fk and then seeks on that for each row TestOuter row. So it only looks at 10 rows in the temporary index for each TestOuter row. Then it throws away the ones with smaller timestamps and counts what's left. The Hash Match/Cache doesn't look very useful since it's hashing on (tstamp, fk) which is unique by definition.

As to why the first query doesn't take the temporary index approach, I don't know. Given that the query processor estimates doing it the naive way will take much longer it does seem pretty odd. What's totally weird is that it does take that approach if you ALTER TABLE TEST DROP CONSTRAINT PK_TEST first. I'd suggest it's a query optimizer bug.


Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-01-04 : 15:01:37
My 2¢, could be (probably is!) completely wrong:

I don't know how or if this relates to Arnold's observations, but it may also be that you're using the WHERE clause to perform a join, instead of using the JOIN syntax:

SELECT TestOuter.TestID, TestOuter.Fk, COUNT( TestInner.Fk) AS Counter
FROM Test TestInner INNER JOIN Test TestOuter
ON (TestOuter.Fk=TestInner.Fk And Cast(TestOuter.TStamp As Money)>=Cast(TestInner.TStamp As Money))


I think this might work too:

SELECT TestOuter.TestID, TestOuter.Fk, COUNT(*) AS Counter
FROM Test TestInner INNER JOIN Test TestOuter
ON (TestOuter.Fk=TestInner.Fk And TestOuter.TStamp>=TestInner.TStamp)


I only translated the first query.

The Count(*) might generate a more favorable plan than Count(TestInner.FK). If the JOIN works without using CAST() then the performance should also pick up a little.

In my experience (not tested or confirmed) SQL Server's optimizer joins better when using the JOIN syntax rather than the WHERE syntax. The JOIN syntax seems to generate a more consistent plan between slight changes than the WHERE syntax does. This could be the "bug" Arnold mentions. I don't remember ever getting the same plan using both styles, but again I haven't really researched it either.

HTH

Go to Top of Page

GreatInca
Posting Yak Master

102 Posts

Posted - 2002-01-04 : 15:30:45
The join syntax is almost always better than the where syntax.

Go to Top of Page

ToddV
Posting Yak Master

218 Posts

Posted - 2002-01-04 : 16:03:12
quote:


SELECT TestOuter.TestID, TestOuter.Fk, COUNT( TestInner.Fk) AS Counter
FROM Test TestInner INNER JOIN Test TestOuter
ON (TestOuter.Fk=TestInner.Fk And Cast(TestOuter.TStamp As Money)>=Cast(TestInner.TStamp As Money))



You need a group by here:
GROUP BY TestOuter.TestID, TestOuter.Fk

Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-01-04 : 18:04:49


Thanks Todd! I owe you a

Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-01-05 : 07:20:52
Yes, knowing that SQL Server's optimization of subqueries in the SELECT part tends to be rather naive, I'd probably write the query that way myself -- assuming the temporary storage for the hash join is likely to fit. I was just concentrating on why the performance with the subquery is made worse by having an index, albeit a not very useful index for the query.

CREATE INDEX IX_Test_Fk ON Test (Fk)
Adding an index on Fk will help the subquery that uses TestID, as expected.
But now the version using TStamp varies depending on whether that CAST is left in or not. With the CAST, it produces a query plan with a higher cost than before, using a bookmark lookup from the new index. Without the CAST, it avoids IX_Test_Fk and keeps the temporary index approach!

CREATE INDEX IX_Test_Fk ON Test (Fk, TStamp)
Alternatively, creating an index that covers for both TestID and TStamp subqueries yields 3 query plans with the same approach using that index.

Perhaps "bug" was an overstatement: by the very nature of optimization there is always an element of heuristics involved.


Edited by - Arnold Fribble on 01/05/2002 07:38:56
Go to Top of Page

narendrareddy
Starting Member

6 Posts

Posted - 2002-01-07 : 01:36:08
Thanks a lot for your explanations and suggestions.

Satisfied,
Reddy.

Go to Top of Page
   

- Advertisement -