Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Correlated Sub Query Causes Poor Performance

Author  Topic 

souLTower
Starting Member

39 Posts

Posted - 2009-02-18 : 07:53:22
Here's an oddball. My software has the following query:


select A.fldA, A.fldB from tableA A where A.fldKey IN
(
select AA.fldKey from tableA AA inner join tableB B ON
AA.fldKey = B.fldKey inner join tableC C ON B.fldChild = C.fldChild
)
OR A.fldKey in
(
select AAA.fldKey from tableA AAA inner join tableX X ON
AAA.fldKey = X.fldKey inner join tableC C ON X.fldChild = C.fldChild
)



The app was hanging and when I checked statistics I found that tableB was being read 4 million times. The table contains about 100K worth of data. As a test I changed the query to read like this:


select A.fldA, A.fldB from tableA A where A.fldKey IN
(
select AA.fldKey from tableA AA inner join tableB B ON
AA.fldKey = B.fldKey inner join tableC C ON B.fldChild = C.fldChild
)
OR A.fldKey = 7


I got the same results. I know how to change the query so this doesn't happen but am curious as to why the DB is reading the table so many times. It seems that the optimizer sees that the correlated sub query can be treated as an inner join. Then when you add the additional conditional (I like how that sounds) it checks the condition against every row and subsequently performs a join for every row.

Here is a testing script that shows what I mean:

set statistics io OFF
set nocount on

CREATE TABLE #foo (
fldID int identity(1, 1),
fldNum int not null
)

-- Get a few records started
insert into #foo(fldNum)
select 1 as X union all
select 1 as X union all
select 1 as X union all
select 1 as X union all
select 1 as X union all
select 1 as X union all
select 1 as X



while (select count(fldID) from #foo) < 100000
insert into #foo(fldNum) select fldID from #foo

select count(fldID) from #foo

set statistics io ON

-- Get statistics on sub query
select top 500 fldID from #foo
-- Results Table #foo Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.


-- Get statistics on the main query and the sub query
select A.fldID from #foo A where A.fldID IN
(
select top 500 B.fldID from #foo B
)
-- Results Table #foo Scan count 2, logical reads 256, physical reads 0, read-ahead reads 0.


-- Get statistics on the second condition
select A.fldID from #foo A where A.fldID = 1
-- Results Table #foo Scan count 1, logical reads 254, physical reads 0, read-ahead reads 0.


-- Get statistics on the whole thing
select A.fldID from #foo A where A.fldID IN
(
select top 500 B.fldID from #foo B
) OR A.fldID = 1

-- Results Table #foo Scan count 3, logical reads 256, physical reads 0, read-ahead reads 0.
-- Table 'Worktable'. Scan count 498, logical reads 997, physical reads 0, read-ahead reads 0.
-- Table 'Worktable'. Scan count 500, logical reads 1001, physical reads 0, read-ahead reads 0.
-- Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0.








God Bless

robvolk
Most Valuable Yak

15732 Posts

Posted - 2009-02-18 : 08:10:56
How about:
select A.fldA, A.fldB from tableA A 
where exists(select * from tableA AA inner join tableB B ON AA.fldKey = B.fldKey
inner join tableC C ON B.fldChild = C.fldChild
where A.fldKey=AA.fldKey)
OR exists(select * from tableA AAA inner join tableX X ON AAA.fldKey = X.fldKey
inner join tableC C ON X.fldChild = C.fldChild
where A.fldKey=AAA.fldKey)
This should reduce scans and hopefully turn them into seeks, however, if you have a lot of distinct values for fldKey then it may be more efficient to scan anyway.
Go to Top of Page

souLTower
Starting Member

39 Posts

Posted - 2009-02-18 : 08:15:37
Thanks. As I mentioned, I have a solution. The post was more of a discussion on why the optimizer thinks that it needs to create additional working tables to resolve the query when it did not need the additional tables to resolve the single "IN" clauses.

God Bless
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2009-02-18 : 22:18:09
Typically when using the IN() operator with a embedded query, the query processor will evaluate that separately and store the results in a work table (look for spool operators in the query plan) Then it will perform a join to that work table, usually nested loops.

Since EXISTS() does not have to process the entire list, but only up to the first match, it can more readily use a seek to answer the condition. It depends on the indexing and other query semantics, sometimes EXISTS(), JOINS, and IN() syntax can generate the same or very similar plans. The only way to know is to try them all.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2009-02-18 : 22:27:53
why not just do the joins and a union?

why the derived tables?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2009-02-18 : 22:41:53
quote:
Originally posted by X002548

why not just do the joins and a union?
Well, joining these could create dupes that would then be removed by the UNION. Using EXISTS avoids both and gives the same results.
Go to Top of Page

souLTower
Starting Member

39 Posts

Posted - 2009-02-19 : 05:33:19
For the record, in this case I actually needed to use a join (I trimmed the original query to make it easier to see the original concept). It just goes to show that you can always learn something. The fact that adding a static conditional to an IN() statement:

where IN(...) OR fld = value

can cause such a serious drop in performance will definitely go in my mental list of "don't do" items.

Thanks

God Bless
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-02-19 : 09:57:20
you can try using EXISTS instead of IN and compare performance.
Go to Top of Page
   

- Advertisement -