Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
narendrareddy
Starting Member
6 Posts |
Posted - 2002-01-09 : 01:33:17
|
| Many people suggest using Exists clause for an existence check. But the 2 queries below result in the same execution plan being used and have the same query cost. (50% relative cost to batch when both are executed in QA ) Could anyone suggest any cases where there is an adavantage of using EXISTS compared to SubQueries.USE pubsGOSELECT DISTINCT pub_nameFROM publishersWHERE EXISTS (SELECT * FROM titles WHERE pub_id = publishers.pub_id AND type = 'business')GO-- Or, using the IN clause:USE pubsGOSELECT distinct pub_nameFROM publishersWHERE pub_id IN (SELECT pub_id FROM titles WHERE type = 'business')GOWhen the above query is written using joins it seems to have a perf gain. So are joins with distinct or groupby the best bet for existence check ?SELECT distinct pub_nameFROM publishersinner join titleson titles.pub_id = publishers.pub_idwhere titles.type = 'business'SELECT pub_nameFROM publishersinner join titleson titles.pub_id = publishers.pub_idwhere titles.type = 'business'group by pub_name |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2002-01-09 : 02:03:16
|
The EXISTS keyword will check for a single instance meeting the criteria but the IN clause will check for all occurrences.Assumptions:Parts has 100 rows and Orders has a 100 rowseg.SELECT P.PartIDFROM Parts PWHERE P.PartID IN (SELECT PartID FROM Orders) Every row int the Orders table will be read for every row in the Parts table ie. 100*100 = 10,000 records read!SELECT P.PartIDFROM Parts PWHERE EXISTS (SELECT PartID FROM Orders O where O.PartID = P.PartID) Only 1 row of Orders will be read for every Product thus only 100 records read!Don't just rely on the query plan for performance measurement..Use.. SET STATISTICS TIME ON SET STATISTICS IO ONand SET SHOWPLAN_ALL ONAs a general rule, a join will be faster than EXISTS or IN, so use a join when you can....DavidMTomorrow is the same day as Today was the day before. |
 |
|
|
ToddV
Posting Yak Master
218 Posts |
Posted - 2002-01-09 : 09:22:35
|
quote: As a general rule, a join will be faster than EXISTS or IN, so use a join when you can....
My 2 cents: Almost all INs can be written and perform better as joins. |
 |
|
|
davidpardoe
Constraint Violating Yak Guru
324 Posts |
Posted - 2002-01-09 : 10:20:13
|
| I remember reading somewhere that EXISTS will break out as soon as it finds an occurrence so it doesn't need to go through the entire SELECT clause after the EXISTS. I have a feeling IN would have to 'finish' the entire SELECT clause.============The Dabbler! |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-01-09 : 10:48:33
|
| David (both Davids!) are correct and echo conventional wisdom: joins are usually faster, and EXISTS is usually faster because it won't have to process an entire table if one match is made. I also agree with Todd, if you can write a query as a JOIN instead of an IN clause you will very likely get much better performance.I think the reason you don't see much improvement is a factor of your example. The pubs database is pretty small, and pub_id is an indexed column in all the selected tables (PK in publishers). The query optimizer will notice this and probably start off using these indexes no matter what your query is.If you had a query where there weren't convenient indexes such as these, or one that is much larger (1 million+ rows), you will almost certainly see an improvement using EXISTS. |
 |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-01-09 : 11:34:12
|
| >> My 2 cents: Almost all INs can be written and perform better as joins.not true.I did an article on this as part of performance optimisation.It was to find all the records in one table that weren't in another.The three options covered (under v7) were exists, left outer join and in clause.Of the three (on 50 - 500,000 recs) the in clause was fastest, outer join next then exists - but not much difference.This changed drastically if it was part of update where the outer join was fastests, exists next (not much difference) then much slower (orders of magnitude) was the in clause.I tend to use the outer join if there is no problem with nulls or duplicate recs and the query is simple. If it is part of a complicated query I will use an exists. I try never to use the in clause as this will always fail if it includes a null and would need major changes if you want to add another field to the check - it does have the advantage that you can just run the subquery to check the data though.==========================================Cursors are useful if you don't know sql.Beer is not cold and it isn't fizzy. |
 |
|
|
narendrareddy
Starting Member
6 Posts |
Posted - 2002-01-10 : 00:23:48
|
| Based on what you have said would it be correct to say that what i was looking at were just plans of executions they would be the same for both subquery and exists. When they are executed over a large table we can definitely see a perf adv in using exists over subquery. When Exists is used it stops reading the table once a match is found but with a subquery it would continue..Thanks for your help,Reddy. |
 |
|
|
|
|
|
|
|