| Author |
Topic |
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 06:16:19
|
| I'm working on a pretty complex search/distribution list system for a company database which holds a very broad range of data. They need to be able to select lists of Contacts from the database based on pretty much any criteria on the database (all 70 or so tables) and then AND, OR or NOT the results of each criteria together to get their final list. I've written a filter/plugin system for the front end, which is working fine, and the plugins individually are running and selecting the correct results for the criteria and storing them in table 'lstRunFilter'.When I need to pull the final list out, I simply AND/OR/NOT a bunch of subqueries together, pulling the individual information for each filter/plugin out and then DISTINCT the list at the end, but its way too slow. Over a minute to run the statement below (which is using 8 filters/plugins). The question is how can the following statement be re-written to be more speed efficient. I'm no SQL guru so I'm more than willing to believe that there is/are simple ways to make this many times faster.Thanks in advance for any help, much appreciated!SELECT DISTINCT ContactID FROM crmContact WHERE ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 93) AND ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 98) OR ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 99) OR ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 100) OR ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 101) OR ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 102) OR ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 103) AND NOT ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 94) |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 07:05:13
|
Rewrite code to something likeSELECT DISTINCT ContactIDFROM crmContactWHERE ContactID IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID IN (93, 98, 99, 100, 101, 102, 103) ) AND NOT ContactID IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 100 AND ListFilterID = 94 ) Peter LarssonHelsingborg, Sweden |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 07:15:08
|
| Nice one Peso. So that means there will only ever be a maximum of 3 logical operations on the data, and that will only happen when one list uses AND, OR and NOT on at least one plugin each. That should speed it up nicely, i'll post some times later when I rewrite the code generator.Many thanks! :)Any more suggestions or improvements are still gratefully receieved! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 07:18:50
|
quote: Originally posted by vallis Nice one Peso. So that means there will only ever be a maximum of 3 logical operations on the data, and that will only happen when one list uses AND, OR and NOT on at least one plugin each. That should speed it up nicely, i'll post some times later when I rewrite the code generator.Many thanks! :)Any more suggestions or improvements are still gratefully receieved!
I am not sure if I have got all the INs and NOT INs in the right places. Please check and see if the rewritten query gives you the same result as before. And if you have time, please post back the time the query took before, and the time the query take after rewriting.Peter LarssonHelsingborg, Sweden |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 07:45:56
|
| I just re-wrote the SQL generator, seems to be working fine, it's now creating SQL like the following:SELECT DISTINCT ContactID FROM crmContact WHERE ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN(98, 99, 100, 101, 103)) AND ContactID IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN(102)) AND NOT ContactID NOT IN(SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN(94))The above statement (which is operating on about 30000 records in lstRunFilter) now takes about half a second to execute, compared to 40-60 seconds before. (It's a slightly different statement but its close).Thanks Peso! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 07:56:39
|
Your welcome! I think the rewritten query is much easier to understand too. But...SELECT DISTINCT ContactIDFROM crmContactWHERE ContactID IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103) ) AND ContactID IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN (102) ) AND NOT ContactID NOT IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN (94) ) 1. There are TWO negative operators marked with red.2. ListFilterID IN (102) could be written as ListFilterID = 1023. ListFilterID IN (94) could be written as ListFilterID = 94Unless this is the result of an dynamically built query. Then ignore 2) and 3).Just some minor details... Good luck!Peter LarssonHelsingborg, Sweden |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 08:04:48
|
| Yeah, 2 & 3 are just a result of dynamically building the query. I ran some tests a while back and passing just one parameter to IN performed the same as using equals so I figure it's not worth the additional code branch in the generator.Thanks for pointing out 1, just fixed the double negative.Cheers! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 08:06:34
|
quote: Originally posted by vallis The above statement (which is operating on about 30000 records in lstRunFilter) now takes about half a second to execute, compared to 40-60 seconds before. (It's a slightly different statement but its close).
Great! A speed difference of 100 times better is not bad. Any day.Peter LarssonHelsingborg, Sweden |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 08:11:12
|
Definitely, and much appreciated too. Thank you for your time |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-06-14 : 10:00:15
|
so all those IN's and NOT IN's....why don't you replace them with a single EXISTS?Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"] |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 10:09:58
|
quote: Originally posted by spirit1 so all those IN's and NOT IN's....why don't you replace them with a single EXISTS?Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"]
Can you explain how? Doesn't EXISTS only returns true or false based on the result of the sub query? How would that be useful for what I'm doing (honest question)? |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-06-14 : 10:36:58
|
Indeed exists return True or Falseso If the condition specified is true it will return 1 and the row will get selected else it won't.SELECT DISTINCT ContactIDFROM crmContact t1WHERE exists (SELECT * FROM lstRunFilter WHERE ContactId = t1.ContactID RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103, 104, 94) Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"] |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 11:01:58
|
| Thanks for the reply spirit1. I've just been playing with it in query analyzer, I've got both versions running alongside each other (one using the INs as before, and one using EXISTS as you suggested) but they both seem to end up with identical execution plans. Is there any speed/efficiency advantage to doing it with EXISTS or is it purely a cosmetic change? Thanks. |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-06-14 : 11:08:27
|
well i guess it's a cosmetic change in your case...exists might be faster in tables that have a lot of rows 10k+But the biggest advantage of exists is that you can check for more than one value in one go i.e.:select * from table1 t1where exists (select * from table2 where t1.id = t2.id anb t1.someDate > t2.someDate) -- or any other condition while with in you can'tGo with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"] |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 11:19:26
|
| Thank you for the explanation spirit1. During the tests I was operating on about 30k records but the execution plans were working out identically so I guess internally SQL Server was turning both methods into the same operations. Always good to know the limitations of the code I'm using though. |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 12:05:04
|
quote: Originally posted by spirit1 Indeed exists return True or Falseso If the condition specified is true it will return 1 and the row will get selected else it won't.SELECT DISTINCT ContactIDFROM crmContact t1WHERE exists (SELECT * FROM lstRunFilter WHERE ContactId = t1.ContactID RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103, 104, 94) Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"]
How do you deny ContactID when runid = 124 and listfilterid = 94?If ContactID is on this list don't return it. It can exist on all other lists, but if present on 124/94 list, don't return it.Peter LarssonHelsingborg, Sweden |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-06-14 : 12:15:34
|
you mean this?SELECT DISTINCT ContactIDFROM crmContact t1WHERE exists (SELECT * FROM lstRunFilter WHERE ContactId = t1.ContactID AND RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103, 104) AND ListFilterID NOT IN (94) Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"] |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 12:31:14
|
quote: Originally posted by spirit1
SELECT DISTINCT ContactIDFROM crmContact t1WHERE exists (SELECT * FROM lstRunFilter WHERE ContactId = t1.ContactID AND RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103, 104) AND ListFilterID NOT IN (94)
Is that true? Are you sure?Consider this table dataContactID RunID ListFilterID--------- ----- ------------123 124 98123 124 94124 124 99125 124 100126 124 101127 124 103128 124 104 The top row is IN ok but the second top row is not since 94 is a NOT IN.But since 123 is allowed from the top row, ContactID 123 will evaluate as valid anyway with your code suggestion.Optimized querySELECT DISTINCT ContactIDFROM crmContactWHERE ContactID IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103, 104) ) AND ContactID NOT IN ( SELECT ContactID FROM lstRunFilter WHERE RunID = 124 AND ListFilterID IN (94) ) This query rejects ContactID 123 because it is allowed for first IN but disallowed for second IN.Peter LarssonHelsingborg, Sweden |
 |
|
|
vallis
Starting Member
12 Posts |
Posted - 2006-06-14 : 12:36:53
|
Peso is correct. You would need to do something like this...SELECT DISTINCT ContactIDFROM crmContact t1WHERE EXISTS (SELECT * FROM lstRunFilter WHERE ContactId = t1.ContactID AND RunID = 124 AND ListFilterID IN (98, 99, 100, 101, 103, 104)) AND NOT EXISTS (SELECT * FROM lstRunFilter WHERE ContactId = t1.ContactID AND RunID = 124 AND ListFilterID IN (94)) Which looks almost identical to the original IN based method, but with the INs replaced with EXISTS instead. |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-06-14 : 12:41:30
|
True Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"] |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-14 : 12:44:28
|
quote: Originally posted by vallis Peso is correct. You would need to do something like this...Which looks almost identical to the original IN based method, but with the INs replaced with EXISTS instead.
And since there are SELECT * uses, no covering index, if any, will be used. IN will use any index where ContactID, RunID and ListFilterID are present.Peter LarssonHelsingborg, Sweden |
 |
|
|
Next Page
|