Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 writing an exclusive join

Author  Topic 

dimkanewtown
Starting Member

13 Posts

Posted - 2004-04-01 : 15:03:22
Hi I have the following query




select
1,
count(*)

from dpo.dp d
join dpo.dpotherinfo o ON ( d.donor_id = o.donor_id )
join dpo.dpotherinfoudf oi ON ( o.other_id = oi.other_id )
join (
select distinct donor_id
from dpo.dpcontact c
where c.contact_date > '03/12/2004' and c.contact_date <= '03/27/2004' and
c.activity_code in ('MA')

) cc ON cc.donor_id = d.donor_id

where oi.event = '1050' and d.home_phone is not null and oi.participationtype = 'LEAD'


what i want is for the third join to be exclusive...
so I want to count the records from table d which do not match the records in the subquery

any clues?

mohdowais
Sheikh of Yak Knowledge

1456 Posts

Posted - 2004-04-01 : 15:12:34
How's this:


select
1,
count(*)

from dpo.dp d
join dpo.dpotherinfo o ON ( d.donor_id = o.donor_id )
join dpo.dpotherinfoudf oi ON ( o.other_id = oi.other_id )
left join (
select distinct donor_id
from dpo.dpcontact c
where c.contact_date > '03/12/2004' and c.contact_date <= '03/27/2004' and
c.activity_code in ('MA')

) cc ON cc.donor_id = d.donor_id

where oi.event = '1050' and d.home_phone is not null and oi.participationtype = 'LEAD'
and cc.donor_id is null


OS
Go to Top of Page

dimkanewtown
Starting Member

13 Posts

Posted - 2004-04-01 : 15:19:12
That works, but my problem is performance... :(

If I run the query in the original I get a count 3707 records in 1 second, when I add the "and cc.donor_id is null" clause it
returns a count of 3524 records (that's the correct number) in 13 seconds...

Somehow I am gaining 10 seconds of execution time with that one statement...
Go to Top of Page

mohdowais
Sheikh of Yak Knowledge

1456 Posts

Posted - 2004-04-01 : 15:25:19
It's not the IS NULL bit that's hurting performance, it's the LEFT JOIN. You could try re-writing that bit using a NOT EXISTS or NOT IN condition. How does this look?

select
1,
count(*)

from dpo.dp d
join dpo.dpotherinfo o ON ( d.donor_id = o.donor_id )
join dpo.dpotherinfoudf oi ON ( o.other_id = oi.other_id )

where oi.event = '1050' and d.home_phone is not null and oi.participationtype = 'LEAD'
and not exists
(
select distinct donor_id
from dpo.dpcontact c
where c.contact_date > '03/12/2004' and c.contact_date <= '03/27/2004' and
c.activity_code in ('MA')
and c.donor_id = d.donor_id
)


OS
Go to Top of Page

dimkanewtown
Starting Member

13 Posts

Posted - 2004-04-01 : 15:33:48
I tried that already,

if I use "not exists" the execution time goes up to 16/17 seconds
if I use "d.donor_id not in" the execution time goes up to 16/17 seconds

the left join seems to be the fastest method....

I need it faster still :(

I used an index optimizer wizard to check the query and it recommended adding a new composite index that would result in 56% performance gain, but I have yet to see it... :(

Maybe I need to rebuild the indexes for the database?
Go to Top of Page

mohdowais
Sheikh of Yak Knowledge

1456 Posts

Posted - 2004-04-01 : 15:39:05
Yup, make sure you've got appropriate indexes...are the donor_id columns indexed in all the tables? You might also need to defragment and rebuild the indexes once in a while, and make sure the auto-update statistics is turned on for all the tables. There were couple of good threads here recently dealing with maintenance of indexes, but its half past midnight and I can barely find my bed...good night


OS
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-04-01 : 15:41:11
in your subquery:

select distinct donor_id
from dpo.dpcontact c
where c.contact_date > '03/12/2004' and c.contact_date <= '03/27/2004' and
c.activity_code in ('MA')

try changing the IN to an EQUAL: c.activity_code = 'MA'

(not sure if that will help but it is better form, anyway)

also, do you have a clustered index on contact date? If you can add one, that would help since you are using a range of contact dates. At the very least, make sure activity_code and Contact_Date are indexed.

- Jeff
Go to Top of Page

drymchaser
Aged Yak Warrior

552 Posts

Posted - 2004-04-01 : 15:41:20
You could try removing the DISTINCT from the subquery, no need for it. In the exists you could do (SELECT 1...) instead of a column. Might help.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-04-01 : 15:43:37
also -- keep in mind you should not be comparing the two queries, with the inner join vs. the exlcusive outer joins. They are complete opposites -- one returns only rows with matches, the other only returns rows WITHOUT matches ...

how long does the left outer join version take if you remove the "WHERE cc.donor_id is null" part?

- Jeff
Go to Top of Page

dimkanewtown
Starting Member

13 Posts

Posted - 2004-04-01 : 15:47:25
I can't change the IN to an EQUAL because this is a dynamic query and the activity_code can contain more tha one value

all the donor_id columns are indexed, activity_code is indexed and contact_date is indexed but not clustered...
every single field used in this query is indexed. The only thing I can think of is that maybe after the indexes are rebuilt, I'll see some improvement...
Go to Top of Page

dimkanewtown
Starting Member

13 Posts

Posted - 2004-04-01 : 15:48:32
quote:
Originally posted by jsmith8858

also -- keep in mind you should not be comparing the two queries, with the inner join vs. the exlcusive outer joins. They are complete opposites -- one returns only rows with matches, the other only returns rows WITHOUT matches ...

how long does the left outer join version take if you remove the "WHERE cc.donor_id is null" part?

- Jeff



the left outer join takes 1 second if I remove that part...
Go to Top of Page

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2004-04-01 : 22:15:54
select
1,
count(*)

from
dpo.dp d
inner join dpo.dpotherinfo o ON ( d.donor_id = o.donor_id )
inner join dpo.dpotherinfoudf oi ON ( o.other_id = oi.other_id )
inner join (
select distinct donor_id
from dpo.dpcontact c
where
c.contact_date > '03/12/2004'
and c.contact_date <= '03/27/2004'
and c.activity_code in ('MA')) cc ON cc.donor_id <> d.donor_id
where
oi.event = '1050'
and oi.participationtype = 'LEAD'
and d.home_phone is not null

MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.
Go to Top of Page
   

- Advertisement -