Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Filtering duplicates (non-standard)

Author  Topic 

Cyclonik
Posting Yak Master

114 Posts

Posted - 2006-01-24 : 10:42:57
Below is a query that looks to match customer records based on a combined firstname, lastname, address string.
THe problem is that I get matchtes on both sides i.e. i get 1, 2 and 2,1 in the result set if 1 and 2 have a matching firstname, lastname, address string. Is there a simple way to filter the second set of matches from the resultset?

select a.PID, b.PID, a.type from
(
(select top 10000 (FirstName + LastName + LTrim(RTrim(d.Address1))) as MatchString, PCM.PID
from
Person Pn
inner join PCM on Pn.PID = PCM.PID
inner join CM on PCM.CMID = CM.CMID
left outer join postaladdress d on CM.cmid = d.cmid ) a Inner JOIN

(select top 10000 (FirstName + LastName + LTrim(RTrim(d.Address1))) as MatchString, PCM.PID
from
Person Pn
inner join PCM on Pn.PID = PCM.PID
inner join CM CM on PCM.CMID = CM.CMID
left outer join postaladdress d on CM.CMid = d.Cmid ) b ON
a.MatchString = b.MatchString and a.PID <> b.PID)


Once I get this figured out I have a more complex problem to deal with. I also need to complete the same matching with other fields and recursively? determine other matches i.e. person 1 = person 2 based on a (firstname, lastname, address) match and person 2 = person 3 based on a (firstname, lastname, email) match so I need to recursive determine that record 1 = record 3 as well. Make sense?

Thanks,
Christian

-=:SpasmatiK:=-

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2006-01-24 : 11:19:25
instead of:
ON a.MatchString = b.MatchString and a.PID <> b.PID

use:
ON a.MatchString = b.MatchString and a.PID < b.PID
Go to Top of Page

Cyclonik
Posting Yak Master

114 Posts

Posted - 2006-01-24 : 11:21:40
Why is that?

edit: nevermind it obviously filters out all duplicates :) Thanks.
-=:SpasmatiK:=-
Go to Top of Page

Cyclonik
Posting Yak Master

114 Posts

Posted - 2006-01-24 : 11:58:43
Should I have the a.PID < b.PID in a where clause instead of the join?

-=:SpasmatiK:=-
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2006-01-24 : 12:40:28
Either works. JOIN is fine, and preferable in my opinion.
Go to Top of Page
   

- Advertisement -