Author |
Topic  |
idsck
Starting Member
16 Posts |
Posted - 05/04/2007 : 22:11:08
|
Scenario: 2 sets of data Set 1 (200 records): firstname, lastname, address, city, phone Set 2 (100 records): firstname, lastname, address, city, phone
I run a query using phone as criteria/condition (where xx.phone=yy.phone). I got 75 matches based on the phone numbers
However, if I deleted the phone numbers from set 2 data, used below query, the return result is only 45 matches. My question is what technique that I can use to optimize the result just based on criteria like first, last, and address.
select xx.firstname, xx.lastname, yy.Firstname, yy.Lastname, xx.address, yy.address, xx.city, yy.city, yy.phone from set2 xx, set1 yy where substring(soundex(xx.Firstname),0,3)=substring(soundex(yy.Firstname),0,3) and substring(soundex(xx.lastname),0,4)=substring(soundex(yy.Lastname),0,4) and substring(xx.address,1,7)= substring(yy.address,1,7) and xx.city=yy.city;
Thank you
id.... |
Edited by - idsck on 05/04/2007 22:14:32
|
|
rmiao
Flowing Fount of Yak Knowledge
USA
7266 Posts |
Posted - 05/04/2007 : 22:26:54
|
How can you get same result with different search condition? |
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/04/2007 : 23:15:22
|
of course not. What I want or wish to find out is how to optimize my query and get more similar match.
|
 |
|
rmiao
Flowing Fount of Yak Knowledge
USA
7266 Posts |
Posted - 05/05/2007 : 12:33:14
|
Use 'where xx.phone=yy.phone' only in your select statement, then check values in other columns to see if can find some logical there. |
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/05/2007 : 14:10:37
|
rmiao: thanks. I think it's good idea to do that and identify the pattern of other columns. I have one last question on this. How can I use something like vowel algorithm or _similarity to compare these 2 sets of data? Thanks |
 |
|
rmiao
Flowing Fount of Yak Knowledge
USA
7266 Posts |
Posted - 05/05/2007 : 18:07:55
|
Try tablediff if you use sql2k5. |
 |
|
jezemine
Flowing Fount of Yak Knowledge
USA
2886 Posts |
|
blindman
Flowing Fount of Yak Knowledge
USA
2365 Posts |
Posted - 05/06/2007 : 10:01:24
|
This algorithm was developed specifically to do what you are asking. I've been using if for more than 10 years. http://sqlblindman.googlepages.com/fuzzysearchalgorithm SOUNDEX is not really appropriate for this. As a matter of fact, I'm not sure how many things SOUNDEX is useful for, except as a linguistic curiosity.
e4 d5 xd5 Nf6 |
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/06/2007 : 13:54:21
|
Hi Blindman: wonder if you could briefly walk me through how to use your fuzzy algorithm and what are the steps I need to do before testing it. I am new and it so advance for me. Thank you. id |
 |
|
blindman
Flowing Fount of Yak Knowledge
USA
2365 Posts |
Posted - 05/07/2007 : 00:45:29
|
There are some tricks you can do to opimize this for large recordsets, but for only a few hundred records you can just do this:
select Set1.Firstname,
Set1.Lastname,
Set2.Firstname,
Set2.Lastname
from Set1
inner join Set2 on
dbo.CompareText(
dbo.MatchText(Set1.FirstName + Set1.LastName),
dbo.MatchText(Set2.FirstName + Set2.LastName)
) > 80 --Increase number for tighter search
e4 d5 xd5 Nf6 |
Edited by - blindman on 05/07/2007 00:46:30 |
 |
|
SwePeso
Patron Saint of Lost Yaks
Sweden
30421 Posts |
Posted - 05/07/2007 : 02:53:48
|
How do your phone numbers look like? Are there hyphens? Spaces? Other characters than numbers?
Peter Larsson Helsingborg, Sweden |
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/07/2007 : 10:22:13
|
Set1: 999-999-9999 Set2: 9999999999
so it's better to convert them into same format?
thanks id
|
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/07/2007 : 10:24:15
|
Thanks Blindman!!!
quote: Originally posted by blindman
There are some tricks you can do to opimize this for large recordsets, but for only a few hundred records you can just do this:
select Set1.Firstname,
Set1.Lastname,
Set2.Firstname,
Set2.Lastname
from Set1
inner join Set2 on
dbo.CompareText(
dbo.MatchText(Set1.FirstName + Set1.LastName),
dbo.MatchText(Set2.FirstName + Set2.LastName)
) > 80 --Increase number for tighter search
e4 d5 xd5 Nf6
|
 |
|
SwePeso
Patron Saint of Lost Yaks
Sweden
30421 Posts |
Posted - 05/07/2007 : 14:32:50
|
select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastname from Set1 inner join Set2 on replace(set2.phonenumber, '-', '') = replace(set1.phonenumber, '-', '')
Peter Larsson Helsingborg, Sweden |
 |
|
blindman
Flowing Fount of Yak Knowledge
USA
2365 Posts |
Posted - 05/07/2007 : 15:16:00
|
Not what he is looking for now...
quote: Originally posted by idsck My question is what technique that I can use to optimize the result just based on criteria like first, last, and address.
e4 d5 xd5 Nf6 |
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/08/2007 : 22:17:41
|
quote: Originally posted by blindman
Not what he is looking for now...
quote: Originally posted by idsck My question is what technique that I can use to optimize the result just based on criteria like first, last, and address.
e4 d5 xd5 Nf6
Thanks ALL!! it works well......
|
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/12/2007 : 04:20:00
|
[quote]Originally posted by idsck
Thanks Blindman!!!
[quote]Originally posted by blindman
There are some tricks you can do to opimize this for large recordsets, but for only a few hundred records you can just do this:
select Set1.Firstname,
Set1.Lastname,
Set2.Firstname,
Set2.Lastname
from Set1
inner join Set2 on
dbo.CompareText(
dbo.MatchText(Set1.FirstName + Set1.LastName),
dbo.MatchText(Set2.FirstName + Set2.LastName)
) > 80 --Increase number for tighter search
how can i put this result into a new table? I tried to use something like this but it gave me syntax error. I also tried to use insert statement, but nothing work. Please advises and thank you so much!
Create Table exMatch As (select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastname from Set1 inner join Set2 on dbo.CompareText( dbo.MatchText(Set1.FirstName + Set1.LastName), dbo.MatchText(Set2.FirstName + Set2.LastName) ) > 80) --Increase number for tighten
|
 |
|
blindman
Flowing Fount of Yak Knowledge
USA
2365 Posts |
Posted - 05/12/2007 : 20:02:30
|
You need to use SELECT INTO rather than CREATE TABLE:
select Set1.Firstname,
Set1.Lastname,
Set2.Firstname,
Set2.Lastname
into exMatch
from Set1
inner join Set2 on
dbo.CompareText(
dbo.MatchText(Set1.FirstName + Set1.LastName),
dbo.MatchText(Set2.FirstName + Set2.LastName)
) > 80)
e4 d5 xd5 Nf6 |
 |
|
idsck
Starting Member
16 Posts |
Posted - 05/12/2007 : 22:02:08
|
thanks!!! |
 |
|
marat
Yak Posting Veteran
Australia
85 Posts |
Posted - 04/08/2008 : 01:20:23
|
This is quite a complicated issue. I suggest use scoring to set up a matching threashold and to filter matching records. You have to create scoring rules: 1. exact match on first name, last name, < other criterion>, <score> 2. exact match on first name initial, surname, <other criterion>, <score> ....all othere matching criterias It all depends on business requirements; how complicated your matching engine should be. Good luck.
|
 |
|
jezemine
Flowing Fount of Yak Knowledge
USA
2886 Posts |
Posted - 04/08/2008 : 02:38:11
|
quote: Originally posted by marat
Good luck.
they will need a great amount of luck to stumble on this post after it's been dead a year...

elsasoft.org |
 |
|
Topic  |
|