Author |
Topic |
idsck
Starting Member
16 Posts |
Posted - 2007-05-04 : 22:11:08
|
Scenario: 2 sets of dataSet 1 (200 records): firstname, lastname, address, city, phoneSet 2 (100 records): firstname, lastname, address, city, phoneI run a query using phone as criteria/condition (where xx.phone=yy.phone). I got 75 matches based on the phone numbersHowever, if I deleted the phone numbers from set 2 data, used below query, the return result is only 45 matches. My question is what technique that I can use to optimize the result just based on criteria like first, last, and address. select xx.firstname, xx.lastname, yy.Firstname, yy.Lastname, xx.address, yy.address, xx.city, yy.city, yy.phone from set2 xx, set1 yy where substring(soundex(xx.Firstname),0,3)=substring(soundex(yy.Firstname),0,3) and substring(soundex(xx.lastname),0,4)=substring(soundex(yy.Lastname),0,4)and substring(xx.address,1,7)= substring(yy.address,1,7)and xx.city=yy.city;Thank youid.... |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-05-04 : 22:26:54
|
How can you get same result with different search condition? |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-04 : 23:15:22
|
of course not. What I want or wish to find out is how to optimize my query and get more similar match. |
 |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-05-05 : 12:33:14
|
Use 'where xx.phone=yy.phone' only in your select statement, then check values in other columns to see if can find some logical there. |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-05 : 14:10:37
|
rmiao: thanks. I think it's good idea to do that and identify the pattern of other columns. I have one last question on this. How can I use something like vowel algorithm or _similarity to compare these 2 sets of data? Thanks |
 |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-05-05 : 18:07:55
|
Try tablediff if you use sql2k5. |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2007-05-06 : 10:01:24
|
This algorithm was developed specifically to do what you are asking. I've been using if for more than 10 years. http://sqlblindman.googlepages.com/fuzzysearchalgorithmSOUNDEX is not really appropriate for this. As a matter of fact, I'm not sure how many things SOUNDEX is useful for, except as a linguistic curiosity.e4 d5 xd5 Nf6 |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-06 : 13:54:21
|
Hi Blindman: wonder if you could briefly walk me through how to use your fuzzy algorithm and what are the steps I need to do before testing it. I am new and it so advance for me. Thank you.id |
 |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2007-05-07 : 00:45:29
|
There are some tricks you can do to opimize this for large recordsets, but for only a few hundred records you can just do this:select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastnamefrom Set1 inner join Set2 on dbo.CompareText( dbo.MatchText(Set1.FirstName + Set1.LastName), dbo.MatchText(Set2.FirstName + Set2.LastName) ) > 80 --Increase number for tighter search e4 d5 xd5 Nf6 |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-05-07 : 02:53:48
|
How do your phone numbers look like?Are there hyphens? Spaces? Other characters than numbers?Peter LarssonHelsingborg, Sweden |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-07 : 10:22:13
|
Set1: 999-999-9999Set2: 9999999999so it's better to convert them into same format?thanksid |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-07 : 10:24:15
|
Thanks Blindman!!!quote: Originally posted by blindman There are some tricks you can do to opimize this for large recordsets, but for only a few hundred records you can just do this:select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastnamefrom Set1 inner join Set2 on dbo.CompareText( dbo.MatchText(Set1.FirstName + Set1.LastName), dbo.MatchText(Set2.FirstName + Set2.LastName) ) > 80 --Increase number for tighter search e4 d5 xd5 Nf6
|
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-05-07 : 14:32:50
|
select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastnamefrom Set1inner join Set2 on replace(set2.phonenumber, '-', '') = replace(set1.phonenumber, '-', '')Peter LarssonHelsingborg, Sweden |
 |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2007-05-07 : 15:16:00
|
Not what he is looking for now...quote: Originally posted by idsckMy question is what technique that I can use to optimize the result just based on criteria like first, last, and address.
e4 d5 xd5 Nf6 |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-08 : 22:17:41
|
quote: Originally posted by blindman Not what he is looking for now...quote: Originally posted by idsckMy question is what technique that I can use to optimize the result just based on criteria like first, last, and address.
e4 d5 xd5 Nf6
Thanks ALL!! it works well...... |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-12 : 04:20:00
|
[quote]Originally posted by idsck Thanks Blindman!!![quote]Originally posted by blindman There are some tricks you can do to opimize this for large recordsets, but for only a few hundred records you can just do this:select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastnamefrom Set1 inner join Set2 on dbo.CompareText( dbo.MatchText(Set1.FirstName + Set1.LastName), dbo.MatchText(Set2.FirstName + Set2.LastName) ) > 80 --Increase number for tighter search how can i put this result into a new table? I tried to use something like this but it gave me syntax error. I also tried to use insert statement, but nothing work. Please advises and thank you so much!Create Table exMatch As(select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastnamefrom Set1 inner join Set2 on dbo.CompareText( dbo.MatchText(Set1.FirstName + Set1.LastName), dbo.MatchText(Set2.FirstName + Set2.LastName) ) > 80) --Increase number for tighten |
 |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2007-05-12 : 20:02:30
|
You need to use SELECT INTO rather than CREATE TABLE:select Set1.Firstname, Set1.Lastname, Set2.Firstname, Set2.Lastnameinto exMatchfrom Set1 inner join Set2 on dbo.CompareText( dbo.MatchText(Set1.FirstName + Set1.LastName), dbo.MatchText(Set2.FirstName + Set2.LastName) ) > 80) e4 d5 xd5 Nf6 |
 |
|
idsck
Starting Member
16 Posts |
Posted - 2007-05-12 : 22:02:08
|
thanks!!! |
 |
|
marat
Yak Posting Veteran
85 Posts |
Posted - 2008-04-08 : 01:20:23
|
This is quite a complicated issue.I suggest use scoring to set up a matching threashold and to filter matching records.You have to create scoring rules:1. exact match on first name, last name, < other criterion>, <score>2. exact match on first name initial, surname, <other criterion>, <score>....all othere matching criteriasIt all depends on business requirements; how complicated your matching engine should be.Good luck. |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-04-08 : 02:38:11
|
quote: Originally posted by marat Good luck.
they will need a great amount of luck to stumble on this post after it's been dead a year... elsasoft.org |
 |
|
Next Page
|