Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Update Column

Author  Topic 

cgunner
Yak Posting Veteran

95 Posts

Posted - 2005-08-02 : 13:36:21
I have two tables in my database that I need to reference from one to update the other. The table that needs the update contains city names that are incorrectly spelled and the reference table contains the correct spelling. Does anyone know of a way to correct the table with the incorrect spellings?

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2005-08-02 : 13:41:02
emm.... maybe using soundex() and difference functions??
i may be way off here...

Go with the flow & have fun! Else fight the flow
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-08-02 : 13:57:00
Is the Levenshtein Edit Distance Algorithm useful for this sort of thing?

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=51540

Kristen
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2005-08-02 : 13:58:22
good thinking man... forgot we have that here...

Go with the flow & have fun! Else fight the flow
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-08-02 : 14:38:06
I'm itching to use that, I never knew about that algorith until it popped up here.

Up until then I was just gob-smacked by the Boyer-Moyer algorithm.

Blimey! That's sad. TWO hits on Google. But I thought it was so cool</cry>

You be having yourself a good evening now Spirit!

Kristen
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2005-08-02 : 14:43:37
we're using it... although it's coded in asp
you too Kristen...

Go with the flow & have fun! Else fight the flow
Go to Top of Page

cgunner
Yak Posting Veteran

95 Posts

Posted - 2005-08-02 : 14:49:31
Very interesting! I am a newbie and this appears to be very complex. How can you implement this to update the table?
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-08-02 : 15:01:46
In essence you'd have to use it to find likely "replacement candidates". And then create an UPDATE of that to put the "best one" back.

However, its very difficult to decide if your "best one", based on some computer score, is the right one. A human could decide that OK most of the time. I foresee a problem that once you've replaced a "broken" name with a perfectly correct one, even if wrong, its going to be very hard for anyone, the postman included!, to establish the correct city name.

"I have two watches, one broken the other 5 minutes slow - which is better?"

Well .. one answer would be the broken one because at least that is right twice a day, the one that is 5 minutes slow is never right ...

Kristen
Go to Top of Page

cgunner
Yak Posting Veteran

95 Posts

Posted - 2005-08-02 : 15:10:48
The reference table has the correct information. I just need the function to decide which city to use to replace the incorrectly spelt one. i.e. Miami to overwrite Mami
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-08-02 : 15:20:44
"I just need the function to decide which city to use to replace"

That's the crux of it IMHO

"Miami to overwrite Mami"

Why would it not choose to overwrite with Mamie in AL Montgomery, or Mamie NC Currituck?

That's the rub, isn't it?

Kristen
Go to Top of Page

cgunner
Yak Posting Veteran

95 Posts

Posted - 2005-08-02 : 15:53:35
I ran this query again sample data and got:

BadCity GoodCity distance
MAMI MAYO 2
MAMI MIAMI 1
MAMI MIMS 2
MAMI VAMO 2

VERY COOL! I am assuming that the distance is the number 1 pick.


select a.BadCity [ABadCity]
,b.GoodCity [BGoodCity]
,dbo.edit_distance(a.BadCity, b.GoodCity) [distance]
Into #resultsPrep

from BTable a
cross join GTable b
where dbo.edit_distance(a.BadCity, b.GoodCity) < 3


Select Y.*
From
(Select ABadCity, BGoodCity, distance=min(distance)
From #resultsPrep s1
Group By ABadCity, BGoodCity
Having (Select count(*) From #resultsPrep Where ABadCity=s1.ABadCity and BGoodCity=s1.BGoodCity and distance=min(s1.distance))=1) Z
Inner Join #resultsPrep Y
On Z.ABadCity=Y.ABadCity
and Z.BGoodCity=Y.BGoodCity
and Z.distance=Y.distance

Drop Table #resultsPrep
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-08-02 : 15:58:13
"I ran this query again sample data"

EXCELLENT!!!!!!!!!!!!

"I am assuming that the distance is the number 1 pick"

"smallest distance" -> Yes

As I understand it its an indication of the numbers of characters that had to be changes / rearranged / inserted / deleted / etc. to get from Start -> Suggestion.

Kristen
Go to Top of Page

cgunner
Yak Posting Veteran

95 Posts

Posted - 2005-08-02 : 16:42:25
Thanks SOOOOO much!
Go to Top of Page
   

- Advertisement -