Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
anoushm
Starting Member
2 Posts |
Posted - 2012-08-30 : 20:01:23
|
I have a table that needs to be cleaned. I need to figure out a way pragmatically. I think this type issue is common and maybe someone has a solution or suggestion for it.Basically the issues is, I have an employer table. It contains employer's name and address. The issue is that the employer's name is spelled several different ways for same employer. For example McDonalds is spelled like Mc Donalds, McDdonalds, The McDonalds, MacDonalds.I need to figure out a way to have one correct common name for employer's that has same address. Basically the table needs to be cleaned. Is this possible to do pragmatically in a SQL script. It is stored in a SQL Server 2008 databse.Thanks in advance |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-08-30 : 21:20:20
|
you have to have master list for doing this or make use of a fuzzy matching algorithm. In both cases it would be approximate method and would require several iterations to get it fixed.------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-08-30 : 21:40:22
|
quote: I need to figure out a way to have one correct common name for employer's that has same address.
If what you said about duplicates having the SAME ADDRESS, then you can group by the address and find which ones have dups. However, I highly doubt whether you have precise addresses when the employer's name was entered with such wanton abandon! |
|
|
anoushm
Starting Member
2 Posts |
Posted - 2012-08-31 : 06:26:12
|
Yep the addresses are even worst. This table is horrible. So, i may need ask one of the PMs to clean this table in a excel file and then i imported back into the db. But if there are any other idea please give suggestions. |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-08-31 : 10:18:35
|
Mostly this is what we do as a part of data quality exercise. MS even have Data Quality Services which we can utilise for data cleansing------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
|
|
|
|
|