Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
zgall1
Starting Member
9 Posts |
Posted - 2014-09-25 : 11:51:46
|
I have a dataset with thousands of URLs stored in a column called Website (type VARCHAR) in a table called WebsiteData. There are many pairs of URLs (stored in separate rows) that are identical except that one begins with www, e.g. www.google.com and the other does not, e.g., google.com. How would I design a SQL query that identifies these pseudo-duplicates and deletes the version that does not start with www? |
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-25 : 12:19:06
|
[code];with FindDupsas (select Replace(Website, 'www.', '') RawWebsitefrom WebsiteDatagroup by Replace(Website, 'www.', '')having count(*) > 1)select w.Websitefrom WebsiteData wwhere Replace(Website, 'www.', '') in (select RawWebsite from FindDups)[/code]Can you take it from here? Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
zgall1
Starting Member
9 Posts |
Posted - 2014-09-25 : 13:19:18
|
Thanks! |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-25 : 18:28:12
|
m'kay Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
|
|
|