Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 deleting duplicates..

Author  Topic 

eddie
Starting Member

45 Posts

Posted - 2002-04-23 : 14:38:16
Ok..last question of the day. I have a table that has duplicate data. I need to delete one or more of the duplicate rows leaving only one row per id. I can use this to get the duplicate rows:

Select column1
from table1
group by column1
having count(*)>1

How do I delete all the duplicates in a script leaving only one? I can write a script to delete all of the duplicates so only the records who were not duplicated to start remain but can write on to take

column 1 column2
1212 a
1212 b
333 c
333 d
444 e

and make
1212 a
333 c
444 e

Any thoughts?

Thanks again!
Eddie

skond
Yak Posting Veteran

55 Posts

Posted - 2002-04-23 : 14:54:39
you could do this

select distinct * into #temp from table_name
drop table_name
select * into table_name from #temp

Go to Top of Page

eddie
Starting Member

45 Posts

Posted - 2002-04-23 : 15:04:11
Skond,

Where do you specify what column should be distict when you are using *? SHould I just add all the columns to the select with the distinct wrapped around the one that should be distinct?

Thanks,
Eddie

Go to Top of Page

skond
Yak Posting Veteran

55 Posts

Posted - 2002-04-23 : 15:04:55
-- Add a new identity column to the input table
alter table test
add seq int identity

--assumes col1 is the duplicating field, you may want to add more fields here
delete from test
from test a,
(
select col1,
count(*) count,
max(seq) max_seq
from test
group by col1
having count(*) > 1
)B
where a.col1 = b.col1 and
a.seq <> b.max_seq

Go to Top of Page

skond
Yak Posting Veteran

55 Posts

Posted - 2002-04-23 : 15:30:18
ALTER PROC usp_DeleteDuplicates
(
@InputTable varchar(128) = 'buswave.dbo.test',
@key varchar(1000) = 'col1,COL2' --- more than one column use 'col1,col2,col3'
)
AS

declare @sql varchar(8000)

exec ('alter table ' + @InputTable + ' add seq int identity')

set @sql =
'
delete from ' + @InputTable + '
where seq not in
(
select max_seq from
(
select ' + @key + ',
count(*) count,
max(seq) max_seq
from ' + @InputTable + '
group by ' + @key + '
)B
)
'

print @sql
EXEC (@sql)


exec ('alter table ' + @InputTable + ' drop column seq')


Edited by - skond on 04/23/2002 15:47:56
Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2002-04-23 : 16:33:46
We also have an article on deleting duplicates (http://www.sqlteam.com/item.asp?ItemID=3331).

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page
   

- Advertisement -