Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Deleting Duplicates Automatically

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2005-02-04 : 08:24:16
Janusz writes "Hi,

I have a table which has values as follows where pID is the primary key:

pID | Reference
-----------------
1 | 1
2 | 1
3 | 1
4 | 2
5 | 2
6 | 3
7 | 3

Right, How do I delete records where pid is (1, 2, 4, 6) Idunno how to explain this properly but in 'Reference' we have instance of a number. I want to keep the reference for that instance of the number where pID is at it's highest... using the above example pID=(1, 2, 4 and 6) will be deleted leaving the table below:

pID | Reference
----------------
3 | 1
5 | 2
7 | 3

Many thanks in advance. The script has to work without any values being passed to it so when I open the ASP page it automatically goes through the DB and deletes archived pages where the it's not the MAX record for that particular section."

nr
SQLTeam MVY

12543 Posts

Posted - 2005-02-04 : 08:30:13
delete tbl
from tbl t
where exists (select * rfom tbl t2 where t.Reference = t2.Reference and t2.pID < t.pID)

delete tbl
from tbl t
join (select Reference, pID = max(pID) group by Reference having count(*) > 1) t2
on t.Reference = t2.Refernece
and t.pID <> t2.pID


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2005-02-04 : 08:38:40
Try this

Declare @t table(pID int, Referenc int)
insert into @t values(1,1)
insert into @t values(2,1)
insert into @t values(3,1)
insert into @t values(4,2)
insert into @t values(5,2)
insert into @t values(6,3)
insert into @t values(7,3)

Delete from @t where pId not in
(select Max(pid) from @t group by Referenc)

Select * from @t

Madhivanan
Go to Top of Page

BammBamm
Starting Member

9 Posts

Posted - 2005-02-04 : 15:30:04
Yet another variant on the delete query:

Delete From @t
From @t A
Where A.pid < (Select Max(B.pid) from @t B where B.Referenc = A.Referenc)

[Had to double reference the table to get the alias reference and use it in the subquery, the query works fine]

The "not in" approach proposed earlier is fine. However:
(1) If the table is "large", and/or
(2) the possible return values to search is equally numerous,

you may want to consider the comparison of values to a subquery like this instead of comparing to a list of values. This will give you more options on performance tuning (i.e.: creating an index/indices with the Referenc column).
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-02-04 : 15:41:49
Janusz,

A slightly different approach.

Why not "virtually" delete them by creating a view that filters those rows.
Then do mass archiving at the end of the day...

DavidM

"Always pre-heat the oven"
Go to Top of Page
   

- Advertisement -