Deleting duplicate records - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Deleting duplicate records

Author

Topic

ninel
Posting Yak Master

141 Posts

Posted - 2006-06-14 : 12:41:33

I have a table tmPunchtimeSummary which contains a sum of employee's hours
per day. The table contains some duplicates.


CREATE TABLE [tmPunchtimeSummary] 
(
[iTmPunchTimeSummaryId] [int] IDENTITY (1, 1) NOT NULL ,
[sCalldate] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[sEmployeeId] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[dTotalHrs] [decimal](18, 4) NULL 
) ON [PRIMARY]

INSERT tmPunchtimeSummary (sCalldate, sEmployeeId, dTotalHrs)
VALUES('20060610', '1234', 4.5)

INSERT tmPunchtimeSummary (sCalldate, sEmployeeId, dTotalHrs)
VALUES('20060610', '1234', 4.5)

INSERT tmPunchtimeSummary (sCalldate, sEmployeeId, dTotalHrs)
VALUES('20060610', '2468', 8.0)

INSERT tmPunchtimeSummary (sCalldate, sEmployeeId, dTotalHrs)
VALUES('20060610', '1357', 9.0)

INSERT tmPunchtimeSummary (sCalldate, sEmployeeId, dTotalHrs)
VALUES('20060610', '2345', 8.5)

INSERT tmPunchtimeSummary (sCalldate, sEmployeeId, dTotalHrs)
VALUES('20060610', '2345', 8.5

How can I write a delete statement to only delete the duplicates which in
this case would be the 1st and 5th records?

Thanks,
Ninel

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-06-14 : 12:57:50

[code]
select * from tmPunchtimeSummary

delete from a
from
tmPunchtimeSummary a
join
(
select
sCalldate,
sEmployeeId,
dTotalHrs,
iTmPunchTimeSummaryId = max(iTmPunchTimeSummaryId)
from
tmPunchtimeSummary
group by
sCalldate,
sEmployeeId,
dTotalHrs
having
count(*) <> 1
) b
on
a.sCalldate = b.sCalldate and
a.sEmployeeId = b.sEmployeeId and
a.dTotalHrs = b.dTotalHrs and
a.iTmPunchTimeSummaryId <> b.iTmPunchTimeSummaryId

select * from tmPunchtimeSummary

[/code]

Results:

[code]
iTmPunchTimeSummaryId sCalldate sEmployeeId dTotalHrs
--------------------- -------------------- -------------------- --------------------
1 20060610 1234 4.5000
2 20060610 1234 4.5000
3 20060610 2468 8.0000
4 20060610 1357 9.0000
5 20060610 2345 8.5000
6 20060610 2345 8.5000

(6 row(s) affected)

(2 row(s) affected)

iTmPunchTimeSummaryId sCalldate sEmployeeId dTotalHrs
--------------------- -------------------- -------------------- --------------------
2 20060610 1234 4.5000
3 20060610 2468 8.0000
4 20060610 1357 9.0000
6 20060610 2345 8.5000

(4 row(s) affected)

[/code]

CODO ERGO SUM

ninel
Posting Yak Master

141 Posts

Posted - 2006-06-14 : 13:16:32

Thank you so much

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-06-14 : 13:32:53

If the combo of the columns (sCalldate, sEmployeeId) is your PK, be sure to put a unique constraint on those columns to disallow dups in the future.

Don't just add identities to all tables as primary keys, otherwise you will always need to deal with handling duplicates.

ninel
Posting Yak Master

141 Posts

Posted - 2006-06-14 : 13:43:06

The combination of employeeid, calldate and totalhours is what makes the table unique. How do I create the unique constraint?

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-06-14 : 13:53:48

So for a given calldate, and for a given employeeID, he/she can have more than 1 row?

If so, then how do you know that your example data is duplicates? Wouldn't it be possible to have two entries that legitmately had the same # of hours?

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources