Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Bex
Aged Yak Warrior
580 Posts |
Posted - 2005-03-03 : 10:09:21
|
| Afternoon allMy scenario: We perform a run of client files into a DB every day. If something goes wrong, the run has to be done again. I have a sproc to delete the records from the previous run, but alas, people forget to ask me to do it. I have now implemented a mechanism to prevent re-runs on the same day, but I still have duplicates to remove.Solution:After much grit, sweat and toil, I have finally written a solution that deletes the latest duplicate. However, there are at times more than two duplicates. So rather than using MAX(ID field), I would like to use TOP <number of duplicates>. Below is the query and the part to change is highlighted in red.Query:CREATE PROCEDURE [dbo].[usp_DeleteDuplicateRecords](@cycleInstanceDate SMALLDATETIME)ASDELETE LetterRepository FROM LetterRepository L INNER JOIN(SELECT MAX(RepositoryID) AS RepositoryID, T1.A000_Document, T1.MailDate, T1.CycleInstanceID, T1.LetterCodeID FROM LetterRepository T1 INNER JOIN (SELECT A000_Document, MailDate, R.CycleInstanceID, LetterCodeID FROM LetterRepository R INNER JOIN CycleInstance C on C.CycleInstanceID = R.CycleInstanceID GROUP BY A000_Document, MailDate, R.CycleInstanceID, LetterCodeID, C.CycleInstanceDate HAVING COUNT(A000_Document) > 1 AND C.CycleInstanceDate = @cycleInstanceDate ) AS T2 ON T1.A000_Document = T2.A000_Document AND T1.MailDate = T2.MailDate AND T1.CycleInstanceID = T2.CycleInstanceID AND T1.LetterCodeID = T2.LetterCodeIDGROUP BY T1.A000_Document, T1.MailDate, T1.CycleInstanceID, T1.LetterCodeID) as T3 ON L.RepositoryID = T3.RepositoryID AND L.A000_Document = T3.A000_Document AND L.MailDate = T3.MailDate AND L.CycleInstanceID = T3.CycleInstanceID AND L.LetterCodeID = T3.LetterCodeIDGOChange the text in red to:TOP ? RepositoryID(and include an ORDER BY clause)Problem:Because I do not know how many times the run has been done, I would like to dynamically count the number of duplicates, and delete accordingly:To count the duplicates:SELECT COUNT(*)FROM LetterRepository R INNER JOIN CycleInstance C on C.CycleInstanceID = R.CycleInstanceID GROUP BY A000_Document, MailDate, R.CycleInstanceID, LetterCodeID, CycleInstanceDate HAVING COUNT(A000_Document) > 1 AND C.CycleInstanceDate = @cycleInstanceDateIs there a way to assign this to a parameter, and then use this parameter in the TOP clause? Like so:DECLARE @counter INTSET @counter = SELECT......ETC..see aboveDELETE LetterRepository FROM LetterRepository L INNER JOIN(SELECT SELECT TOP @counter RepositoryID ......ETCI have tried the above and it doesn't work, I get the following error message:Server: Msg 170, Level 15, State 1, Line 1Line 1: Incorrect syntax near '@count'.Any suggestions?????Hearty head pats |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2005-03-03 : 10:39:13
|
| How about getting the MIN (ID field) and use an outer join to delete all dupes except the minimum one?Be One with the OptimizerTG |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2005-03-03 : 10:53:48
|
set rowcount @counterSELECT RepositoryID ......ETCinto #tempfrom...set rowcount 0DELETE LetterRepositoryFROM LetterRepository L INNER JOIN#temp on etc...Go with the flow & have fun! Else fight the flow |
 |
|
|
Bex
Aged Yak Warrior
580 Posts |
Posted - 2005-03-03 : 10:59:13
|
Fantastic! Thats just what I need!! Thank you both!Hearty head pats |
 |
|
|
|
|
|
|
|