We do this in one of two ways (using a staging table as nosepicker suggests):
1. Pre-delete any matching PKs, then INSERT the lot. For this we ONLY pull "newly dated" records into the Staging table (the source table has an Update Date column which is set whenever the record changes, and we remember what date we have already processed up to, and then just pull anything with a more recent date). This will NOT transfer any deletes
2. We get ALL the data into a staging table. Then we JOIN the staging table to the "Live" table for a set of statements:
a) Delete anything in Destination table that is absent from Source/Staging table
DELETE D
FROM MyDatabaseName_DST.dbo.[MyTable] AS D
WHERE NOT EXISTS
(
SELECT *
FROM MyDatabaseName_SRC.dbo.[MyTable] AS S
WHERE
D.[MyPK1] = S.[MyPK1]
AND D.[MyPK2] = S.[MyPK2]
)
the deletes are done in FK order - children first, then parents.
b) Update anything in Destination table that is different to Source/Staging table
We use a "large" WHERE clause covering every column; it is not possible to compare TEXT columns, so we just compare the lengths, and the first 8000 characters.
UPDATE D
SET
[MyPK1] = S.[MyPK1],
[MyPK2] = S.[MyPK2],
[MyCol1] = S.[MyCol1],
[MyStringCol2] = S.[MyStringCol2],
[MyTextCol3] = S.[MyTextCol3],
...
-- SELECT TOP 100 'D', D.*, CHAR(13)+CHAR(10)+'S ', S.*, CHAR(13)+CHAR(10)+'----------'
FROM MyDatabaseName_DST.dbo.[MyTable] AS D
JOIN MyDatabaseName_SRC.dbo.[MyTable] AS S
ON D.[MyPK1] = S.[MyPK1]
AND D.[MyPK2] = S.[MyPK2]
WHERE
(D.[MyPK1] <> S.[MyPK1]
OR (D.[MyPK1] IS NULL AND S.[MyPK1] IS NOT NULL)
OR (D.[MyPK1] IS NOT NULL AND S.[MyPK1] IS NULL))
OR (D.[MyPK2] <> S.[MyPK2]
OR (D.[MyPK2] IS NULL AND S.[MyPK2] IS NOT NULL)
OR (D.[MyPK2] IS NOT NULL AND S.[MyPK2] IS NULL))
OR (D.[MyCol1] <> S.[MyCol1]
OR (D.[MyCol1] IS NULL AND S.[MyCol1] IS NOT NULL)
OR (D.[MyCol1] IS NOT NULL AND S.[MyCol1] IS NULL))
OR (D.[MyStringCol2] COLLATE Latin1_General_BIN <> S.[MyStringCol2]
OR (D.[MyStringCol2] IS NULL AND S.[MyStringCol2] IS NOT NULL)
OR (D.[MyStringCol2] IS NOT NULL AND S.[MyStringCol2] IS NULL))
OR (DATALENGTH(D.[MyTextCol3]) <> DATALENGTH(S.[MyTextCol3])
OR CONVERT(varchar(8000), D.[MyTextCol3]) COLLATE Latin1_General_BIN <> CONVERT(varchar(8000), S.[MyTextCol3])
OR (D.[MyTextCol3] IS NULL AND S.[MyTextCol3] IS NOT NULL)
OR (D.[MyTextCol3] IS NOT NULL AND S.[MyTextCol3] IS NULL))
...
note that we use COLLATE Latin1_General_BIN for comparison so that even an Upper/Lower case difference is considered "significant"
c) Then we insert any new records:
SET IDENTITY_INSERT MyDatabaseName_DST.dbo.[MyTable] ON
INSERT INTO MyDatabaseName_DST.dbo.[MyTable]
(
[MyPK1],
[MyPK2],
[MyCol1],
[MyStringCol2],
[MyTextCol3],
...
)
SELECT S.*
FROM MyDatabaseName_SRC.dbo.[MyTable] AS S
WHERE NOT EXISTS
(
SELECT *
FROM MyDatabaseName_DST.dbo.[MyTable] AS D
WHERE
D.[MyPK1] = S.[MyPK1]
AND D.[MyPK2] = S.[MyPK2]
)
SET IDENTITY_INSERT MyDatabaseName_DST.dbo.[MyTable] OFF
We do the Updates and Inserts in opposite FK order to the deletes - i.e. Parents first, then Children.
Kristen