Author |
Topic |
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-10-01 : 18:56:42
|
Say we have an "UPSERT" proc that is subject to a large number of concurrent requests.We want to avoid deadlocks and key errors.Most of the code snippets I have seen for upserts only use updlock.Possibly with a holdlock as well. (which is really needed imo!).Why not just put a xlock straight away, since we are going toeither update or insert. Wouldn't that be more efficient?proc outline:create proc upsert @someKey int ,@someVal intasset nocount onbegin transaction/* now we don't want concurrent requests to be able toA. insert duplicate keysB. deadlockWe use the following hinsts:holdlock + [xlock,updlock]*/if exists( select * from targetTable with([xlock,updlock],holdlock) -- xlock or updlock ? where someKey = @someKey)begin -- row exists, do the update update targetTable set someVal = @someVal where someKey = @someKey -- handle error...endelsebegin -- row does not exists, do the insert insert targetTable ( someKey ,someVal ) values ( @someKey ,@someVal ) -- handle error...endcommit transaction We have a proc, somewhat along the above lines, but it still deadlocks itself from time to time.And I can't seem to understand why rockmoose |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-10-01 : 23:07:36
|
I usually avoid the IF EXISTS test, and just try the one first that I expect to be the most common, INSERT or UPDATE. This way, the check is built into the statement, so there should be very little chance of a deadlock.declare @error intdeclare @rowcount intbegin transactioninsert targetTable ( someKey, someVal )sekect a.someKey, a.someValfrom ( select someKey = @someKey, someVal = @someVal ) a left join targetTable b on a.someKey = b.someKeywhere -- Verify row does not exist b.someKey is nullselect @error = @@error, @rowcount = @@rowcountif @rowcount = 1 begin commit return 0 endupdate targetTableset someVal = @someValwhere someKey = @someKey and someVal <> @someValselect @error = @@error, @rowcount = @@rowcountcommitreturn 0 CODO ERGO SUM |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-10-02 : 02:49:43
|
Thanks, that makes a lot of sense.In my simple testrig I had to lock the table to avoid PK violations though!Which does seems strange, at least I think so atm!...insert targetTable ( someKey, someVal )sekect a.someKey, a.someValfrom ( select someKey = @someKey, someVal = @someVal ) a left join targetTable b with([updlock/xlock],holdlock) -- without this, got PK violations on a.someKey = b.someKeywhere -- Verify row does not exist b.someKey is null... Above I feel like using the xlock.EDIT:Just the HOLDLOCK hint is necessary, forget about UPDLOCK and XLOCK.rockmoose |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-10-02 : 06:11:28
|
How is the promary key being assigned?Ut sounds like the real problem is that the same key is being used for multiple inserts.CODO ERGO SUM |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-10-02 : 06:25:28
|
The key is assigned/created in another system.Then that other system uses the upsert proc like this:begin tran exec upsert @key = 1, @val = 'a' commitbegin tran exec upsert @key = 1, @val = 'b' commitbegin tran exec upsert @key = 1, @val = 'c' commitIt posts these concurrently. (the same ms in profiler).rockmoose |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-10-02 : 06:45:57
|
Is using the same key in the example a typo?I would want all the upserts done in the same transaction. or better yet, in the same INSERT or UPDATE. Get it done with one DB call.insert targetTable ( someKey, someVal )sekect a.someKey, a.someValfrom ( select someKey = @someKey_1, someVal = @someVal_1 union all select someKey = @someKey_2, someVal = @someVal_2 union all select someKey = @someKey_3, someVal = @someVal_3 ) a left join targetTable b on a.someKey = b.someKeywhere -- Verify row does not exist b.someKey is null CODO ERGO SUM |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-10-02 : 07:04:06
|
It's an automated process,it posts a bunch of transactions, and doesn't care if it posts the same key several times with the same posting.There is some date logic in the parameters as well, (like this data was last updated so&so).The proc has some logic as to just update the data where last_updated is > than the current last_updated date.I know, it's not optimal, or even very good.But for the day the processing engine is not going to be rewritten, but the procedure is.So, my question is solely on the PK violation issue and the HOLDLOCK hint.Speaking generally, the code you posted seems subject to PK violations if HOLDLOCK is not specified.And that struck me as strange in the first place.And in the second place I was wondering what people do in this kind of upsert procs.What kind of locking hints do you specify, if any? |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-10-02 : 08:19:44
|
I'm a little bit leary about rolling over from a failed-update to an insert (or vice-versa):Try UPDATEIF @@ROWCOUNT = 0 then do INSERTis perhaps OK butTry INSERTIF @@ERROR <> 0 then do UPDATEstrikes me as dangerous because it implies that the operator did not know that a record pre-existing and they may be overwriting some critical data created by the earlier INSERT (or possibly earlier insert and several subsequent UPDATEs!)We do allow updates where the previous state is not known, but these are rare (there is a flag on the SProc along the lines of @blnOverwriteDontCareWhatWasThereBefore !!) as we signal our UpSert Sprocs:Create new (fail if exists)Update from a known VersionNo/TIMESTAMP - fail if VersionNo is differentUpdate (record must pre-exist, but don't care what version)UpSert (don't care if exists or not)Kristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-10-02 : 08:55:20
|
my take on this:Don't use upsert sprocs.Use an Update sproc and an Insert sproc.Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-10-02 : 09:11:39
|
Spirit: Do you include in that an Sproc that can do either, but takes a parameter saying which it should do?We like the fact that all the logic (much of the validation is the same etc) to do with UpSert of a table is gathered in a single place ...Kristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-10-02 : 09:17:48
|
no i don't.plan caching messes itself up depending on what you do more.if that isn't a concern then i'm for it.Validation is DAL's job not sproc's.what do you consider validation? maybe business logic validation?Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-10-02 : 09:57:29
|
Why did I press REFRESH rather than SAVE? Damn "plan caching messes itself up depending on what you do more"I've been working on the assumption that SQLServer can't screw up a single record transaction based on the PK - never tested that assertion though!"what do you consider validation? maybe business logic validation?"Some of it is supplementing/replacing DAL - certainly nothing like "is this a valid date" - the DAL does that. But I would include stuff where data might be arriving "wonky" from some un-trusted source - but in general that goes through Staging tables and so on, so should arrive at UpSert in clean shape.Order Save: If VoucherCode present check that it is valid for this customer and for the circumstances of the order.Web registration: Check email address does not exist on another account.And we do pre-check some FK stuff - does the WidgetCode exist in the WidgetTable. In the main the DAL takes care of this (e.g. by displaying a SELECT list in the first place!) but there are some more complicated scenarios where the DAL doesn't/can't handle it [with its current abilities at least]. We prefer to return a friendly error where there is likely to be something to catch (rather than a totally unanticipated error, which we treat as terminal) because it is easier [for us] to process a return code than catch & interpret stuff in the ADO.errors() collection or similar.Kristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-10-02 : 10:29:17
|
well if you have atomic updates and insert then i see no problem.but if you have a xml insert etc then caching comes in handy.my personal prefernce is still 2 sprocs.maybe because i've never had a project that had a lot of business logic in sporcs. Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-10-02 : 11:57:46
|
Are the upserts being submitted on different connections? I guess that would explain the PK violations. If so, what is the mechanism to insure that the upserts are applied in the correct order? It's almost like the application is throwing data at the database to see which sticks.Using HOLDLOCK has the same effect as this, so your application may have low concurrency.SET TRANSACTION ISOLATION LEVEL SERIALIZABLE If this is mainly a data collection process with no need for real-time access to the data, you might consider just inserting the data into a transaction table with an IDENTITY key, and then use a scheduled batch job to process the data into your current table one row at a time in KEY sequence. That should eliminate PK violations, deadlocks, and other concurrency issues. Of course, ther are plenty of possible disadvantages to this approach, especially data-lag and throughput. CODO ERGO SUM |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-10-02 : 12:24:52
|
What happened to Microsoft Queue (if I've remembered the name correctly), that might pip you the data in a sensible data with a guarantee of no data loss (if I've remembered what the marketing man told me 10 years ago!)Kristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-10-02 : 12:35:43
|
forget MS Queue, Service broker is here. Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-10-02 : 12:40:22
|
But I'm trying to avoid Service broken ... |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-10-02 : 12:44:01
|
why? you don't like new technology? what kind of a geek are you? Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-10-02 : 13:07:47
|
Dyslexic geek ... |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-10-02 : 16:14:14
|
Yes I've been thinking of alternative approaches along those lines. (some sort of queued processing).I was watching profiler for a while 1200+ calls during 4 minutes, and many simultaneous ones (same ms).It comes in bursts, but < 1mil a day so far.It seems they just throw data, The ratio update:insert must be 90:10.It might well be that a large % of the updates are "unnecessary". I don't know yet.The initial problem was with rewriting the proc itself, to better handle the load, and thanks for all the input there.I will be disussing this further with the devs.We are looking into service-broker both here&there, but I am not seriously involved yet, we still have the upgrade (2K5) to do.This particular system that we are communicating with is one candidate.But just throwing another technology/product into the mix is not something I will allow until we sort some things out.If they can't handle a gun,I'm sure as hell not going to let them handle a machine-gun ;)Next thing I know, I come to work and there are 40GB of unhandled xml messages Actually I am a bit wary of "black-box" techniques such as SB.> What's the problem?> Stuck in a queue! I agree with Mladen,one update sprocone insert sprocLet the BL at least be savvy enough to know which one to call, how hard can it be!And also with KristenWe have lots of business logic in the procs.rockmoose |
 |
|
|