Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-17 : 07:47:11
|
| I'm trying to complete a sort of work-queue design, except that the queue items will be touched a number of times until a max is reached.We're seeing an exessive amount of blocking under heavy load - which *could* be beause the hard drive is badly fragmented (which our hosting company is currently working on), or because I've botched the locking hints.The locking mechanism for the rows happens in p_WorkQueueReserve.Anyone have any ideas on this?Here's a code subset to re-create the issue:CREATE TABLE [dbo].[WorkQueue] ( [QueueID] int IDENTITY(1,1) NOT NULL, [QueueGroupID] int NOT NULL, [OwnerID] int NOT NULL, [TouchDueCnt] int NOT NULL, [TouchCnt] int NULL, [GUID] uniqueidentifier NULL, [InProcessDttm] datetime NULL, [HaltDttm] datetime NULL, [TouchDttm] datetime NULL )INSERT INTO [dbo].[WorkQueue] ([QueueGroupID], [OwnerID], [TouchDueCnt], [TouchCnt], [HaltDttm], [TouchDttm]) VALUES (1, 7, 5, 1, NULL, DATEADD(d, -2, GETDATE()))INSERT INTO [dbo].[WorkQueue] ([QueueGroupID], [OwnerID], [TouchDueCnt], [TouchCnt], [HaltDttm], [TouchDttm]) VALUES (1, 7, 5, 0, NULL, DATEADD(mi, -20, GETDATE()))INSERT INTO [dbo].[WorkQueue] ([QueueGroupID], [OwnerID], [TouchDueCnt], [TouchCnt], [HaltDttm], [TouchDttm]) VALUES (1, 6, 5, 4, NULL, DATEADD(mi, -2, GETDATE()))INSERT INTO [dbo].[WorkQueue] ([QueueGroupID], [OwnerID], [TouchDueCnt], [TouchCnt], [HaltDttm], [TouchDttm]) VALUES (2, 4, 5, 0, NULL, DATEADD(d, -2, GETDATE()))CREATE TRIGGER [dbo].[updWorkQueue] ON [dbo].[WorkQueue] FOR UPDATEASBEGIN -- Halt any work queue rows that have reached their max touchcnt IF UPDATE([TouchCnt]) BEGIN UPDATE [dbo].[WorkQueue] WITH (ROWLOCK) SET [HaltDttm] = CASE WHEN i.[TouchCnt] >= wq.[TouchDueCnt] THEN GETDATE() WHEN wq.[HaltDttm] IS NOT NULL THEN wq.[HaltDttm] ELSE NULL END FROM inserted i JOIN [dbo].[WorkQueue] AS wq WITH (NOLOCK) ON wq.[QueueID] = i.[QueueID] WHERE i.[TouchCnt] >= wq.[TouchDueCnt] END ENDCREATE PROCEDURE [dbo].[p_WorkQueueReserve] @queueGroupID INT, @queueID INT OUTPUT, @ownerID INT OUTPUTASBEGIN DECLARE @guid UNIQUEIDENTIFIER SET @queueID = NULL -- Get a globally unique ID for this event to insure we get a single unique row SET @guid = NEWID() -- Flag and reserve the next spoke UPDATE [dbo].[WorkQueue] WITH (ROWLOCK) SET [GUID] = @guid, [InProcessDttm] = GETDATE() WHERE [InProcessDttm] IS NULL AND [QueueID] = ( SELECT TOP 1 [QueueID] FROM [dbo].[WorkQueue] WITH (UPDLOCK, ROWLOCK, READPAST) WHERE [InProcessDttm] IS NULL AND [HaltDttm] IS NULL AND [QueueGroupID] = @queueGroupID ORDER BY [TouchDttm] ASC) -- Get the queue information to pass back SELECT @queueGroupID = [QueueGroupID], @queueID = [QueueID], @ownerID = [OwnerID] FROM [dbo].[WorkQueue] WITH (NOLOCK) WHERE [GUID] = @guid RETURNENDGOCREATE PROCEDURE [dbo].[p_WorkQueueRelease] @queueID INT, @resultInd BITASBEGIN IF @resultInd = 1 -- Success UPDATE [dbo].[WorkQueue] WITH (ROWLOCK) SET [InProcessDttm] = NULL, [TouchCnt] = ISNULL([TouchCnt], 0) + 1, [TouchDttm] = GETDATE() WHERE [QueueID] = @queueID ELSE UPDATE [dbo].[WorkQueue] WITH (ROWLOCK) SET [InProcessDttm] = NULL WHERE [QueueID] = @queueID RETURNENDGOselect * from [dbo].[WorkQueue] -- see what's thereDECLARE @queueGroupID INT, @queueID INT, @ownerID INT, @SUCCESS bit, @FAILURE bit, @setResultInd BITSET @SUCCESS = 1SET @FAILURE = 0SET @queueGroupID = 1EXEC [dbo].[p_WorkQueueReserve] @queueGroupID, @queueID OUTPUT, @ownerID OUTPUTSELECT @queueID, @ownerIDSET @setResultInd = 1 -- 1=Success,0=Failure-- Do something that works with the queue and release with SUCCESS IF @setResultInd = 0 EXEC [dbo].[p_WorkQueueRelease] @queueID, @SUCCESS-- or "rollbacK" with FAILURE without updating the LastUpDttmELSE EXEC [dbo].[p_WorkQueueRelease] @queueID, @FAILURE* endAnd THANKS in advance. |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2006-04-17 : 15:43:15
|
| Have you tried REMOVING all the hints, and just letting the optimizer do its job? |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-17 : 15:53:48
|
| LOL - riiiggghhhttt. Without hints, SQLServer likes to do a table or full database lock. A known "bug" called agressive lock escallation. We have millions of rows - without hints, everything locks up in a matter of seconds. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-04-17 : 16:03:07
|
| Could you post a link to this "known bug"?Tara Kizeraka tduggan |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-17 : 16:21:37
|
| OK, Microsoft does not apparently consider this a bug. But it's a well known "feature"- even Books OnLine suggests that hints may be required in future releases. The problem is that for small databases, the optimzer does a pretty good job - but when you get a lot of rows, things happen. And yes, our transactions are as small as we can make them - we have tons of performance code here to keep it running. Here are a couple of references to lock escallation:http://www.sql-server-performance.com/lock_contention_tamed_article.asp"In practice and under high load, SQL Server's locking system, which is based on lock escalation, does not perform well. Why? Lock contention. "http://support.microsoft.com/default.aspx?scid=kb;en-us;323630&sd=tech |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-04-17 : 16:26:22
|
| How many rows do your update statements affect?Tara Kizeraka tduggan |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-17 : 16:33:36
|
| In theory, the update should be affecting only ONE row - it specifies: WHERE QueueID = (SELET TOP 1 QueueID WHERE...) Unless SQL Server is holding locks on every potential row in calculating the TOP 1. It doesn't appear to be doing that though.We've tried it every way we can think of, including splitting into a SELECT @queueID = QueueID WHERE and then doing the UPDATE as separate statements inside the TRANSACTION, with locking hints, without locking hints - all with no change in behavior (except without hints sometimes the entire DB locks up requiring a DB restart). Part of the issue is that SQL Server is holding locks on some of the rows - even after the BEGIN/COMMIT statement in the Reserve procedure and even after Release has been called. Why is still a mystery to me. Any clues, hints, or thoughts you have will be greatly appreciated. |
 |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2006-04-17 : 17:14:35
|
| OK, forgive me but I have to ask some basic stuff first...The DDL script you supplied does not add any indexes to your WorkQueue table. You do have this indexed/primary-keyed, etc...right? And what columns are indexed? |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-17 : 17:37:09
|
| Excellent point - I'd left them out for brevity:ALTER TABLE [dbo].[WorkQueue] WITH NOCHECK ADD CONSTRAINT [PK_WorkQueue] PRIMARY KEY CLUSTERED ([QueueID]) ON [SECONDARY]GOCREATE INDEX [IX_WorkQueue_QGID] ON [dbo].[WorkQueue]([QueueGroupID]) WITH FILLFACTOR = 90 ON [SECONDARY]GOCREATE INDEX [IX_WorkQueue_GUID] ON [dbo].[WorkQueue]([GUID]) WITH FILLFACTOR = 90 ON [SECONDARY]GOCREATE INDEX [IX_WorkQueue_QGID_IPDTM] ON [dbo].[WorkQueue]([QueueGroupID],[InProcessDttm]) WITH FILLFACTOR = 90 ON [SECONDARY]GO |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-04-17 : 19:04:30
|
I would be interested to see what kind of query plan you get on this code from the subquery of the update in the p_WorkQueueReserve stored procedure. My guess is that you are getting a table scan, or at best an index scan. Since this is part of an update statement, I would not be surprised to see intense blocking/locking and deadlock problems.SELECT TOP 1 [QueueID] FROM [dbo].[WorkQueue] WITH (UPDLOCK, ROWLOCK, READPAST)WHERE [InProcessDttm] IS NULL AND [HaltDttm] IS NULL AND [QueueGroupID] = @queueGroupID ORDER BY [TouchDttm] ASC You might see an improvement in performance by creating an index to cover the subquery, something like this:CREATE INDEX [IX_WorkQueue_QGID_IPDTM]ON [dbo].[WorkQueue]([QueueGroupID], [TouchDttm], [InProcessDttm], [HaltDttm] ) Also, you have "[InProcessDttm] IS NULL" in both the where clause of the Update, and in the subquery. This is also very likely to lead to a table scan (or maybe 2), so I would remove it from the where clause of the update, and leave the condition in the subquery.CODO ERGO SUM |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-17 : 19:36:54
|
| Thanks - and the covering index doesn't seem to add anything:SELECT 0%Sort/TopN Sort 77%Filter 0%Bookmark Lookup 0%WorkQueue.IX_WorkQueue_QGID_IPDTM 22%We added the additional WHERE InprocessDttm IS NULL hoping it would improve the results (didn't have an effect). |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-04-18 : 10:19:30
|
| Our hosting company last night defragged the F drive (our data drive, indexes and temp on on other drives). One fragment remains on F for some reason - they're looking into that, and moved us to SP4 Build 2187 (we were on Build 2162, which Microsoft has pulled).So - this morning, we're still seeing:SQLServer is holding locks for some reason on the WorkQueue table - even after the BEGIN/COMMIT transaction in p_WorkQueueReserve. The only place WorkQueue is being updated is with the 2 services.I think that may be a clue to the issue. |
 |
|
|
LinuxLars
Starting Member
9 Posts |
Posted - 2006-05-03 : 10:25:44
|
| Just to close this issue - it appears the issue related to row locking - I don't believe the row locks are being released properly.As a work-around, we re-engineered the entire workflow process, so that all incoming requests are "staged" - and only one thread assigns the work queue rows. This works. |
 |
|
|
|
|
|
|
|