Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 Identifying a rouge query

Author  Topic 

dmeagor
Starting Member

16 Posts

Posted - 2010-03-29 : 13:03:26
Hi

I hope some genius here can help me. For some reason queries that normally run fine (in a fraction of a second) are timing out (30 sec's+) This seems to happen for about 10 minutes until things return to normal. It would only appear to be one table and I believe you can still read from it but just can't use UPDATE/INSERT/DELETE.

I'm assuming profiler is the way to identify this but I'm not having much success configuring it.


My best (unverified) guess is that it's a delete/clear option we have on the site because it's the only thing that can take time to complete but it could be something else I've missed. There are there are lots of smaller deletes and updates going on every second. The big delete query is below...

DELETE FROM tblBigTable WHERE myId=xyz

In this case myId is the primary key. Would this query block write access to the whole table whilst it completed?

Help would be very much appreciated!


David Meagor

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-03-29 : 13:04:57
have you tried running sp_who2 to check any other process which is causing blocking whilst you're queries are run?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-29 : 13:17:57
quote:
Originally posted by visakh16

have you tried running sp_who2 to check any other process which is causing blocking whilst you're queries are run?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/





Thanks for replying!

I will try sp_who2 next time however because it's just one table I had assumed that it was some kind of write lock rather than another process running. There are no maintenance/backup/autoshrink scripts running during the time as far as I can see.

David Meagor
Go to Top of Page

namman
Constraint Violating Yak Guru

285 Posts

Posted - 2010-03-29 : 13:29:06
There are lots of smaller deletes and updates going on every second. The big delete query is below...

DELETE FROM tblBigTable WHERE myId=xyz

In this case myId is the primary key. Would this query block write access to the whole table whilst it completed?




Check the smaller deletes and updates queries. Do they delete/update table base on primary key? If NOT, sql server may lock page or even entire object. So your big delete query must wait for small queries to finish their jobs.
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-29 : 13:40:03
The small delete queries other than the one below contain either the primary key (form_id) or an index (session_id.) qu_id and page are not indexed.

Examples
DELETE FROM tblBigTable WHERE form_id=123456 AND qu_id=654654
DELETE FROM tblBigTable WHERE session_id = 123456789 AND page = 2
DELETE FROM tblBigTable WHERE form_id=123456 AND qu_id=615651

It is the smaller queries that are timing out.

Also I've been checking through the profiler results and I cannot see any mention of the big delete query. I am using the default setting and filtering duration for 10000. Would this show queries that timeout or just those that complete after a long time?

David Meagor
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-03-29 : 14:05:02
I think if you run sp_who/sp_who2 during the problem, you'll notice blocking. You need to find the parent blocker and figure out if an index is missing. I have a feeling that some DML operation is inefficient (missing index) or is processing too many rows in one batch. The query optimizer may be choosing to exclusively lock the table, hence the issue. So your first goal is to find the parent blocker.

You need to change your session_id index so that page is also included. You should also create a covering index for form_id and qu_id.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-29 : 15:23:37
I will do however there has not been a serious blockage occur in a few hours now.

I am sitting here watching the profiler with my finger on the trigger so to speak I don't suppose these is an automated way of doing this.

Also someone has told me that running profiler on the production server is dangerous although it's only reporting transactions which take over 15 seconds which isn't many.



David Meagor
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-29 : 17:32:56
quote:
Originally posted by tkizer

I think if you run sp_who/sp_who2 during the problem, you'll notice blocking. You need to find the parent blocker and figure out



OK It just went down so I ran the sp_who2 command several times.

There are 4 queries (all INSERTS and DELETES) listed with BlkBy numbers 100. When I check the SPID 100 process it has a BlkBy number of 61. SPID61 is the SELECT command below.

If that's right then they are actually waiting for a SELECT command to finish. However I just ran the select query again and it competed in 1 second so now I'm completely confused. Something must have stopped the SELECT query but it didn't show any BlkBy no.

I had profiler running at the same time picking up anything over 15seconds) and this is what it lists the SPID's and durations as (they both completed in the same second, the SELECT started first as you'd expect.)

SPID61
reads: 147526 cpu: 422 duration: 253745

SELECT tblBigTable.qId, tblBigTable.qu_sub_id, tblBigTable.numericAnswer, tblBigTable.textAnswer, tblBigTable.session_id, tblSession.dateStamp,tblSession.tracking,firstname,lastname,email,cust,lcode FROM tblBigTable ,tblSession LEFT JOIN tblParticipant on tblParticipant.session_id=tblSession.session_id WHERE tblBigTable.session_id=tblSession.session_id AND tblBigTable.form_id = 712587 ORDER BY session_id ASC

SPID 100.
Reads: 16 writes: 12: duration: 144233
DELETE FROM tblBigTable WHERE form_id=732707 AND qId=5910957

David Meagor
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-03-29 : 17:40:16
You should switch your isolation level to READ_COMMITTED_SNAPSHOT to avoid reads from blocking writes.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-29 : 18:19:25
quote:
Originally posted by tkizer

You should switch your isolation level to READ_COMMITTED_SNAPSHOT to avoid reads from blocking writes.


Thanks, I'll read up on that now. It sounds like something that could have various pros/cons?

Do you have any idea why the (61) SELECT query that normally runs in 1 second can be blocking the later (100) DELETE transactions for 5-10 minutes?

David Meagor
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-03-29 : 18:29:55
The only con I know about is the tempdb usage.

Maybe spid61 had a bad execution plan. Maybe the statistics were outdated. Maybe spid61 was blocked by another query.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-30 : 08:42:58
Thanks for all of your help so far. I think this query is definitly the cause as the error occurred twice more during the night and each time the same query had been run and appeared to be holding the rest up.

Do long running SELECT queries normally block tables whilst they complete or is it just the LEFT JOIN part causing an issue?

Looking at the execution plan it shows...

83% doing a KEY LOOKUP on a the LEFT JOIN to tblParticipant (In these cases there were no matching records)
8% doing KEY LOOKUP (clustered, primary) on tblBigTable
7% doing CLUSTERED INDEX SEEK on tblSession

Any ideas what I can do to stop this or is the READ_COMMITTED_SNAPSHOT setting the only option?



David Meagor
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2010-03-30 : 10:17:11
Post the SELECT query.

Key lookups are where the primary key has been found in the index but the other information you want to bring back isn't in the index. In that case the engine has to then read the required info by scanning each row in the KEY LOOKUP.

You can remove these by

1) returning only the data you need
2) if possible INCLUDE the columns you require in the INDEX that is used.

NB:This is quite a wooly description of a KEY LOOKUP.
There are a lot of good topics on this however, Here's one:
http://blog.sqlauthority.com/2009/11/09/sql-server-removing-key-lookup-seek-predicate-predicate-an-interesting-observation-related-to-datatypes/
Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION[/red]
Go to Top of Page

namman
Constraint Violating Yak Guru

285 Posts

Posted - 2010-03-30 : 13:55:39

SPID61
reads: 147526 cpu: 422 duration: 253745

SELECT tblBigTable.qId, tblBigTable.qu_sub_id, tblBigTable.numericAnswer, tblBigTable.textAnswer, tblBigTable.session_id, tblSession.dateStamp,tblSession.tracking,firstname,lastname,email,cust,lcode FROM tblBigTable ,tblSession LEFT JOIN tblParticipant on tblParticipant.session_id=tblSession.session_id WHERE tblBigTable.session_id=tblSession.session_id AND tblBigTable.form_id = 712587 ORDER BY session_id ASC

SPID 100.
Reads: 16 writes: 12: duration: 144233
DELETE FROM tblBigTable WHERE form_id=732707 AND qId=5910957

Do you have any idea why the (61) SELECT query that normally runs in 1 second can be blocking the later (100) DELETE transactions for 5-10 minutes?


Do long running SELECT queries normally block tables whilst they complete or is it just the LEFT JOIN part causing an issue?




Normal read is using share lock (except updlock). It does not block others. So I think the cause of problem is somewhere else.

How many rows have been deleted after you run
DELETE FROM tblBigTable WHERE form_id=732707 AND qId=5910957

If millions, of course, it takes long time to run.

If only few rows, I think that process has been blocked by other delete/update/insert processes. In this case, create some indexes (or INCLUDE) colums as people in this thread already suggested. That helps those modify processes to run faster and prevent lock escalation.
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-30 : 15:04:18
OK I've learned something new and irritating about SQL server/ADO which seems to be the cause of my problems (and likely other people's too.) It would seem that Profiler does not record the duration until the ADO Recordset is closed even if that's a long time after the query really finished so I've been barking up the wrong tree.

Basically the query does execute in under 1-4 seconds however the Recordset object in ASP is left open whilst other queries are run based on its results in a loop which takes up some time. When I run this script the table is write blocked.

rs1=cn.execute sql ' takes 1 second

do while not rs1.eof ' takes 10 minutes for loop to complete.
rs2=cn.execute anotherSql
do something...
loop
rs1.close ' profiler records duration for initial sql statement here.

I am also guessing that SQL server is maintaining the lock/block on the tblBigTable until the first recordset is closed. I have therefore tried using a #temp table to hold the results of the tblBigTable and tblSession first and then then in the same batch did the left join with the the #temp table instead. So far that 'appears' to have worked avoid the blocking issue. I guess it's just blocking the temp table now which doesn't effect anything else.

It's a poor ASP script I know but I'm still not sure why SQL server would block the whole table just because the recordset isn't closed.

I'm hoping this solution works whilst I rewrite the script itself properly.

David Meagor
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-03-30 : 17:18:29
quote:
Originally posted by namman

Normal read is using share lock (except updlock). It does not block others. So I think the cause of problem is somewhere else.



Reads do block writes in the default isolation level. That's the whole point of switching to READ_COMMITTED_SNAPSHOT isolation level so that reads take a snapshot of the data and do not lock anything.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-03-30 : 17:21:27
quote:
Originally posted by dmeagor

OK I've learned something new and irritating about SQL server/ADO which seems to be the cause of my problems (and likely other people's too.) It would seem that Profiler does not record the duration until the ADO Recordset is closed even if that's a long time after the query really finished so I've been barking up the wrong tree.



It doesn't show up in SQL Profiler until the ADO recordset is closed because you are inside a transaction. It isn't considered completed until the transaction completes.

quote:
Originally posted by dmeagor

It's a poor ASP script I know but I'm still not sure why SQL server would block the whole table just because the recordset isn't closed.



It really depends on how many rows you have locked inside the transaction. SQL Server may have chosen to take an exclusive lock on the table.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

DBA in the making
Aged Yak Warrior

638 Posts

Posted - 2010-03-30 : 20:35:26
quote:
Originally posted by dmeagor
Basically the query does execute in under 1-4 seconds however the Recordset object in ASP is left open whilst other queries are run based on its results in a loop which takes up some time. When I run this script the table is write blocked.

rs1=cn.execute sql ' takes 1 second

do while not rs1.eof ' takes 10 minutes for loop to complete.
rs2=cn.execute anotherSql
do something...
loop
rs1.close ' profiler records duration for initial sql statement here.



Try using a disconnected recordset, like this:

http://support.microsoft.com/kb/184397

I'm pretty sure that disconnecting the recordset will prevent it from locking the table.

There are 10 types of people in the world, those that understand binary, and those that don't.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-03-31 : 04:11:52
" It's a poor ASP script I know but I'm still not sure why SQL server would block the whole table just because the recordset isn't closed"

" Try using a disconnected recordset"

or you could use GetRows to "stream" the recordset into an Array - and then process from the array - obviously the array takes some memory, particularly if the recordset is large!

Possible upsides are: easier traversal of the recordset (go back, skip forwards, count rows [i.e. in the array]) and the ability to pre-process the data in the array - e.g. we have some "hints" for "data pre-processing" that are applied to the Array before the application gets to use it - so we can format dates ahead of application using them, etc
Go to Top of Page

dmeagor
Starting Member

16 Posts

Posted - 2010-03-31 : 15:16:50
Thanks for the ideas. I'm familiar with the getRows array method but I've not used a disconnected recordset before.

The disconnected recordsets sound like a nice way of working with large amounts of data. Does anyone know if there's a big performance drain on using them vs GetRows/arrays. I assume the drain is on the web server not the database server anyway which is far less important since this page is not often called?


David Meagor
Go to Top of Page

DBA in the making
Aged Yak Warrior

638 Posts

Posted - 2010-04-01 : 11:50:27
GetRows/arrays should work also, for pretty much the same reason. The recordset is not longer associated with the transaction. At a guess, I'd say it would perform better also. But the downside is that arrays only have a single data type, so each value would be converted to a string before you would be able to use them.

As a general rule, when you're using an ADO recordset object like this (Connect, populate via query, iterate and do work), you should always use a disconnected recordset. Once the recordset has been populated, there's no need to keep it connected to the server.

There are 10 types of people in the world, those that understand binary, and those that don't.
Go to Top of Page
    Next Page

- Advertisement -