Author |
Topic |
psetterfield
Starting Member
7 Posts |
Posted - 2010-02-02 : 16:20:17
|
Hello,I hope somebody can help or offer advice. My question has 2 parts.First part is on index use or over use. I have a SQL Server table called Documents which stores 3 types of document - Invoices, Estimates and Jobsheets. The Table has columns DocumentID and DocumentType plus many others - Customer, DocumentDate, Paid, Archived and a few more etc.I have specified Indexes on (DocumentID, DocumentType - clustered), Customer, DocumentDate, Paid, Archived. I am finding that on a 23,000 row table - if I run SELECT Customer FROM Documents WHERE Paid = 1, then the Paid Index or Customer Index does not get used - it just does a Clustered Scan instead - so the question is - should I just remove these redundant indexes which do not appear to be getting used? I would never just select Customer by itself, or Paid by itself.2nd Query is on a locking issue. With the Table as mentioned above - if I have one user who selects a single document (Jobsheet for example) with an UPDLOCK - i.e SELECT * FROM Documents WITH (UPDLOCK, ROWLOCK) WHERE DocumentID=1 AND DocumentType = 2 - then for some reason the following query will hang until the lock is released - UPDATE Documents SET Archived = 1 WHERE DocumentType = 0I'm not sure why the 2nd query would hang, because the DocumenType is an indexed value, and so any DocumentType "0" records should be updateable I would have thought?Any help or suggestions appreciated.Just to add after some more research myself...This statment fails if a record if DocumentType = 2 has an UpdateLockUPDATE Documents SET Paid = 1 WHERE Paid = 0 AND On_Hold = 0 AND DocumentType = 0 This statement runs fine - I think the IN is forcing SQL Server to make more sensible use of the (DocumentID,DocumentType) index.UPDATE Documents SET Paid = 1WHERE DocumentType = 0 AND DocumentID IN (SELECT DocumentID FROM Documents WHERE DocumentType = 0 AND On_Hold = 0 AND Paid = 0) |
|
Kristen
Test
22859 Posts |
Posted - 2010-02-03 : 01:31:35
|
" if I run SELECT Customer FROM Documents WHERE Paid = 1, then the Paid Index or Customer Index does not get used - it just does a Clustered Scan instead"An index on a simple Yes/No (or 1/0) value will very rarely be used. If Paid=1 was 10% of the values, or less, AND if the statistics were up to date for the index (not the case if Paid changes rapidly through the day, say), then the index might be used. Otherwise the cost of "Find it in the index, then do a lookup for the actual record" is too great - and the random-access nature of the request means that the disk heads would be jumping around all over the place so, normally, just scanning through all the rows is quicker."should I just remove these redundant indexes which do not appear to be getting used?"Yes.However, if you hadSELECT Customer FROM Documents WHERE Paid = 1 AND CustomerCode BETWEEN 'AAA' AND 'EEE' type queries then an index with keys CustomerCode, [Paid] would be beneficial (make sure CustomerCode is first key). This index is said to "Cover" the query because the WHERE clause (and in this case the SELECT) is covered by the fields in the index. SQL can use a "Range" of the index, from "AAA" to "EEE", and resolve whether the Document is Paid, or not, without having to get the actual record.Dunno about 2, I always program to avoid using those sorts of locks on SELECT statements. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-03 : 07:33:32
|
I will just add that adding locking hints like you do is rarely a good idea. You're basically saying that the query optimizers locking decisions are not good enough and well...I'm not gonna say that it never is the case but in more than 95% of them the query optimizer does a really good job.- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-03 : 07:46:58
|
The other hint I hate to see is (NOLOCK). Similar reasoning - its a bodge to work around actually coding it properly! |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-03 : 08:09:49
|
Whoaaa Kristen! That's another ballpark man...I *totally* disagree! You need to know where NOT to use it but I see no point whatsoever creating locks on data that is never changed. Like in a DWH the whole shabang could be read using NOLOCK (outside loading hours of course). If you think about how many selects one creates that actually has a chance of hitting data under modification I'd say it's less than 20%. I don't know about the applications you work on of course but I created an online trading system for odds a few years back and NOLOCK was used a lot of places. One second after a football game ended the data was by definition "old" and could be read using NOLOCK...- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
psetterfield
Starting Member
7 Posts |
Posted - 2010-02-03 : 08:23:31
|
Locking Hints are a must in my system. We have multiple users accessing jobsheets or invoices for Read or Update. We place an UPDLOCK on a record when someone hits Edit so that no one else can change the same document and have their changes lost etc.By using UPDLOCK we are then forced to use a NOLOCK when reading because someone might just decide they want to run a report on the Invoices in the system as they stand at present. If they couldn't run the report because someone was performing an Edit, it would be a bit silly of us to say "Oh, before you can run any reports, you must make sure everyone else logs out." Without NOLOCK, there is a chance the system just locks up and goes unresponsive until the Lock Timeout is reached. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-03 : 08:31:34
|
It seems like you have a decent grip on what your doing so if it works for you then thats great. I'd consider adding an "BeingUpdated"-column to the table instead though and have a job that runs every 5 mins or something that unlocks records that has been locked for 20 mins or something. Out of curiosity: what happens when another user tries to update a record that has an UPDLOCK? Does it get handled by your application with an error message or something or does it just hang there until the lock is released...?- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-02-03 : 08:32:03
|
your strategey is a performance nightmare from the get-go. update locks aren't truly necessary. and nolock can introduce bugs due to dirty reads and the same records being read multiple times. bad idea.what you should do is create a custom locking scheme. add a column or two to the table(s) in question. when a user *owns* those records you place their userid and perhaps date/time or a status bit in the new columns. when the user is finished u update those columns again.SELECT with UPDLOCK is always going to cause blocking and is almost never a good idea. |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-03 : 09:27:07
|
After reading the article I still believe that my point about using NOLOCK when reading historic/non-changing data is valid. It's like creating a read-only filegroup or partition...data that is rarely or never updated needs no locking, which I believe is the case for the majority of data in any given database. Please disagree, I'm very interested in a discussion on the topic (again )- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-02-03 : 09:42:32
|
Well, I doubt that using nolock improves performance for reading static data.by the way, I meant the OPs strategy, not yours Lumbago...not sure if that was clear. |
|
|
psetterfield
Starting Member
7 Posts |
Posted - 2010-02-03 : 09:55:13
|
I appreciate the comments. I will check out the links as well.The reason I have gone with the UPDLOCK lock is so that the database will look after the locking for me without the need to re-invent the wheel myself. SQL Server is very good at knowing when a client disconnects and then rolls back the lock. If I implemented a locking system myself, I would need a maintenance procedure then to "Clear All Locks" in case of system crash or whatever.To handle locks at present, I keep a short 3 seconds LOCK_TIMEOUT when I try to do a SELECT again with an UPDLOCK so I can then inform the user - "The record is locked by someone else at the moment".Why does SQL have locks if they are not meant to be used?! Microsoft implemented Row Locks at the request of users - in my opinion Locks are not just for the databases internal use only.I think that so long as you know what the consequences are then LOCKS can work very well in your favour. Sometimes I even use READPAST when performing some Tidy Up or when removing temporary records on redundant records and I know that I might not have exclusive access to the table. |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-03 : 14:50:43
|
The debate is religious! but nonetheless, and despite the fact that I have evolved "ways of working", I'm always interested to review them and make sure the way I do things is still sound."Like in a DWH the whole shabang could be read using NOLOCK (outside loading hours of course)"So do you specifically handle it when the data IS loading then? Or do you just ignore the problem and hope that the users don't trip over it, and if they do that they don't suffer too much?I see NOLOCK being used wholesale to work around performance / blocking / other issues. No attempt is made to handle "dirty read" scenarios, such scenarios are usually unrepeatable, and the DEVs never actually attempt to test them; unless they have good logging (designed to find user-issues) they probably have no idea how often (if ever) users are being inconvenienced by the side effects of dirty reads; users are probably going "Oh that always happens around this time I'll a) not use the system or b) press REFRESH and hope it works next time. Both are totally unacceptable IMHO!There is a place for NOLOCK, its just that the only place I've seen it used (except where I wrote the code and deliberately used it!) is where an application experienced blocking / deadlocks, and the solution adopted was to sprinkle the code with NOLOCKS. Problem Solved! ... but as the site scales up the side effects become more frequent, and then eventually to the point where people started complaining, and then the DEVs have no idea how to solve what had become a bad-habit.The only place we use NOLOCK is on diagnostic reports, for DEVs only, of logging tables that are basically WRITE-ONLY, and never intended to be read except in extremis. Any READs on those tables are incredibly intrusive on the WRITES so we use NOLOCK to prevent upsetting the Inserts, and we definitely have no issue with the fact that the reads may be dirty."We place an UPDLOCK on a record when someone hits Edit so that no one else can change the same document"Do you mean you place a database lock, and then allow the user to preform an action before the lock is released? (Just want to be sure I have understood your scenario correctly).If so: How is sorting out "User went to lunch with a record displayed (i.e. locked) on their screen" different to "I would need a maintenance procedure then to "Clear All Locks" in case of system crash"? Neither desirable, both are unfortunate outcomes of that choice. Using database locks is a blunt weapon. Using soft-locks allows user-driven solutions (Supervisor can clear a lock, user can see who's locking the record - and if sat next to them tell them to hurry up, whatever).We use what I think is called Optimistic Locking. Change the record however you like, but if someone else saves the record whilst you are editing then you lose your changes. For us the chances of that happening is tiny, so acceptable to the users. (At least we check! some system just write back the user's changes regardless so "last editor wins" )The problem I have with locking hints, in general, is that I don't see software written to handle the small number of cases where the opposite-happens. Detecting, and handling elegantly, that you have a Dirty Read when using NOLOCK;Using Update Lock and assuming it is still valid when writing back the record - i.e. not checking that the Update Lock is no longer held (and adding a test to the QA tests to ensure that that scenario works)"Why does SQL have locks if they are not meant to be used?!"That is a very good point ... |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2010-02-03 : 15:20:42
|
In many ways, this debate is obsolete for SQL 2005 and above.If you set the database to use row version isolation as the default (READ_COMMITTED_SNAPSHOT) the issues mentioned go away, and you do not have to deal with the problems that can be introduced by NOLOCK hints. Using UPDLOCK is just asking for trouble, because you do not know at what point SQL Server will decide to escalate the lock to a table lock. Remember that hints are just that; SQL Server is free to ignore them.Also, optimistic locking is a much better way of handling the possibility of two people trying to update a row at the same time. With the addition of a row version column, conflicts are easy to detect and give users a chance to retry.CODO ERGO SUM |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-04 : 03:05:00
|
Do you think READ_COMMITTED_SNAPSHOT puts much "strain" on the Server? (i.e. better to use nothing unless you need it? which would come back to my original point that I see NOLOCK being used for the wrong reasons, and using READ_COMMITTED_SNAPSHOT would still hide the underlying problems of poorly written queries, or poor Design. It would however stop the occasional dirty read that those poor programmers are inflicting on their users!) |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-04 : 03:31:34
|
Just so we're clear: I've never advocated the use of NOLOCK to prevent/solve locking/blocking problems. But in quite a few cases it doesn't really matter if the average temperature at the moon differs by 0.0002 degrees just because you did a dirty read if you catch my drift. The first thing I said about using NOLOCK is to know where NOT to use it and if you don't know this then you can't use it at all. I use it on data that could just as well be placed in a read-only filegroup.- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-04 : 10:00:17
|
"But in quite a few cases it doesn't really matter if the average temperature at the moon differs by 0.0002 degrees just because you did a dirty read if you catch my drift."I know exactly where you are coming from, but I still don't agree (with the nuance of your statement above). There is a fair chance that the application (or some modification to it in the future) will error on the fact that the data has Dirty Reads. Some calculation will need to re-query, or the user will find the results are not reproducible if they Refresh etc.I think it is "submarine-ing" a problem for the future, or so intermittent that the users get confused, or DEVs waste lots of time chasing the rare errors.But you are right about "Only use it if you understand it under-the-hood" |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2010-02-04 : 10:27:41
|
quote: Originally posted by Kristen Do you think READ_COMMITTED_SNAPSHOT puts much "strain" on the Server? (i.e. better to use nothing unless you need it? which would come back to my original point that I see NOLOCK being used for the wrong reasons, and using READ_COMMITTED_SNAPSHOT would still hide the underlying problems of poorly written queries, or poor Design. It would however stop the occasional dirty read that those poor programmers are inflicting on their users!)
I have implemented READ_COMMITTED_SNAPSHOT on a number of systems, mainly to eliminate deadlocks, but also to prevent updates from blocking reporting queries. I haven't really seen any difference in resource usage.I have seen it eliminate deadlocks completely where we were getting over 10,000 deadlocks per day in a vendor supplied system. And eliminate complaints about reporting queries being blocked.Truthfully, it's the tool I reach for first when I see deadlocking. Although the deadlocks could be eliminated by proper coding, indexing, etc., most vendors (and our own internal developers) don't seem to have anyone capable of doing this. Since it normally doesn't require any code change to implement, it's fairly painless.CODO ERGO SUM |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-02-04 : 11:57:34
|
One other caveat I want to point out about NOLOCK hint...In a replicated environment inserts (or updates) from selects with (nolock) will generate errors at the distributor. For example:INSERT replicatedTable (c1, c2, c3)SELECT a.c1, a.c2, b.c3FROM tableA a (nolock)JOIN tableB bOn a.pk = b.pk Eventually, you're almost certainly going to get Primary Key errors or Row not found at subscriber errors. |
|
|
|