Author |
Topic |
elwoos
Master Smack Fu Yak Hacker
2052 Posts |
Posted - 2006-03-31 : 10:52:29
|
I've recently been reading about latch waits. There is a lot of good stuff on the web but I wondered what people here thought about them. Microsoft suggest that if they are high you may have a problem [url]http://technet2.microsoft.com/WindowsServer/en/Library/9277f422-eb8c-4c14-89b5-9fe09f80fd191033.mspx[/url] but they don't indicate what constitutes a high value. FWIW mine appear to be around 1654.339 at a period of low - average use. For the same period my buffer cache hit ratio is 99.744%. The latter sounds pretty good to me but I'm not sure about the latch waits. Not sure it matters but this is SQL server 7! I saw a suggestion that I should look at latch timeouts but can't see that option so have looked at Lock timeouts but am not sure which instance to pickAny comments would be appreciatedthankssteve-----------Oh, so they have internet on computers now! |
|
schuhtl
Posting Yak Master
102 Posts |
Posted - 2006-04-03 : 08:51:38
|
I recently had a Microsoft SQL Server Engineer onsite to do a SQL Server Health Check on several of our larger servers. I mentioned to the engineer that it would be great if there was some/better documentation regarding performance counter preferred values. As you know many of the counters have great descriptions of what they are supposed to measure but once you start measuring them you have no idea if the numbers are good or bad. I was told that within Microsoft many of the "experts" can't agree on a certain thresholds for every counter because it can make a big difference depending on the environment. They did pass along a document that the field engineer’s use as a starting point for most of the counters and this is what it has for SQL Server:Latches:Object: - Server:LatchesCounter: - Average Latch Wait Time (ms)Preferred Value: - < 300Description and Threshold: - Average latch wait time (milliseconds) for latch requests that had to wait. |
 |
|
Jim77
Constraint Violating Yak Guru
440 Posts |
Posted - 2006-04-03 : 11:02:17
|
That is interesting stuff any chance you could post any more default preferred starting point values please Schutl ? |
 |
|
schuhtl
Posting Yak Master
102 Posts |
Posted - 2006-04-03 : 14:38:53
|
The information below is to be used as a guide. Just because the counters collected in your environment may be drastically different from those listed below does not mean that there is a problem. I am sure many of these could be debated but I find the information helpful and you can be the judge if it is useful in your environment(s). I am not the author of the content below so if you see something that you disagree with… don't blame me:)Memory Bottleneck AnalysisObject: - MemoryCounter: - Available MbytesPreferred Value: - > 20MBDescription: -Reference: - KB 889654Object: - MemoryCounter: - Free System Page Table EntriesPreferred Value: - > 7000Description: - Free System Page Table Entries is the number of page table entries not currently in use by the system. If < 7000, consider removing /3GB.Reference: - KB 311901Object: - MemoryCounter: - Pages/SecPreferred Value: - < 50Description: - Pages/sec is the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays.Reference: - Monitoring and Tuning Your Server Object: - MemoryCounter: - Pages Input/SecPreferred Value: - < 10Description: - Pages Input/sec is the rate at which pages are read from disk to resolve hard page faults.Reference: - KB 889654Object: - Paging FileCounter: - %UsagePreferred Value: - < 70%Description: - The amount of the Page File instance in use in percent. Reference: - KB 889654Object: - Paging FileCounter: - %UsagePreferred Value: - < 70%Description: - The peak usage of the Page File instance in percent. Reference: - KB 889654Object: - SQL Server:Buffer ManagerCounter: - Page Life ExpectancyPreferred Value: - > 300Description: - This performance monitor counter tells you, on average, how long data pages are staying in the buffer. If this value gets below 300 seconds, this is a potential indication that your SQL Server could use more memory in order to boost performance.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Lazy Writes/SecPreferred Value: - < 20Description: - This counter tracks how many times a second that the Lazy Writer process is moving dirty pages from the buffer to disk in order to free up buffer space. Generally speaking, this should not be a high value, say more than 20 per second or so. Ideally, it should be close to zero. If it is zero, this indicates that your SQL Server's buffer cache is plenty big and SQL Server doesn't have to free up dirty pages, instead waiting for this to occur during regular checkpoints. If this value is high, then a need for more memory is indicated.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Checkpoint Pages/SecPreferred Value: - This value is relative, it varies from server to server, we need to compare the average to a base line capture to tell if the value is high or low.Description: - When a checkpoint occurs, all dirty pages are written to disk. This is a normal procedure and will cause this counter to rise during the checkpoint process. What you don't want to see is a high value for this counter over time. This can indicate that the checkpoint process is running more often than it should, which can use up valuable server resources. If this has a high figure (and this will vary from server to server), consider adding more RAM to reduce how often the checkpoint occurs, or consider increasing the "recovery interval" SQL Server configuration setting.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Page reads/secPreferred Value: - < 90Description: - Number of physical database page reads issued. 80 – 90 per second is normal, anything that is above indicates indexing or memory constraint.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Page writes/secPreferred Value: - < 90Description: - Number of physical database page writes issued. 80 – 90 per second is normal, anything more we need to check the lazy writer/sec and checkpoint counters, if these counters are also relatively high then, it’s memory constraint.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Free pagesPreferred Value: - > 640Description: - Total number of pages on all free lists.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Stolen pagesPreferred Value: - Varies. Compare with baselineDescription: - Number of pages used for miscellaneous server purposes (including procedure cache).Reference: - Object: - SQL Server:Buffer ManagerCounter: - Buffer Cache hit ratioPreferred Value: - > 90%Description: - Percentage of pages that were found in the buffer pool without having to incur a read from disk.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Target Server Memory(KB)Preferred Value: - Description: - Total amount of dynamic memory the server can consume.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Total Server Memory(KB)Preferred Value: - Description: - Total amount of dynamic memory (in kilobytes) that the server is using currentlyReference: - Disk Bottleneck AnalysisObject: - PhysicalDiskCounter: - Avg. Disk Sec/ReadPreferred Value: - < 8msDescription: - Measure of disk latgency. Avg. Disk sec/Read is the average time, in seconds, of a read of data from the disk.More Info:Reads or non cached WritesExcellent < 08 Msec ( .008 seconds )Good < 12 Msec ( .012 seconds )Fair < 20 Msec ( .020 seconds )Poor > 20 Msec ( .020 seconds )Cached Writes OnlyExcellent < 01 Msec ( .001 seconds )Good < 02 Msec ( .002 seconds )Fair < 04 Msec ( .004 seconds )Poor > 04 Msec ( .004 secondsReference: - Object: - PhysicalDiskCounter: - Avg. Disk sec/WritePreferred Value: - < 8ms (non cached) < 1ms (cached)Description: - Measure of disk latency. Avg. Disk sec/Write is the average time, in seconds, of a write of data to the disk.Reference: - Object: - PhysicalDiskCounter: - Avg. Disk Read Queue LengthPreferred Value: - < 2 * spindlesDescription: - Avg. Disk Read Queue Length is the average number of read requests that were queued for the selected disk during the sample interval.More Info:< (2+ no of spindles) Excellent< (2*no of spindles) Good< (3* no of spindles) FairReference - Whitepaper “Performance Monitoring in Windows 2003: Best Practices” by Ben W. ChristenburyNote: If the disk has say 20 disk and it is RAID 10 then no. of spindles = 20/2 = 10. If it is RAID 5 then the no. of spindles = no of disks = 20.Reference: - Object: - PhysicalDiskCounter: - Avg. Disk Write Queue LengthPreferred Value: - < 2 * spindlesDescription: - Avg. Disk Write Queue Length is the average number of write requests that were queued for the selected disk during the sample interval.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Page reads/secPreferred Value: - < 90Description: - Number of physical database page reads issued. 80 – 90 per second is normal, anything that is above indicates indexing or memory constraint.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Page writes/secPreferred Value: - < 90Description: - Number of physical database page writes issued. 80 – 90 per second is normal, anything more we need to check the lazy writer/sec and checkpoint counters, if these counters are also relatively high then, it’s memory constraint.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Free pagesPreferred Value: - > 640Description: - Total number of pages on all free lists.Reference: - Object: - SQL Server:Buffer ManagerCounter: - Stolen pagesPreferred Value: - Varies. Compare with baselineDescription: - Number of pages used for miscellaneous server purposes (including procedure cache).Reference: - Object: - SQL Server:Buffer ManagerCounter: - Buffer Cache hit ratioPreferred Value: - > 90%Description: - Percentage of pages that were found in the buffer pool without having to incur a read from disk.Reference: - Processor Bottleneck AnalysisObject: - ProcessorCounter: - %Processor TimePreferred Value: - < 80%Description: - % Processor Time is the percentage of elapsed time that the processor spends to execute a non-Idle thread.Reference: - Object: - ProcessorCounter: - %Privileged TimePreferred Value: - < 30% of Total %Processor TimeDescription: - % Privileged Time is the percentage of elapsed time that the process threads spent executing code in privileged mode. Reference: - Object: - Process (sqlservr)Counter: - %Processor TimePreferred Value: - < 80%Description: -Reference: - Object: - Process (sqlservr)Counter: - %Privileged TimePreferred Value: - < 30% of %Processor Time (sqlservr)Description: - Note: Divide the value by number of processorsReference: - Object: - SystemCounter: - Context Switches/secPreferred Value: - < 3000Description: - 1500 – 3000 per processor Excellent – Fair> 6000 per processor PoorUpper limit is about 40,000 at 90 % CPU per CPUNOTE: Remember to divide by number of processorsReference: - Object: - SystemCounter: - Processor Queue LengthPreferred Value: - < 4 per CPUDescription: - For standard servers with long Quantums<= 4 per CPU Excellent< 8 per CPU Good< 12 per CPU FairReference: - Object: - SQLServer:Access MethodsCounter: - Full Scans / secPreferred Value: - < 1Description: - If we see high CPU then we need to invistigate this counter, otherwise if the full scan are on small tables we can ignore this counter. Values greater than 1 or 2 indicates that we are having table / Index page scans. We need to analyze how this can be avoided.Reference: - Object: - SQLServer:Access MethodsCounter: - Worktables Created/SecPreferred Value: - < 20Description: - Number of worktables created in tempdb per second. Worktables are used for queries that use various spools (table spool, index spool, etc).Reference: - Object: - SQLServer:Access MethodsCounter: - Workfiles Created/SecPreferred Value: - < 20Description: - Number of work files created per second. Tempdb workfiles are used in processing hash operations when the amount of data being processed is too big to fit into the available memory. They may be able to reduce this number by making the queries more efficient by adding/changing indexes, adding additional memory, etc.Reference: - Object: - SQLServer:Access MethodsCounter: - Page Splits/secPreferred Value: - < 20 Description: - Interesting counter that can lead us to our table / index design. This value needs to be low as possible. If you find out that the number of page splits is high, consider increasing the fillfactor of your indexes. An increased fillfactor helps to reduce page splits because there is more room in data pages before it fills up and a page split has to occur.Reference: - Overall SQL Server Bottleneck AnalysisObject: - SQLServer:General StatisticsCounter: - User ConnectionsPreferred Value: - Description: - The number of users currently connected to the SQL Server.Reference: - Object: - SQLServer:General StatisticsCounter: - Logins/secPreferred Value: - < 2 Description: - > 2 per second indicates that the application is not correctly using connection pooling.Reference: - Object: - SQLServer:General StatisticsCounter: - Logouts/secPreferred Value: - < 2Description: - > 2 per second indicates that the application is not correctly using connection pooling.Reference: - Object: - SQLServer:SQL StatisticsCounter: - Batch Requests/SecPreferred Value: - < 1000Description: - Over 1000 batch requests per second indicate a very busy SQL Server.Reference: - Object: - SQLServer:SQL StatisticsCounter: - SQL Compilations/secPreferred Value: - < 10% of the number of Batch Requests / secDescription: - The number of times per second that SQL Server compilations have occurred. This value needs to be as low as possible. If you see a high value such as over 100, then it’s an indication that there are lots or adhoc queries that are running, might cause CPUReference: - Object: - SQLServer:SQL StatisticsCounter: - SQL Re-Compilations/secPreferred Value: - < 10% of the number of SQL Compilations/secDescription: - This needs to be nil in our system as much as possible. A recompile can cause deadlocks and compile locks that are not compatible with any locking type.Reference: - Object: - SQL Server:LatchesCounter: - Average Latch Wait Time (ms)Preferred Value: - < 300Description: - Average latch wait time (milliseconds) for latch requests that had to wait.Reference: - Transaction ManagementObject: - SQL Server:Locks Counter: - Number of Deadlocks/secPreferred Value: - < 1Description: - The number of lock requests that resulted in a deadlock.Reference: - Object: - SQL Server:Locks Counter: - Lock Requests/secPreferred Value: - < 1000Description: - Number of requests for a type of lock per second. Lock requests/sec > 1000 indicates that the queries are accessing large number of rows, the next step is to review high read queries. If you also see high Avg. Wait time, then it’s an indication of blocking, then review the blocking script output.Reference: - Object: - SQL Server:Locks Counter: - Average Wait Time (ms)Preferred Value: - < 500Description: - This is the average wait time in milliseconds to acquire a lock. Lower the value the better it is. If the value goes higher then 500, there may be blocking going on; we need to run blocker script to identify blocking.Reference: - |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-04-03 : 15:23:46
|
Kristen, sticky this!Go with the flow & have fun! Else fight the flow Blog thingie: [URL="http://weblogs.sqlteam.com/mladenp"] |
 |
|
elwoos
Master Smack Fu Yak Hacker
2052 Posts |
Posted - 2006-04-04 : 03:18:27
|
Schutl that's great, thanks for providing thatsteve-----------Oh, so they have internet on computers now! |
 |
|
eyechart
Master Smack Fu Yak Hacker
3575 Posts |
Posted - 2006-04-04 : 03:41:19
|
that was a unbelievably excellent post. That should go in a blog ASAP.-ec |
 |
|
eyechart
Master Smack Fu Yak Hacker
3575 Posts |
Posted - 2006-04-04 : 03:42:53
|
quote: Originally posted by schuhtlThey did pass along a document that the field engineer’s use as a starting point for most of the counters and this is what it has for SQL Server:Latches:
Schuhtl, Can you pass along the title of that doc? I would like to request it from our TAM. Again, thanks for the great post(s).-ec |
 |
|
Jim77
Constraint Violating Yak Guru
440 Posts |
Posted - 2006-04-04 : 04:12:43
|
what a super post thank you muchly. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-04-04 : 06:32:24
|
"Kristen, sticky this!"Consider it done!Great post schuhtl, Thanks.Kristen |
 |
|
bogey
Posting Yak Master
166 Posts |
Posted - 2006-07-12 : 14:59:49
|
Any news on getting that document. We are evaluating Idera Diagnostic MGR and there are alot of numbers being displayed. |
 |
|
schuhtl
Posting Yak Master
102 Posts |
Posted - 2006-07-12 : 15:22:35
|
bogey,There is no "official" document... I was given a spreadsheet and I posted everything that the spreadsheet contained. What counter(s) are you looking for that are not listed? |
 |
|
|