SQL Server 2008 slow, please help!

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

SQL Server Administration (2008)

SQL Server 2008 slow, please help!

Author

Topic

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-04-11 : 09:31:03

I originally posted this question [url]http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=155509[/url], but it's still unresolved and extremely frustrating. Any help that steers me in the right direction will be really appreciated!

Basically, the performance is very erratic even though the usual indicators like high memory use or long-running queries are absent. A query of the top waits returns this

wait_type                                                    waiting_tasks_count  wait_time_ms         max_wait_time_ms     signal_wait_time_ms  
------------------------------------------------------------ -------------------- -------------------- -------------------- -------------------- 
SQLTRACE_INCREMENTAL_FLUSH_SLEEP                             128150               513769585            4040                 252
OLEDB                                                        189868779            509700990            563766               0
TRACEWRITE                                                   437071               509587033            2043                 596914
ASYNC_NETWORK_IO                                             2184542              1861763              71174                36551
CXPACKET                                                     58213                1430626              2304                 12872
PREEMPTIVE_OS_WAITFORSINGLEOBJECT                            2109753              1227058              894                  0
PAGEIOLATCH_SH                                               41866                284903               545                  613
LATCH_EX                                                     193257               232155               125                  14663
WRITELOG                                                     42163                18942                27                   10559
PAGEIOLATCH_EX                                               2892                 12628                151                  8

The server just seeems...nonresponsive. Not overworked. Even just connecting to it takes longer than it should.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-04-11 : 09:44:58

You don't appear to have any I/O problems. I'd look at your network connectivity, the OLEDB and ASYNC_NETWORK_IO are pretty high.

Are you running any traces to files over the network? If so, turn those off or redirect them to a local disk.

If you make any configuration changes, run DBCC SQLPERF("sys.dm_os_wait_stats" , CLEAR) immediately after. You won't be able to accurately measure the effects with the accumulated wait history you're showing here.

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-04-11 : 09:55:40

That makes sense, given that in Activity Monitor nothing is high, except Network I/O under Resource Waits, and when running profiler traces the only thing that stands out is the AUDIT LOGON and LOGOFF events. There are no traces over the network.

One thing: when looking at Task Manager, there's a service running dns.exe, which is consuming over 500000K memory and is possibly messing up the network, even though Task Manager shows network utilization to be low.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-04-11 : 10:04:31

Yeah, sounds like someone included the DNS role/feature when they installed it. Probably should remove it.

You should also run "netstat -an" on the SQL Server and see how many connections you have and what their status is. If you have a lot of wait statuses (CLOSE_WAIT or FIN_WAIT) then your application is not closing their SQL connections properly and is starving the box of available sockets. Anything more than 500 connections should be investigated, especially if most of them are waits.

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-05-11 : 10:12:55

There are no connections other than those ESTABLISHED when I do netstat. There are only about 20 ESTABLISHED connections to SQL, matching the user count.

The network admin had a fit when I asked about DNS. It will have to be a last resort after all other possibilities have been eliminated.

So a question about network waits and connection pooling and stuff. If UserA is running a query that isn't consuming its data fast enough (let's say due to a loop in app code), would that affect the speed with which UserB receives a batch result? Would UserB be put into a queue to wait until UserA consumes all data before UserB's data gets sent?

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-11 : 10:45:34

Have you cleared the stats and re-checked them since? It's been a month, has anything changed?

Are you doing a lot of linked server stuff? Or ADO connections with cursors? That may explain the high OLEDB waits. And are you running any traces besides the default trace? If it's just the default trace, I'd look at the disk it's being saved to, it might be getting pounded.

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-05-11 : 11:11:28

Nothing has changed. The problem is still there, although the stats have been cleared several times. The stats have changed, of course, but the Network I/O is still very high.

There are no linked servers. There are ADO connections with cursors. About 25 concurrent users. But most of the cursors are read-only and forward-only, or client-side and read-only, and there are few lengthy client-app loops or cursors with many rows. Most of it gets consumed pretty fast. The clients seem to freeze randomly, at different parts of the app, so it doesn't seem like a code issue in the client app.

There are no other traces right now, but I've run other DMV queries and all the disk stats are ridiculously low. This server is very underutilized for its hardware. Far from getting pounded, it's barely getting touched.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-11 : 11:23:36

quote:
The clients seem to freeze randomly, at different parts of the app, so it doesn't seem like a code issue in the client app.

That seems a little contradictory, since the SQL Server doesn't seem to be the problem. Are there any other operations on that SQL Server that don't use that app, and are they running slow too?

My next suggestion is to have your devs examine their code wherever they open or close a connection to the server. They should also examine the cursor settings and possibly test different variations. If the problem still exists then open a support case with Microsoft. They recently closed a case for us that fixed a problem with .Net network connections (framework bug, not released yet) and maybe you're experiencing the same problem.

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-05-11 : 11:40:46

There are no other operations on this server, other than it being a domain controller. Nothing else runs on this SQL Server besides for my app.

I'm the dev :-) Connections are opened explicitly when clients log on to SQL Server and are closed explicitly when clients log off. This matches the results from Netstat showing only sensible ESTABLISHED connections. I've tested different cursor types - same issue. The cursors are the fastest ones possible.

If a cursor is slow, wouldn't it be consistently slow? Ditto for app code? What's happening is, that the clients can work for e.g. 3 minutes and all is hunky dory, and then someone will hang for a bit at any random point of the client app. Same scenario keeps on repeating, for different users, different parts of the application.

The same application is in use in different environments and this problem isn't occurring anywhere. It's not likely to be bad coding, unless it's bad coding interacting with data access gone wrong. It only happened when this particular client site migrated to a newer, more powerful server with SQL 2008. On the old rattletrap of a box running SQL 7 this problem did NOT exist.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-11 : 12:01:27

quote:
On the old rattletrap of a box running SQL 7 this problem did NOT exist.

Was that box also DC? Same OS too? Same .Net library? Any of these changes could be contributing, if not the single cause. You can't compare how an older version of SQL works vs. the newest one (as I'm finding out myself...stupid 3rd party app takes 3x longer to log out of the hot new server than the old...so much for progress)

It doesn't have to be a coding error, if this is indeed the same problem we had, it's a bug in the framework. I have a feeling it's not though, because you're not seeing waits on your netstats. Did you run that on the client machine(s) too?

Have you done any network monitoring to see if it's getting saturated? Or Wireshark? I can see a DC getting hammered every 3 minutes if it's also serving DNS. You may want to look at some .Net counters too, although I couldn't tell you which ones besides networking/TCP.

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-05-16 : 21:29:37

By bug in the framework, do you mean the network libraries? The clients are connecting using ADO 2.8.

The client machines are displaying plenty TIME_WAIT. There's one ESTABLISHED entry, and every time there's a freeze, there's a SYN_SENT entry and then an increase in TIME_WAIT entries. I.e. the first time it froze, there were two TIME_WAIT entries on serveraddress:1433, then the second time it froze there were four, etc. Not necessarily a doubling of TIME_WAITS, but an increase. And even when the app resumes, the TIME_WAITS don't go away.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-16 : 22:30:35

Interestingly, I went to a session at SQLRally last week that discussed wait stats, and the presenter mentioned ASYNC_NETWORK_IO as an indication of either network saturation and/or slow client processing. Is it possible the clients are not using the full network bandwidth? (10 Mbps vs. 100?) Are you able to test it with only 1 or 2 client connections and see if the waits still occur (including the TIME_WAITs)?

simondeutsch
Aged Yak Warrior

547 Posts

Posted - 2011-05-18 : 23:45:49

Tested it with only six connections on the server, of which I presume all but one or two are inactive, and was able to get the same result with wait stats. It hangs every 15 requests or so, and netstat shows constant increases in the TIME_WAIT connections on this client machine. It does not make sense for it to be slow client processing, which would be consistent.

I can't really see how it'd be possible for client computers not to utilize 100 mbps bandwith. The hardware is in place. This is happening on a LAN.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-19 : 07:30:11

We had a bad network cable knock a Gbit card down to 100 Mbit (found out JUST before rolling new server to production...PHEW!). I'm not saying all of these things are causes, I'm just suggesting based on my experience. If you haven't contacted Microsoft support yet, I think you should.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources