Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2012 Forums
 SQL Server Administration (2012)
 SQL instance failing due tempdb space issues

Author  Topic 

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2013-05-23 : 10:30:09
1st, I should clarify my instance is Cluster aware and running SQL2012.

The issue started or I noticed it, after few planned failovers due an scheduled Cluster's patch for an iSCSI / SAN issue. Everytime I try to move the whole SQL failover instance, all resources come online on the other node but the SQL one. It stays online for a few seconds and then go down again. Here's the SQL log error:


2013-05-22 17:42:40.72 spid11s Error: 1205, Severity: 13, State: 35.
2013-05-22 17:42:40.72 spid11s Transaction (Process ID 11) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
2013-05-22 17:42:40.72 spid11s Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.
2013-05-22 17:42:40.72 spid11s SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.
2013-05-22 17:42:41.11 Logon Error: 18456, Severity: 14, State: 38.


tempdb_data LUN is 250GB and it is currently using 20GB of that. tempdb_logs is 200GB and it is using 12GB only. I also kept an eye on those numbers for about a week and they remain the same. Upper limit was set to unlimited so if they have to growth after a restart, they can. So the lack of temdpb space makes no sense to me.

I checked instant file initialization and SQL service is added to the local policies there.

SQL 2012 is running on top of Windows 2008R2.

Any hint?

ahmeds08
Aged Yak Warrior

737 Posts

Posted - 2013-05-23 : 10:42:27
If tempdb is growing then there might be some big transaction running behind.Is there any data loading like heavy inserts,updates or deletes running?
Any Index rebuilds running??
If yes,you need to chunk them into small batches.

mohammad.javeed.ahmed@gmail.com
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2013-05-23 : 10:47:35
quote:
Originally posted by ahmeds08

If tempdb is growing then there might be some big transaction running behind.Is there any data loading like heavy inserts,updates or deletes running?
Any Index rebuilds running??
If yes,you need to chunk them into small batches.

mohammad.javeed.ahmed@gmail.com



No data loading during the maintenance window, which is usually when I do the planned failover.

But we do have 500+ databases there. It is virtually impossible for me to stop any outstanding activity, if one. I do stop new client connectivity though, via web server and SQL connection string.

Also, the SQL failover should work no matter what. That's the purpose of the Cluster's setup. Of course, some transactions maybe lost (rollback or rolled forward) but the overall Cluster functionality should work. As a matter of fact, this is not my 1st Cluster setup but the 1st time I see this tempdb issue.

By the way, thanks for reply.
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2013-05-23 : 11:10:27
What do you see in the windows application event logs?

And what do you mean by "try to move the failover instance"? Do you mean failover?

Is tempdb path correct on passive node?
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2013-05-23 : 11:44:04
quote:
Originally posted by russell

What do you see in the windows application event logs?

And what do you mean by "try to move the failover instance"? Do you mean failover?

Is tempdb path correct on passive node?



Hi there Russell,

Yes, I mean, failing over. Moving from active node to passive node.

Not sure if I understand your question about tempdb path. This is a Cluster aware instance, so the path is the same. tempdb has its own LUN and the tempdb disk or LUN moves to the other node fine. Of course, the internal SQL path is still the same.

SQL eventually goes up (without me changing anything) but I usually need to "kill" or reboot the passive node and/or keep forcing or bringing the SQL resource online. This us unacceptable and in case of a real crash, it will stay down.

Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2013-05-23 : 12:33:21
What are you seeing in the event logs?
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2013-05-23 : 16:10:28
quote:
Originally posted by russell

What are you seeing in the event logs?



This is what I can see on the system logs, 1 second after tempdb's error


The SQL Server service terminated with service-specific error The specified resource name cannot be found in the image file..
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2013-05-23 : 17:09:14
I would expect there would be more interesting info in the event logs than that.

Tried moving tempdb?

Go to Top of Page

jackv
Master Smack Fu Yak Hacker

2179 Posts

Posted - 2013-05-24 : 01:54:41
Analyze the Tempdb activity , and develop some tactics for alleviating pressure - this post has some ideas
http://www.sqlserver-dba.com/2011/04/tempdb-performance-and-strategy-checklist.html

Jack Vamvas
--------------------
http://www.sqlserver-dba.com
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2013-05-24 : 07:51:02
Well, I thought about that ... but only place would be "c" drive, which is local to the Proliant server. tempdb is on a SAN, RAID10 LUN.

I already saw tempdb activity. It is normall or low, does not exceed the upper limits.

This is a really weird problem that I have never seen before. Little or no errors on system log that do not help either.
Go to Top of Page

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2013-06-07 : 13:19:07
How are the LUNS created? Are they iSCSI attached?

I have seen issues where the LUNS are not actually available before SQL Server tries to start and this causes a failure - very similar to what you are seeing. I would validate that SQL Server is dependent on all of the LUNS it uses - and that those LUNS are not available until they have actually been mounted to the server.
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2013-06-21 : 16:08:47
quote:
Originally posted by jeffw8713

How are the LUNS created? Are they iSCSI attached?

I have seen issues where the LUNS are not actually available before SQL Server tries to start and this causes a failure - very similar to what you are seeing. I would validate that SQL Server is dependent on all of the LUNS it uses - and that those LUNS are not available until they have actually been mounted to the server.



Well, I do not think that's the case here, because the dependencies are fine. The disk resources go up before SQL server and SQL server depends of the Disk Resources (they go 1st)

I actually check those one by one and manually (disk resources) and they move back and forth without issues.
Go to Top of Page
   

- Advertisement -