Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Data Corruption Issues
 TempDB is growing at an alarming rate!

Author  Topic 

NickStan
Starting Member

36 Posts

Posted - 2007-10-16 : 12:50:19
Hi

We are having problems with our SQL server 2000.
The problem is that on a daily basis we run out of disk space and I always have to run shrinkdatabase on tempdb.
Today we started with 160GB of free space and by the end of the day it was gone!

Yes we do have many jobs running on our SQL server pulling data in from many sources. But I dont know how to find out which job is causing this problem. I have a suspicion that it could be a job that runs hourly that pulls data from Oracle (approximately 10000 rows each time), but that job has been active since the 28th August 2007. We only started running out of space in the past 5 days. Any suggestions would be appreciated as to what is causing this or how to diagnose the problem.

Thanks



X002548
Not Just a Number

15586 Posts

Posted - 2007-10-16 : 13:07:06
Are the processes all in separate transactions?

Or is one massive process without commits?


Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-16 : 13:36:32
"I always have to run shrinkdatabase on tempdb"

Please note that is a very dangerous process, and is likely to lead to corruption of the database

See: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=55210&SearchTerms=Shrinking+TEMPDB,Database%20Shrinking,Shrink,Shrinking

Kristen
Go to Top of Page

NickStan
Starting Member

36 Posts

Posted - 2007-10-17 : 05:39:07
quote:
Are the processes all in separate transactions?


Well the processes that import from Oracle are all in DTS packages that do a bulk insert into SQL.
I cannot at any point see a 'COMMIT' statement. Is it needed in a bulk insert?

The dts that I suspect is causing the problem does the following:

1) Truncates a SQL table.
2) Does a massive select on one of our tables looking for certain records from another table. This table has about 20 million records and is indexed. The select to find the related records takes approximately 1 hour.
3) Updates a table based on anything found in step 2.
4) Inserts records into a table based on anything found in step 2.

I think that the 'massive' select in step 2 could be causing the tempdb to grow so much. Any advice/views?

Thanks
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-17 : 06:11:46
is (2) "joining" between the databases to work out what has changed?

'Coz in my experience that is a resource-heavy method of doing it (not relevant to the TEMPDB size issue though)

Kristen
Go to Top of Page

NickStan
Starting Member

36 Posts

Posted - 2007-10-17 : 08:53:43
quote:
is (2) "joining" between the databases to work out what has changed?


No, I am not joining between databases.

All I am doing is a simple join from one table to another in the same SQL server database.

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2007-10-17 : 09:33:31
My guess is that tempdb file has grown but the actual amount of data is small

Now 160GB is a lot, and the only reason I would know for it to grow if it was being naged on by

1). Many operations being done with out any commits
2). Many separate processes running asynchronously

In any case you should set it up so the processes let tempdb free the space so it doesn't grow the file



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

NickStan
Starting Member

36 Posts

Posted - 2007-10-17 : 11:11:24
quote:
My guess is that tempdb file has grown but the actual amount of data is small

Yes that is correct.

Yes, we do have many sql jobs running asynchronously.
Yes, many operations are being executed without commit.
I dont know of a way to execute the commit statement if I am importing the data in a data pump task.

I will uncheck the option to automatically grow the size of tempdb and limit the database size to 1GB.

Thanks
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2007-10-17 : 11:42:40
My guess is it's not the load but the transformations I'm sure you're doing if you've got DTS

You many files get brought in by any 1 package?

If there are many you might want to do this in between

BEGIN TRAN

SQL

COMMIT TRAN
GO
SELECT @@TRANCOUNT AS TRANCOUNT
GO
BACKUP LOG Tracker_Data WITH TRUNCATE_ONLY
GO


Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-17 : 11:59:43
"BACKUP LOG Tracker_Data WITH TRUNCATE_ONLY"

Won't that break the backup chain? In which case some actions ought to be taken to allow recovery - such as a Full Backup as soon as the DTS stuff is all finished.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2007-10-17 : 12:07:04
OK, then do a backup after each process

How big is the database


Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

NickStan
Starting Member

36 Posts

Posted - 2007-10-18 : 04:31:32
Thanks for the reply Brett.

What I did yesterday is disable the hourly process that does a bulk insert from Oracle to SQL.

Now the size of tempdb did not grow at all since last night!
It stayed at 8MB.
So obviously this hourly process is causing the hassle.
It does not import many records (maybe 10000), however it is a very 'heavy' slow query that takes about 50 minutes to run.
The Oracle table that it accesses is a massive table and access is rather slow.


Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-18 : 04:34:39
Could you use a staging table at the Oracle end, or is all modification to that dis-allowed?
Go to Top of Page

NickStan
Starting Member

36 Posts

Posted - 2007-10-18 : 04:51:36
Ah yes, I have thought of using a staging table.

Its not allowed on the Oracle database that I am selecting from, however I have another Oracle database that I can create a staging table in that I can first select the data into and then pump it across into SQL.

I will give that a try.

Thanks
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-18 : 05:15:04
What I was thinking of is to have a table close to the Source with all the data in it, and an "UpdatedDate" column. This should only contain the columns you are interested in (which may be less than the full set of columns available)

Then pull the data from Oracle and selectively update this staging table:

UPDATE D
SET UpdateDate = [Now],
MyCol1 = S.MyCol1,
...
FROM SourceTable AS S
JOIN DestinationTable AS D
ON D.MyPK = S.MyPK
WHERE (S.MyCol1 <> D.MyCol1
OR (S.MyCol1 IS NULL AND D.MyCol1 IS NOT NULL)
OR (S.MyCol1 IS NOT NULL AND D.MyCol1 IS NULL))
OR (S.MyCol2 <> D.MyCol2
...

that way only the rows that have changed will get a fresh UpdatedDate, and then you can just pull rows with a recent UpdatedDate across to SQL Server.

The Source database may have a reliable UpdatedDate column, which will reduce the amount you have to pull (and then you can still pull the whole lot in the middle of the night, or weekend, for belt & braces)

You need to Insert any new rows, and flag deleted rows too. (Need another column for that!)

We work on the basis of having this "transfer table" really close to the Source database (ideally on the same server), keep the table structure the same as the source, but omit columns we are not interested in.

Then have an identical table as close to the Destination server as possible. Pull data between the two, based on the UpdateDate. And then update the real table at the destination end - i.e. data transformation takes place at the Destination end.

Thus if the link between Source and Destination is narrow-bandwidth, and we need to change the transformation algorithm, we will already have the "raw" source data at the destination end, and can just do an Update of the Real table from there.

If we need an additional column that has to be got through by some other means ... usually a one-time
SELECT MyPK, MyNewColumn1, MyNewColumn2 FROM REMOTE
type thing!

Kristen
Go to Top of Page
   

- Advertisement -