Author |
Topic |
NickStan
Starting Member
36 Posts |
Posted - 2007-10-16 : 12:50:19
|
HiWe are having problems with our SQL server 2000.The problem is that on a daily basis we run out of disk space and I always have to run shrinkdatabase on tempdb.Today we started with 160GB of free space and by the end of the day it was gone!Yes we do have many jobs running on our SQL server pulling data in from many sources. But I dont know how to find out which job is causing this problem. I have a suspicion that it could be a job that runs hourly that pulls data from Oracle (approximately 10000 rows each time), but that job has been active since the 28th August 2007. We only started running out of space in the past 5 days. Any suggestions would be appreciated as to what is causing this or how to diagnose the problem.Thanks |
|
X002548
Not Just a Number
15586 Posts |
|
Kristen
Test
22859 Posts |
|
NickStan
Starting Member
36 Posts |
Posted - 2007-10-17 : 05:39:07
|
quote: Are the processes all in separate transactions?
Well the processes that import from Oracle are all in DTS packages that do a bulk insert into SQL.I cannot at any point see a 'COMMIT' statement. Is it needed in a bulk insert?The dts that I suspect is causing the problem does the following:1) Truncates a SQL table.2) Does a massive select on one of our tables looking for certain records from another table. This table has about 20 million records and is indexed. The select to find the related records takes approximately 1 hour.3) Updates a table based on anything found in step 2.4) Inserts records into a table based on anything found in step 2.I think that the 'massive' select in step 2 could be causing the tempdb to grow so much. Any advice/views?Thanks |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-10-17 : 06:11:46
|
is (2) "joining" between the databases to work out what has changed?'Coz in my experience that is a resource-heavy method of doing it (not relevant to the TEMPDB size issue though)Kristen |
|
|
NickStan
Starting Member
36 Posts |
Posted - 2007-10-17 : 08:53:43
|
quote: is (2) "joining" between the databases to work out what has changed?
No, I am not joining between databases.All I am doing is a simple join from one table to another in the same SQL server database. |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2007-10-17 : 09:33:31
|
My guess is that tempdb file has grown but the actual amount of data is smallNow 160GB is a lot, and the only reason I would know for it to grow if it was being naged on by1). Many operations being done with out any commits2). Many separate processes running asynchronouslyIn any case you should set it up so the processes let tempdb free the space so it doesn't grow the fileBrett8-)Hint: Want your questions answered fast? Follow the direction in this linkhttp://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxAdd yourself!http://www.frappr.com/sqlteam |
|
|
NickStan
Starting Member
36 Posts |
Posted - 2007-10-17 : 11:11:24
|
quote: My guess is that tempdb file has grown but the actual amount of data is small
Yes that is correct.Yes, we do have many sql jobs running asynchronously.Yes, many operations are being executed without commit.I dont know of a way to execute the commit statement if I am importing the data in a data pump task.I will uncheck the option to automatically grow the size of tempdb and limit the database size to 1GB.Thanks |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2007-10-17 : 11:42:40
|
My guess is it's not the load but the transformations I'm sure you're doing if you've got DTSYou many files get brought in by any 1 package?If there are many you might want to do this in betweenBEGIN TRANSQLCOMMIT TRANGOSELECT @@TRANCOUNT AS TRANCOUNTGOBACKUP LOG Tracker_Data WITH TRUNCATE_ONLYGOBrett8-)Hint: Want your questions answered fast? Follow the direction in this linkhttp://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxAdd yourself!http://www.frappr.com/sqlteam |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-10-17 : 11:59:43
|
"BACKUP LOG Tracker_Data WITH TRUNCATE_ONLY"Won't that break the backup chain? In which case some actions ought to be taken to allow recovery - such as a Full Backup as soon as the DTS stuff is all finished. |
|
|
X002548
Not Just a Number
15586 Posts |
|
NickStan
Starting Member
36 Posts |
Posted - 2007-10-18 : 04:31:32
|
Thanks for the reply Brett.What I did yesterday is disable the hourly process that does a bulk insert from Oracle to SQL.Now the size of tempdb did not grow at all since last night! It stayed at 8MB.So obviously this hourly process is causing the hassle.It does not import many records (maybe 10000), however it is a very 'heavy' slow query that takes about 50 minutes to run.The Oracle table that it accesses is a massive table and access is rather slow. |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-10-18 : 04:34:39
|
Could you use a staging table at the Oracle end, or is all modification to that dis-allowed? |
|
|
NickStan
Starting Member
36 Posts |
Posted - 2007-10-18 : 04:51:36
|
Ah yes, I have thought of using a staging table.Its not allowed on the Oracle database that I am selecting from, however I have another Oracle database that I can create a staging table in that I can first select the data into and then pump it across into SQL.I will give that a try.Thanks |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-10-18 : 05:15:04
|
What I was thinking of is to have a table close to the Source with all the data in it, and an "UpdatedDate" column. This should only contain the columns you are interested in (which may be less than the full set of columns available)Then pull the data from Oracle and selectively update this staging table:UPDATE DSET UpdateDate = [Now], MyCol1 = S.MyCol1, ...FROM SourceTable AS S JOIN DestinationTable AS D ON D.MyPK = S.MyPKWHERE (S.MyCol1 <> D.MyCol1 OR (S.MyCol1 IS NULL AND D.MyCol1 IS NOT NULL) OR (S.MyCol1 IS NOT NULL AND D.MyCol1 IS NULL)) OR (S.MyCol2 <> D.MyCol2... that way only the rows that have changed will get a fresh UpdatedDate, and then you can just pull rows with a recent UpdatedDate across to SQL Server.The Source database may have a reliable UpdatedDate column, which will reduce the amount you have to pull (and then you can still pull the whole lot in the middle of the night, or weekend, for belt & braces)You need to Insert any new rows, and flag deleted rows too. (Need another column for that!)We work on the basis of having this "transfer table" really close to the Source database (ideally on the same server), keep the table structure the same as the source, but omit columns we are not interested in.Then have an identical table as close to the Destination server as possible. Pull data between the two, based on the UpdateDate. And then update the real table at the destination end - i.e. data transformation takes place at the Destination end.Thus if the link between Source and Destination is narrow-bandwidth, and we need to change the transformation algorithm, we will already have the "raw" source data at the destination end, and can just do an Update of the Real table from there.If we need an additional column that has to be got through by some other means ... usually a one-time SELECT MyPK, MyNewColumn1, MyNewColumn2 FROM REMOTE type thing!Kristen |
|
|
|