Author |
Topic |
dvdb
Starting Member
5 Posts |
Posted - 2008-01-22 : 12:37:34
|
I think I may need help with query optimization.We have a customer table and an address table. Each customer can have 1 or more addresses in the address table.The task is to synchronize customer information from 2 entirely separate systems. I have a stored procedure that reads a text file (exported from the 'master' system) and updates all the customer records in the second system (which is in the sql server db). The queries themselves work. The data is accurate. My issue is with how long it takes the stored procedure to run. With over 11,000 records in the text file, the stored procedure sometimes takes over 3 hours to run.That seems excessive to me. (and probably to those of you snickering in the back) As an overview: my stored procedure does the following.1) creates a temp table2) dumps the data from the text file into the temp table3) updates the address table with data from the temp table (records already there)4) inserts records into the address table from the temp table (records not already there)5) updates the customer table with an address id 6) drops the temp tableAny help/suggestions is appreciated. If you need more info, let me know. |
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2008-01-22 : 13:03:02
|
Probably the first thing to do is get a time break-down of each of these steps so you know where to put your effort. Either by adding some logging info or print statements with a timestamp or even just to do the logical equivilant of commenting out steps 3,4, and 5? Then if the process still takes 2 hrs 59 min you'll know there is a problem with the data transfer into sql server. If it runs quickly then try to just add step 3 back, and check the time again. Hopefully you'll find a specific thing that is taking all the time.Be One with the OptimizerTG |
 |
|
dvdb
Starting Member
5 Posts |
Posted - 2008-01-23 : 11:35:57
|
I broke it all down. The only thing left in each time I ran it was the create temp table, bulk insert, drop temp table. (For obvious reasons.) Nothing unusual resulted. Each update and insert section by itself ran between 3 and 5 minutes. Putting the SP back together it ran about 17 minutes. One note: this testing was done on my local and not the production server. I'm expecting a long run time again on the production server, as there are many other databases.Any thoughts on this? |
 |
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2008-01-24 : 07:26:33
|
Are you executing this with CURSORS in your code?Can you post sample code? |
 |
|
jackv
Master Smack Fu Yak Hacker
2179 Posts |
Posted - 2008-01-25 : 10:06:15
|
One thing you could look at doing is : when you create the temp table , make sure you create it with relevant indices and also that your destination table has relevant indices. You can build all this in programatically. And will probbably speed drastically.Jack Vamvas--------------------Search IT jobs from multiple sources- http://www.ITjobfeed.com |
 |
|
dvdb
Starting Member
5 Posts |
Posted - 2008-01-25 : 11:47:10
|
AndrewMurphy: No, I'm not using cursors. I am not familiar with cursors. Do you think it is worth the time to learn about it?jackv: So basically I should try using create index index_name etc. (with proper syntax of course) in the stored procedure and then drop the indexes? I'll try to find time to put up the sample code. Thanks for the input. |
 |
|
dvdb
Starting Member
5 Posts |
Posted - 2008-01-25 : 17:53:18
|
Below is the sample code. I tried to chop it up where the syntax would be obvious to keep it shorter. Again, the queries work and the data inserted and updated is accurate. My concern is the length of time the SP runs. Thanks!CREATE TABLE [dbo].[#tempTable]( ... [columns to hold data from tab-delimited text file] ...)BULK INSERT dbo.[#tempTable]FROM 'c:\addressimport.txt'update aset [AddressTable values are set = #tempTable values]from AddressTable ainner join #tempTable b ... insert into AddressTable([each of the values needed])select [values from the temp table as well as newid() and getdate()]from #tempTable binner join CustomerTable ...inner join Address ...where [address id in temp table] not in (select b.[address id] from #tempTable b inner join AddressTable a on b.[address id] like a.comparedValue)and [temp table customer id] in (select b.[customer id] from #tempTable b inner join CustomerTable c on b.[customer id] like c.comparedValue)update cset c.AddressID = a.AddressIDfrom CustomerTable cinner join AddressTable a ...inner join #tempTable b ...and c.comparedValue like b.[customer id]drop table #tempTable |
 |
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2008-01-28 : 07:37:39
|
1. stay away from cursors as much as you can....it's like fillinw a sugarbowl using a tweezers instead of a spoon...it works but painfully slow.2. can you post the execution plans from that query? for 11k rows, it should work a lot faster, unless you don't have indices to speed up performance...in which case you may be getting table scans. |
 |
|
dev2dev
Starting Member
48 Posts |
Posted - 2008-01-31 : 01:23:40
|
if i am correct, bcp is faster than bulk insert replace your bulk insert statement as xp_cmdshell 'bcp .....'compare the differenceare the text files (being imported) always located in sql server local drive? if its mapped drive, network speed, disk speed is a concern here. |
 |
|
dvdb
Starting Member
5 Posts |
Posted - 2008-01-31 : 09:33:09
|
I'm not familiar with bcp, so I'll have to read about that.The text files are on a mapped drive. |
 |
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2008-01-31 : 10:19:30
|
quote: Originally posted by dvdb I'm not familiar with bcp, so I'll have to read about that.The text files are on a mapped drive.
Am sure your performance isnt getting affected by using bulk insert.. neednt change it to bcp.. |
 |
|
|