Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
photond
Starting Member
20 Posts |
Posted - 2013-10-18 : 17:40:39
|
Hi, so I'm new to SQL Server and I recently started at a new job. We currently have about 50 jobs that run throughout the night that pull from a linked server and insert into our data warehouse.Most of the jobs take a few minutes, but there are some that take several hours. The jobs are set up through SQL Agent as T-SQL jobs. We truncate the tables in our data warehouse daily and then load the records. Is there a more efficient way to do these loads in SSIS or a way to only load the changes that occur from the previous day?Here's an example:insert into dbo.claimdetail fieldsselect fieldsfrom LinkedServer.prod1.dbo.clmdetwhere exists (select * from LinkedServer.prod1.dbo.ClaimsToPull where blah blah) |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2013-10-20 : 03:50:58
|
problem might be related to logic you've implemented. Also linked server connection can also increase query response times.We might need to check queries for that.What you could so is to analyze execution plans and understand bottlenecks.a way to only load the changes that occur from the previous dayThis is possible if you've an audit column (like datecreated,datemodified) in table. You just need to retrieve latest audit column value from your destination table each day and use a query likeSELECT * FROM SourceTable WHERE AuditColumn > (Max Value retrieved) to retrieve only new records after last job execution.Or if you've a Unique sequential primary key (IDENTITY column) Then also you can capture MAX(ID) from destination and use logic ID > MaxValue------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/https://www.facebook.com/VmBlogs |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2013-10-21 : 16:38:15
|
Truncating and loading is probably the most inefficient way to do things depending on the amount of change that is happening in the source. Also, if you are using linked servers, hopefully, you are not doing any sort of join between the two servers, as that can lead to some major performance issues.As you mentioned SSIS can be used and it may offer better performance. Can Visakh has shown, if you have some value or date in your destination table that you can use to filter the source to get the changes rows, then that can be a decent option to only move new/changes rows instead of all the rows.My only real advice here is to test out several scenarios and see what shakes out based on your needs and environment. |
|
|
|
|
|