Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SSIS and Import/Export (2008)
 SSIS 2008 SCD Performance Issues

Author  Topic 

BhaskarDasari
Starting Member

19 Posts

Posted - 2009-05-19 : 12:05:27
Hi

We are having 20 dimension tables and each table will be having around 20 million records.

These tables would be loaded on a daily frequency with 5 files, each of 3 million records.

We are currently using SCD transformation for TYPE2 load of data.( to maintain history in the dimension table.)

But SCD is taking a long time to insert the data and below are the statistics that I recorded when I executed the package with sample files:

Run1: File1(0.5 million records) —2 minutes (Dimension Table is empty)
Run2: File2(0.5 million records) —13 minutes (Table has 589,000 records)
Run3: File3(0.5 million records) —26 minutes (Table has 1,140,000 records)
Run4: File4(0.5 million records) —37 minutes (Table has 1,680,000 records)
Run5: File5(1 million records) —51 minutes (Table has 2,780,000 records)


Package elapsed time : 2 hr 9 min

1. How do i improve the performance of the SCD? If not, is there any way of loading a table parallely from file so that i can achive performance?

2. In informatica, we have a partitioning feature to load the data parallely which greatly improves performance. Is there any equivalent feature or workaround in SSIS?

Any help would be greatly appreciated.


Thanks,
Bhaskar



visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-05-19 : 13:28:13
you could try adjusting buffer size to get some performance boost. you could also try table difference component given here

http://www.sqlbi.eu/Home/tabid/36/ctl/Details/mid/374/ItemID/0/Default.aspx
Go to Top of Page

BhaskarDasari
Starting Member

19 Posts

Posted - 2009-05-20 : 14:20:55
Actually i have two staging tables for doing some column level (stage1)and lookup validations(stage2).
i have base and child table functionality and Currently one base table and 20 child table.
In base table SCD ,when the change happens to Variant column for set of businesskey all the child tables would be updated as we have enabled on update cascade option for all the child tables.

( Currently child table SCD is just inserting rows)

5 files i have tested with are not causing any updates so as i mensioned above all records are inserted.

here my doubt even for just inserting SCD taking so long time.worring about production if dimension grows in billion package may run for even 6 hrs.

i could not even able to convert SCD into lookup and conditional split since lot of changes are required in my package.(not allowed ).

questions:

How can i gain performance with Cach Lookup tranformation when this should also perform row by row operation?

Can u plz suggest me approach which required minimall changes to my package?

Are there any settings to be changed to SCD?

Is there any of loading the same table parallely in streams which might improve performance(i feel)?

i would not be allowed to use third party SCD components.


Please guide some good approach which would greatly help me in improving package performance.

Bhaskar
Go to Top of Page
   

- Advertisement -