Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 Development Tools
 Other Development Tools
 Iterations - Best Practices

Author  Topic 

CSharper
Starting Member

4 Posts

Posted - 2004-08-26 : 13:39:14
Hello everyone and thanks in advance.

Ok, here's the deal. I am 1 of 2 developers at my organization, the other is a COBOL programmer. Long story short, every night he sends me a 22MB XML file containing our company's inventory. This contains about 40,000 items. In C#, I'm basically looping through all the XML nodes and on each node I load up an object with the particular properties (manufacturer, SKU, serial number, etc.) and call a Stored Procedure to to save the item.

In saving the item, I'm checking to see if it was already in my inventory and I either perform an update or an insert on the table as approriate. The query does a couple of other things like retreiving the Purchase Order Number, Cubic Feet, etc. from other tables. All in all, the query takes about 1-2 seconds to complete. I've optimized the query as much as I know how (including the addition of indexes), but this still takes at least 6 hours to complete the 40,000 items. Is this the best approach? Any other suggestions?

So far the only things I've thought about changing is:
1. Break it out into multiple queries to avoid poor compilation of execution plans

2. Multithread the C# app so that I'm running like 10 queries at once.

Again, Thanks in advance!

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-08-26 : 16:08:16
1. Ask the COBOL programmer to send you a semicolon separeted text file instead of the XML
2. BULK INSERT the text file into SQL Server
3. Do Set Based SQL to insert the data

For 40.000 items we should be talking seconds and minutes.

/rockmoose
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2004-08-27 : 06:44:54
Or use your c# app to reformat to csv (or fixed length) for the import.
The problem will be that you are performing a database access for each item - you need to be able to import all the data in a batch then mainipulate it.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-08-27 : 07:47:22
The real scapegoat is as nr says, the one by one manipulation of each record.
Try to get the file into a sql server table. From there you can do anything you want to it with SQL.
This will be tons faster.
Unless of course you resolve to writing a cursor in sql... which is one by one manupulation :-(

/rockmoose
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2004-08-27 : 09:43:18
or even simpler, load the xml into a dataset and then simply Update that dataset.
you could also use OPENXML in sql server.


Go with the flow & have fun! Else fight the flow :)
Go to Top of Page

CSharper
Starting Member

4 Posts

Posted - 2004-08-27 : 09:46:25
Thank you both for your input. Defintely a great way to look at it, and I can't wait to try it out. I appreciate your prompt replies. I love the idea of doing a single insert and a single update. The results should be phenomenal!

Thanks again!

I hope others will browse this topic and get value from it as well
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-08-27 : 14:08:31
quote:
Originally posted by spirit1

or even simpler, load the xml into a dataset and then simply Update that dataset.
you could also use OPENXML in sql server.


Go with the flow & have fun! Else fight the flow :)



This will be s l o w e r though.
The dataset loops through all the datarows and performs appropriate insert/update/delete/no action per row.

Edit: also there is also business logic to take in account that most certainly will make this approach unfeasible.

/rockmoose
Go to Top of Page

CSharper
Starting Member

4 Posts

Posted - 2004-08-27 : 19:08:21
quote:
Originally posted by rockmoose
This will be s l o w e r though.
The dataset loops through all the datarows and performs appropriate insert/update/delete/no action per row.

Edit: also there is also business logic to take in account that most certainly will make this approach unfeasible.

/rockmoose



Agreed. That's trying to make C# do what is clearly SQL's job.

Thanks rockmoose!
Go to Top of Page
   

- Advertisement -