Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Barry Seymour
Starting Member
2 Posts |
Posted - 2015-05-06 : 20:25:22
|
I have a really STRANGE issue that's driving me nuts. I am getting inconsistent results when BULK INSERTING data from a tab-delimited text file. As part of my testing, I run the same code on the same file again and again, and I get different results every time! I get this on SQL 2005 and SQL 2012 R2.We have an application that imports data from a spreadsheet. The sheet contains section headers with account numbers and detail rows with transactions by date:AAAA.1234 /* (account number)*/1/1/2015 $150 First Transaction1/3/2015 $24.233 Second TransactionBBBB.56781/1/2015 $350 Third Transaction1/3/2015 $24.233 Fourth TransactionMy Import program saves this spreadsheet at tab-delimited text, then I use BULK INSERT to bring the data into a generic table full of varchar(255) fields. There are about 90,000 rows in each day's data; after the BULK INSERT about half of them are removed for various reasons.I then run a routine that converts and copies those records into another holding table that's a copy of the final destination table. That routine parses though the data, assigning the account number in the section header to each detail row. The goal is to have it end up looking like this:AAAA.1234 1/1/2015 $150 First PurchaseAAAA.1234 1/3/2015 $24.233 Second PurchaseBBBB.5678 1/1/2015 $350 Third PurchaseBBBB.5678 1/3/2015 $24.233 Fourth PurchaseMy technique: I add a RowID column to the table with the IDENTITY (1,1) property. This gives my raw data unique row numbers.I use a cursor to get the starting RowID for each Account Number: I then use the upper and lower RowIDs to do an INSERT into the final table. The query looks like this:SELECT RowID, SUBSTRING(RowHeader, 6,4) + '.UBC1' AS AccountNumberFROM GenericTableWHERE RowHeader LIKE '____.____%'Results look like this:But every time I run the routine, I get different numbers!Needless to say, my results are not accurate. I get inconsistent results EVERY TIME. It's maddening! Here is my code, with table, field and account names changed for business confidentiality. I’m open to suggestions; I’m truly stuck. This is a high profile project at my company; any help would be deeply appreciated.TRUNCATE TABLE GenericImportTable;ALTER TABLE GenericImportTable DROP COLUMN RowID;BULK INSERT GenericImportTable FROM '\\SERVER\General\Appname\DataFile.2015.05.04.tab.txt'WITH (FIELDTERMINATOR = '\t', ROWTERMINATOR = '\n', FIRSTROW = 6)ALTER TABLE GenericImportTable ADD RowID int IDENTITY(1,1) NOT NULLSELECT RowID, SUBSTRING(RowHeader, 6,4) + '.UBC1' AS AccountNumberFROM GenericImportTableWHERE RowHeader LIKE '____.____%' |
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2015-05-06 : 20:36:30
|
Are you relying on the ALTER TABLE ADD IDENTITY to always give you the same order? That's the issue right there. The order in a table is meaningless. Instead, drop the table and then recreate with the identity column. It'll be in the order of the file now.Tara KizerSQL Server MVP since 2007http://weblogs.sqlteam.com/tarad/ |
|
|
Barry Seymour
Starting Member
2 Posts |
Posted - 2015-05-07 : 00:08:31
|
Thanks, Tara. Yeah, that was my assumption. Now that I know my assumption is incorrect, I know what to do. My VB app opens the worksheet and saves it as Tab-delimited text -- I can add a bit that inserts a row number to the text file. At that point my BULK INSERT will result in a table with properly numbered rows. The rest will work as originally desired.Thanks! |
|
|
|
|
|
|
|