Author |
Topic |
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-26 : 04:37:21
|
Hi all, I have just started a new data migration project where I'd like to use BULK INSERT instead of DTS (ref: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=68124). I have a 500MB txt file to start testing on with fixed-width columns without any delimiters. I have never imported such a file programatically before so if anyone has any pointers or best-practices I'm all ears... --Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-26 : 09:54:23
|
Here's a simple example file with the columns id, created and last_updated:000000154872006062020060623000000154882006062020060625000000154902006062021060622 What would the format-file look like for this kind of data?--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-06-26 : 10:31:05
|
i have no idea on the format file thing, but have you considered just bulk inserting into one column and the splitting the values from there?Go with the flow & have fun! Else fight the flow |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-06-26 : 10:47:59
|
Have a look athttp://www.nigelrivett.net/SQLTsql/BCP_quoted_CSV_Format_file.htmlIt has format files for delimitted files - fixed width are similar and it will show you how to easily test that you are on the right track.Have a go and post what you get.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-26 : 10:49:05
|
Yes I did consider it but it seems like it's possible using BULK INSERT and a format file and that might perform better...? The alternative would probably be to insert everything into a single column staging database and then select from there using substring? The thing is that I might have to use a staging database after all because I would get quite a few extra characters, I would f.ex. have to remove "000000" from "00000015487 in order to get usable data.--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-26 : 10:55:23
|
OK, I must admit I made a cross-post at a competing forum because I was kind of eager to get a test-solution up and running and I managed to get something out of it but it still doesn't work. Here's what I got:Error message:---------------------------------------------"Invalid data type for column number 1 in the format file "C:\Temp\bulk\bulk_format.txt"." ---------------------------------------------Format File:---------------------------------------------8.031 sqlchar 0 11 \0 1 id2 sqlchar 0 08 \0 2 created3 sqlchar 0 08 \0 3 last_updated---------------------------------------------Data file:---------------------------------------------000000154872006062020060623000000154882006062020060625000000154902006062021060622---------------------------------------------Database table---------------------------------------------CREATE TABLE [dbo].[bulktest]( [ID] [char](11), [Created] [char](8), [Last_updated] [char](8))---------------------------------------------SQL statement:---------------------------------------------BULK INSERT testing.dbo.bulktestFROM 'C:\Temp\bulk\bulk.txt' WITH (FORMATFILE='C:\Temp\bulk\bulk_format.txt');--------------------------------------------- --Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-06-26 : 11:42:07
|
Have a look at the link I posted.Use of format fileAnother way is to use a format file to omit columnsFormat file structure (see bol)8.0 version1 number of columns1 SQLCHAR 0 1 "\r\n" 1 col1 Latin1_General_CI_AS col number, file data type, prefix length, data length, terminator, table col order, table col name, file collationNote - can use version 7.0 and omit the collation.You need the collation for v8.0.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-27 : 03:54:28
|
Ok, it seems I can't get this to work if my life depended on it! Making me furious because I belive this is simple but it just doesn't work. This is what I got:SQL:BULK INSERT bulktestFROM 'C:\Temp\bulk\bulk.txt' WITH (FORMATFILE='C:\Temp\bulk\bulk_format.txt');Bulk.txt:000000154872006062020060623000000154882006062020060625000000154902006062021060622Bulk_format.txt:8.031 SQLCHAR 0 11 "" 1 id Danish_Norwegian_CI_AS2 SQLCHAR 0 08 "" 2 created Danish_Norwegian_CI_AS3 SQLCHAR 0 08 "\r\n" 3 updated Danish_Norwegian_CI_AS I have tried setting version 7 and removing collation, terminators \r and \n together and by themselves, separating the settings in the format file with single-space and tab...is it possible that someone do me the favor and try the code at your end? I have tried every single tweak I can think of and I just get error messages for source column 3 no matter what I do.--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-27 : 04:15:06
|
AAAAAAAAAAAAAAAAARRRRRRRRRRRRRRRRRRRRRRRGGGGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHH!!!I just found the problem and it the most annoying thing ever!! I kept getting error messages no matter what I did and now, after several hours, for some reason I hit "enter" and put a blank line after the last column in the format file and that made the difference. To be honest it pisses me off...and it almost ruins the pleasure of finally making it work. But still; thanx alot for your help --Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-06-27 : 05:03:06
|
Think you mean you hadn't terminated the last line i.e. it needs a crlf at the end of the last line.Anyway glad you've solved it and I'm not the only one to spot their stupid mistake today .==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-27 : 06:43:42
|
Yup, but it seems I might have found another stupid mistake...I have converted the format file to match my real data file and when I run it I get the following error:Bulk load: An unexpected end of file was encountered in the data file.Googeling the error gave me a kb-article from MS (http://support.microsoft.com/default.aspx/kb/271344) but that was in relation to blob-data, and then another sqlteam-guy having the same problem (http://sqlteam.com/forums/topic.asp?TOPIC_ID=67988) but no usable resolutions I belive. Any clues?--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-06-27 : 08:27:33
|
Don't think that applies.You have a crlf terminator according to the format file.Check there is one at the end of the file.Try it with a single line then two.If they fail then with a single column.Just find something that works and build up from there.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2006-06-27 : 09:20:03
|
Stupid mistake but I found out what happened. The original data-file was transferred using binary ftp instead of ascii so when I got a new file all worked out well. However, I decided to try the substring-method also and it actually performed alot better. Format file took about 6 mins and the substring took 2 mins. So I guess it all worked out in the end and I'm back to beeing a happy camper. I belive the rest (read: convincing the boss to use BULK INSERT instead of DTS) will be a walk in the park --Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-06-27 : 09:50:45
|
Another evangelist.Eventually the world will be converted to the true path.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Imrana
Starting Member
1 Post |
Posted - 2010-04-26 : 02:18:30
|
Hi Lumbago,pls help me with the Bulk insertI need to bulk insert a .txt file sayABC1234into a table wherecolumn 1 = ABColumn 2 = CColumn 3 = 1234through bulk insert.u have worked on it pls help me i am new to it |
|
|
|