Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
kce1900
Starting Member
2 Posts |
Posted - 2004-08-19 : 05:32:32
|
| My ultimate goal is to import some mix of Chinese and English froma set of files into a database. Since I have multiple files and since I'm not good at writing iterations in DTS, I'd like to use stored procs with BULK INSERT. The input format will be csv with quoted strings and there is no guarantee that there are no commas in the strings. Therefore I'll need a format file to strip the quotes. But for now, I'm trying very simple things... To make the long story short: as long as I am using ASCII files, I don't have any problem, but if I save the input file as UNICODE, things start falling apart. Here are the details:/* Contents of the the input files xx-a.txt and xx-u.txt: */one,two,threeun,deux,troixeins,zwei,drei/* Contents of the the input files yy-a.txt and yy-u.txt: */one two three <CR><LF>un deux troix <CR><LF>eins zwei drei <CR><LF>/* Input files were written in Notepad, -a saved as ANSI and -u saved as UNICODE *//* Contents of format file xx.fmt: */8.031 SQLCHAR 0 10 "," 1 a SQL_Latin1_General_CP1_CI_AS2 SQLCHAR 0 10 "," 2 b SQL_Latin1_General_CP1_CI_AS3 SQLCHAR 0 10 "\r\n" 3 c SQL_Latin1_General_CP1_CI_AS/* Contents of format file yy.fmt: */8.031 SQLCHAR 0 10 "" 1 a SQL_Latin1_General_CP1_CI_AS2 SQLCHAR 0 10 "" 2 b SQL_Latin1_General_CP1_CI_AS3 SQLCHAR 0 10 "" 3 c SQL_Latin1_General_CP1_CI_AS/* the test table */CREATE TABLE [dbo].[xx] ( [a] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [b] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [c] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ) /* test 1: ASCII input with FORMATFILE */BULK INSERT xxFROM 'E:\test\xx-a.txt'WITH ( FORMATFILE = 'e:\test\xx.fmt')/* test 2: UNICODE input with FORMATFILE */BULK INSERT xxFROM 'E:\test\xx-u.txt'WITH ( DATAFILETYPE = 'widechar', FORMATFILE = 'e:\test\xx.fmt')/* test 3: UNICODE input with FIELDTERMINATOR */BULK INSERT xxFROM 'E:\test\xx-u.txt'WITH ( DATAFILETYPE = 'widechar', FIELDTERMINATOR = ',')/* test 4: ASCII input, fixed length */BULK INSERT xxFROM 'E:\test\yy-a.txt'WITH ( FORMATFILE = 'e:\test\yy.fmt')/* test 5: UNICODE input, fixed length */BULK INSERT xxFROM 'E:\test\yy-u.txt'WITH ( DATAFILETYPE = 'widechar', FORMATFILE = 'e:\test\yy.fmt')/*------------------------------------------------------------------*/Results:Test 1, Test3 and Test 4 work fine.Test 2 gives the following error message:Bulk insert data conversion error (truncation) for row 1, column 2 (b).Test 5 produces a checker board: line 1: a='one', b='', c='two' line 2: a='', b='three', c='' etcSystem used: English Windows 2000 AS SP4, SQL Server 2000 EE 8.00.760 (SP3)/*------------------------------------------------------------------*/ |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2004-08-20 : 13:12:08
|
| Try it with SQLNCHAR in the format file.>> Therefore I'll need a format file to strip the quotesNot necessarily.Depending on how much data have a look athttp://www.nigelrivett.net/ImportTextFiles.htmlhttp://www.nigelrivett.net/f_GetEntryDelimiitted.htmlFor format files for quote delimitted files seehttp://www.nigelrivett.net/BCP_quoted_CSV_Format_file.html==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
|
kce1900
Starting Member
2 Posts |
Posted - 2004-08-21 : 02:45:53
|
| Thanks for all the hints on your website! Been there before and found quite a few helpful things.I tried using SQLNCHAR in the format file, but that won't work either. Depending on the settings for the prefix and the length I get different error messages.But here is something else: I tried Test 2 (that's the only relevant one anyway) in bcp, and I got the following:--------------------------------------------------E:\test>bcp testDB..xx in xx-u.txt -f xx.fmt -Psecret -wWarning: -w overrides -f.Starting copy...SQLState = S1000, NativeError = 0Error = [Microsoft][ODBC SQL Server Driver]Unexpected EOF encountered in BCP data-file0 rows copied.--------------------------------------------------That warning clearly tells us that in bcp you can't have Unicode data and format files at the same time. That leads me to the next question: Are BULK INSERT and bcp two different implementations or just different interfaces for the same thing? If it is the latter, then I can give up. In DTS, on the other hand, I don't have any problem doing the imports... |
 |
|
|
|
|
|
|
|