Author |
Topic |
Crow555
Starting Member
5 Posts |
Posted - 2007-12-24 : 08:20:32
|
Hi Everyone, I've tried to find the answer to this problem on this forum and elsewhere but no success. I've got a CSV file with a number of columns exported from another system to be imported by ours. However, when one of the text columns contains a comma (,) the exporting system will wrap that particular field of data with double quotes. This of course means that Bulk Insert finds an extra column and that BCP with a format file can't look for '",' as a terminator as the double quote isn't always there.Has anyone encountered a problem like this and if so, how was it dealt with?Many thanks in advance,Dave |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-12-24 : 21:43:30
|
You should use other character as column delimiter, | for instance. |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2007-12-27 : 14:31:43
|
putting the fields within "....." might also help |
|
|
Crow555
Starting Member
5 Posts |
Posted - 2007-12-31 : 08:13:03
|
Unfortunately, the CSV file format is beyond our control as it is exported by a different system therefore we are stuck with this particular type of format.I'm considering amending our import stored procedure to read the file and change any commas (,) in between double quotes (") to pipes (|). Then after the import, updating the affected columns to change the pipes back to commas.However, my SQL isn't that good and any suggestions on how to achieve this (or even a better solution to this problem) would be very much appreciated. |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2007-12-31 : 08:25:23
|
You can achieve this using REPLACE function. look into BOL for syntax and usage. |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2007-12-31 : 09:20:44
|
How big are the files? If they are small you can use a function to get the olumns after importing. It's very slow though so you wouln't want to do it on large fiels.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Crow555
Starting Member
5 Posts |
Posted - 2007-12-31 : 09:44:26
|
Hi NR,The files vary but are usually sitting around 400-500kb as .csv files.I've been scratching my head and worked out that the openrowset functions can give me the results I want.For example, if these were lines in my file...1,Sam Smith,123,abc2,Tom Turner,456,def3,"Bill Brannigan, Developer",789,ghi4,Paul Palmer,910,jkl.. line 3 has the problem entry. Bulk insert picks this up as 5 columns instead of 4 and errors out. However, an openrowset query will return 4 columns and strip the double quotes. Now all I have to do is delimit that into a different file with a different delimiter but I'm stuck as to how to do that in T-SQL. Like I said, I'm not to hot at SQL. :( |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2007-12-31 : 09:58:29
|
Why not import using the openrowset?==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Crow555
Starting Member
5 Posts |
Posted - 2007-12-31 : 10:11:57
|
Like I said, I've hopeless at SQL and hoping to get better at it. :)The thing I've found about openrowset is that it takes the first row and makes that the column header meaning in my above example, the columns would be called '1', 'Sam Smith', '123', 'abc' and there would be only 3 lines. It's a small niggle but would mean one record would be missing.Given the position I'm in what would you recommend? |
|
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2008-01-02 : 05:59:01
|
Can you sort the files to put the double quote records to the top?Can you run a 2-pass extract to deal with the regualr recrods 1st and then the double-quote ones?Can you write a VB/C++/C# program to convert the "double-quote" to standard records and then pipe through the ETL process? |
|
|
Crow555
Starting Member
5 Posts |
Posted - 2008-01-02 : 06:14:37
|
Hi Andrew and all,I think I've worked out the problem but wanted to test it fully first before replying.WhereasSELECT * FROM OPENROWSET('MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=D:\Store;', 'select * from RolesTest.csv') Rowset_1 has no headers, this one includes column headers, albeit in F1, F2, F3 etc aliases (sp?)SELECT * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0', 'Text;Database=D:\Store;HDR=NO', 'select * from RolesTest.csv') Rowset_3 Having used this, I simply replaced the Bulk Insert clause in the stored procedure and added the above code into an insert statement (We create a hash table before this so SELECT INTO isn't used).And there we have it, all column displayed correctly and no missing rows. Thanks to everyone who replies, especially NR and I hope this thread will be of use to someone in the future. |
|
|
|