Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 Importing Flat File with CR Carriage Return

Author  Topic 

jmill130
Starting Member

26 Posts

Posted - 2009-02-16 : 09:39:38
I am having some trouble importing a flat text file. I am hoping there is a way around my problem since the group i receive this data from is not so helpful in modifying the format.

Source Data File:
-----------------
-Text file with comma delmited and double quotes around all fields
-File has 15 columns, some are dates, some text, some are names with commas in them.
-27000 records/lines in file

Below is a sample of one line of data:
--------------------------------------
"45435435","345435435345","JHCP at White Marsh","Smith, John","19-NOV-08","","","","","","","","","","Indication: headache{CR}
Specialist may order lab test"{CR}{LF}

Problem:
--------
I cannot get this file to import because of the last field. I have tried all of the settings(ignoring errors and still will not work) in the import wizard within SQL08. I have also tried MS ACCESS, and sql2000. Access will import it but then for some reason i have about 27 records that i get "Unparsable Record" as error. I look at the lines in notepad++ and i dont see anything out of the ordinary on any of these records it indicates has problems. I would like to create a saved import in SSIS for this and not have to use MS Access but access seems to import the file at least. I can only get about 1000 records in SQL and then it gives me errors.

Is there a way to get SQL server to import this file even though there is a carriage return inside the field. I have indicated that a
{CR}{LF} as row delimiter.

Does anyone have any suggestions or experience working with similar problem?

maybe i am missing a setting inside the wizard? Is there a preferred file format to use when columns contain {CR}??


your help is greatly appreciated

-J

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2009-02-16 : 09:53:30
What errors does it give you? Also what is that {CR} in there after headache?
Go to Top of Page

jmill130
Starting Member

26 Posts

Posted - 2009-02-16 : 10:06:45
The {CR} i assume is a carriage return from someone who hits the "enter" key while entering data into the system, its a free text field, its not always there just on some of the rows, which is my problem. I am working on rerunning my package and will post errors soon.
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-02-16 : 11:15:35
Hi jmill130

you can use bcp with a custom format file to handle this: Here's an example:

-- Table def
CREATE TABLE _foo (
[foo] NVARCHAR(255)
, [bar] NVARCHAR(255)
)

Import file:

"abc","123
456"
"foo","bar"
"woo","Hello
World"

Unless this site has nerfed the data that is a single CHR(13) between 123-CHR(13)-456 and Hello-CHAR(13)-World. If this doesn't work you can make your own sample file by using ALT (hold down alt and type 013 -- release ALT and it will insert a carrage return.

Here's the required format file for this data

8.0
3
1 SQLCHAR 0 255 "\"" 0 skip SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 255 "\",\"" 1 foo SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 255 "\"\r\n" 2 bar SQL_Latin1_General_CP1_CI_AS


Running all of these with a bcp string like

bcp XXXXX.dbo._foo in chr13_import.csv -Sdevdb2\SQLSERVER2005 -Usa -PXXXXXXXX -fbcp.fmt -eout.log

Give the required data.

NB - edit to tidy up the format file
Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-02-16 : 11:23:49
Sorry forgot to post the final data!


SELECT * FROM _foo

Results:

foo bar
abc 123 456
foo bar
woo Hello World

(the spaces are really CHR(13)).

If you need any help making your format file reply to the thread -- you should have everything you need to be going forward now though.


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2009-02-16 : 17:33:22
bcp calls and format files are one option

Another try is as follows

/* Build the SQL string one time.*/
SET @SQLString =
N'BULK INSERT dbo.StagingTable
FROM ''' + @Path2File + 'FileFromWhichToBulkInsert.txt''
WITH
(
FIELDTERMINATOR =''\t'',
ROWTERMINATOR =''\n'',
FIRSTROW = 1,
BATCHSIZE = 1000000,
CODEPAGE = ''RAW'',
DATAFILETYPE = ''char'',
ERRORFILE = ''' + @Path2File+ 'Staging.ERR'',
TABLOCK
);'

EXECUTE sp_executesql @SQLString
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-02-17 : 04:18:22
don't think that will strip the " from his file will it? The bcp format file is set up to ignore the first (") of a line and then treat further (",") as delimiters. Finally to think of ("<cr><lf>) as the row delimiter.


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

jmill130
Starting Member

26 Posts

Posted - 2009-02-17 : 09:28:46
Charlie,

Thanks for your reply, i think this is definitely what i need to do. For now i just imported into ms access then DTS the data from there to sql server for futher processing. Its amazing how well access works. Even the new import/export wizard in sql 05/08 doesnt compare. It took me less than a minute to get all of the data imported, where the <cr> was it just put a square box character. At least now i use the rest of the data which is the main concern. Plus I can run queries to look at the data. In sql import/export wizard it wouldnt even import, i tried every option and converting fields and just nothing, didnt like the fact that the carriage return was there, kept getting error that column15 was not defined but it clearly could see it in the preview.
I still find it amazing that access will import it and sql cant, go figure. All i did in access was import column15, one with the <cr> as a memo field.

I find it so frustrating when SQL server wont import data, if you can import you cant analyze the problem. I know sql08 has some data profiler tool but i think it can only be used for data that is already in a sql table. It will show you what the longest length data element is for each field, etc. Look fairly

I do want to understand your format file futher since i eventually want to automate this process and have a sql job run the process.

Where would i save this format file?


Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-02-17 : 10:53:58
Hi jmill130,

It doesn't really matter where the format file lives as long as whichever computer you are using BCP from can see it. If you are going to go down the BULK INSERT route to automate this process then the format file has to be somewhere the db server can access it

I generally just have thme in the same folder as the import file and I generally just name them <importfileName>.fmt

Here's a good starting point for making the format files.

http://msdn.microsoft.com/en-us/library/ms191516.aspx
http://msdn.microsoft.com/en-us/library/ms191479.aspx

And heres a good article on BULK INSERT
http://msdn.microsoft.com/en-us/library/ms188365.aspx

I'll explain how the one I posted works -- you should be able to get the rest from that. (see the second link I posted to a guide to the format)

The line
quote:

1 SQLCHAR 0 255 "\"" 0 skip SQL_Latin1_General_CP1_CI_AS


tells bcp to treat anything up to the first " (\") as the first logical column. Therefore as the first character in each of your lines is a " it will be marked as a field delimiter. because the server column order (the number before the collation is 0 bcp will ignore it and not try and insert an empty string into the database

the line
quote:

2 SQLCHAR 0 255 "\",\"" 1 foo SQL_Latin1_General_CP1_CI_AS


Tells bcp to take everything (up to the next 255 characters) until it hits a "," (\",\") then treat that as the delimiter -- this will then go into the server column order 1 (foo)

the line
quote:

3 SQLCHAR 0 255 "\"\r\n" 2 bar SQL_Latin1_General_CP1_CI_AS


tells bco to take everything after that (again to 255 character) until it hits a "<CR><LF> (\"\r\n) as the row delimiter. that will go into server column order 2 (bar)

then it loops again.

Using BCP from my computer connecting to another machine where the database lives I run

bcp XXXXX.dbo._foo in chr13_import.csv -Sdevdb2\SQLSERVER2005 -Usa -PXXXXXXXX -fbcp.fmt -eout.log

That's

bcp <databaseName>.<schema>.<table> in <filename> -S<serverName> -U<user> -P<password> -f<format file> -e<errorLogFilename>


If you get stuck making up a format file for your table post the DDL of the table and how far you got with the format file.


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

GTaber
Starting Member

3 Posts

Posted - 2015-02-02 : 21:33:09
I am having a very similar problem, only difference i think really is, I am running BCP from dos/batch file as we only have READ only access to the SQLExpress Database. Trying to import .csv file but with extra CR and LF in the larger fields where users enter HArd Reteurns while typing paragraphs of data, messes things up on the import. See my example here (I have posted a new topic on this as well but no response, thought it might come up here with better feedback)..

I am trying to export data from SQL Express into a .CSV file, without having to ommit the columns that have Extra CR LF's in them. I can generate the .csv file but it won't import or parse properly into SQL 2012. All the problem columns are free text and quite large so users have hit CR plenty of times within problematic fields.

My working BCP Command Looks like this below i am using "," as the normal column delimiter:



bcp "SELECT quotename(ID,CHAR(34)), quotename(Name,CHAR(34)), quotename(Date,CHAR(34)), quotename(Smallcomment,CHAR(34)) FROM database.dbo.table ORDER BY Date" queryout C:\filename.csv -c -CACP -t, -S servername\instance -U readonlyuser -P readonlyuserpass



The results are perfect, and even this 'Smallcomment' field is a free text string and users are putting ',' commas, '/' slashes, ':' colons all kinds of junk, but it never has an issue. The next group of about 5 more columns after this one, are much larger in size and all have the same commas, slashes and colons and junk in them, but i see consistent breaks in what appears to be SPACES in Notepad, but I See CR LF symbols using Notepad++

Statements/Questions: I am pretty sure there are CR LF's being hand entered in the larger comments columns breaking my logic?

How can I ignore those specific CR and LF's in those specific columns which I have identified to keep my Carriage Return and or Line Feed for my Row Delimiter?

Below is an example of good results coming out:

ID Col "1111","name1, name2 (AAA: 01/01/01 ID: 1111)","Jan O1 1001 10:00AM",,"","","","","","","","","","3","Name Name, Title",""CRLF

The bad results basically show Several CR LF characters within those larger columns and starts putting all the rows in separate lines, thus making it impossible to separate the columns when we import the file.

How can I make those trouble columns ignore CR and LF and only utilize the ones after the last column I need to go to the next row?

I have screenshots to compare the result differences in Notepad ++ w/All Char View Below:

good data http://tinyurl.com/obj8pgf

bad data http://tinyurl.com/owzbwje
Go to Top of Page
   

- Advertisement -