Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Mamsaac
Starting Member
9 Posts |
Posted - 2011-05-25 : 18:32:46
|
So, tests I had been doing on importing where on a copy of the files I would be receiving. Funny enough, the copies I had where on ANSI encoding. However, the real files are on UCS-2 (LE) format. When trying to work with it, I received an "Unexpected EOF" problem (which is understandable).I'm using a format file (since I only import a subset of the table, other columns being generated by the DB itself). My question relies on how I can tell bcp that the imported data is in UCS-2 format and not ANSI.Thanks! |
|
robvolk
Most Valuable Yak
15732 Posts |
|
Mamsaac
Starting Member
9 Posts |
Posted - 2011-05-25 : 19:21:10
|
Well, I got most of it working... All lines are imported with no issues EXCEPT for the first one. It seems that UCS-2 files start with 1 (or 2) bytes of extra data and bcp is not being able to say "oh, this is just the start of the file, I can ignore it", and those throws a message like the following:Starting copy...SQLState = 22018, NativeError = 0Error = [Microsoft][SQL Native Client]Invalid character value for cast specification2432110 rows copied. Notice it only happens for the first row. If I manually change first row to any other row of the file, the new line that is at first row will fail. |
 |
|
Mamsaac
Starting Member
9 Posts |
Posted - 2011-05-25 : 20:21:14
|
As hacky as it may seem, my "solution" was to preppend a dummy line to the file. That way the dummy line (which I don't care about) is not inserted and no data is lost. It works, at least... but I'm surprised I couldn't find a way to just do it with the tool. |
 |
|
Mamsaac
Starting Member
9 Posts |
Posted - 2011-05-25 : 22:01:21
|
Just in case one day someone has the same issue with Unicode, this should save such person some time:string tempFileName = prefix + Path.GetRandomFileName();StreamReader reader = new StreamReader(fileName);StreamWriter newFile = new StreamWriter(tempFileName, false, Encoding.Unicode);string line = reader.ReadLine();newFile.WriteLine(line);reader.DiscardBufferedData();reader.BaseStream.Seek(2, SeekOrigin.Begin);reader.BaseStream.Position = 2;while ((line = reader.ReadLine()) != null){ newFile.WriteLine(line);}reader.Close();newFile.Close();Using Unicode because bcp does NOT support UTF8, which is .NET default encoding. Seeking to position 2 because you don't want to write those 2 characters (that indicate the encoding) for a second time in the file, but only once (when copied first line).newFile is the file I can process without losing any data. |
 |
|
|
|
|
|
|