Author |
Topic |
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 06:58:31
|
My client wants a table exported using Pipe delimiter "|"They haven't given any brief on what I do with embedded delimiter characters, but I'd like to be prepared But lets suppose I need to escape them to "\|", and thus I would have to also change the Escape character "\" to "\\".Is there a way to do this for all String columns easily?I have made a view to do the column name conversions they want:CREATE VIEW dbo.EXPORT_RegistrationsASSELECT [CreateDate] = xxx_yyy_CrDt, [Name1] = xxx_yyy_Name1, [Name2] = xxx_yyy_Name2, [Name3] = xxx_yyy_Name3, ...FROM dbo.XXX_YYY_RegistrationWHERE xxx_yyy_Status NOT IN (-1, 0, 99) -- -1=Imported, 0=Incomplete, 99=CancelledGO and then I'm exporting from each VIEW using:DECLARE @strBCP varchar(8000)PRINT 'EXPORT_Registrations'SELECT @strBCP = 'bcp "SELECT *, '''' FROM MyDatabase.dbo.EXPORT_Registrations"' + ' queryout' + ' "C:\ClientName_Registrations.TXT"' + ' -e "C:\ClientName_Registrations.ERR"' + ' -o "C:\ClientName_Registrations.OUT" -T -t"|" -c'SELECT [@strBCP] = @strBCP and that works a treat. And more importantly its fast, and doesn't involve yards of programmingSo, is there some neat way I can have some substitution for any embedded delimiter in any column (well, only needs to be String datatype I suppose)?For example:REPLACE Escape character "\" with "\\"The REPLACE pipe delimiter "|" with "\|"I could use a UDF and surround every possible String datatype column in the VIEW? (There are loads of tables, with loads of columns, so that would be quite tedious, but doable). Would that be dog-slow at Runtime?Something else?Thanks,Kristen |
|
X002548
Not Just a Number
15586 Posts |
|
X002548
Not Just a Number
15586 Posts |
|
dinakar
Master Smack Fu Yak Hacker
2507 Posts |
Posted - 2007-09-28 : 14:46:56
|
KristenAre you asking how to deal with rows that have the pipe as part of the data?Dinakar Nethi************************Life is short. Enjoy it.************************http://weblogs.sqlteam.com/dinakar/ |
|
|
X002548
Not Just a Number
15586 Posts |
|
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 14:58:54
|
Thanks guys"Where is the data going?"Files. Then one time conversion to new system. Except that there needs to be a "final" conversion at the time that they switch-over."What's the volume?"10 tables, 20 columns each (avg), range from 10,000 to 250,000 rows per table."Do you know if you even have pipes in the data?"No, but there is risk that the final conversion, which will be very time critical, might discover that some bright spark has added one that's the specific thing I'm thinking I should be helpful about and guard against."Are you asking how to deal with rows that have the pipe as part of the data?"Well columns actually, but other than that "yes please" "http://weblogs.sqlteam.com/brettk/archive/2005/02/08/4144.aspx"Cut&Pasted, went to save as GlobalSearchOfAllTablesAndColumns.SQL and guess what? I already did that last time I needed a global search, so I already had that solution and just didn't realise it. Thanks for that, its running now.Kristen |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 15:02:01
|
Yup, we definitely have embedded pipes.Folk are registering with names like|/bin/ls -alI guess their SQL injection techniques are going to need a bit more work!Kristen |
|
|
dinakar
Master Smack Fu Yak Hacker
2507 Posts |
Posted - 2007-09-28 : 15:05:18
|
And you HAVE to use a pipe delimiter? or can you use something else?Dinakar Nethi************************Life is short. Enjoy it.************************http://weblogs.sqlteam.com/dinakar/ |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 15:08:53
|
It would seem so. They have stipulated the PIPE character, despite the fact I have told them that the data includes embedded TAB, CR/LF, and all most probably all possible ASCII characters. |
|
|
X002548
Not Just a Number
15586 Posts |
|
X002548
Not Just a Number
15586 Posts |
|
dinakar
Master Smack Fu Yak Hacker
2507 Posts |
Posted - 2007-09-28 : 15:14:52
|
quote: Originally posted by Kristen It would seem so. They have stipulated the PIPE character, despite the fact I have told them that the data includes embedded TAB, CR/LF, and all most probably all possible ASCII characters.
I did something similar a while back. We have some junk data in one of our tables I Was trying to transfer the data from. No matter what delimiter I used it was taken. So I wrote a view as a wrapper for the table and did a REPLACE in the view. Replace the pipe (the delimiter I was using) with something else. After the data got imported into the new table, I ran an update to put the pipe back.Another way this can be done is to bcp out in 2 phases. In 1st phase, select all rows that do not have the pipe (hopefully this wil be 95% of the data), for the remaining, just print results to file and copy the data into the original txt file. Not exactly neat but if the rows are few some kind of manual work can be done.Dinakar Nethi************************Life is short. Enjoy it.************************http://weblogs.sqlteam.com/dinakar/ |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2007-09-28 : 15:23:17
|
ooo, ooo,Write a cursor for every table, then echo out the data to the filesThe let them try and load the dataSince it will blow up...Hold on here, what's the issueYou can do what they want right?Are you worried that it won't load?I think they will then want thave a meetingBrett8-)Hint: Want your questions answered fast? Follow the direction in this linkhttp://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxAdd yourself!http://www.frappr.com/sqlteam |
|
|
X002548
Not Just a Number
15586 Posts |
|
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 15:45:12
|
I like the idea of not selecting any rows that have embedded PIPEs. Seeing as I have no specification for what to do with embedded Pipes, and they are bound to break the import, then "5 row(s) missing because embedded delimiter found" will be good. If I never hear anything they ain't got no stinking Referential Integrity in the new system ."Are you worried that it won't load?"Actually I think our erstwhile client is being taken for a ride by their new provider.The new provider has an accounts package and an "attached" e-commerce web application. I think their accounts package is probably a good choice for this client. There is clearly some manager who thinks that using a web site from the accounting system provider Saves Work. Whilst I hold the view that we are specialists in the e-commerce field and they are going to lose some market from this route "Saves Work" route, I also realise that this client has been VERY hard to persuade that to increase their web sales they need to put in hard work. (IMHO that mostly means setting up really good content, not just Cut&Paste from Paper-based catalogue, adding lots of additional pictures [Rear view, front view, close up view, all that sort of stuff]; and Alternatives, Accessories, Up/Cross sell. But that's basically it. Those of our clients who do that see huge improvements in sales.)But this client doesn't want to follow our advice and has ideas like duplicating the category info on every product details page. That will push down the product detail data so the user has to scroll, expose to Google some repeating content [high up on the page] and ultimately reduce their Search Engine ranking IMHO. But of course there is NO WORK INVOLVED in this, as that data already exists, so they like this idea .Client is making lots of $000,000 from the existing site, I can't figure why spending some money on content is hard to budget for. But there we are.And the Data Import folk just want the data, in what I am sure they perceive is industry-standard-pip-delimited-format. They don't want a meeting to discuss the "nuance" of the data. They are, presumably, just going to throw it into the new package. Now, when you next move house I recommend you go through the crud in your attic. I strongly recommend that you do NOT just get the removals people to put all that Crud in the Attic of your new house ...All the while we are trying to act professionally in the client's interest. And the client is very impressed with what we do, sings our praises, etc. all very bizarre.I reckon the boss's nephew three-times-removed got the data-migration-intermediary role. I mean ... no possible alternative explanation, is there? I'll be interested to see how they get on with the new package though. But that doesn't remove the fact that IF they had worked on the content as I've done my best to persuade them I genuinely believe that they would now be doing $0,000,000 (as our other "content-is-king" clients do) rather than just $000,000, despite the fact that they are thrilled that they have gone from $0,000 to $000,000 in a relatively short time.Ho! Hum!Kristen |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 15:55:33
|
So what's the collective wisdom then chaps?1. Exclude all rows that include embedded pipe2. UDF to "escape" the PIPE. (although I now realise I have no idea whether the Importer can UPDATE that back again once imported)3. Something else that does the REPLACEs which is less work to implement?Only thing I can think of is to output with a guaranteed unique delimiter - either "" as Brett said, or something like CHAR(0), and then post-process the file to REPLACE pipes with escaped-pipes and CHAR(0) with pipes. That avoids having to surrounds all columns with UDF (which will a) be a pain to type and b) slow to run) and the one-solution will work for any number of tables / columns / export files.Kristen |
|
|
|