Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Import/Export (DTS) and Replication (2000)
 Data Loading question?

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2003-07-11 : 07:16:52
John writes "I'm loading data from CSV files which are imports from a mainframe. Our database has numerous look up tables with foreign key relationships that point to one and two character codes which reside in the tables that are loaded from the mainframe import. Whenever a new code shows up, which is fairly often, we get a Foreign Key violation and the insert fails. What I need to do is insert all of the "good" records, and identify the bad ones, so that they can be dealt with. My thought was to add the additional step of loading the files into a set of unconstrained load tables, and then querying the "good" records out of those tables for insert into the existing framework. This seems like a lot of work, for the 150+ tables we have. Is there an easier way to do this that I am just unaware of?

Thanks,

John"

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-07-11 : 08:55:46
Use bcp with hint option:

-h "CHECK_CONSTRAINTS"

- Vit
Go to Top of Page

dukey07
Starting Member

16 Posts

Posted - 2003-07-11 : 11:05:58
Thanks, but this is not exactly what I mean. By turning on "Check_Constraints" I can certainly get the insert to fail with an FK violation. My question is how do I get around that so that I can load the good ones, and cull the bad ones from the load so that they can be addressed. This is important, because an entire file that fails to load in the middle of the night is a show stopper, but a few bad records that need to be addressed the next day are no problem. I just don't want to let those bad records into my tables before they are fixed. Once again, I can see a path to do this, but it seems kind of a long way around.

Thanks Again

John



Edited by - dukey07 on 07/11/2003 11:06:54
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-07-12 : 19:28:45
... oh, yes, John... I understood it later on...
Maybe something like my code below. I think this is the best way out.

exec master..xp_cmdshell 'bcp myDB.dbo.myTable in D:\my.txt -Sz -Usa -P -c',
no_output

declare abc cursor forward_only for
select 0 from myTable
open abc
fetch next from abc
while @@fetch_status=0
begin
update myTable set ColumnInDoubt=ColumnInDoubt where current of abc
if @@error=547 delete from myTable where current of abc
fetch next from abc
end
close abc
deallocate abc

- Vit
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-07-12 : 21:32:45
Staging tables in a staging database.
You can load all the data then run an SP to merge with the production data.

In this situation you may want to add the data as dummy entries to the parent tables for the constraints so that they can be investigated later. Then you can leave the staging tables to be truncated before every load. If you rely on the staging tables to hold data then you may have problems with backup strategies as the data then has to be in step with the active tables.

I think I might post something about this as I've been saying it in so many threads lately.


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-07-13 : 07:49:24
General solutions are very often useless or at least ineffective...
We don't know all details of John's trouble, so we can't decide on
what is the best workaround.

- Vit
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2003-07-13 : 10:00:24
quote:
General solutions are very often useless or at least ineffective...
Only when people don't take a step back and look at their problem from a more general perspective. Treating every problem as a specialized one-off that has only one solution is too inflexible. When you look at what's common about a problem, instead of what's different about it, you'll find it easier to modify an existing solution to work with it.

There's nothing wrong with the idea of staging tables, and the scenario John described in his original post is a good way to do it. So what if it's long? I've done 30-40 table loads the same way; 150 is more but not crushing. And once the procedure is set up it can run automatically and be almost entirely error- and maintenance-proof.

Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-07-13 : 11:16:18
Certainly general solutions are good thing on starting point of a project
developing. But what if John got his new job just a several days ago and
now he's forced to fight with his predecessor work defects? :)

Obviously he sought for some magic undocumented switch in BCP utility
with which BCP utility will ignore not only format violators but constraints
violators too. Plus, save both of these kinds in its error log...

- Vit
Go to Top of Page

dukey07
Starting Member

16 Posts

Posted - 2003-07-13 : 19:29:48
Thanks all, sounds like the staging database is the direction i need to be moving. Thanks Stoad, I was indeed on some level looking for that magic bullet, although I had a pretty good idea that I was gonna have to go the statging DB route. nr, thanks for the point about dealing with the left over "bad" records. I like the idea of adding the new keys to the parent tables, and marking them "Hey what the heck am I". Also, I purposely wanted to keep it kind of general, because currently this data loads via DTS (Legacy), and I was looking for an approach that would allow me to scrap that and revert to Bulk Insert or BCP. By allowing it to go to the staging DB, I can replace the DTS table by table as I have more time.

John



Edited by - dukey07 on 07/13/2003 19:31:38
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-07-13 : 19:58:28
Yep - I've got tyhe same thing here copying data from Oracle daily.
I've got an SP that I give the table name and it (it's done like this so I can use as much of the generic daily routine as possible to test things have been set up properly).

Creates the production table from the oracle structure.
Adds the PK from the Oracle structure.
Creates the staging table in the staging database
Calls the data transfer proc (flagged initial download) to get all the data into staging table. (This logs the start and end time of the transfer).
Calls the merge proc which deletes all existing data which matches on the primary key then inserts all data (and logs the insert and update).
Adds an entry to the TableTransfer table to cause the transfer to be scheduled daily.

The daily transfer just gets all data with a created or modified date of the previous day - the proc fails with an error message if there is no create date or it is going to transfer more than half a million recs.

Quite fun because they still think it's a long process to include another table in the transfer whereas usually it's a 5 minute job.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page
   

- Advertisement -