Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Fastest method for data formatting

Author  Topic 

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-05-15 : 12:10:06
I have a dts package that takes data from a csv and loads it into a table. (actually, we have several.. this is the key one giving me so many troubles). The csv actually has 2147000 + lines.

Now the data is inserted into a dts table. Now these values moved to the dts table are all numeric that represent another value. 4 other tables exist that contain values which link these numeric fields to the actual data they should be.

Currently it is set up to run 5 different update statements on the dts table then an export over to a new table. This exeutes perfectly and no errors are found. However the run time on this is about 1.5 hours. the updates are as follows

UPDATE strat_pick_Dts
SET Strat_unit = s.strat_unit
from strat_pick_dts p, strat_unit s
where p.strat_unit = s.strat_unit_legacy_formation
go

update strat_pick_dts
set pick_depth = null
where pick_depth = '000000'
go

UPDATE strat_pick_Dts
SET resource_interpretation_id = r.resource_interpretation_seq
from strat_pick_dts p, resource_interpretation_event_dts r
where p.bwdgtmk_dbkey = r.bwdgtmk_dbkey
go

UPDATE strat_pick_Dts
SET pick_quality = q.pick_quality
from strat_pick_dts p, r_pick_quality q
where p.pick_quality = q.legacy_code
go

UPDATE strat_pick_Dts
SET pick_description = d.pick_description
from strat_pick_dts p, r_pick_description d
where p.pick_description = d.legacy_code
go

With the exception of the one that does the set null, as you can see its formatting data in the strat_pick_dts, replacing numbers such as 1045 with it's actual text value taken from a different table.
After this is done a simple dts transfer is used to populate the final table.

Still with me?

Now using some ideas rob gave me from other posts I made on here. I decided to try creating a view that links these tables together and then running the insert from the view to the new table.

create view Strat_picks_load as
select r.Resource_Interpretation_seq,
p.Strat_Pick_seq,
p.Strat_name_set,
s.Strat_Unit,
p.Pick_depth,
q.Pick_quality,
d.Pick_description
from strat_pick_dts p left join strat_unit s on p.strat_unit = s.strat_unit_legacy_formation
left join resource_interpretation_event_dts r on p.bwdgtmk_dbkey = r.bwdgtmk_dbkey
left join r_pick_quality q on p.pick_quality = q.legacy_code
left join r_pick_description d on p.pick_description = d.legacy_code

--Editors note
update strat_pick_dts
set pick_depth = null
where pick_depth = '000000'
go

Must still be run in either case
--

this to runs fine and in about 30 mins (much much better)... however I begin getting errors as some fields (mostly pick_description) are null and the table their being moved to dsoesn't allow nulls. They were not null when just the regular updates were run and I'm confused as to why it's null for the view but has a value when updates are run on it. Any ideas? Its hard to run tests on this table as the thing just has so many lines



Edited by - M.E. on 05/15/2002 12:22:56

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-05-15 : 13:00:39
The reason you have nulls using the view and not your previous method is that you view left joins while the previous method inner joins. I may have to re-read you post, but maybe you can define the view like this . . .


create view Strat_picks_load as
select
coalesce(r.Resource_Interpretation_seq,p.resource_interpretation_id) as resource_interpretation_seq,
p.Strat_Pick_seq,
p.Strat_name_set,
coalesce(s.Strat_Unit, p.strat_unit) as strat_unit,
case
when p.Pick_depth = '000000' then null
else p.pick_depth
end as pick_depth,
coalesce(q.Pick_quality, p.pick_quality) as pick_quality,
coalesce(d.Pick_description, p.pick_description) as pick_description
from
strat_pick_dts p
left join strat_unit s
on p.strat_unit = s.strat_unit_legacy_formation
left join resource_interpretation_event_dts r
on p.bwdgtmk_dbkey = r.bwdgtmk_dbkey
left join r_pick_quality q
on p.pick_quality = q.legacy_code
left join r_pick_description d
on p.pick_description = d.legacy_code


 
By the way, the case statement in the select clause for the view definition should take care of the pick_depth = '000000' issue . . . If you don't understand the difference between the view and the way you where doing it b4 (I know you had some previous question on joins), I would suggest you study up a bit before you move on to your next task . . .

<O>
Go to Top of Page

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-05-15 : 13:13:56
I';ve used the case statement like that before and I would have used it, but not until the other bugs are fixed.


I understand quite well the difference between the view and the joins method. I'll try doing the method you've got here, other then the case statement it seems very much the same.

what I'm wondering is why when you go through with updates, no nulls are found. When you do joins and create a view a couple nulls pop up where the update would have removed them (which should produce the exact same thing... the update is altering the table to these new values and the view is just showing me what these values are with out making changes to the table) and I am extremely confused as to why this is.

don't worry about my previous posts, they had to do with a different problem with another dts package which has been worked out (actually.. differenet database too)

Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-05-15 : 13:44:00
quote:
what I'm wondering is why when you go through with updates, no nulls are found.

Because there's a WHERE condition (JOIN condition, same thing in that case) that would exclude nulls. The big thing to remember about nulls is that they don't EQUAL anything, even another null. I'm sure you know this already but it's easy to forget (I do it a lot!) So unless you include an IS NULL condition in your code, nulls are completely ignored. If you use the JOIN syntax in the UPDATE (why AREN'T you using it, BTW?) the condition is still using the = sign, and nulls will be ignored; it's as if they aren't in the table at all.

Now, in the case of a LEFT JOIN in the view, the rows in the table(s) on the left side show up no matter what, but the unmatched rows on the right throw out nulls that don't really exist in the tables. However, in trying to update or insert from the view, you trickle down to the tables, and are trying to insert a null from the view, a null that was created by the LEFT JOIN. Not to mention that you have several LEFT JOINs...

I'm wondering, have you tried the view using INNER JOINs? Did it work? I don't think you need LEFT JOINs, since if there is no match in the right-side table, you can't update the left side with a value anyway!

Edited by - robvolk on 05/15/2002 13:48:07
Go to Top of Page

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-05-15 : 13:55:55
quote:
If you use the JOIN syntax in the UPDATE (why AREN'T you using it, BTW?)


This is how it was when I came in.



quote:
I'm wondering, have you tried the view using INNER JOINs? Did it work? I don't think you need LEFT JOINs, since if there is no match in the right-side table, you can't update the left side with a value anyway!



Yes, I did try inner join's first... Left joins second... your seeing my second attempt.. heh. It seemed to make no difference

do you think just taking all those updates making it into one update statement using joins would make it any fast? I guess it'd look like this (missing a bit... just did it quickly):

UPDATE strat_pick_Dts
pick_description = d.pick_description ,
resource_interpretation_id = r.resource_interpretation_seq ,
pick_description = d.pick_description ,
Strat_unit = s.strat_unit ,
from
strat_pick_dts p
left join strat_unit s
on p.strat_unit = s.strat_unit_legacy_formation
left join resource_interpretation_event_dts r
on p.bwdgtmk_dbkey = r.bwdgtmk_dbkey
left join r_pick_quality q
on p.pick_quality = q.legacy_code
left join r_pick_description d
on p.pick_description = d.legacy_code



And thanks alot guys.. your input on these forums saves me



Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-05-15 : 14:08:18
Try this:

UPDATE strat_pick_Dts
pick_description = IsNull(d.pick_description, pick_description) ,
resource_interpretation_id = IsNull(r.resource_interpretation_seq, resource_interpretation_id) ,
pick_description = IsNull(d.pick_description, pick_description) ,
Strat_unit = IsNull(s.strat_unit, Strat_unit)
from
strat_pick_dts p
left join strat_unit s
on p.strat_unit = s.strat_unit_legacy_formation
left join resource_interpretation_event_dts r
on p.bwdgtmk_dbkey = r.bwdgtmk_dbkey
left join r_pick_quality q
on p.pick_quality = q.legacy_code
left join r_pick_description d
on p.pick_description = d.legacy_code


That should work but it will most likely be slower. I'm still thinking that you should use INNER JOINs, but unfortunately you'd have to stick with individual updates for each table joined.

FYI, when I suggested in the other threads that you use a view, I didn't mean for the purposes of an UPDATE. I meant that instead of updating the table with the info from the joins, and then SELECTing from that table, you should create the view with the proper joins, and then SELECT from the view instead. Then you wouldn't have to update anything.

Is that a possibility? If you've already tried that and posted so, I missed it somewhere. Sorry.

Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-05-15 : 14:39:40
quote:

...I understand quite well the difference between the view and the joins method...



M.E., don't take this as an attack cause I am really trying to be helpful . . . but, every post on this thread indicates you don't fully understand the difference between an inner and an outer join. In your orginal post, the UPDATE statements use inner joins while the view definition does an outer join. That explains ...
quote:

why when you go through with updates, no nulls are found....


in your last post, you are using an outer join in the FROM clause of your update, but you are not accounting for NULLs (which will result from rows on the left-side which have no match on the right-side). This will result in an error if you targeted updated table has a NOT NULL constraint (which you said it did).
Rob and I are both telling you to do the same thing. For example, if there is no corresponding strat_unit in the strat_unit table to match the strat_unit you have in strat_pick_dts, then leave what is in strat_pick_dts (or said differently set it equal to itself). The Rob's "isnull(strat_unit.strat_unit,strat_pick_dts.strat_unit)" and my "coalesce(strat_unit.strat_unit,strat_pick_dts.strat_unit)" both do the this.



<O>
Go to Top of Page

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-05-16 : 11:04:22
Rob
quote:
FYI, when I suggested in the other threads that you use a view, I didn't mean for the purposes of an UPDATE. I meant that instead of updating the table with the info from the joins, and then SELECTing from that table, you should create the view with the proper joins, and then SELECT from the view instead. Then you wouldn't have to update anything.




Yes I knew what you ment and that actually solved the problem in those scenarios... Thatnkyou for that. Unfortunately theres a third party program involved in this one that for some unknown reason that me and a DBA here have puzzled over for a while, won't accept a view instead of the table. So view although preferable just isn't working in this case.

Oh, antoher note.. in the update statement you gave me there, you forgot to put in a set statement ;P lol, like it matters. Thanks for that


Page47's bit
quote:
. . . but, every post on this thread indicates you don't fully understand the difference between an inner and an outer join. In your orginal post, the UPDATE statements use inner joins while the view definition does an outer join. That explains ...



Did you see this?

quote:
Yes, I did try inner join's first... Left joins second... your seeing my second attempt.. heh. It seemed to make no difference



What was strange about it to me was the fact that innerjoin and left join created the exact same nulls. and I just couldn't understand why this was. Although I'm closer to finding out why now... One of the tables was unproperly loaded and was creating umm.. I think the term for it is cartesian join or cross join or something like that. So I'm taking a look into other tables that might be causing simular problems. Funny enough the final solution I'm going with is the pretty little update statements you guys gave me there... Thanks.



quote:
M.E., don't take this as an attack cause I am really trying to be helpful


ack!!! Please don't ever think I'm anything less then completely grateful for your help. And I'm very sorry if I accidentally showed otherwise



Edited by - M.E. on 05/16/2002 11:08:12
Go to Top of Page
   

- Advertisement -