Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 view efficiency with NOT IN (SUBQUERY)?

Author  Topic 

CorpDirect
Yak Posting Veteran

92 Posts

Posted - 2005-10-10 : 00:34:44
Good day,

More fun with data from a proprietary app's tables! We're returning data with a view from 'current' and 'history' transaction tables. The way this app is set up, each month all 'closed' transactions are moved to the history table. That part's not so bad; using UNION we can get data from both as the columns are similar.

However, open transactions also get copied to history -- but are left in the current table too, with a flag of 'H' in the history column. These records may be updated in the 'current' table during the month before they get moved or copied to 'history', so may exist with a newer 'version' in the 'current' table. Unfortunately, the history flag is not set on the copies of those records in the history table!

I'm looking for an efficient way to union tables in the view but exclude from the 'history' recordset those that still exist in 'current'.

Example tables and data:
SET NOCOUNT ON

-- the current transaction table
CREATE TABLE trans_curr (
order_no INT NOT NULL
, line_no INT NOT NULL
, history VARCHAR(1) NULL
, status VARCHAR(10) NOT NULL
)
GO

-- the transaction history table
CREATE TABLE trans_hist (
order_no INT NOT NULL
, line_no INT NOT NULL
, status VARCHAR(10) NOT NULL
)
GO

-- some sample data for the current table
INSERT INTO trans_curr VALUES (41797,3,'H','P')
INSERT INTO trans_curr VALUES (42650,13,'','P')
INSERT INTO trans_curr VALUES (41894,1,'H','C')
INSERT INTO trans_curr VALUES (42890,4,'','N')
INSERT INTO trans_curr VALUES (39895,7,'H','P') -- this is a new item for order_no 39895
INSERT INTO trans_curr VALUES (42810,6,'','P')
INSERT INTO trans_curr VALUES (42892,8,'','N')
INSERT INTO trans_curr VALUES (43172,11,'','P')
INSERT INTO trans_curr VALUES (43153,3,'','P')
INSERT INTO trans_curr VALUES (39895,3,'H','P')

-- some sample data for the history table
INSERT INTO trans_hist VALUES (41797,3,'P')
INSERT INTO trans_hist VALUES (21916,1,'C')
INSERT INTO trans_hist VALUES (27043,3,'C')
INSERT INTO trans_hist VALUES (39895,3,'P')
INSERT INTO trans_hist VALUES (37508,1,'C')
INSERT INTO trans_hist VALUES (26327,5,'C')
INSERT INTO trans_hist VALUES (11394,3,'C')
INSERT INTO trans_hist VALUES (6039,2,'C')
INSERT INTO trans_hist VALUES (14767,1,'C')
INSERT INTO trans_hist VALUES (41894,1,'P')

SET NOCOUNT OFF

The status column unfortunately is not very useful here; of more importance is that new line items may be added for an order_no in trans_curr before it is completed, so if we exclude those with a 'history' value of 'H' from the current table we'll miss some records. We have to exclude from the history table those order numbers with a value of 'H' in the current table.

For example, you will see that I commented the record for order_no 39895, line_no 7; this is a new line item that was added to that order after the monthly move/copy batch. However, the app sets its 'history' flag to 'H' so that it matches the other items on the order, some of which have been moved to history. So, if we exclude items marked history = 'H', we miss those items.

This is unneccessarily complex and difficult, but I didn't write the app... Here's what I've got:

SELECT
order_no
, line_no
-- , etc...
FROM
trans_curr
UNION ALL
SELECT
order_no
, line_no
-- , etc...
FROM
trans_hist
WHERE
order_no NOT IN (
SELECT
order_no
FROM
trans_curr
WHERE
history = 'H' )

-- clean up after ourselves
DROP TABLE trans_curr
GO

DROP TABLE trans_hist
GO

This works, but really is not aesthetically pleasing. The real history table in particular is pretty big, several million records, and the actual tables have many dozens of columns each. Four our views we're pulling data from several sets of current/history tables at once, for each set having to do this WHERE ?? NOT IN (SUBQUERY). However, I can't think of any other way to write this. Can anyone offer suggestions (other than finding the developers and kicking them a bit)?

Thanks,

Daniel

Kristen
Test

22859 Posts

Posted - 2005-10-10 : 01:39:22
Make a LEFT OUTER JOIN where the "LEFT" side PK is NULL??

FROM trans_hist H
LEFT OUTER JOIN trans_curr T
ON T.order_no = H.order_no
AND T.history = 'H'
WHERE T.order_no IS NULL

Kristen
Go to Top of Page

CorpDirect
Yak Posting Veteran

92 Posts

Posted - 2005-10-10 : 01:58:18
Umm, it's 2am here; I'll have to look at this and think about it after some sleep!
Go to Top of Page

CorpDirect
Yak Posting Veteran

92 Posts

Posted - 2005-10-10 : 10:21:12
Thought about it some more and I wonder if simply using UNION may do the trick (in place of UNION ALL). The application is very rigid. It allows new items to be added to open orders (= new record in trans_curr table for order_no), but does not allow the details of an item to be edited after it is added. So maybe selecting from trans_curr UNION trans_hist would be the ticket?
SELECT
order_no
, line_no
-- , etc...
FROM
trans_curr
UNION
SELECT
order_no
, line_no
-- , etc...
FROM
trans_hist

This should automatically leave out duplicates of the columns selected, right?

Daniel
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-10-10 : 10:32:10
UNION is only going to remove duplicates if all columns match - presumably there is some difference for the amount / transactions etc. in the "new version" ?

A UNION will have to perform a sort on the selected data (in order to find & remove the duplicates), so if you are selecting a large number of rows this will slow things down a bit.

Provided you have INDEXES on the various columns used in the LEFT OUTER JOIN it should be pretty quick.

Kristen
Go to Top of Page
   

- Advertisement -