Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Full Outer Join

Author  Topic 

SamC
White Water Yakist

3467 Posts

Posted - 2003-02-06 : 23:19:35
Here's an example query doing a full outer join on itself

SELECT * FROM TableA A
FULL OUTER JOIN TableA B
on A.UserID = B.UserID

Now, If I restrict a few rows in A, shouldn't all rows of B remain with NULL in the columns where A is eliminated?

SELECT * FROM TableA A
FULL OUTER JOIN TableA B
on A.UserID = B.UserID
WHERE A.UserID<5555

What I get is just the rows in A that meet the condition. No nulls.

Sam

ValterBorges
Master Smack Fu Yak Hacker

1429 Posts

Posted - 2003-02-07 : 01:18:48
If you place your query into QA. and then the following

SELECT * FROM TableA A
FULL OUTER JOIN TableA B
on A.UserID = B.UserID AND A.UserID < 5555

you will see that in the first case it determines that since you're filtering on the left table based on a full outer join that it can optimize it to a left join.
Join is performed first giving you a mirror image since it's the same table joined on the key. Next the filter is applied and rows where A.UserID >= 5555 are removed in which case since it's performed after the join the columns from b which are on the same row are also filtered out.

In the second case shows a merge full outer join.

However because the way that a full outer join behaves combining every row whether or not it has a match in the table it will return exactly the same as

SELECT * FROM TableA A
FULL OUTER JOIN TableA B
on A.UserID = B.UserID

BOL
quote:

FULL [OUTER]

Specifies that a row from either the left or right table that does not meet the join condition is included in the result set, and output columns that correspond to the other table are set to NULL. This is in addition to all rows usually returned by the INNER JOIN.




Just curious in what scenerio or for what purpose would you need a full outer join to the same table using the primary key to join on??????



Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-02-07 : 09:48:23
Hi Valter,

I don't follow the logic in the example about why rows in B are eliminated, but I do get that it has something to do with the fact that B and A are the same table. This was just an example. My problem uses two tables: Modules and a derived table.

Modules is a fixed table with about 20 rows

CREATE TABLE Modules (
ModuleID INT NOT NULL
ModuleName VARCHAR (100)
)

Table B is a derived table

SELECT ModuleID, URL from PageHits

But often, ModuleID is NULL in the derived table, the URL is not a Module. I want all the URLs in the JOIN and all the Modules, even if there are no match. Seemed like a good application for a full outer join.

SELECT M.ModuleID, M.ModuleName, P.ModuleID as test, P.URL
FROM Modules M
FULL OUTER JOIN
(SELECT ModuleID, URL from PageHits) P
ON P.ModuleID=M.ModuleID

The result set has only the rows with valid M.ModuleID, eliminating all P.ModuleID that do not match M.ModuleID.

Guessing that the NULL ModuleIDs in P were a problem, I fixed that with a CASE statement that set the NULLs to zero. Same result.

Sam


Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-02-08 : 09:40:03
I don't know if anyone has followed this topic, but here's the resolution that worked for me:

It's a common syntax (for me) to do a LEFT OUTER JOIN like this

SELECT * FROM MyTable M
LEFT OUTER JOIN TableB B on M.Key = B.Key
WHERE M.Column = .. some condition ..

----------------------------

It is a habit of mine that WHERE conditions usually apply to the left side of JOINs. Take this habit, and apply it to the full outer join and watch..

SELECT * FROM TableA A
FULL OUTER JOIN TableB B ON A.UserID = B.UserID
WHERE A.UserID < 5555

We've restriced Table A rows, and the FULL OUTER JOIN should not restrict the rows in Table B, but surprisingly it does. This post is all about finding out why and correcting the problem.

The missing rows in Table B have A.UserID=NULL (as it should). These rows do not satisfy

WHERE A.UserID<5555

There are two solutions, I like one better than the other

SELECT * FROM TableA A
FULL OUTER JOIN TableB B ON A.UserID = B.UserID
WHERE A.UserID < 5555 OR A.UserID IS NULL

This solution asks the question: Why are we testing for NULL? The following solution avoids this and translates better (to me):

SELECT * FROM (SELECT * FROM TableA WHERE UserID<5555) A
FULL OUTER JOIN TableB B on A.UserID = B.UserID

That's it.

Sam

--- I see I omitted the key WHERE condition from my previous post. Not enough information to solve the problem... Sam

Edited by - SamC on 02/08/2003 09:43:47
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2003-02-08 : 13:18:17
The point is that (conceptually, at least) the WHERE condition is applied to the rowset that results from the join. Consequently, if you say WHERE A.UserID<5555, you can be certain there won't be any NULLs in the A.UserID column of the result.

Valter's assertion that:
quote:

SELECT * FROM TableA A
FULL OUTER JOIN TableA B
on A.UserID = B.UserID AND A.UserID < 5555

[...] will return exactly the same as

SELECT * FROM TableA A
FULL OUTER JOIN TableA B
on A.UserID = B.UserID


is incorrect. The reason for this is that the rows for the OUTER part of the join are determined after the join condition has been satisfied. This means that if there is a UserID value >= 5555 in both TableA and TableB, two rows will be produced by the outer join: one with the value from TableA and one from TableB!

Your two ways of 'correcting' this problem produce different results.
For example, if both TableA and TableB have a UserID of 6666,
Using the derived table A, one row will result with A.UserID = NULL, B.UserID = 6666
Using the OR...IS NULL, the outer join will generate a row with A.UserID = 6666, B.UserID = 6666, which is then filtered out by the WHERE clause.

Canned example:

CREATE TABLE TableA (UserID int PRIMARY KEY)
CREATE TABLE TableB (UserID int PRIMARY KEY)

GO

DECLARE @i int
SET @i = 0
SET NOCOUNT ON
BEGIN TRANSACTION
WHILE @i < 100
BEGIN
IF RAND() < 0.5 INSERT INTO TableA VALUES (@i)
IF RAND() < 0.5 INSERT INTO TableB VALUES (@i)
SET @i = @i + 1
END
COMMIT TRANSACTION
SET NOCOUNT OFF

GO

SELECT *
FROM TableA AS A
FULL OUTER JOIN TableB AS B ON A.UserID = B.UserID
ORDER BY COALESCE(A.UserID, B.UserID), B.UserID

SELECT *
FROM TableA AS A
FULL OUTER JOIN TableB AS B ON A.UserID = B.UserID
WHERE A.UserID < 50
ORDER BY COALESCE(A.UserID, B.UserID), B.UserID

SELECT *
FROM TableA AS A
FULL OUTER JOIN TableB AS B ON A.UserID = B.UserID AND A.UserID < 50
ORDER BY COALESCE(A.UserID, B.UserID), B.UserID

SELECT *
FROM (SELECT * FROM TableA WHERE UserID < 50) AS A
FULL OUTER JOIN TableB AS B ON A.UserID = B.UserID
ORDER BY COALESCE(A.UserID, B.UserID), B.UserID

SELECT *
FROM TableA A
FULL OUTER JOIN TableB B ON A.UserID = B.UserID
WHERE A.UserID < 50 OR A.UserID IS NULL
ORDER BY COALESCE(A.UserID, B.UserID), B.UserID



Go to Top of Page
   

- Advertisement -