Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Bitten by a NULL

Author  Topic 

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-09 : 21:44:12
I don't have many Columns that allow NULLs, but datetime Columns are the exception in our databases.

The table 'Course' has a row added every time a user takes a course. The FinishDate is NULL initially, until the user finishes taking the course.

Users can take the course more than once, and dates may have nulls. I use the following subquery to get the MOST RECENTLY finished course:

SELECT UserID, Max(FinishDate) AS CompletedDate FROM Quiz GROUP BY UserID

If a user had multiple records, some with NULL FinishDates, that this subquery returns only the most recent FinishDate.

Problem is: The above subquery doesn't return the Primary Key, so it's necessary to use the UserID, CompletedDate together to uniquely identify the row, and join it with the associated row in the User table.

Here's an abbreviated SELECT where I try to return the MOST RECENT results for a user

SELECT Firstname, Lastname, FinishDate from Users U
LEFT OUTER JOIN Course C
ON U.UserID=C.UserID -- we have a SET OF ALL COURSE ROWS
LEFT OUTER JOIN
(SELECT UserID, Max(FinishDate) AS CompletedDate FROM Quiz
GROUP BY UserID) M
ON U.UserID=M.UserID AND Q.FinishDate=M.CompletedDate

This query returns more than one row for a user IF there is a course row with a NULL finish date.

How can I modify this query to return 1 row for each user, even if the FinishDate is NULL?

SamC

r937
Posting Yak Master

112 Posts

Posted - 2002-09-10 : 01:22:22
correleted subquery?

SELECT Firstname, Lastname, FinishDate 
from Users U
LEFT OUTER
JOIN Course C
ON U.UserID=C.UserID
LEFT OUTER
JOIN Quiz Q
on U.UserID=Q.UserID
where FinishDate =
( SELECT Max(FinishDate)
FROM Quiz
where UserID = Q.UserID )

might still give funny results, as i don't see how Course is related to Quiz, so maybe that's where your extra rows are coming from

rudy
http://rudy.ca/
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-10 : 08:38:15
I owe an apology to is group by posting a bogus "simplified" subquery in my original post.

SELECT Firstname, Lastname, FinishDate from Users U
LEFT OUTER JOIN Course C
ON U.UserID=C.UserID -- we have a SET OF ALL COURSE ROWS
LEFT OUTER JOIN
(SELECT UserID, Max(FinishDate) AS CompletedDate FROM Course
GROUP BY UserID) M
ON U.UserID=M.UserID AND C.FinishDate=M.CompletedDate

I'll try the correlated quiz later today. Thanks for the feedback.

SamC

Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-10 : 10:09:20
I tried the correlated subquery below:

SELECT Firstname, Lastname, FinishDate
from Users U
LEFT OUTER
JOIN Course C
ON U.UserID=C.UserID
where C.FinishDate =
( SELECT Max(FinishDate)
FROM Course
where UserID = C.UserID )

For a User that has completed the course it returns the latest course record, and not any incompleted course records. This is exactly what I want.

Problems remain for users who have only 1 course record with a NULL FinishDate. The query returns no record at all. (This makes it hard to report on this user.

Maybe the most reasonable fix is to replace NULL datetime with something like Jan 1 1900??

Help !?

SamC

Go to Top of Page

joshb
Yak Posting Veteran

52 Posts

Posted - 2002-09-10 : 12:28:37
Haven't tested it but if I understand you correctly I think this should do what you want:

SELECT Firstname, Lastname, MaxFinish
FROM Users U LEFT OUTER JOIN
(SELECT UserID, MAX(FinishDate) AS MaxFinish
FROM Course
GROUP BY UserID) F
ON U.UserID = F.UserID

Go to Top of Page

r937
Posting Yak Master

112 Posts

Posted - 2002-09-10 : 12:31:53
aack! no, null is much better than some funny date

i notice that the Quiz table has quietly dropped from the picture... :)

i have never written a subquery into a join condition, don't even know if it's legal, but try replacing the keyword WHERE in your last query with AND

if that doesn't work, split it up --

  select Firstname, Lastname
, cast(null as datetime) as FinishDate
from Users U
where not exists
( select 1
from Course
where UserID = U.UserID
and FinishDate is not null )
union all
select Firstname, Lastname, FinishDate
from Users U
inner
join Course C
on U.UserID=C.UserID
where C.FinishDate =
( select Max(FinishDate)
from Course
where UserID = C.UserID )

rudy
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-10 : 22:57:36
I've read page 82 of Ken Henderson's Guru's Guide to Transact SQL. There he proposes solving this problem by turning off ANSI_NULL.

Here's my latest query, which is a derived query to test ANSI_NULL OFF.

SET ANSI_NULLS OFF
go
CREATE PROCEDURE dbo.GetUserQuizRow
AS
SELECT NUMBER, Q.USERID, FINISHDATE, MAXFINISHDATE FROM Quiz Q
LEFT OUTER JOIN
(SELECT UserID, MAX(FinishDate)AS MaxFinishDate
FROM quiz GROUP BY UserID) M
ON Q.USERID=M.USERID
where finishdate = maxfinishdate
DECLARE @V1 DATETIME, @V2 DATETIME
SET @V1 = NULL
SET @V2 = NULL
select 1 where @V1=@V2 -- Returns '1' as a sanity check.
GO
SET ANSI_NULLS ON
GO
EXEC dbo.GetUserQuizRow
GO

The above query WILL NOT return rows when finishdate and maxfinishdate are both null. (Yes, the SELECT 1 returns 1 !)

If I modify the WHERE as follows:

SELECT NUMBER, Q.USERID, FINISHDATE, MAXFINISHDATE FROM Quiz Q
LEFT OUTER JOIN
(select userid, max(finishdate)AS MAXFINISHDATE from quiz group by userid) M
ON Q.USERID=M.USERID
where finishdate = maxfinishdate
or (finishdate is null and maxfinishdate is null)

Then the query works perfectly. I could stop and develop on this foundation, but the query shouldn't require this extension.

Why doesn't the query work properly with both finishdate and maxfinishdate are null? ANSI_NULL is OFF, and the "sanity"
SELECT 1 WHERE NULL=NULL
works.

SamC


Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-09-11 : 00:00:00
A LEFT JOIN will produce nulls that don't actually exist; they represent missing values in the right-hand table that don't match rows in the left-hand table. Technically, you can't compare them even to another null.

By any chance, does this do what you want? :

SELECT Firstname, Lastname, NullIf(MaxFinish, CAST '19000101' as datetime) MaxFinish
FROM Users U LEFT OUTER JOIN
(SELECT UserID, MAX(IsNull(FinishDate, CAST '19000101' as datetime)) AS MaxFinish
FROM Course
GROUP BY UserID) F
ON U.UserID = F.UserID


I haven't been able to test it but it allows the null dates to remain but substitutes a dummy value when no non-nulls exist. The NullIf function in the outer query is purely optional.

Edited by - robvolk on 09/11/2002 00:03:44
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-11 : 07:25:48
Rob,

The IsNull function gives another tool to chase down the quarry. Yes, the results are correct if I have a where that states

where IsNull(P.FinishDate,cast('19000101' as datetime))=IsNull(P.MaxFinish, cast('19000101' as datetime))

This WHERE is TRUE when both dates are equal, or both NULL.

This is also what I understand should happen with

SET ANSI_NULLS OFF
SELECT ...
WHERE DATE1=DATE2

I'm still bothered that with ANSI_NULLS OFF, date1=date2 or date1<>date2 will return NO NULLS in the result set. Meanwhile,

SELECT 'ANSI_NULLS ARE OFF' WHERE NULL=NULL

always returns true.

Just to check sanity: when ANSI_NULLS ARE ON

SELECT 'ANSI_NULLS ARE OFF' WHERE NULL=NULL
SELECT 'ANSI_NULLS ARE ON' WHERE not NULL=NULL
go

As expected, with ANSI_NULLS on, neither of the above SELECT statements return values - the WHERE cannot be resolved to true in either case. Point being - I'm pretty sure I've got ANSI_NULLS OFF !

Go figure.

SamC


Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2002-09-11 : 14:29:14
my advice is to replace the NULL date with some date in the future far enough out, not to cause operational difficulties (by this I mean, that it could technically be a "business valid date" within a reasonable short period of time (you don't want another Y2K style problem hitting you too fast, unless there's a country-villa somewhere to be paid for!!!!)....and not 01-01-1900 which has already expired....


I use 31/12/2099....as my infinity date....ie records with this as a finish date have not ended yet. When they do end, I replace 31/12/2099 with Todays date.


As such I can make the end date, part of a key....because it won't have nulls. I know some part of some DateBase Law (or normal form)...says you can't/shouldn't change a key value...


To the purists I say tough...this is a boundary condition....and I'm comfortable with it working....I've had no problems with this style of solution in 10 years of relational database programming.

Go to Top of Page

r937
Posting Yak Master

112 Posts

Posted - 2002-09-11 : 17:48:36
good solution, andrew, and i don't see how it could violate any database law

when you change it, it's no violation either, because it's not a primary key

even if it were a primary key, you could still change it -- i mean, primary keys are allowed to change, even if most databases have not implemented the automatic ON UPDATE CASCADE to foreign keys

nothing wrong with a business rule that says "no course is allowed to remain unfinished until the year 2100"

rudy
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-11 : 18:54:59
I appreciate these solutions, but I confess I'm retentive when it comes to something like ANSI_NULL OFF not appearing to work properly.

I'll run another set of experiments soon and postback my findings.

If no luck there, then a "future date" or "fictional date" may be the answer.

SamC

Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-09-11 : 19:21:44
The best way to deal with nulls is to strictly use IS NULL and IS NOT NULL, always. Don't rely on system settings to test for null equality...even if such a setting is part of the standard...because ultimately it is not logically sound.

Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-11 : 23:17:38
This test script and output below tell a story. I wish it was wrong.

The results show that SET ANSI_NULLS OFF is not symmetric for very similar equality tests.

Datecol1 = NULL
Datecol2 = @value
both test true when Datecol1 is NULL and @value is NULL

BUT (here comes the bite)
Datecol1 = Datecol2
always tests "unknown" when both Datecol1 and Datecol2 are NULL

Seems very inconsistent to me.

SamC

script and output follow
----------------------------------------

SET NOCOUNT ON
DROP TABLE Datetable
GO
-- Build a test table with two datetime columns
create table Datetable (date1 datetime null, date2 datetime null)
go
-- Populate the table with various combinations of dates and nulls
INSERT INTO Datetable VALUES (NULL, NULL)
INSERT INTO Datetable VALUES (NULL, NULL)
INSERT INTO Datetable VALUES (cast('jan 1 1900 12:23:04 AM' as datetime), cast('jan 1 1900 12:23:04 AM' as datetime))
INSERT INTO Datetable VALUES (cast('jan 1 1900 12:23:04 AM' as datetime), cast('jan 1 1900 12:23:04 AM' as datetime))
INSERT INTO Datetable VALUES (cast('jan 1 1900 12:23:04 AM' as datetime), NULL)
INSERT INTO Datetable VALUES (cast('jan 1 1900 12:23:04 AM' as datetime), NULL)

-- Show the table
select 'Contents of Table Datetable'
select * from Datetable

SET ANSI_NULLS OFF
DECLARE @DVAL DATETIME
SET @DVAL = NULL
-- Demonstrate we can select a column having a null value
SELECT 'Datetable rows where Date1 = NULL'
select 'Date1= ', Date1, ' Date2= ', Date2 from Datetable where date1=@DVAL
-- Demonstrate we cannot select two columns being equal when both are null
SELECT 'Datetable rows where Date1=Date2'
SELECT * FROM Datetable WHERE DATE1=DATE2
-- My sanity check
SELECT 'Sanity check returns 1 if ANSI_NULLS OFF'
SELECT 1 WHERE NULL=NULL

select cast(NULL as datetime)

---------------------------------------------

Contents of Table Datetable

date1,date2
,
,
1900-01-01 00:23:04.000,1900-01-01 00:23:04.000
1900-01-01 00:23:04.000,1900-01-01 00:23:04.000
1900-01-01 00:23:04.000,
1900-01-01 00:23:04.000,


Datetable rows where Date1 = NULL

,Date1,,Date2
Date1= ,, Date2= ,
Date1= ,, Date2= ,


Datetable rows where Date1=Date2

date1,date2
1900-01-01 00:23:04.000,1900-01-01 00:23:04.000
1900-01-01 00:23:04.000,1900-01-01 00:23:04.000

Sanity check returns 1 if ANSI_NULLS OFF
1


Go to Top of Page

sanjnep
Posting Yak Master

191 Posts

Posted - 2002-09-12 : 00:16:23
Hi I don't understand the topic clearly pls resend it.


"Heaven's light is our guide"


Sanjeevshrestha
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-09-12 : 09:30:17
quote:
The results show that SET ANSI_NULLS OFF is not symmetric for very similar equality tests.
Ummmmmmm, did I not say that:
quote:
The best way to deal with nulls is to strictly use IS NULL and IS NOT NULL, always. Don't rely on system settings to test for null equality...even if such a setting is part of the standard...because ultimately it is not logically sound.
I agree that it SHOULD work, but I've had so many problems with this that I abandoned ANSI_NULLS a long time ago.

IS NULL and IS NOT NULL, that is the only way to go with null tests.

Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-09-12 : 10:00:41
Rob,

I had to stick with ANSI_NULLS to the end believing it ought to work properly.

It's a hard way to learn, but the lesson sticks.

SamC



Go to Top of Page
   

- Advertisement -