Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Show only changes for one data field temporal data

Author  Topic 

michael.appleton
Posting Yak Master

160 Posts

Posted - 2011-01-26 : 11:36:27
Hello,

I am stuck on this seemingly simple query. Basically I'm dealing with temporal data storing a row for each change in the data and I want to show only the changes for one data field for each object. In the example below this is status, and each object has a different object id.


DECLARE @TemporalData TABLE(
Id int,
Status varchar(50),
SomeOtherData varchar(max),
ValidFrom datetime,
ValidTo datetime,
Primary Key(Id,ValidFrom),
Unique(Id,ValidTo)
)

--Sample data
--each row shows some change in the data
--valid to dates are the next validfrom date
INSERT INTO @TemporalData
SELECT 1,'online','some other stuff','20000101','20010101'
UNION ALL
SELECT 1,'offline','some other stuff','20010101','20050105'
UNION ALL
SELECT 1,'offline','some other stuff changed','20050105','20100101'
UNION ALL
SELECT 1,'online','some other stuff changed','20100101','20110101'
UNION ALL
SELECT 1,'destroyed','some other stuff changed','20110101','20130101'
UNION ALL
SELECT 1,'destroyed','some other stuff changed again','20130101','99991231'
UNION ALL
SELECT 2,'online','some other stuff','20000101','20100101'
UNION ALL
SELECT 2,'offline','some other stuff','20100101','20130501'
UNION ALL
SELECT 2,'online','some other stuff changed','20130501','20130506'
UNION ALL
SELECT 2,'offline','some other stuff changed','20130506','20150501'
UNION ALL
SELECT 2,'online','some other stuff','20150501','99991231'
UNION ALL
SELECT 3,'online','some other stuff','20050101','20070101'
UNION ALL
SELECT 3,'offline','some other stuff','20070101','20080101'
UNION ALL
SELECT 3,'offline','some other stuff changed','20080101','20090101'
UNION ALL
SELECT 3,'offline','some other stuff changed again','20090101','20150101'
UNION ALL
SELECT 3,'destroyed','some other stuff changed again','20150101','99991231'




--required output

--shows date range statuses were relevant

SELECT Id=1,Status='online',ValidFromDate=CAST('20000101' as datetime),ValidToDate=CAST('20010101' as datetime)
UNION ALL
SELECT 1,'offline','20010101','20100101'
UNION ALL
SELECT 1,'online','20100101','20110101'
UNION ALL
SELECT 1,'destroyed','20110101','99991231'
UNION ALL
SELECT 2,'online','20000101','20100101'
UNION ALL
SELECT 2,'offline','20100101','20130501'
UNION ALL
SELECT 2,'online','20130501','20130506'
UNION ALL
SELECT 2,'offline','20130506','20150501'
UNION ALL
SELECT 2,'online','20150501','99991231'
UNION ALL
SELECT 3,'online','20050101','20070101'
UNION ALL
SELECT 3,'offline','20070101','20150101'
UNION ALL
SELECT 3,'destroyed','20150101','99991231'




I've tried using the row_number function and self joining the result to get "gaps" in the data showing the change point, but it seems very long winded and I'm sure I'm missing something rather simple! If someone wants me to post my solution, I can do that, but I think there's probably a much easier way so don't want to post it and confuse people! I've also tried using grouping, but then if the status changes back to something it has been already, that change is missed.

Thanks for the help,

Michael

Ifor
Aged Yak Warrior

700 Posts

Posted - 2011-01-26 : 12:19:32
Try using the order difference/sum grouping method which I think is attributable to Steve Kass of Drew University.

Something like:

;WITH GrpTemporalData
AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY Id ORDER BY ValidFrom)
- ROW_NUMBER() OVER (PARTITION BY Id, [status] ORDER BY ValidFrom) AS Grp
FROM @TemporalData
)
,GrpResults
AS
(
SELECT Id, grp, [status]
,MIN(ValidFrom) AS ValidFrom
,MAX(ValidTo) AS ValidTo
FROM GrpTemporalData
GROUP BY Id, grp, [status]
)
SELECT Id, [status], ValidFrom, ValidTo
FROM GrpResults
ORDER BY Id, ValidFrom

Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-01-26 : 12:44:13
seems like this

SELECT Id,Status,MIN(ValidFromDate) AS ValidFromDate,MAX(ValidToDate) AS ValidToDate
FROM
(
SELECT t.*,t1.ValidFromDate AS BorderDate
FROM YourTable t
OUTER APPLY (SELECT TOP 1 ValidFromDate
FROM YourTable
WHERE Id = t.Id
AND Status <> t.Status
AND ValidFromDate> t.ValidToDate
ORDER BY ValidFromDate ASC)t1
)p
GROUP BY Id,Status,BorderDate


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

michael.appleton
Posting Yak Master

160 Posts

Posted - 2011-01-27 : 12:18:48
Thanks for that Ifor, that's the sort of thing I was after. Useful things, those CTEs. Still seems fairly complicated though for what seems like a simple requirement and that definitely feels like it should be done in a set based manner. Still, it's all done in one transaction at least, unlike my version.

visakh16, that query doesn't produce the results that I'm after. Perhaps a little modification can get the expected results though. I'll have a little look and see if I can figure out what you were trying to do.

I wonder if someone else has a simpler method.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-01-27 : 12:32:02
it was a small typo. see edited suggestion below


DECLARE @TemporalData TABLE(
Id int,
Status varchar(50),
SomeOtherData varchar(max),
ValidFrom datetime,
ValidTo datetime,
Primary Key(Id,ValidFrom),
Unique(Id,ValidTo)
)

--Sample data
--each row shows some change in the data
--valid to dates are the next validfrom date
INSERT INTO @TemporalData
SELECT 1,'online','some other stuff','20000101','20010101'
UNION ALL
SELECT 1,'offline','some other stuff','20010101','20050105'
UNION ALL
SELECT 1,'offline','some other stuff changed','20050105','20100101'
UNION ALL
SELECT 1,'online','some other stuff changed','20100101','20110101'
UNION ALL
SELECT 1,'destroyed','some other stuff changed','20110101','20130101'
UNION ALL
SELECT 1,'destroyed','some other stuff changed again','20130101','99991231'
UNION ALL
SELECT 2,'online','some other stuff','20000101','20100101'
UNION ALL
SELECT 2,'offline','some other stuff','20100101','20130501'
UNION ALL
SELECT 2,'online','some other stuff changed','20130501','20130506'
UNION ALL
SELECT 2,'offline','some other stuff changed','20130506','20150501'
UNION ALL
SELECT 2,'online','some other stuff','20150501','99991231'
UNION ALL
SELECT 3,'online','some other stuff','20050101','20070101'
UNION ALL
SELECT 3,'offline','some other stuff','20070101','20080101'
UNION ALL
SELECT 3,'offline','some other stuff changed','20080101','20090101'
UNION ALL
SELECT 3,'offline','some other stuff changed again','20090101','20150101'
UNION ALL
SELECT 3,'destroyed','some other stuff changed again','20150101','99991231'

SELECT Id=1,Status='online',ValidFromDate=CAST('20000101' as datetime),ValidToDate=CAST('20010101' as datetime)
UNION ALL
SELECT 1,'offline','20010101','20100101'
UNION ALL
SELECT 1,'online','20100101','20110101'
UNION ALL
SELECT 1,'destroyed','20110101','99991231'
UNION ALL
SELECT 2,'online','20000101','20100101'
UNION ALL
SELECT 2,'offline','20100101','20130501'
UNION ALL
SELECT 2,'online','20130501','20130506'
UNION ALL
SELECT 2,'offline','20130506','20150501'
UNION ALL
SELECT 2,'online','20150501','99991231'
UNION ALL
SELECT 3,'online','20050101','20070101'
UNION ALL
SELECT 3,'offline','20070101','20150101'
UNION ALL
SELECT 3,'destroyed','20150101','99991231'

SELECT Id,Status,MIN(ValidFrom) AS ValidFromDate,MAX(ValidTo) AS ValidToDate
FROM
(
SELECT t.*,t1.ValidFrom AS BorderDate
FROM @TemporalData t
OUTER APPLY (SELECT TOP 1 ValidFrom
FROM @TemporalData
WHERE Id = t.Id
AND Status <> t.Status
AND ValidFrom> t.ValidFrom
ORDER BY ValidFrom ASC)t1
)p
GROUP BY Id,Status,BorderDate
ORDER BY Id,ValidFromDate


your output
---------------------------
Id Status ValidFromDate ValidToDate
1 online 2000-01-01 00:00:00.000 2001-01-01 00:00:00.000
1 offline 2001-01-01 00:00:00.000 2010-01-01 00:00:00.000
1 online 2010-01-01 00:00:00.000 2011-01-01 00:00:00.000
1 destroyed 2011-01-01 00:00:00.000 9999-12-31 00:00:00.000
2 online 2000-01-01 00:00:00.000 2010-01-01 00:00:00.000
2 offline 2010-01-01 00:00:00.000 2013-05-01 00:00:00.000
2 online 2013-05-01 00:00:00.000 2013-05-06 00:00:00.000
2 offline 2013-05-06 00:00:00.000 2015-05-01 00:00:00.000
2 online 2015-05-01 00:00:00.000 9999-12-31 00:00:00.000
3 online 2005-01-01 00:00:00.000 2007-01-01 00:00:00.000
3 offline 2007-01-01 00:00:00.000 2015-01-01 00:00:00.000
3 destroyed 2015-01-01 00:00:00.000 9999-12-31 00:00:00.000


my output
----------------------------
Id Status ValidFromDate ValidToDate
1 online 2000-01-01 00:00:00.000 2001-01-01 00:00:00.000
1 offline 2001-01-01 00:00:00.000 2010-01-01 00:00:00.000
1 online 2010-01-01 00:00:00.000 2011-01-01 00:00:00.000
1 destroyed 2011-01-01 00:00:00.000 9999-12-31 00:00:00.000
2 online 2000-01-01 00:00:00.000 2010-01-01 00:00:00.000
2 offline 2010-01-01 00:00:00.000 2013-05-01 00:00:00.000
2 online 2013-05-01 00:00:00.000 2013-05-06 00:00:00.000
2 offline 2013-05-06 00:00:00.000 2015-05-01 00:00:00.000
2 online 2015-05-01 00:00:00.000 9999-12-31 00:00:00.000
3 online 2005-01-01 00:00:00.000 2007-01-01 00:00:00.000
3 offline 2007-01-01 00:00:00.000 2015-01-01 00:00:00.000
3 destroyed 2015-01-01 00:00:00.000 9999-12-31 00:00:00.000



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

michael.appleton
Posting Yak Master

160 Posts

Posted - 2011-01-28 : 04:52:40
Thanks visakh16, that looks fine now. I Should have been able to work out where your typo was, but I've had to move on to other problems at work so haven't had much time to spend on this one since posting. I'm got loads of places I'm going to need this kind of query so your help is much appreciated and it's great to have two different solutions to the same problem. Thanks!
Go to Top of Page
   

- Advertisement -