Convoluted MAX sql query - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Convoluted MAX sql query

Author

Topic

vr8ce
Starting Member

23 Posts

Posted - 2006-06-09 : 15:09:30

I want to get the most recent row for each foreign key from a table that has multiple rows for each of the foreign keys. I think I can do it with a three-tiered correlated query, but I'm hoping it can be done simpler.

create table foo (key1 int, fk int, dt smalldatetime, amt money)
create unique index foo_key1 on foo (key1)

insert into foo values (100,1,'5/18/2006',10)
insert into foo values (200,1,'5/18/2006',20)
insert into foo values (300,2,'5/1/2006',30)
insert into foo values (400,4,'5/8/2006',40)
insert into foo values (500,4,'6/2/2006',50)
insert into foo values (600,4,'6/4/2006',60)
insert into foo values (700,4,'6/4/2006',70)
insert into foo values (800,5,'6/7/2006',80)
insert into foo values (900,5,'6/1/2006',90)

The row with the most recent date for each fk should be returned. If there is more than one "most recent", i.e. multiple rows for that date, the one with the largest key should be returned. So, the return from the query for the above data would be

200,1,'5/18/2006',20
300,2,'5/1/2006',30
700,4,'6/4/2006',70
800,5,'6/7/2006',80

I can get the max(dt), but since there's multiple possible rows for a given fk, I can't just join to that. I also have to get the max(key) for the given max(date) (for each fk), then join on the max(key) back to the table to get the data. But that's sounding very convoluted.

The real "foo" table will be a large table (multi-million rows), so while I'm interested in any solution, an efficient one would be even better. :)

Thanks!

Vince

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-06-09 : 15:23:41

[code]-- prepare test data
declare @foo table (key1 int, fk int, dt smalldatetime, amt money)

insert into @foo values (100,1,'5/18/2006',10)
insert into @foo values (200,1,'5/18/2006',20)
insert into @foo values (300,2,'5/1/2006',30)
insert into @foo values (400,4,'5/8/2006',40)
insert into @foo values (500,4,'6/2/2006',50)
insert into @foo values (600,4,'6/4/2006',60)
insert into @foo values (700,4,'6/4/2006',70)
insert into @foo values (800,5,'6/7/2006',80)
insert into @foo values (900,5,'6/1/2006',90)

-- do the work
SELECT f.key1,
f.fk,
f.dt,
f.amt
FROM @foo f
INNER JOIN (
SELECT f.fk,
f.dt,
MAX(f.amt) amtmax
FROM (
SELECT fk,
MAX(dt) dtmax
FROM @foo f
GROUP BY fk
) ld
INNER JOIN @foo f ON f.fk = ld.fk AND f.dt = ld.dtmax
GROUP BY f.fk,
f.dt
) z ON z.fk = f.fk AND z.dt = f.dt AND z.amtmax = f.amt
ORDER BY f.fk[/code]

vr8ce
Starting Member

23 Posts

Posted - 2006-06-09 : 16:39:40

Thanks very much for the reply.

That's what I was afraid of. That was the three-tiered query I referred to, and I was really hoping it could be done simpler than that. I shudder to think what that's going to do when there are 25 million rows in the table.

Thanks for the confirmation, anyway!

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-06-09 : 16:57:00

What if you have a covered index? Try to run and post result times here.

25 million rows? I would estimate a run time of about 8-12 seconds.

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-06-10 : 05:02:14

[code]SELECT *
FROM @foo
WHERE key1 IN (
SELECT (
SELECT TOP 1 key1
FROM @foo i
WHERE i.fk = o.fk
ORDER BY dt DESC, key1 DESC) ikey1
FROM @foo o
GROUP BY o.fk
)[/code]or maybe with a join instead, not sure if that really will change anything[code]SELECT f.*
FROM @foo f
INNER JOIN (
SELECT (
SELECT TOP 1 key1
FROM @foo i
WHERE i.fk = o.fk
ORDER BY dt DESC, key1 DESC) ikey1
FROM @foo o
GROUP BY o.fk
) AS dt
ON f.key1 = dt.ikey1[/code]the execution plan on my server (2005) favors the first, but Peso´s even more.
With small amount of data, indexes (or lack here off) doesnt really come into play. So try out all three and see which one suits you best.

-- This one's tricky. You have to use calculus and imaginary numbers for this. You know, eleventeen, thirty-twelve and all those.

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-06-10 : 05:17:40

this gives the same plan as my first proposal, but doesnt it look neat

SELECT *
FROM @foo
WHERE key1 IN (
   SELECT (
      SELECT TOP 1 key1
      FROM @foo i
      WHERE i.fk = o.fk
      ORDER BY dt DESC, key1 DESC) ikey1
   FROM (
      SELECT fk
      FROM @foo
      GROUP BY fk
      ) o
   )

-- This one's tricky. You have to use calculus and imaginary numbers for this. You know, eleventeen, thirty-twelve and all those.

RyanRandall
Master Smack Fu Yak Hacker

1074 Posts

Posted - 2006-06-12 : 05:44:07

Something else to try...

--data
if object_id('#foo') is not null drop table #foo
go
create table #foo (key1 int, fk int, dt smalldatetime, amt money)
create unique index #foo_key1 on #foo (key1)
set dateformat mdy

insert into #foo values (100,1,'5/18/2006',10)
insert into #foo values (200,1,'5/18/2006',20)
insert into #foo values (300,2,'5/1/2006',30)
insert into #foo values (400,4,'5/8/2006',40)
insert into #foo values (500,4,'6/2/2006',50)
insert into #foo values (600,4,'6/4/2006',60)
insert into #foo values (700,4,'6/4/2006',70)
insert into #foo values (800,5,'6/7/2006',80)
insert into #foo values (900,5,'6/1/2006',90)

--calculation
select * from #foo a where key1 =
  (select top 1 key1 from #foo where fk = a.fk order by dt desc, key1 desc)

/*results
key1        fk          dt                                                     amt                   
----------- ----------- ------------------------------------------------------ --------------------- 
200         1           2006-05-18 00:00:00                                    20.0000
300         2           2006-05-01 00:00:00                                    30.0000
700         4           2006-06-04 00:00:00                                    70.0000
800         5           2006-06-07 00:00:00                                    80.0000
*/

Ryan Randall
www.monsoonmalabar.com London-based IT consultancy

Solutions are easy. Understanding the problem, now, that's the hard part.

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-06-12 : 11:42:35

Sigh, way the best one so far, so simple (only two table scans). Why didn't I see that, I was sooooo close, guess I was to fixated on the need of an explecite join. This has to be the right answer.

-- This one's tricky. You have to use calculus and imaginary numbers for this. You know, eleventeen, thirty-twelve and all those.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources