| Author |
Topic |
|
vr8ce
Starting Member
23 Posts |
Posted - 2006-06-09 : 15:09:30
|
I want to get the most recent row for each foreign key from a table that has multiple rows for each of the foreign keys. I think I can do it with a three-tiered correlated query, but I'm hoping it can be done simpler.create table foo (key1 int, fk int, dt smalldatetime, amt money)create unique index foo_key1 on foo (key1)insert into foo values (100,1,'5/18/2006',10)insert into foo values (200,1,'5/18/2006',20)insert into foo values (300,2,'5/1/2006',30)insert into foo values (400,4,'5/8/2006',40)insert into foo values (500,4,'6/2/2006',50)insert into foo values (600,4,'6/4/2006',60)insert into foo values (700,4,'6/4/2006',70)insert into foo values (800,5,'6/7/2006',80)insert into foo values (900,5,'6/1/2006',90) The row with the most recent date for each fk should be returned. If there is more than one "most recent", i.e. multiple rows for that date, the one with the largest key should be returned. So, the return from the query for the above data would be200,1,'5/18/2006',20300,2,'5/1/2006',30700,4,'6/4/2006',70800,5,'6/7/2006',80 I can get the max(dt), but since there's multiple possible rows for a given fk, I can't just join to that. I also have to get the max(key) for the given max(date) (for each fk), then join on the max(key) back to the table to get the data. But that's sounding very convoluted.The real "foo" table will be a large table (multi-million rows), so while I'm interested in any solution, an efficient one would be even better. :)Thanks!Vince |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-09 : 15:23:41
|
| [code]-- prepare test datadeclare @foo table (key1 int, fk int, dt smalldatetime, amt money)insert into @foo values (100,1,'5/18/2006',10)insert into @foo values (200,1,'5/18/2006',20)insert into @foo values (300,2,'5/1/2006',30)insert into @foo values (400,4,'5/8/2006',40)insert into @foo values (500,4,'6/2/2006',50)insert into @foo values (600,4,'6/4/2006',60)insert into @foo values (700,4,'6/4/2006',70)insert into @foo values (800,5,'6/7/2006',80)insert into @foo values (900,5,'6/1/2006',90)-- do the workSELECT f.key1, f.fk, f.dt, f.amtFROM @foo fINNER JOIN ( SELECT f.fk, f.dt, MAX(f.amt) amtmax FROM ( SELECT fk, MAX(dt) dtmax FROM @foo f GROUP BY fk ) ld INNER JOIN @foo f ON f.fk = ld.fk AND f.dt = ld.dtmax GROUP BY f.fk, f.dt ) z ON z.fk = f.fk AND z.dt = f.dt AND z.amtmax = f.amtORDER BY f.fk[/code] |
 |
|
|
vr8ce
Starting Member
23 Posts |
Posted - 2006-06-09 : 16:39:40
|
| Thanks very much for the reply.That's what I was afraid of. That was the three-tiered query I referred to, and I was really hoping it could be done simpler than that. I shudder to think what that's going to do when there are 25 million rows in the table.Thanks for the confirmation, anyway! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-09 : 16:57:00
|
| What if you have a covered index? Try to run and post result times here.25 million rows? I would estimate a run time of about 8-12 seconds. |
 |
|
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
Posted - 2006-06-10 : 05:02:14
|
| [code]SELECT *FROM @fooWHERE key1 IN ( SELECT ( SELECT TOP 1 key1 FROM @foo i WHERE i.fk = o.fk ORDER BY dt DESC, key1 DESC) ikey1 FROM @foo o GROUP BY o.fk )[/code]or maybe with a join instead, not sure if that really will change anything[code]SELECT f.*FROM @foo fINNER JOIN ( SELECT ( SELECT TOP 1 key1 FROM @foo i WHERE i.fk = o.fk ORDER BY dt DESC, key1 DESC) ikey1 FROM @foo o GROUP BY o.fk ) AS dt ON f.key1 = dt.ikey1[/code]the execution plan on my server (2005) favors the first, but Peso´s even more.With small amount of data, indexes (or lack here off) doesnt really come into play. So try out all three and see which one suits you best.-- This one's tricky. You have to use calculus and imaginary numbers for this. You know, eleventeen, thirty-twelve and all those. |
 |
|
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
Posted - 2006-06-10 : 05:17:40
|
this gives the same plan as my first proposal, but doesnt it look neat  SELECT *FROM @fooWHERE key1 IN ( SELECT ( SELECT TOP 1 key1 FROM @foo i WHERE i.fk = o.fk ORDER BY dt DESC, key1 DESC) ikey1 FROM ( SELECT fk FROM @foo GROUP BY fk ) o ) -- This one's tricky. You have to use calculus and imaginary numbers for this. You know, eleventeen, thirty-twelve and all those. |
 |
|
|
RyanRandall
Master Smack Fu Yak Hacker
1074 Posts |
Posted - 2006-06-12 : 05:44:07
|
Something else to try...--dataif object_id('#foo') is not null drop table #foogocreate table #foo (key1 int, fk int, dt smalldatetime, amt money)create unique index #foo_key1 on #foo (key1)set dateformat mdyinsert into #foo values (100,1,'5/18/2006',10)insert into #foo values (200,1,'5/18/2006',20)insert into #foo values (300,2,'5/1/2006',30)insert into #foo values (400,4,'5/8/2006',40)insert into #foo values (500,4,'6/2/2006',50)insert into #foo values (600,4,'6/4/2006',60)insert into #foo values (700,4,'6/4/2006',70)insert into #foo values (800,5,'6/7/2006',80)insert into #foo values (900,5,'6/1/2006',90)--calculationselect * from #foo a where key1 = (select top 1 key1 from #foo where fk = a.fk order by dt desc, key1 desc)/*resultskey1 fk dt amt ----------- ----------- ------------------------------------------------------ --------------------- 200 1 2006-05-18 00:00:00 20.0000300 2 2006-05-01 00:00:00 30.0000700 4 2006-06-04 00:00:00 70.0000800 5 2006-06-07 00:00:00 80.0000*/Ryan Randallwww.monsoonmalabar.com London-based IT consultancy Solutions are easy. Understanding the problem, now, that's the hard part. |
 |
|
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
Posted - 2006-06-12 : 11:42:35
|
| Sigh, way the best one so far, so simple (only two table scans). Why didn't I see that, I was sooooo close, guess I was to fixated on the need of an explecite join. This has to be the right answer.-- This one's tricky. You have to use calculus and imaginary numbers for this. You know, eleventeen, thirty-twelve and all those. |
 |
|
|
|
|
|