| Author |
Topic |
|
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2000-12-14 : 16:10:25
|
| Tim writes "Okay, this shouldn't be hard, but 6 hours is enough. I have a table with multiple entries per ID. I want to get the top 10 values for each ID. Please, make me feel stupid." |
|
|
GreatInca
Posting Yak Master
102 Posts |
Posted - 2003-12-09 : 13:47:50
|
| There any way to get fields not in the group by or aggragate without adding a second self join?I need to get the top 3 notes for each application (the 'ID' is the FK, the QTY/Rank Field is the date), rank by date and output the title/description. I also can't have it have doing a table scan on the production server as there are too many rows (currently Does scan on 'A' and seek on 'B' in a loop join but could end up as a double scan / hash join on the production server). |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-09 : 13:54:16
|
| Here is how I do it:SELECT Column1, Column2, Column3FROM Table1 t1INNER JOIN (SELECT Column1, MAX(Column2) AS Column2 FROM Table1 t2 GROUP BY Column1) t2ON t1.Column1 = t2.Column1Tara |
 |
|
|
GreatInca
Posting Yak Master
102 Posts |
Posted - 2003-12-09 : 14:01:33
|
| Thats Top 1 for each. It can't do Top 2+. Also could do a corrlated subquery in the select statement but that can only output 1 field but not 2+. It performs much better though but I need 2 fields outputted.The method in the 2nd reply works best but I don't want a double self join or any table scans as there is a typical max of 20 Applications to get the top 3 notes each for but there are 100,000+ notes in this table for 10,000+ applications (on the production server; only a few hundered on dev). |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-09 : 14:03:32
|
| I posted the answer as you were editing your reply, so I answered the question based on only the first sentence in your now edited post.Tara |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-09 : 14:09:07
|
| GreatInca, could you provide DDL for your table and DML for some sample data plus the expected result set using your sample data? I need to see it in code in order to help out. Also, others might join in on this thread if provided this information.Tara |
 |
|
|
GreatInca
Posting Yak Master
102 Posts |
Posted - 2003-12-09 : 14:26:18
|
| Heres the schema:Applications:PK_ApplicationID (Nonclustered)2 FKs (Clustered Unique Index)Tracking system FKsDupe CounterDatesNotes:PK_NoteID (Nonclustered)FK_ApplicationID (Clustered Index)FK_NoteTypeIDTitleDescriptionDateusing the first of the unique key FKs in the App table in the where clause, I need the Top 3 notes for each App that comes up without crunching the whole notes table. Using only the first FK usually gets up to 20 apps, less than .2% of the table's rows. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-09 : 15:40:05
|
| We need the info provided in the form of CREATE TABLE (DDL) and INSERT INTO (DML for sample data). Also expected result set using the sample data is needed. The reason we need this information is to make it easier for us. There are a bunch of people here who are willing to help if the person asking a question gives us everything that we need.Tara |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2003-12-09 : 18:59:16
|
It would definiteily be nice to have DDl and DML, but in the meantime try this technique:select A.* from( select t1.*, (select count(*) from tbl t2 where t1.id = t2.id and t1.date >= t2.date) as Rank from tbl t1) awhere Rank <= 5 in the above, ID is the PK of tbl. The calculation of "Rank" is the key to all this; examine it to see how it works.This problem has been solved MANY times at this site -- do search of my posts (jsmith8858) for the word "Rank" and you'll probably find tons of them ....- Jeff |
 |
|
|
GreatInca
Posting Yak Master
102 Posts |
Posted - 2003-12-11 : 12:04:07
|
| It works better. One of table scan eliminated. The one for the join to the applications remains. The self join table scan for the rank calculation was eliminated. It is still effectively processing all 100,000+ notes for 10,000+ apps when only 200 notes for 20 apps are needed. I think the derived table structure is preventing the clustered index on the FK_ApplicationID from being used for the join.Check out my pictures at http://www.pbase.com/GreatIncaCheck out my job's site!http://www.jobing.com (Market picker for first visit)http://phoenix.jobing.com (corporate home market if you don't live in a covered market) |
 |
|
|
GreatInca
Posting Yak Master
102 Posts |
Posted - 2003-12-11 : 12:23:44
|
| Oh yeah I found a solution that performs better but has its own drawbacks. I used an IN() in the derived table that has a SELECT TOP .. Order By. It can only have 1 ranking field and it runs a sort for each outside row but this query rarely has more than 20 and is being very selective relative to the whole table (20 out of 10,000+) and there is rarely more than 10 notes per app - usually 0-4, so it is the fastest of the 3. But woudn't be very fast for a less selective reporting query though, and it can't rank on multiple fields.SELECT ...FROMApplications... More JoinsJOIN (SELECT FK_ApplicationID, NoteFROM NotesWHERE DateStamp IN (SELECT TOP 3 DateStamp FROM Notes AS Notes2 WHERE Notes.FK_ApplicaitonID=Notes2.FK_ApplicationID ORDER BY DateStamp DESC)) AS Notes ON Applications.PK_ApplicationID=Notes.FK_ApplicationWHERE ... (Limits to 10-20 applications)I would like an even better solution than has no loop join sorts or entire-table crunching table scans to get top 3 notes for each application. IN(SELECT TOP) works fine for this but removing the sort would make a universal solution that might perform well for reporting too.****************************************Check out my pictures at http://www.pbase.com/GreatIncaCheck out my company's site!http://www.jobing.com (Market picker for first visit)http://phoenix.jobing.com (corporate home market if you don't live in a covered market) |
 |
|
|
|