Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Can anyone help me optimise this query?

Author  Topic 

spib
Starting Member

9 Posts

Posted - 2002-09-22 : 10:47:03
Hi,

I have two tables
 
FILES
=====
SEQ_NO (primary key)
NAME

EVENTS
=======
SEQ_NO (primary key)
FILE_ID (points to FILES.SEQ_NO)
PLAY (boolean)
AUTOPLAY (boolean)
TIMESTAMP (datetime)


A row is added to the events table with either the PLAY or AUTOPLAY field set to true. The
FILE_ID column refers to a valid row in the FILES table. What I am trying to show in my query, is a list of files ordered by the number of times PLAYED or AUTOPLAYED within a date range specified in the timestamp.

At present I am selecting the appropriate date range from the EVENTS table into an intermediate table called COUNTS

COUNTS
=======
FILE_ID (points to FILES.SEQ_NO)
PLAYS (count)
AUTOPLAYS (count)


COUNTS contains one row per FILE which I then join to my FILES table on the FILE_ID field.

At present I have about 4000 rows in the FILES table and about 30,000 rows in the EVENTS table. This means that to execute the above query currently takes about 40 secs which is far too long.

Is there a way I could acheive the above with out using the intermediate COUNTS table thus speeding up the query?



Edited by - spib on 09/22/2002 10:48:13

Tim
Starting Member

392 Posts

Posted - 2002-09-22 : 22:38:51
try this

-- assume "boolean" columns are implemented as bit data type
-- assume your file name is unique

SELECT
a.name,
sum(b.plays) AS PlayCount,
sum(b.autoplays) AS AutoPlayCount
FROM
files a JOIN events b ON a.seq_no = b.file_id
WHERE
b.timestamp BETWEEN @fromdate AND @todate
GROUP BY
a.name


You can further improve performance of this with appropriate indexes.


PS: Why use two columns when Autoplay = not Play ??



Edited by - tim on 09/22/2002 22:41:40
Go to Top of Page

spib
Starting Member

9 Posts

Posted - 2002-09-23 : 05:56:00
Hi Tim and thanks for the reply.

That query worked great, thanks. I realise that the PLAY and AUTOPLAY columns are redundant but I actually have about 6 other events which are recorded in there but I wanted to simplify the question as much as possible.

* edit *
The FILES table actually contains more columns than I mentioned above. When I tried to include these columns in the select I get the following error

You tried to execute a query that does not include the specified expression 'FILES.FILESIZE' as part of an aggregate function.

I've seen this error before but I can't remember what I need to do to fix it. I have 15 other columns which are in the FILES table which I need to select in the above query.

Thanks again

James



Edited by - spib on 09/23/2002 06:58:42
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2002-09-23 : 12:08:46
"select a,b,c, SUM(field1), COUNT(*) from table1 GROUP BY a,b,c"


Where a SELECT statement contains a SUM or COUNT (aggregate functions) and you are displaying some fields a,b,c (etc)....
you also need to repeat the a,b,c in the GROUP BY statement....

Go to Top of Page

Tim
Starting Member

392 Posts

Posted - 2002-09-24 : 22:44:39
In some cases Andrew's solution will give more records than you want.

An alternative is like this:

SELECT
*
FROM
files
JOIN

(
SELECT
a.seq_no,
sum(b.plays) AS PlayCount,
sum(b.autoplays) AS AutoPlayCount
FROM
files a JOIN events b ON a.seq_no = b.file_id
WHERE
b.timestamp BETWEEN @fromdate AND @todate
GROUP BY
a.seq_no
) as filestats

ON files.seq_no = filestats.seq_no


----
Nancy Davolio: Best looking chick at Northwind 1992-2000
Go to Top of Page
   

- Advertisement -