Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 More efficient query

Author  Topic 

pwcphoto
Yak Posting Veteran

69 Posts

Posted - 2005-07-17 : 08:02:28
Is there a more efficient way to join 2 tables then shown below (I call out the actual fields, the * is there for brevity in this posting)

Select *
FROM (UF2_rt_Images INNER JOIN UF2_rt_friends ON UF2_rt_friends.to_id=UF2_rt_Images.user_id AND UF2_rt_Friends.From_ID = "& MyID &")

I tried moving the "AND UF2_rt_Friends.From_ID = "& MyID &")" from the JOIN to the WHERE portion of the query, but it made no difference in speed. I have indexes set on the Image table User_ID as well as the Friends table To_ID fields.

There is about 1,000,000 records in the images table total and there are about 100 records in the UF2_rt_Friends table that meet the criteria of MyID. Does this mean that the DB has to go through all 1,000,000 records 100 times to look for matching friends user_ID

I can get through the first bunch of pages real fast but if i jump to the last page of which there are #285 the last 30 of 8545 returned records it times out.

The paging is using a stored procedure that returns only the rows requested not everything as in ado. Is there some other way using a subselect that may be more efficient?

Thanks for any info,

Phil

-----------------------
I used to be a rocket scientist. Now I'm just a space cadet...

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-07-17 : 10:35:16
You need an index on UF2_rt_Friends.fom_id column and you need to replace * with explicit column names.

quote:
There is about 1,000,000 records in the images table total and there are about 100 records in the UF2_rt_Friends table that meet the criteria of MyID. Does this mean that the DB has to go through all 1,000,000 records 100 times to look for matching friends user_ID
No, you have index on user_id so only matching rows will be accessed.
Go to Top of Page

pwcphoto
Yak Posting Veteran

69 Posts

Posted - 2005-07-17 : 15:40:17
Well there is an index on Friends.from_ID and as mentioned in the original posting above, the actual query lists out each field individually I don't use *, I just used a * here in this posting instead of typing out 25 fields since they are irrelavent as far as the problem goes.

Any other suggestions?

Phil

-----------------------
I used to be a rocket scientist. Now I'm just a space cadet...
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2005-07-17 : 18:36:36
The sample SQL You posted is the most efficient way to join the tables, make sure You have indexes on the joined columns.

Run the same queries in QA, and look at the Execution Plan.
UPDATE STATISTICS might help.

Are You doing a paging procedure ? ... and just showing us part of the query.

rockmoose
Go to Top of Page

pwcphoto
Yak Posting Veteran

69 Posts

Posted - 2005-07-17 : 21:04:08
Yes I am using a stored procedure for the paging, I pass the query to it and it then perfoms the paging, it seems to work great on other queries, but it is having a real hard time with this query for some reason.

REATE PROCEDURE UFSP_Record_Paging (
@strFields VARCHAR(4000) ,
@strPK VARCHAR(100),
@strTables VARCHAR(4000),
@strFilter VARCHAR(8000) = NULL,
@strSort VARCHAR(8000) = NULL,
@strGroup VARCHAR(8000) = NULL,
@intPageNo INT = 1,
@intPageSize INT = NULL,
@blnGetRecordCount BIT = 0,
@strCountTables VARCHAR(500)
)

/*
PURPOSE: executes a select statement as defined by the parameters and returns a particular page of data (or all rows) efficiently

*/

AS
DECLARE @blnBringAllRecords BIT
DECLARE @strPageNo VARCHAR(50)
DECLARE @strPageSize VARCHAR(50)
DECLARE @strSkippedRows VARCHAR(50)

DECLARE @strFilterCriteria VARCHAR(8000)
DECLARE @strSimpleFilter VARCHAR(8000)
DECLARE @strSortCriteria VARCHAR(8000)
DECLARE @strGroupCriteria VARCHAR(8000)

DECLARE @intRecordcount INT
DECLARE @intPagecount INT


--******** NORMALIZE THE PAGING CRITERIA
--if no meaningful inputs are provided, we can avoid paging and execute a more efficient query, so we will set a flag that will help with that (blnBringAllRecords)

IF @intPageNo < 1
SET @intPageNo = 1

SET @strPageNo = CONVERT(VARCHAR(50), @intPageNo)

IF @intPageSize IS NULL OR @intPageSize < 1 -- BRING ALL RECORDS, DON'T DO PAGING
SET @blnBringAllRecords = 1
ELSE
BEGIN
SET @blnBringAllRecords = 0
SET @strPageSize = CONVERT(VARCHAR(50), @intPageSize)
SET @strPageNo = CONVERT(VARCHAR(50), @intPageNo)
SET @strSkippedRows = CONVERT(VARCHAR(50), @intPageSize * (@intPageNo - 1))
END





--******** NORMALIZE THE FILTER AND SORTING CRITERIA
--if they are empty, we will avoid filtering and sorting, respectively, executing more efficient queries

IF @strFilter IS NOT NULL AND @strFilter != ''
BEGIN
SET @strFilterCriteria = ' WHERE ' + @strFilter + ' '
SET @strSimpleFilter = ' AND ' + @strFilter + ' '
END
ELSE
BEGIN
SET @strSimpleFilter = ''
SET @strFilterCriteria = ''
END

IF @strSort IS NOT NULL AND @strSort != ''
SET @strSortCriteria = ' ORDER BY ' + @strSort + ' '
ELSE
SET @strSortCriteria = ''

IF @strGroup IS NOT NULL AND @strGroup != ''
SET @strGroupCriteria = ' GROUP BY ' + @strGroup + ' '
ELSE
SET @strGroupCriteria = ''



--************************** NOW START DOING THE REAL WORK
--!NOTE: for potentially improved performance, use sp_executesql instead of EXEC

IF @blnBringAllRecords = 1 --ignore paging and run a simple select
BEGIN

EXEC (
'SELECT ' + @strFields + ' FROM ' + @strTables + @strFilterCriteria + @strGroupCriteria + @strSortCriteria
)

END-- WE HAD TO BRING ALL RECORDS
ELSE --BRING ONLY A PARTICULAR PAGE
BEGIN
IF @intPageNo = 1 --in this case we can execute a more efficient query, with no subqueries
EXEC (
'SELECT TOP ' + @strPageSize + ' ' + @strFields + ' FROM ' + @strTables + @strFilterCriteria + @strGroupCriteria + @strSortCriteria
)
ELSE --execute a structure of subqueries that brings the correct page
EXEC (
'SELECT ' + @strFields + ' FROM ' + @strTables + ' WHERE ' + @strPK + ' IN ' + '
(SELECT TOP ' + @strPageSize + ' ' + @strPK + ' FROM ' + @strTables +
' WHERE ' + @strPK + ' NOT IN ' + '
(SELECT TOP ' + @strSkippedRows + ' ' + @strPK + ' FROM ' + @strTables + @strFilterCriteria + @strGroupCriteria + @strSortCriteria + ') ' +
@strSimpleFilter +
@strGroupCriteria +
@strSortCriteria + ') ' +
@strSimpleFilter +
@strGroupCriteria +
@strSortCriteria
)

END --WE HAD TO BRING A PARTICULAR PAGE


--IF WE NEED TO RETURN THE RECORDCOUNT
IF @blnGetRecordCount = 1
IF @strGroupCriteria != ''
EXEC (
'SELECT COUNT(*) AS RECORDCOUNT FROM (SELECT COUNT(*) FROM ' + @strCountTables + @strFilterCriteria + @strGroupCriteria + ') AS tbl (id)'
)
ELSE
EXEC (
'SELECT COUNT(*) AS RECORDCOUNT FROM ' + @strCountTables + @strFilterCriteria + @strGroupCriteria
)
GO


-----------------------
I used to be a rocket scientist. Now I'm just a space cadet...
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-07-17 : 22:50:47
Nice code! :) I think you may slightly improve performance if you return record count as a value of output parameter instead of returning it as the only row of the query select count(*) from .... In case when you return all rows you will have significantly better performance (i think) if you set @recordCount = @@rowCount.

You may also try composite index on from_id, user_id + columns in the select list from that table (in that order). I hope there are just 2,3 columns from that table selected.

p.s. Sorry for the star comment, I overlooked your comment in parentheses.
Go to Top of Page

pwcphoto
Yak Posting Veteran

69 Posts

Posted - 2005-07-17 : 23:57:13
The problem seems to be with using the inner join as a way to limit the records returned. In other queries using the same stored procedure, there are plenty of joins, but not used to limit the returned records, only to join names to id's that sort of thing. I tried a LEFT JOIN and moved the filtering to a WHERE statement, but it didn't seem to make much of a difference.

The friends table only has 3 columns, an ID Ident, From_ID which is me, and To_ID which is my friends, not real complicated. The image table has an image_ID field which is the PK and user_ID field which is what I join on the Friends_From_ID.

When it comes to the indexes, maybe you can tell me if I did it right. There is an index of the Image_ID only which is the PK. Then there are indexes on Image_ID & User_ID, Since User_ID is not going to be unique as there may be multiple images by the same user, to make it a unique key I added the Image_ID to the index. Is there a difference as to which should be first and asscending or descending? Should I create indexes in both directions? Still not real clear on that part.

Thanks,

Phil

-----------------------
I used to be a rocket scientist. Now I'm just a space cadet...
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2005-07-18 : 05:01:50
"Is there a difference as to which should be first and asscending or descending? Should I create indexes in both directions"....it does matter.
If an index is COLA+COLB...and in a query you supply COLB....then the index is no good (for that query)...It's like searching a phone book for everybody named John, when it's order by surname.
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-07-18 : 05:09:49
quote:
The friends table only has 3 columns, an ID Ident, From_ID which is me, and To_ID which is my friends, not real complicated. The image table has an image_ID field which is the PK and user_ID field which is what I join on the Friends_From_ID.
Then it's clear, you should crate index on from_id, to_id with columns in that order. Column order in composite index index is important. Imagine index as list sorted using order of the first column values first, being order by second column on top of that when there are multiple entries of first column values. The last element of index entry is pointer to actual data. Consider multiple column index as a string made by concatenation of columns in order they are specified. Also be aware that you can jump on first string that is bigger or equal (in case ascending order) to specified string prefix.
quote:
When it comes to the indexes, maybe you can tell me if I did it right. There is an index of the Image_ID only which is the PK. Then there are indexes on Image_ID & User_ID, Since User_ID is not going to be unique as there may be multiple images by the same user, to make it a unique key I added the Image_ID to the index. Is there a difference as to which should be first and asscending or descending? Should I create indexes in both directions? Still not real clear on that part.
You don't have to include image_id in your index on user_id if your pk is clustered. In case you have clustered index on the table pointers to data in non-clustered indexes are clustered index values. In case columns making clustered index are not unique sql server automatically adds internal autogenerated column to make it unique (which is not your case). What killed performance is the fact (if i got your explanation right) that image_id is first column in your index. Drop that index and create one on user_id only.

Go to Top of Page
   

- Advertisement -