Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
pwcphoto
Yak Posting Veteran
69 Posts |
Posted - 2005-07-17 : 08:02:28
|
| Is there a more efficient way to join 2 tables then shown below (I call out the actual fields, the * is there for brevity in this posting)Select * FROM (UF2_rt_Images INNER JOIN UF2_rt_friends ON UF2_rt_friends.to_id=UF2_rt_Images.user_id AND UF2_rt_Friends.From_ID = "& MyID &")I tried moving the "AND UF2_rt_Friends.From_ID = "& MyID &")" from the JOIN to the WHERE portion of the query, but it made no difference in speed. I have indexes set on the Image table User_ID as well as the Friends table To_ID fields. There is about 1,000,000 records in the images table total and there are about 100 records in the UF2_rt_Friends table that meet the criteria of MyID. Does this mean that the DB has to go through all 1,000,000 records 100 times to look for matching friends user_IDI can get through the first bunch of pages real fast but if i jump to the last page of which there are #285 the last 30 of 8545 returned records it times out.The paging is using a stored procedure that returns only the rows requested not everything as in ado. Is there some other way using a subselect that may be more efficient?Thanks for any info,Phil-----------------------I used to be a rocket scientist. Now I'm just a space cadet... |
|
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-07-17 : 10:35:16
|
You need an index on UF2_rt_Friends.fom_id column and you need to replace * with explicit column names.quote: There is about 1,000,000 records in the images table total and there are about 100 records in the UF2_rt_Friends table that meet the criteria of MyID. Does this mean that the DB has to go through all 1,000,000 records 100 times to look for matching friends user_ID
No, you have index on user_id so only matching rows will be accessed. |
 |
|
|
pwcphoto
Yak Posting Veteran
69 Posts |
Posted - 2005-07-17 : 15:40:17
|
| Well there is an index on Friends.from_ID and as mentioned in the original posting above, the actual query lists out each field individually I don't use *, I just used a * here in this posting instead of typing out 25 fields since they are irrelavent as far as the problem goes.Any other suggestions?Phil-----------------------I used to be a rocket scientist. Now I'm just a space cadet... |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2005-07-17 : 18:36:36
|
| The sample SQL You posted is the most efficient way to join the tables, make sure You have indexes on the joined columns.Run the same queries in QA, and look at the Execution Plan.UPDATE STATISTICS might help.Are You doing a paging procedure ? ... and just showing us part of the query.rockmoose |
 |
|
|
pwcphoto
Yak Posting Veteran
69 Posts |
Posted - 2005-07-17 : 21:04:08
|
| Yes I am using a stored procedure for the paging, I pass the query to it and it then perfoms the paging, it seems to work great on other queries, but it is having a real hard time with this query for some reason.REATE PROCEDURE UFSP_Record_Paging ( @strFields VARCHAR(4000) , @strPK VARCHAR(100), @strTables VARCHAR(4000), @strFilter VARCHAR(8000) = NULL, @strSort VARCHAR(8000) = NULL, @strGroup VARCHAR(8000) = NULL, @intPageNo INT = 1, @intPageSize INT = NULL, @blnGetRecordCount BIT = 0, @strCountTables VARCHAR(500))/* PURPOSE: executes a select statement as defined by the parameters and returns a particular page of data (or all rows) efficiently */ AS DECLARE @blnBringAllRecords BIT DECLARE @strPageNo VARCHAR(50) DECLARE @strPageSize VARCHAR(50) DECLARE @strSkippedRows VARCHAR(50) DECLARE @strFilterCriteria VARCHAR(8000) DECLARE @strSimpleFilter VARCHAR(8000) DECLARE @strSortCriteria VARCHAR(8000) DECLARE @strGroupCriteria VARCHAR(8000) DECLARE @intRecordcount INT DECLARE @intPagecount INT --******** NORMALIZE THE PAGING CRITERIA --if no meaningful inputs are provided, we can avoid paging and execute a more efficient query, so we will set a flag that will help with that (blnBringAllRecords) IF @intPageNo < 1 SET @intPageNo = 1 SET @strPageNo = CONVERT(VARCHAR(50), @intPageNo) IF @intPageSize IS NULL OR @intPageSize < 1 -- BRING ALL RECORDS, DON'T DO PAGING SET @blnBringAllRecords = 1 ELSE BEGIN SET @blnBringAllRecords = 0 SET @strPageSize = CONVERT(VARCHAR(50), @intPageSize) SET @strPageNo = CONVERT(VARCHAR(50), @intPageNo) SET @strSkippedRows = CONVERT(VARCHAR(50), @intPageSize * (@intPageNo - 1)) END--******** NORMALIZE THE FILTER AND SORTING CRITERIA --if they are empty, we will avoid filtering and sorting, respectively, executing more efficient queries IF @strFilter IS NOT NULL AND @strFilter != '' BEGIN SET @strFilterCriteria = ' WHERE ' + @strFilter + ' ' SET @strSimpleFilter = ' AND ' + @strFilter + ' ' END ELSE BEGIN SET @strSimpleFilter = '' SET @strFilterCriteria = '' END IF @strSort IS NOT NULL AND @strSort != '' SET @strSortCriteria = ' ORDER BY ' + @strSort + ' ' ELSE SET @strSortCriteria = '' IF @strGroup IS NOT NULL AND @strGroup != '' SET @strGroupCriteria = ' GROUP BY ' + @strGroup + ' ' ELSE SET @strGroupCriteria = ''--************************** NOW START DOING THE REAL WORK--!NOTE: for potentially improved performance, use sp_executesql instead of EXECIF @blnBringAllRecords = 1 --ignore paging and run a simple selectBEGIN EXEC ( 'SELECT ' + @strFields + ' FROM ' + @strTables + @strFilterCriteria + @strGroupCriteria + @strSortCriteria )END-- WE HAD TO BRING ALL RECORDSELSE --BRING ONLY A PARTICULAR PAGEBEGIN IF @intPageNo = 1 --in this case we can execute a more efficient query, with no subqueries EXEC ( 'SELECT TOP ' + @strPageSize + ' ' + @strFields + ' FROM ' + @strTables + @strFilterCriteria + @strGroupCriteria + @strSortCriteria ) ELSE --execute a structure of subqueries that brings the correct page EXEC ( 'SELECT ' + @strFields + ' FROM ' + @strTables + ' WHERE ' + @strPK + ' IN ' + ' (SELECT TOP ' + @strPageSize + ' ' + @strPK + ' FROM ' + @strTables + ' WHERE ' + @strPK + ' NOT IN ' + ' (SELECT TOP ' + @strSkippedRows + ' ' + @strPK + ' FROM ' + @strTables + @strFilterCriteria + @strGroupCriteria + @strSortCriteria + ') ' + @strSimpleFilter + @strGroupCriteria + @strSortCriteria + ') ' + @strSimpleFilter + @strGroupCriteria + @strSortCriteria )END --WE HAD TO BRING A PARTICULAR PAGE--IF WE NEED TO RETURN THE RECORDCOUNTIF @blnGetRecordCount = 1 IF @strGroupCriteria != '' EXEC ( 'SELECT COUNT(*) AS RECORDCOUNT FROM (SELECT COUNT(*) FROM ' + @strCountTables + @strFilterCriteria + @strGroupCriteria + ') AS tbl (id)' ) ELSE EXEC ( 'SELECT COUNT(*) AS RECORDCOUNT FROM ' + @strCountTables + @strFilterCriteria + @strGroupCriteria )GO-----------------------I used to be a rocket scientist. Now I'm just a space cadet... |
 |
|
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-07-17 : 22:50:47
|
| Nice code! :) I think you may slightly improve performance if you return record count as a value of output parameter instead of returning it as the only row of the query select count(*) from .... In case when you return all rows you will have significantly better performance (i think) if you set @recordCount = @@rowCount.You may also try composite index on from_id, user_id + columns in the select list from that table (in that order). I hope there are just 2,3 columns from that table selected.p.s. Sorry for the star comment, I overlooked your comment in parentheses. |
 |
|
|
pwcphoto
Yak Posting Veteran
69 Posts |
Posted - 2005-07-17 : 23:57:13
|
| The problem seems to be with using the inner join as a way to limit the records returned. In other queries using the same stored procedure, there are plenty of joins, but not used to limit the returned records, only to join names to id's that sort of thing. I tried a LEFT JOIN and moved the filtering to a WHERE statement, but it didn't seem to make much of a difference.The friends table only has 3 columns, an ID Ident, From_ID which is me, and To_ID which is my friends, not real complicated. The image table has an image_ID field which is the PK and user_ID field which is what I join on the Friends_From_ID. When it comes to the indexes, maybe you can tell me if I did it right. There is an index of the Image_ID only which is the PK. Then there are indexes on Image_ID & User_ID, Since User_ID is not going to be unique as there may be multiple images by the same user, to make it a unique key I added the Image_ID to the index. Is there a difference as to which should be first and asscending or descending? Should I create indexes in both directions? Still not real clear on that part.Thanks,Phil-----------------------I used to be a rocket scientist. Now I'm just a space cadet... |
 |
|
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2005-07-18 : 05:01:50
|
| "Is there a difference as to which should be first and asscending or descending? Should I create indexes in both directions"....it does matter.If an index is COLA+COLB...and in a query you supply COLB....then the index is no good (for that query)...It's like searching a phone book for everybody named John, when it's order by surname. |
 |
|
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-07-18 : 05:09:49
|
quote: The friends table only has 3 columns, an ID Ident, From_ID which is me, and To_ID which is my friends, not real complicated. The image table has an image_ID field which is the PK and user_ID field which is what I join on the Friends_From_ID.
Then it's clear, you should crate index on from_id, to_id with columns in that order. Column order in composite index index is important. Imagine index as list sorted using order of the first column values first, being order by second column on top of that when there are multiple entries of first column values. The last element of index entry is pointer to actual data. Consider multiple column index as a string made by concatenation of columns in order they are specified. Also be aware that you can jump on first string that is bigger or equal (in case ascending order) to specified string prefix.quote: When it comes to the indexes, maybe you can tell me if I did it right. There is an index of the Image_ID only which is the PK. Then there are indexes on Image_ID & User_ID, Since User_ID is not going to be unique as there may be multiple images by the same user, to make it a unique key I added the Image_ID to the index. Is there a difference as to which should be first and asscending or descending? Should I create indexes in both directions? Still not real clear on that part.
You don't have to include image_id in your index on user_id if your pk is clustered. In case you have clustered index on the table pointers to data in non-clustered indexes are clustered index values. In case columns making clustered index are not unique sql server automatically adds internal autogenerated column to make it unique (which is not your case). What killed performance is the fact (if i got your explanation right) that image_id is first column in your index. Drop that index and create one on user_id only. |
 |
|
|
|
|
|
|
|