Need new type of hint -- Rowcount

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

Need new type of hint -- Rowcount

Author

Topic

GreatInca
Posting Yak Master

102 Posts

Posted - 2001-12-17 : 16:48:33

The optimizer is often way off on the rowcount estimate (its usually good directly out of an index, if the range isn't too small -- such as 5 minutes span on a datetime field of a table with 2 years worth of data -- but out of joins and filters, it can get way off). They should have a Estimated row count ovveride hint. Currently on a query of mine, the optimizer is using an unspooled nested loop join on a subquery with 3 joins and a bookmark lookup, thinking there is only 1 row on the left side of the join but there is really 350. The first table read int he query returned 1 row, but 350 came out of the first join and it expected 1, leading to a bad query plan after that.

Alos the Query optimizer sometimes ignores join hints in stored procedures (more common if join has multiple constraints) and on this query, it makes a more efficient execution plan when the query is executed adhoc in query analyzer than when run as a stored procedure. The query runs in 2 seconds in ad-hoc with hash join hints for the subquery modules, 3 seconds adhoc with merge join hints, 4 seconds adhoc with no hints (it uses 1 hash join and 2 loop joins with worktables), and in 17 seconds as a stored procedure (3 loop joins, 2 with hash match caches, 1 with no spoolin at all).

robvolk
Most Valuable Yak

15732 Posts

Posted - 2001-12-17 : 17:46:09

It seems you've tried all of the available hints. Even if a Rowcount hint was available, I can't see what good it would do; regardless of the # of rows returned, it still has to retrieve them. If your SELECT list is only pulling column(s) that are part of an index, it can simply use the index and not hit the table at all. A rowcount hint wouldn't help (in fact it would slow it down).

Are you sure that the index statistics are up to date? If so, have you done a DBCC REINDEX recently, or a DBCC CHECKTABLE or CHECKDB? BTW, did you DROPCLEANBUFFERS before you ran each type of hint? If you don't clear the buffers you might get better performance than is warranted, because the data might still be in cache.

GreatInca
Posting Yak Master

102 Posts

Posted - 2001-12-18 : 12:49:22

Its pulling rows from a subquery with 4 joins (has abookmark lookup in it's plan and all the compenent joins were loop/index seek joins). The Rowcounts went off on the 2nd join - when 1 row cam from the initial index seek and then 320 rows came out of the join (there were 320 survey submissions for the companyid).

Whats happening is that when run as a stored procedure it does an uncached / no work table nested loop join to this derived table, and therefore reexcuting the subquery (gets 'your answers', 4 loop/index joins, 1 bookmark lookup) 320 times. It ignores my join hints. There are also 2 other derived tables, one with 3 joins and the other 4 (survey aggragates, choice summary). It uses loop join / hash-match cache on these to derived tables in stored rocedure form and ignores join hints.

When run adhoc it is compiled better and spools the results of the subquery (running it only once) into a worktable and using a loop join with the worktable. It does not ingnore join hints when run adhoc and runs much faster when using hash-match join hint to the derived table.

I'm thinking if I force it to think there are 500 rows on the left side of the joins to the derived tables, it would use a hash match join all by itself.

The statistics can't be too out of date becuase auto-update and auto-create are on. Although there are 2 temp tables being accessed by all 3 subquery modules (derived tables). They probably don't have any statistics. One is for a list of company IDs from the application's multiselect box to restrict the aggragates and choice summaries, and the other is a list of PositionIDs from the application's multiselect box to resttrict all 3 derived tables (aggragates, your answers, choice summary). They are created in the query prior to executing the stored procedure and dropped in the query batch after the stored procedure finishes.

DBCC ReIndex is run every sunday morning. Statistics are manually updated every night. Didn't do anything with clearing the buffer cache. CheckDB checks out ok.

I think its the optimizer -- it seems to work differently with stored procedures than adhoc queries -- becuase of parameters. And there might be bugs becuase its ignoring my hints in stroed procedure and which it honers them when run as an adhoc query. It might be thrown off by the 3 or 4 predicate joins. If it honored my join hints or allowed a rowcount override then my query, or better yet, compiled it right in the first place, it would be running in 2 seconds instead of 17. (2 adhoc with hash join hints, 4 adhoc with no hints, 17 as stored procedure).

robvolk
Most Valuable Yak

15732 Posts

Posted - 2001-12-18 : 15:42:53

I've had the same thing happen to me!

On more than one occasion I've had a stored procedure that runs with a bad plan, even when I do a recompile on it. If I drop the procedure and recreate it, then it runs as fast as a Query Analyzer run of the same code. It seems that sometimes the cached plan never gets cleared unless you drop and recreate the SP. See if that helps.

On a side note, can any of the subqueries be re-written as JOINs? You might be able to use more hints then.

Edited by - robvolk on 12/18/2001 15:43:46

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2001-12-18 : 16:52:11

The plan that is generated for a stored procedure is based on the parameters passed to it on its first execution.

If you have parameters with NULL defaults and use NULL for your first execution then the plan used will be based on a NULL. Obviously this will not be the most efiicent plan.

When first executing the proc try to use realistic values...

DavidM

"Why are you crying?"

"Because I love a Microsoft product"

GreatInca
Posting Yak Master

102 Posts

Posted - 2001-12-18 : 16:59:52

NULLs are extinct in my databases

Recompiling does not fix this procedure - same plan every time. Only running it directly in QA produces better plan and alos the only way to get it not to ignore join hints.

It recieves 2 single value parameters - Login Company ID and Survey ID. There are 2 set parameters being passed through temp tables -- Company IDs for the aggragates and choice summmaries, and a list of positions to report on.

The Query structure is

Select
Fields with lots of Case Statements
FROM
SELECT
Fields with Outer Join NULL Filters
FROM
(SubQuery to get Poisitions, Questions, and Choices; uses #Positions)
LEFT OUTER JOIN
(Subquery to get aggragates; Uses #Positions and #Companies, @SurveyID)
LEFT OUTER JOIN
(Subquery to get 'your answers'; Uses #Positions, @SurveyID, @CompanyID)
LEFT OUTER JOIN
(Subquery to get choice summaries (has some aggrates, different group by; Uses #Positions, #Companies, and @SurveyID))

The 'Your answers' derived table is the real bad one - the optimizer uses no table spool or not even a hash match cash. But it will use a table spool when run the QA instead of a stored procedure.

Edited by - GreatInca on 12/18/2001 17:05:17

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources