performance hit using @variables in query

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

performance hit using @variables in query

Author

Topic

migil
Starting Member

4 Posts

Posted - 2004-09-08 : 08:27:26

This is an odd one. Almost certainly related to query plans, index use, etc.
I have a simple query (part of a large stored procedure) returning a single row from a two table join.
The tables have ~1,000,000 and ~5,000,000 rows respectively. The join is on single column (integer, and the indexed primary key in one table) and two columns on each table are matched to local @variables. The two tables have several possible indexes to use.
This morning the query was giving me a massive performance hit (~8 sec, instead of ~20 ms).
Performance was restored if I replaced the local variables with specific values (integers all) - !!
So I pulled this query out of the procedure and put it three times (query with values/@variables/values) into (a) its own proc and (b) a script. I.e:
begin
declare @v integer, @n integer select @n = 1, @n = 2
select * from a, b
where a_key = b_key and a_value = 1 and b_number = 2
select * from a, b
where a_key = b_key and a_value = @v and b_number = @n
select * from a, b
where a_key = b_key and a_value = 1 and b_number = 2
end
[typical elapsed times 13ms, 7500 ms, 0 ms - so not a caching issue]
The results were inconsistent, odd and annoying...
On two out of three servers the same performance hit was observed in the procedure using local @variables compared to values. The performance hit was NOT observed in the script. Nor was it observed in either the proc or the script in the third server...
Using query analyser (which I am not as familiar with as I ought to be) it is clear that slightly different query plans are being used with the introduction of a 'bookmark/filter' (?) in the versions that ran faster. The third server (which is an older version, 8.00.194) uses the same query plan each time.
What is going on?!
And more importantly, how do I fix it to run the faster version of the query plan!
Thanks for your help.
mike

mr_mist
Grunnio

1870 Posts

Posted - 2004-09-08 : 08:45:58

Smallint would be faster than int if you can afford the fewer digits.

-------
Moo. :)

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2004-09-08 : 08:47:11

sounds an unusual problem...however...some simple stuff to begin with....

1...use a tool like redgate to compare server definitions....are all identical in structure/load?
2..."(which is an older version, 8.00.194) "....version of what?
3...can you supply ddl for affected tables?...including indices?
4..."select * from a inner join b on a_key = b_key where a_value = 1 and b_number = 2" is a more readable format...and it also allows specifying of index hints.
5...you can/should clear/reset query plans before each phase of the queries/scripts...for benchmarking.
6...can you post expected/actual execution plans?
7...SQL Profiler is another tool for your problem investigation toolit.

migil
Starting Member

4 Posts

Posted - 2004-09-08 : 09:31:35

[quote]Originally posted by AndrewMurphy
1....are all identical in structure/load?
Similar - same database reloaded into other two servers - all effectively single user.
2..."(which is an older version, 8.00.194) "....version of what?
Microsoft SQL Server 2000 - 8.00.194 (Intel X86)
Aug 6 2000 00:57:48
etc..
3...can you supply ddl for affected tables?...including indices?
Posted below.
4...and it also allows specifying of index hints.
I use 'table with(index(index_name))' ok in my current format. In this case using the obvious (unique key) indexes has no effect.
5...you can/should clear/reset query plans ...for benchmarking.
Is this possible in a stored procedure? How do you do it?
6...can you post expected/actual execution plans?
Need to work out how to get them in a postable form...
7...SQL Profiler is another tool ..
Will investigate.
DDLs for tables:
------------------------------------------------
create table sequences
(
seq_id ID,
seq_type CODE,
seq_source CODE,
seq_obj_id ID,
seq_end tinyint NULL,
::: (other rows)
seq_masked char(1) null,
seq_revcomp char(1) null,
seq_tag varchar(16) null,
seq_sequence text
)
go
create unique nonclustered index seq_unique
on sequences ( seq_id )
create nonclustered index seq_source
on sequences ( seq_obj_id )
create unique nonclustered index seq_object
on sequences ( seq_obj_id, seq_end, seq_version, seq_revcomp, seq_masked, seq_tag )
go
------------------------------------------------
create table project_sequences
(
psq_prj_id ID,
psq_version integer default 1,
psq_seq_id ID,
psq_parent_seq_id ID null,
psq_offset integer null
::: (other rows)
)
create unique nonclustered index psq_unique
on project_sequences ( psq_prj_id, psq_version, psq_seq_id )
create nonclustered index psq_parent
on project_sequences ( psq_prj_id, psq_version, psq_parent_seq_id )
create nonclustered index psq_seq
on project_sequences ( psq_seq_id )
go
------------------------------------------------

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2004-09-08 : 10:09:21

in your query what corresponds to what?
"select * from a, b where a_key = b_key and a_value = 1 and b_number = 2"
a = sequences
b = project_sequences
a_key = seq_id
b_key = psq_seq_id
a_value = ?
b_number = ?

is this a parent-child relationship....based on seq_id->psq_seq_id?
if so...reversing (psq_seq_id, psq_version, psq_prj_id) in index psq_unique would give the same uniqueness as before...with less/no need for index psq_seq...at no obvious cost....(as a non-clustered index it won't re-order the data).

ideally you should be including the "where cols" in indices as well...for top-of-the range performance...and avoiding the convention "Select *"

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-09-08 : 10:11:49

1) please show us what your User defined datatypes are (CODE, ID)

2) you still didn't show us which columns you are using -- your example does not indicate which of the columns in these tables you are trying to filter on

- Jeff

migil
Starting Member

4 Posts

Posted - 2004-09-08 : 10:33:25

Actual query is:
select seq_id, seq_masked, seq_version, psq_sense, psq_offset
from project_sequences, sequences
where psq_prj_id = @prj_id
and psq_version = @version
and psq_seq_id = seq_id
and seq_obj_id = @obj_id
and seq_end = @end
and seq_strand = 1 (I didn't show this column B4)
I appreciate some folk like a different format.
Remarks so far, most helpful, thanks.
Now I've also found something interesting, if I remove the reference to seq_strand (un-indexed) it's really fast... I think the indexes now 'cover' the query. This has (partially) solved the problem, but not really answered the question of why using local @variables produced a (relatively) inefficient query plan in stored procedures...
Time to review some indexes anyway.
mike

migil
Starting Member

4 Posts

Posted - 2004-09-08 : 10:35:08

1) please show us what your User defined datatypes are (CODE, ID)
CODE = varchar(4)
ID = integer
2) you still didn't show us which columns you are using -- your
(above)

kselvia
Aged Yak Warrior

526 Posts

Posted - 2004-09-08 : 10:43:12

This article on "parameter sniffing" from Bart Duncan @ Microsoft may be helpful.

Copied from: http://www.examnotes.net/archive79-2002-7-48335.html
--------------------------------------------------------------
Brian,

A couple things come to mind here. First add SET NOCOUNT ON to the
beginning of the sp and see if that helps. But more than likely your
running into this:

The reason for the performance difference stems from a feature called
"parameter sniffing". Consider a stored proc defined as follows:

CREATE PROC proc1 @p1 int AS
SELECT * FROM table1 WHERE c1 = @p1
GO

Keep in mind that the server has to compile a complete execution plan for
the proc before the proc begins to execute. In 6.5, at compile time SQL
didn't know what the value of @p1 was, so it had to make a lot of guesses
when compiling a plan. Suppose all of the actual parameter values for
"@p1 int" that a user ever passed into this stored proc were unique
integers that were greater than 0, but suppose 40% of the [c1] values in
[table1] were, in fact, 0. SQL would use the average density of the
column to estimate the number of rows that this predicate would return;
this would be an overestimate, and SQL would might choose a table scan
over an index seek based on the rowcount estimates. A table scan would
be the best plan if the parameter value was 0, but unfortunately it
happens that users will never or rarely pass @p1=0, so performance of the
stored proc for more typical parameters suffers.

In SQL 7.0 or 2000, suppose you executed this proc for the first time
(when the sp plan is not in cache) with the command "EXEC proc1 @p1 =
10". Parameter sniffing allows SQL to insert the known value of
parameter @p1 into the query at compile time before a plan for the query
is generated. Because SQL knows that the value of @p1 is not 0, it can
compile a plan that is tailored to the class of parameters that is
actually passed into the proc, so for example it might select an index
seek instead of a table scan based on the smaller estimated rowcount --
this is a good thing if most of the time 0 is not the value passed as
@p1. Generally speaking, this feature allows more efficient stored proc
execution plans, but a key requirement for everything to work as expected
is that the parameter values used for compilation be "typical".

In your case, the problem is that you have default NULL values for your
parameters ("@Today DATETIME = NULL, ...") that are not typical because
the parameter values are changed inside the stored proc before they are
used -- as a result NULL will never actually be used to search the
column. If the first execution of this stored proc doesn't pass in an
explicit value for the @Today parameter, SQL believes that its value will
be NULL. When SQL compiles the plan for this sp it substitutes NULL for
each occurrence of @Today that is embedded within a query.
Unfortunately, after execution begins the first thing the stored proc
does is change @Today to a non-NULL value if it is found to be NULL, but
unfortunately SQL doesn't know about this at compile time. Because NULL
is a very atypical parameter value, the plan that SQL generates may not
be a good one for the new value of the parameter that is assigned at
execution time.

So, the bottom line is that if you assign defaults to your sp parameters
and later use those same parameters in a query, the defaults should be
"typical" because they will be used during plan generation. If you must
use defaults and business logic dictates that they be atypical (as may be
the case here if app modifications are not an option), there are two
possible solutions if you determine that the substitution of atypical
parameter values is causing bad plans:

1. "Disable" parameter sniffing by using local DECLARE'd variables that
you SET equal to the parameters inside the stored proc, and use the local
variables instead of the offending parameters in the queries. This is the
solution that you found yourself. SQL can't use parameter sniffing in
this case so it must make some guesses, but in this case the guess based
on average column density is better than the plan based on a specific but
"wrong" parameter value (NULL).

2. Nest the affected queries somehow so that they run within a different
context that will require a distinct execution plan. There are several
possibilities here. for example:
a. Put the affected queries in a different "child" stored proc. If
you execute that stored proc within this one *after* the parameter @Today
has been changed to its final value, parameter sniffing will suddenly
become your friend because the value SQL uses to compile the queries
inside the child stored proc is the actual value that will be used in the
query.
b. Use sp_executesql to execute the affected queries. The plan won't
be generated until the sp_executesql stmt actually runs, which is of
course after the parameter values have been changed.
c. Use dynamic SQL ("EXEC (@sql)") to execute the affected queries.
An equivalent approach would be to put the query in a child stored proc
just like 2.a, but execute it within the parent proc with EXEC WITH
RECOMPILE.

Option #1 seems to have worked well for you in this case, although
sometimes one of the options in #2 is a preferable choice. Here are some
guidelines, although when you're dealing with something as complicated as
the query optimizer experimentation is often the best approach <g>:

- If you have only one "class" (defined as values that have similar
density in the table) of actual parameter value that is used within a
query (even if there are other classes of data in the base table that are
never or rarely searched on), 2.a. or 2.b is probably the best option.
This is because these options permit the actual parameter values to be
used during compilation which should result in the most efficient query
plan for that class of parameter.
- If you have multiple "classes" of parameter value (for example, for
the column being searched, half the table data is NULL, the other half
are unique integers, and you may do searches on either class), 2.c can be
effective. The downside is that a new plan for the query must be
compiled on each execution, but the upside is that the plan will always
be tailored to the parameter value being used for that particular
execution. This is best when there is no single execution plan that
provides acceptable execution time for all classes of parameters.

HTH -
Bart
------------
Bart Duncan
Microsoft SQL Server Support

Please reply to the newsgroup only - thanks.

This posting is provided "AS IS" with no warranties, and confers no
rights.

--Ken
I want to die in my sleep like my grandfather, not screaming in terror like his passengers.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources