MIN and MAX so slow - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

MIN and MAX so slow

Author

Topic

mrzcrn
Starting Member

15 Posts

Posted - 2004-10-30 : 14:35:17

Hello everybody,

I have a question.
I have in a Windows 2000 computer with 512 MB RAM an SWL Server 2000 database.

I have a table with many fields (82) all defined as long.
This table is filled by samples of 80 measures for each meter.
The key is the length.
Normally in the table i have 400 records.
I need to calculate the MIN, MAX , AVG and STDEV for each measure (80 fields).
If i calculate by the following query
SELECT MIN(field1),MIN(field2),... ,MIN(field80) FROM table
i need 1 minute of time to get the results.
The same time when i compute the MAX but only few seconds for AVG and STDEV.
I cannot understand why MIN and MAX (very easy operations) need so long time comparing with AVG and STDEV (more difficult operations).

But i m really surprize when i splitted my query in 80 single query like:
select MIN(field1) from table
select MIN(field2) from table
:
:
select MIN(field80) from table

and the calculation time is reduced to only few seconds.
The same result for the MAX operation.
No change for AVG and STDEV.

Someone can explain to me the reason of previous results?

For me the first solution (with the MIN or MAX computed one shot) should be the faster but the results said i m wrong. One shot solution should access to the table only once and the splitted querys must acces to the table 80 times.

I really would like understand.

Thanks in advance

maurizio carnesecchi

elwoos
Master Smack Fu Yak Hacker

2052 Posts

Posted - 2004-10-30 : 19:11:24

What does the execution plan show?

steve

To alcohol ! The cause of - and solution to - all of life's problems

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2004-10-31 : 06:29:38

Is it the execution of the query, or the creation of the execution plan that takes the time?

mrzcrn
Starting Member

15 Posts

Posted - 2004-10-31 : 12:34:15

Thanks for ur questions ... i will check tomorrow at work .....

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 05:49:43

Hello i try to get more information.
I run the following query:
SELECT
MIN(MEA1), MIN(MEA2), MIN(MEA3), ..... , MIN(MEA79), MIN(MEA80)
FROM QUALITY_DATA_LF_TMP.

The result of execution plan is the following:
COST SELECT = 0%
COST STREAM AGGREGATE = 0%
COST TABLE SCAN = 100%

The total time to get the result is about 4 minutes when in the table there are only 74 rows.
It doesnt change so much if the number of rows becam 500.
I saw by task manager thet the system use a lot of memory to compute the query. The memory usage increase of 450 MB during the query execution.

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 06:04:30

If i reduce the the number of MIN to only 20 fields the query needs less then 1 seconds.
With 30 fields it needs 4 seconds
With 40 fields it needs 5 seconds
With 50 fields it needs 10 seconds
With 60 fields it needs 17 seconds
With 70 fields it needs 48 seconds
With 80 fields it needs 6 minutes and 34 seconds !!!!!

Kristen
Test

22859 Posts

Posted - 2004-11-01 : 09:22:04

Can you do

SET SHOWPLAN_TEXT ON

and run the query in QA and then post the plan here - it may be a bit huge though!

Are any of the columns indexed (e.g. the first 40, but not the second 40)

If you run the SECOND 40 fields on their own does it take 48 seconds, or 6m34s - 48seconds - if its quick then just UNION the two queries together

EDIT: Might also help to post the results for

SET STATISTICS IO ON
SET STATISTICS TIME ON

and run the query (then turn those off), and post the results here (should just be a couple of lines indicating the number of logic I/Os etc.)

Kristen

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-11-01 : 09:56:31

someone has to say it, so I'll be the "bad cop":

it might help to normalize your data. I doubt that a table with 80 numeric columns is properly normalized.

http://www.datamodel.org/NormalizationRules.html

- Jeff

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 12:03:42

Kristen,
first thanks for your help.

Following you can find the information you request to me.

This is the result of SET SHOWPLAN_TEXT ON

|--Stream Aggregate(DEFINE:([Expr1002]=MIN([QUALITY_DATA_LF_TMP].[MEA1]), [Expr1003]=MIN([QUALITY_DATA_LF_TMP].[MEA2]), [Expr1004]=MIN([QUALITY_DATA_LF_TMP].[MEA3]), [Expr1005]=MIN([QUALITY_DATA_LF_TMP].[MEA4]), [Expr1006]=MIN([QUALITY_DATA_LF_TMP].[MEA5
|--Table Scan(OBJECT:([HISTORICAL].[dbo].[QUALITY_DATA_LF_TMP]))

No INDEX are defined for that fields

If I run the query for the last 40 fields, it takes onlt 4 seconds (as run with the first 40 fields)

This is the result after
SET STATISTICS IO ON
SET STATISTICS TIME ON

SQL Server execution time:
CPU time = 0 ms, past time = 17 ms.
SQL Server analysis and compilation phase:
CPU time = 51734 ms, past time = 185508 ms.

Table 'QUALITY_DATA_LF_TMP'. Scanning counter 1, logical read 40, phisical read 0, read-ahead 47.

SQL Server execution time:
CPU time = 0 ms, past time = 357 ms.
SQL Server analysis and compilation phase:
CPU time = 0 ms, past time = 36 ms.

SQL Server execution time:
CPU time = 0 ms, past time = 0 ms.

maurizio.

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 12:08:56

If the english of previous results is not clear is because i had an SQL Server italain version ...so it is just problem of my translation ....sorry for that ....

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 12:10:46

Thanks to you Jeff too,

I dont know a lot about normalization, but i will read the link you suggest to me.

maurizio.

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 12:12:19

Kristen,

You if i split the query in 2 queries with the first 40 fields and one with the other 30 ... i nedd only 8 seconds instead of some minutes .....

It looks so strange to me .....

maurizio.

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-01 : 12:41:43

I think it is the calculation of the Execution plan that is the cause of the problem.
I did run some test with 100 columns / 100 rows

Performance does start to deteriorate after som 50 columns, and especially with the MIN MAX aggregates, not so with the AVG aggregate.
A normalized solution of course does not incurr that penalty on the calculation of the execution plan.

This takes about 20-30 sec on my laptop,
the execution plan for the 100 column table is real slow to compute

Edit: I did not crosstab the normalized solution, so I don't know what the execution plan and execution time would be then

set nocount on

create table #tally(nr int)
insert #tally select 10*p1.number+p2.number+1 from master..spt_values p1
cross join master..spt_values p2 where p1.type = 'P' and p2.type = 'P'
and p1.number < 10 and p2.number < 10 order by 1


create table meter(meter int)
create table measure(m varchar(10))
create table meter_measure(meter int,m varchar(10),value float)

create table denormalized(meter int,
	m1 float,m2 float,m3 float,m4 float,m5 float,m6 float,m7 float,m8 float,m9 float,m10 float,m11 float,m12 float,m13 float,m14 float,m15 float,m16 float,m17 float,m18 float,m19 float,m20 float,
	m21 float,m22 float,m23 float,m24 float,m25 float,m26 float,m27 float,m28 float,m29 float,m30 float,m31 float,m32 float,m33 float,m34 float,m35 float,m36 float,m37 float,m38 float,m39 float,m40 float,
	m41 float,m42 float,m43 float,m44 float,m45 float,m46 float,m47 float,m48 float,m49 float,m50 float,m51 float,m52 float,m53 float,m54 float,m55 float,m56 float,m57 float,m58 float,m59 float,m60 float,
	m61 float,m62 float,m63 float,m64 float,m65 float,m66 float,m67 float,m68 float,m69 float,m70 float,m71 float,m72 float,m73 float,m74 float,m75 float,m76 float,m77 float,m78 float,m79 float,m80 float,
	m81 float,m82 float,m83 float,m84 float,m85 float,m86 float,m87 float,m88 float,m89 float,m90 float,m91 float,m92 float,m93 float,m94 float,m95 float,m96 float,m97 float,m98 float,m99 float,m100 float)

insert denormalized
select nr,
	nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,
	nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,
	nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,
	nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,
	nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr
from #tally

insert measure select 'm'+replace(str(nr,3),' ','0') from #tally
insert meter_measure select nr,m,nr from measure cross join #tally

drop table #tally


print '
--------------------------------------------------------------'
print 'DENORMALIZED START OF EXECUTION.....'
print '--------------------------------------------------------------'
go

select
	min(m1),min(m2),min(m3),min(m4),min(m5),min(m6),min(m7),min(m8),min(m9),min(m10),min(m11),min(m12),min(m13),min(m14),min(m15),min(m16),min(m17),min(m18),min(m19),min(m20),
	min(m21),min(m22),min(m23),min(m24),min(m25),min(m26),min(m27),min(m28),min(m29),min(m30),min(m31),min(m32),min(m33),min(m34),min(m35),min(m36),min(m37),min(m38),min(m39),min(m40),
	min(m41),min(m42),min(m43),min(m44),min(m45),min(m46),min(m47),min(m48),min(m49),min(m50),min(m51),min(m52),min(m53),min(m54),min(m55),min(m56),min(m57),min(m58),min(m59),min(m60),
	min(m61),min(m62),min(m63),min(m64),min(m65),min(m66),min(m67),min(m68),min(m69),min(m70),min(m71),min(m72),min(m73),min(m74),min(m75),min(m76),min(m77),min(m78),min(m79),min(m80),
	min(m81),min(m82),min(m83),min(m84),min(m85),min(m86),min(m87),min(m88),min(m89),min(m90),min(m91),min(m92),min(m93),min(m94),min(m95),min(m96),min(m97),min(m98),min(m99),min(m100)
from denormalized


print '
--------------------------------------------------------------'
print 'NORMALIZED START OF EXECUTION.....'
print '--------------------------------------------------------------'
go
select m,min(value) from meter_measure group by m

drop table meter_measure
drop table meter
drop table measure
drop table denormalized

rockmoose

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-01 : 15:12:09

Rockmoose,

I run your SQL commands in my computer and the complete execution take only 5 seconds and not some minutes as my query.
All the time is for the denormalized select.

What is the difference?
I saw you just use float instead then int for the 100 fields. But i tryed to use int in your script and the time is always 5 seconds.

So i tryed to fill your table denormalized with the same data i have in my table (a set of promiscuos integer).

The result is that to execute the query it is necessary again 4 minutes.

So i think the execution time depends on data too ....

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-01 : 15:35:52

I don't have much of a clue to what is going on.
In my script when insertion of int to a float column takes place, then implicit conversion int->float takes place.

One difference between the aggragate functions MIN,MAX vs SUM,AVG,STDEV
is that MIN,MAX accepts character data, whilst the other ones only accept numeric data.

Could it be that some datatype conversion is taking place, making things slow ?!

If you had time to post sample code so that others could run the same tests as You do,
then maybe someone can figure out what is happening.

rockmoose

Kristen
Test

22859 Posts

Posted - 2004-11-01 : 16:52:39

mrzcrn, forgetting about the "cause" of the problem, and thinking about a "solution" (for now) you could do:

SELECT MIN(field1),MIN(field2),... ,MIN(field40), NULL, ... NULL FROM table
UNION ALL
SELECT NULL, ... NULL, MIN(field41),MIN(field42),... ,MIN(field80) FROM table

You'd then have to add some logic that took 40 MIN's from the first row, and then the second lot or 40 from row No.2 ...

Kristen

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-02 : 04:42:15

@rockmoose
I will add asap the sample code to test the function. Thanks anyway for your help.

@kristen
Yes kristen, i already tryed that and is working. But the fastest solution is to compute 80 different MIN and 80 different MAX and save the result in local variables and insert all the local variable one shot in a table. I know it is a very bad SQL code to see but it is the faster.

But i think now should be interesting for everybody to understand the reason of that problem.

maurizio

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2004-11-02 : 16:24:56

I wonder; does replacing
MAX(n) with -MIN(-n)
and
MIN(n) with -MAX(-n)
make a difference?

Kristen
Test

22859 Posts

Posted - 2004-11-03 : 01:20:13

If most of the time is in building the Query Plan surely it will be fine in production - put the code in an SProc and the query plan will be cached ... <g>

Kristen

mrzcrn
Starting Member

15 Posts

Posted - 2004-11-04 : 03:19:58

@Arnold:
I didnt try that ...do you think it really can change the performances?

@kristen:
The code is already in a stored procedure.... buyt the results are always the same.

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2004-11-04 : 03:41:58

quote:
I didnt try that ...do you think it really can change the performances?

It won't run the query any quicker (assuming it produces the same execution plan), but it certainly speeded up the creation of the execution plan in rockmoose's example.

My hypothesis is that the query optimizer goes off down a number of blind alleys trying to do something seek-based with MAX(n). And that, because any calculation of AVG(n), STDEV(n) or SUM(n) has to look at all the values irrespective of indexes, the optimizer doesn't look for clever shortcuts. So since the optimizer doesn't know how to transform -MIN(-n) into MAX(n), it gets treated similarly to AVG, etc.

As to why it might go off on these wild-goose chases when there aren't any indexes on the columns in question, I have no answers.

But as I say, this is just my hypothesis.

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources