| Author |
Topic |
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-10-30 : 14:35:17
|
| Hello everybody,I have a question.I have in a Windows 2000 computer with 512 MB RAM an SWL Server 2000 database.I have a table with many fields (82) all defined as long.This table is filled by samples of 80 measures for each meter.The key is the length.Normally in the table i have 400 records.I need to calculate the MIN, MAX , AVG and STDEV for each measure (80 fields).If i calculate by the following query SELECT MIN(field1),MIN(field2),... ,MIN(field80) FROM table i need 1 minute of time to get the results.The same time when i compute the MAX but only few seconds for AVG and STDEV.I cannot understand why MIN and MAX (very easy operations) need so long time comparing with AVG and STDEV (more difficult operations).But i m really surprize when i splitted my query in 80 single query like:select MIN(field1) from tableselect MIN(field2) from table::select MIN(field80) from tableand the calculation time is reduced to only few seconds.The same result for the MAX operation.No change for AVG and STDEV.Someone can explain to me the reason of previous results?For me the first solution (with the MIN or MAX computed one shot) should be the faster but the results said i m wrong. One shot solution should access to the table only once and the splitted querys must acces to the table 80 times. I really would like understand. Thanks in advance maurizio carnesecchi |
|
|
elwoos
Master Smack Fu Yak Hacker
2052 Posts |
Posted - 2004-10-30 : 19:11:24
|
| What does the execution plan show?steveTo alcohol ! The cause of - and solution to - all of life's problems |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2004-10-31 : 06:29:38
|
| Is it the execution of the query, or the creation of the execution plan that takes the time? |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-10-31 : 12:34:15
|
| Thanks for ur questions ... i will check tomorrow at work ..... |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 05:49:43
|
| Hello i try to get more information.I run the following query:SELECT MIN(MEA1), MIN(MEA2), MIN(MEA3), ..... , MIN(MEA79), MIN(MEA80) FROM QUALITY_DATA_LF_TMP.The result of execution plan is the following:COST SELECT = 0%COST STREAM AGGREGATE = 0%COST TABLE SCAN = 100%The total time to get the result is about 4 minutes when in the table there are only 74 rows.It doesnt change so much if the number of rows becam 500.I saw by task manager thet the system use a lot of memory to compute the query. The memory usage increase of 450 MB during the query execution. |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 06:04:30
|
| If i reduce the the number of MIN to only 20 fields the query needs less then 1 seconds.With 30 fields it needs 4 secondsWith 40 fields it needs 5 secondsWith 50 fields it needs 10 secondsWith 60 fields it needs 17 secondsWith 70 fields it needs 48 secondsWith 80 fields it needs 6 minutes and 34 seconds !!!!! |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-11-01 : 09:22:04
|
| Can you do SET SHOWPLAN_TEXT ONand run the query in QA and then post the plan here - it may be a bit huge though!Are any of the columns indexed (e.g. the first 40, but not the second 40)If you run the SECOND 40 fields on their own does it take 48 seconds, or 6m34s - 48seconds - if its quick then just UNION the two queries togetherEDIT: Might also help to post the results forSET STATISTICS IO ONSET STATISTICS TIME ONand run the query (then turn those off), and post the results here (should just be a couple of lines indicating the number of logic I/Os etc.)Kristen |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-11-01 : 09:56:31
|
| someone has to say it, so I'll be the "bad cop": it might help to normalize your data. I doubt that a table with 80 numeric columns is properly normalized.http://www.datamodel.org/NormalizationRules.html- Jeff |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 12:03:42
|
| Kristen,first thanks for your help.Following you can find the information you request to me.This is the result of SET SHOWPLAN_TEXT ON |--Stream Aggregate(DEFINE:([Expr1002]=MIN([QUALITY_DATA_LF_TMP].[MEA1]), [Expr1003]=MIN([QUALITY_DATA_LF_TMP].[MEA2]), [Expr1004]=MIN([QUALITY_DATA_LF_TMP].[MEA3]), [Expr1005]=MIN([QUALITY_DATA_LF_TMP].[MEA4]), [Expr1006]=MIN([QUALITY_DATA_LF_TMP].[MEA5 |--Table Scan(OBJECT:([HISTORICAL].[dbo].[QUALITY_DATA_LF_TMP]))No INDEX are defined for that fieldsIf I run the query for the last 40 fields, it takes onlt 4 seconds (as run with the first 40 fields)This is the result after SET STATISTICS IO ONSET STATISTICS TIME ONSQL Server execution time: CPU time = 0 ms, past time = 17 ms.SQL Server analysis and compilation phase: CPU time = 51734 ms, past time = 185508 ms.Table 'QUALITY_DATA_LF_TMP'. Scanning counter 1, logical read 40, phisical read 0, read-ahead 47.SQL Server execution time: CPU time = 0 ms, past time = 357 ms.SQL Server analysis and compilation phase: CPU time = 0 ms, past time = 36 ms.SQL Server execution time: CPU time = 0 ms, past time = 0 ms.maurizio. |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 12:08:56
|
| If the english of previous results is not clear is because i had an SQL Server italain version ...so it is just problem of my translation ....sorry for that .... |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 12:10:46
|
| Thanks to you Jeff too,I dont know a lot about normalization, but i will read the link you suggest to me.maurizio. |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 12:12:19
|
| Kristen,You if i split the query in 2 queries with the first 40 fields and one with the other 30 ... i nedd only 8 seconds instead of some minutes .....It looks so strange to me .....maurizio. |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-11-01 : 12:41:43
|
I think it is the calculation of the Execution plan that is the cause of the problem.I did run some test with 100 columns / 100 rowsPerformance does start to deteriorate after som 50 columns, and especially with the MIN MAX aggregates, not so with the AVG aggregate.A normalized solution of course does not incurr that penalty on the calculation of the execution plan.This takes about 20-30 sec on my laptop,the execution plan for the 100 column table is real slow to compute Edit: I did not crosstab the normalized solution, so I don't know what the execution plan and execution time would be then  set nocount oncreate table #tally(nr int)insert #tally select 10*p1.number+p2.number+1 from master..spt_values p1cross join master..spt_values p2 where p1.type = 'P' and p2.type = 'P'and p1.number < 10 and p2.number < 10 order by 1create table meter(meter int)create table measure(m varchar(10))create table meter_measure(meter int,m varchar(10),value float)create table denormalized(meter int, m1 float,m2 float,m3 float,m4 float,m5 float,m6 float,m7 float,m8 float,m9 float,m10 float,m11 float,m12 float,m13 float,m14 float,m15 float,m16 float,m17 float,m18 float,m19 float,m20 float, m21 float,m22 float,m23 float,m24 float,m25 float,m26 float,m27 float,m28 float,m29 float,m30 float,m31 float,m32 float,m33 float,m34 float,m35 float,m36 float,m37 float,m38 float,m39 float,m40 float, m41 float,m42 float,m43 float,m44 float,m45 float,m46 float,m47 float,m48 float,m49 float,m50 float,m51 float,m52 float,m53 float,m54 float,m55 float,m56 float,m57 float,m58 float,m59 float,m60 float, m61 float,m62 float,m63 float,m64 float,m65 float,m66 float,m67 float,m68 float,m69 float,m70 float,m71 float,m72 float,m73 float,m74 float,m75 float,m76 float,m77 float,m78 float,m79 float,m80 float, m81 float,m82 float,m83 float,m84 float,m85 float,m86 float,m87 float,m88 float,m89 float,m90 float,m91 float,m92 float,m93 float,m94 float,m95 float,m96 float,m97 float,m98 float,m99 float,m100 float)insert denormalizedselect nr, nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr, nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr, nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr, nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr, nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nr,nrfrom #tallyinsert measure select 'm'+replace(str(nr,3),' ','0') from #tallyinsert meter_measure select nr,m,nr from measure cross join #tallydrop table #tallyprint '--------------------------------------------------------------'print 'DENORMALIZED START OF EXECUTION.....'print '--------------------------------------------------------------'goselect min(m1),min(m2),min(m3),min(m4),min(m5),min(m6),min(m7),min(m8),min(m9),min(m10),min(m11),min(m12),min(m13),min(m14),min(m15),min(m16),min(m17),min(m18),min(m19),min(m20), min(m21),min(m22),min(m23),min(m24),min(m25),min(m26),min(m27),min(m28),min(m29),min(m30),min(m31),min(m32),min(m33),min(m34),min(m35),min(m36),min(m37),min(m38),min(m39),min(m40), min(m41),min(m42),min(m43),min(m44),min(m45),min(m46),min(m47),min(m48),min(m49),min(m50),min(m51),min(m52),min(m53),min(m54),min(m55),min(m56),min(m57),min(m58),min(m59),min(m60), min(m61),min(m62),min(m63),min(m64),min(m65),min(m66),min(m67),min(m68),min(m69),min(m70),min(m71),min(m72),min(m73),min(m74),min(m75),min(m76),min(m77),min(m78),min(m79),min(m80), min(m81),min(m82),min(m83),min(m84),min(m85),min(m86),min(m87),min(m88),min(m89),min(m90),min(m91),min(m92),min(m93),min(m94),min(m95),min(m96),min(m97),min(m98),min(m99),min(m100)from denormalizedprint '--------------------------------------------------------------'print 'NORMALIZED START OF EXECUTION.....'print '--------------------------------------------------------------'goselect m,min(value) from meter_measure group by mdrop table meter_measuredrop table meterdrop table measuredrop table denormalized rockmoose |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-01 : 15:12:09
|
| Rockmoose,I run your SQL commands in my computer and the complete execution take only 5 seconds and not some minutes as my query.All the time is for the denormalized select.What is the difference? I saw you just use float instead then int for the 100 fields. But i tryed to use int in your script and the time is always 5 seconds.So i tryed to fill your table denormalized with the same data i have in my table (a set of promiscuos integer).The result is that to execute the query it is necessary again 4 minutes.So i think the execution time depends on data too .... |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-11-01 : 15:35:52
|
| I don't have much of a clue to what is going on.In my script when insertion of int to a float column takes place, then implicit conversion int->float takes place.One difference between the aggragate functions MIN,MAX vs SUM,AVG,STDEVis that MIN,MAX accepts character data, whilst the other ones only accept numeric data.Could it be that some datatype conversion is taking place, making things slow ?!If you had time to post sample code so that others could run the same tests as You do,then maybe someone can figure out what is happening.rockmoose |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-11-01 : 16:52:39
|
| mrzcrn, forgetting about the "cause" of the problem, and thinking about a "solution" (for now) you could do:SELECT MIN(field1),MIN(field2),... ,MIN(field40), NULL, ... NULL FROM table UNION ALLSELECT NULL, ... NULL, MIN(field41),MIN(field42),... ,MIN(field80) FROM table You'd then have to add some logic that took 40 MIN's from the first row, and then the second lot or 40 from row No.2 ...Kristen |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-02 : 04:42:15
|
| @rockmoose I will add asap the sample code to test the function. Thanks anyway for your help.@kristen Yes kristen, i already tryed that and is working. But the fastest solution is to compute 80 different MIN and 80 different MAX and save the result in local variables and insert all the local variable one shot in a table. I know it is a very bad SQL code to see but it is the faster. But i think now should be interesting for everybody to understand the reason of that problem.maurizio |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2004-11-02 : 16:24:56
|
| I wonder; does replacingMAX(n) with -MIN(-n)andMIN(n) with -MAX(-n)make a difference? |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-11-03 : 01:20:13
|
| If most of the time is in building the Query Plan surely it will be fine in production - put the code in an SProc and the query plan will be cached ... <g>Kristen |
 |
|
|
mrzcrn
Starting Member
15 Posts |
Posted - 2004-11-04 : 03:19:58
|
| @Arnold:I didnt try that ...do you think it really can change the performances?@kristen:The code is already in a stored procedure.... buyt the results are always the same. |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2004-11-04 : 03:41:58
|
quote: I didnt try that ...do you think it really can change the performances?
It won't run the query any quicker (assuming it produces the same execution plan), but it certainly speeded up the creation of the execution plan in rockmoose's example.My hypothesis is that the query optimizer goes off down a number of blind alleys trying to do something seek-based with MAX(n). And that, because any calculation of AVG(n), STDEV(n) or SUM(n) has to look at all the values irrespective of indexes, the optimizer doesn't look for clever shortcuts. So since the optimizer doesn't know how to transform -MIN(-n) into MAX(n), it gets treated similarly to AVG, etc.As to why it might go off on these wild-goose chases when there aren't any indexes on the columns in question, I have no answers.But as I say, this is just my hypothesis. |
 |
|
|
Next Page
|
|
|