Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
raxbat
Yak Posting Veteran
52 Posts |
Posted - 2007-12-08 : 10:51:19
|
Hello all!I have he table with the rates for currency pair for the last 12h. I wish to calculate with SQL query the Simpla=e mving average and Exponencial moving average for some period. Help me please wit this query. My table:Time Rate----------08:00:06 1.437808:00:12 1.4377...etc...PS: moving average: http://en.wikipedia.org/wiki/Moving_averageThank You |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-09 : 02:23:48
|
[code]declare @t table (date datetime not null, rate float not null)insert @tselect getdate()-10, 1 union allselect getdate()-9, 2 union allselect getdate()-8, 4 union allselect getdate()-7, 4 union allselect getdate()-6, 5 union allselect getdate()-5, 6 union allselect getdate()-4, 6 union allselect getdate()-3, 8 union allselect getdate()-2, 9 union allselect getdate()-1, 10 union allselect getdate()-0, 11 union allselect getdate()+1, 9select * from @t-- change the upper limit of the between clause to -- specify how far back the moving average looks. also if your moving average looks -- back minutes rather than days, you'll have to change the first arg to datediff.select t.date, avg(tt.rate) as simpleMovingAvgfrom @t AS tjoin @t tt on DATEDIFF(day, tt.date, t.date) between 0 And 2group by t.dateorder by t.date-- add more when clauses to extend how far back the weighted moving average looks. -- adjust weights however you like, with the constraint that they all sum to 1. -- note that you should exclude the first 2 values returned by this since they don't have -- a full set of history - a minor boundary effect.select t.date, sum(case when DATEDIFF(day, tt.date, t.date)=0 then 0.7*tt.rate when DATEDIFF(day, tt.date, t.date)=1 then 0.2*tt.rate when DATEDIFF(day, tt.date, t.date)=2 then 0.1*tt.rate end) as weightedMovingAvgfrom @t tjoin @t tt on DATEDIFF(day, tt.date, t.date) between 0 And 2group by t.dateorder by t.date[/code] elsasoft.org |
|
|
raxbat
Yak Posting Veteran
52 Posts |
Posted - 2007-12-09 : 05:03:04
|
Thanx i works! But took a long time to process al my data!I know that it is possible to compute current MA value, knowing previous value! How can I realize this?Thanx a lot! |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-09 : 12:44:40
|
yea, moving averages (like running totals) are a case where the set-based approach is slower than a procedural approach. that is, a cursor may be faster (gasp!) since you would only have to read each row once, and there would be no join. To do the moving average as I have done above, each row is read 3 times, plus you have the extra overhead of a join. I'm too lazy to give you the cursor-based solution though. that's left as an exercise for you elsasoft.org |
|
|
raxbat
Yak Posting Veteran
52 Posts |
Posted - 2007-12-09 : 12:52:38
|
How can I implement this?Thank You |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-09 : 15:50:39
|
I think the wikipedia link you posted explains it well enough... did you try it? elsasoft.org |
|
|
raxbat
Yak Posting Veteran
52 Posts |
Posted - 2007-12-09 : 16:13:22
|
If I could ... :(Im the simple user of sql, Im far away from sql programming and difficult functionshelp me please |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-09 : 18:10:53
|
well, ok. this code compares a cursor-based method with the one I gave above. cursor is much faster if you have thousands of rows.On my system, with 5000 rows of sample data, the cursor took 796 ms and the other method took 12016 ms. with 100k rows of sample data, cursor took 19 sec. other method I killed after letting it run 5 minutes.have a look at this article, it explains the problem with queries like this: http://www.sqlservercentral.com/articles/T-SQL/61539declare @t table (date datetime not null, rate float not null)declare @movingAverages table (date datetime not null, simpleMovingAvg float not null, weightedMovingAvg float not null)---- prepare some random sample data--declare @i intset @i=0while @i < 5000 --increase this number to make the perf difference even more starkbegin insert @t select getdate()+@i, rand() set @i = @i + 1end---- use an ugly cursor to calculate moving avgs--declare @date datetimedeclare @rate floatdeclare @rateCurrent floatdeclare @rateMinusOne floatdeclare @rateMinusTwo floatdeclare @then datetimeset @then=getdate()declare ugly cursor for select date, rate from @topen uglywhile 1=1begin FETCH NEXT FROM ugly INTO @date, @rate if @@FETCH_STATUS <> 0 break set @rateMinusTwo=@rateMinusOne set @rateMinusOne=@rateCurrent set @rateCurrent=@rate if @rateMinusTwo is not null and @rateMinusOne is not null and @rateCurrent is not null begin insert @movingAverages select @date ,(@rateCurrent + @rateMinusOne + @rateMinusTwo) / 3.0 ,0.7 * @rateCurrent + 0.2 * @rateMinusOne + 0.1 * @rateMinusTwo endendCLOSE uglyDEALLOCATE ugly-- see what we came up with select t.date, t.rate, m.simpleMovingAvg, m.weightedMovingAvgfrom @movingAverages mjoin @t t on t.date=m.date-- how long did the cursor take?select datediff(ms, @then, getdate()) as [ugly cursor time]---- now see how long the set-based way takes--set @then=getdate()select t.date ,avg(tt.rate) as simpleMovingAvg ,sum(case when DATEDIFF(day, tt.date, t.date)=0 then 0.7*tt.rate when DATEDIFF(day, tt.date, t.date)=1 then 0.2*tt.rate when DATEDIFF(day, tt.date, t.date)=2 then 0.1*tt.rate end) as weightedMovingAvgfrom @t AS tjoin @t tt on DATEDIFF(day, tt.date, t.date) between 0 And 2group by t.dateorder by t.date-- how long did set-based take? select datediff(ms, @then, getdate()) as [set based time] elsasoft.org |
|
|
raxbat
Yak Posting Veteran
52 Posts |
Posted - 2007-12-10 : 00:28:23
|
Thanx a lot! You are the real sql guru :) |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-10 : 09:45:27
|
Jezemine, will you timetest this suggestion with CROSS JOIN?select t1.date, t1.rate, avg(t2.rate)from @t AS t1cross join @t as t2where t2.date between t1.date - 2 and t1.dategroup by t1.date, t1.rate E 12°55'05.25"N 56°04'39.16" |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-10 : 10:28:44
|
quote: Originally posted by Peso Jezemine, will you timetest this suggestion with CROSS JOIN?select t1.date, t1.rate, avg(t2.rate)from @t AS t1cross join @t as t2where t2.date between t1.date - 2 and t1.dategroup by t1.date, t1.rate E 12°55'05.25"N 56°04'39.16"
on my system this was about 10x slower than the cursor on 10k rows.cursor took 3030ms, cross join took 36376ms.EDIT: and that's with the cursor calculating BOTH the simple and weighted moving avg. elsasoft.org |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-10 : 10:43:57
|
So 36 seconds is still faster than the original inner join (killed after 5 minutes)? E 12°55'05.25"N 56°04'39.16" |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-10 : 11:48:31
|
no, the one I killed had 100k sample rows. I tested yours with 10k sample rows above.running the test again on 100k rows: cursor takes 15 sec, cross join was still churning after 10 mins. killed it. I know it's hard to take - we all harp on doing things in a set based way. but sometimes a loop is more efficient! elsasoft.org |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-10 : 13:46:19
|
I think set-based code has the fastest time And no need for staging table either!-- Prepare sample dataDECLARE @Sample TABLE (dt DATETIME, Rate FLOAT)INSERT @SampleSELECT CURRENT_TIMESTAMP - 10, 1 UNION ALLSELECT CURRENT_TIMESTAMP - 9, 2 UNION ALLSELECT CURRENT_TIMESTAMP - 8, 4 UNION ALLSELECT CURRENT_TIMESTAMP - 7, 4 UNION ALLSELECT CURRENT_TIMESTAMP - 6, 5 UNION ALLSELECT CURRENT_TIMESTAMP - 5, 6 UNION ALLSELECT CURRENT_TIMESTAMP - 4, 6 UNION ALLSELECT CURRENT_TIMESTAMP - 3, 8 UNION ALLSELECT CURRENT_TIMESTAMP - 2, 9 UNION ALLSELECT CURRENT_TIMESTAMP - 1, 10 UNION ALLSELECT CURRENT_TIMESTAMP - 0, 11 UNION ALLSELECT CURRENT_TIMESTAMP + 1, 9-- Peso 2SELECT DATEADD(DAY, dt, '19000101') AS dt, AVG(Rate) AS SimpleMovingAvg, SUM(wr) AS WeightedMovingAvgFROM ( SELECT DATEDIFF(DAY, '18991230', dt) AS dt, 0.1 * Rate AS wr, Rate, 0 AS Actualdate FROM @Sample UNION ALL SELECT DATEDIFF(DAY, '18991231', dt), 0.2 * Rate, Rate, 0 FROM @Sample UNION ALL SELECT DATEDIFF(DAY, '19000101', dt), 0.7 * Rate, Rate, 1 FROM @Sample ) AS kGROUP BY dtHAVING MAX(Actualdate) = 1ORDER BY dt E 12°55'05.25"N 56°04'39.16" |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-10 : 14:09:30
|
Jezemine, for your 5,000 record test above I get these timingsCURSOR 6,813 msJezemine set-based 20,577 msPeso set-based 127 ms E 12°55'05.25"N 56°04'39.16" |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-10 : 14:11:57
|
very nice! on 100k rows, your new query finishes in 5 sec, whereas cursor takes 15 sec (EDIT: on my dev box). @raxbat: is it clear now who the real sql guru is? elsasoft.org |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-11 : 08:40:12
|
Further optimized for SQL Server 2005 (only a third of the reads for SQL Server 2000 approach).Jezemine, if you have SQL Server 2005, can you timetest this approach for SQL Server 2005 too?DECLARE @Sample TABLE (dt DATETIME, Rate FLOAT)INSERT @SampleSELECT CURRENT_TIMESTAMP - 10, 1 UNION ALLSELECT CURRENT_TIMESTAMP - 9, 2 UNION ALLSELECT CURRENT_TIMESTAMP - 8, 4 UNION ALLSELECT CURRENT_TIMESTAMP - 7, 4 UNION ALLSELECT CURRENT_TIMESTAMP - 6, 5 UNION ALLSELECT CURRENT_TIMESTAMP - 5, 6 UNION ALLSELECT CURRENT_TIMESTAMP - 4, 6 UNION ALLSELECT CURRENT_TIMESTAMP - 3, 8 UNION ALLSELECT CURRENT_TIMESTAMP - 2, 9 UNION ALLSELECT CURRENT_TIMESTAMP - 1, 10 UNION ALLSELECT CURRENT_TIMESTAMP - 0, 11 UNION ALLSELECT CURRENT_TIMESTAMP + 1, 9-- SQL Server 2005SELECT DATEADD(DAY, p.ActualDate, '19000101') AS dt, AVG(p.Rate) AS SimpleMovingAvg, SUM( CASE p.dt WHEN 'dt0' THEN p.r0 WHEN 'dt1' THEN p.r1 WHEN 'dt2' THEN p.r2 ELSE 0 END ) AS WeightedMovingAvgFROM ( SELECT Rate, DATEDIFF(DAY, '19000101', dt) AS dt0, 0.7 * Rate AS r0, DATEDIFF(DAY, '18991231', dt) AS dt1, 0.2 * Rate AS r1, DATEDIFF(DAY, '18991230', dt) AS dt2, 0.1 * Rate AS r2 FROM @Sample ) AS yUNPIVOT ( ActualDate FOR dt IN (y.dt0, y.dt1, y.dt2) ) AS pGROUP BY p.ActualDateHAVING MIN(p.dt) = 'dt0'ORDER BY p.ActualDate-- SQL Server 2000SELECT DATEADD(DAY, k.dt, '19000101') AS dt, AVG(k.Rate) AS SimpleMovingAvg, SUM(k.wr) AS WeightedMovingAvgFROM ( SELECT DATEDIFF(DAY, '18991230', dt) AS dt, 0.1 * Rate AS wr, Rate, 0 AS Actualdate FROM @Sample UNION ALL SELECT DATEDIFF(DAY, '18991231', dt), 0.2 * Rate, Rate, 0 FROM @Sample UNION ALL SELECT DATEDIFF(DAY, '19000101', dt), 0.7 * Rate, Rate, 1 FROM @Sample ) AS kGROUP BY k.dtHAVING MAX(k.Actualdate) = 1ORDER BY k.dt E 12°55'05.25"N 56°04'39.16" |
|
|
madhivanan
Premature Yak Congratulator
22864 Posts |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-11 : 09:00:12
|
Hard to tell.With the moving average, you have a finite number of elements (records) to compare with.With a running total, there can be any number of records to compare with. E 12°55'05.25"N 56°04'39.16" |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-11 : 09:07:10
|
This method is still useful in SQL Server 2005.CREATE TABLE #Sample ( dt DATETIME, Rate FLOAT, running FLOAT )INSERT #Sample ( dt, Rate )SELECT CURRENT_TIMESTAMP - 10, 1 UNION ALLSELECT CURRENT_TIMESTAMP - 4, 6 UNION ALLSELECT CURRENT_TIMESTAMP - 3, 8 UNION ALLSELECT CURRENT_TIMESTAMP - 7, 4 UNION ALLSELECT CURRENT_TIMESTAMP - 6, 5 UNION ALLSELECT CURRENT_TIMESTAMP - 5, 6 UNION ALLSELECT CURRENT_TIMESTAMP - 2, 9 UNION ALLSELECT CURRENT_TIMESTAMP - 1, 10 UNION ALLSELECT CURRENT_TIMESTAMP - 0, 11 UNION ALLSELECT CURRENT_TIMESTAMP - 9, 2 UNION ALLSELECT CURRENT_TIMESTAMP - 8, 4 UNION ALLSELECT CURRENT_TIMESTAMP + 1, 9DECLARE @rt FLOATSET @rt = 0CREATE CLUSTERED INDEX IX_Peso ON #Sample (dt)UPDATE #SampleSET @rt = running = @rt + RateDROP INDEX #Sample.IX_PesoSELECT dt, Rate, runningFROM #SampleORDER BY dtDROP TABLE #Sample E 12°55'05.25"N 56°04'39.16" |
|
|
Next Page
|
|
|
|
|