Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Moving average in TSQL

Author  Topic 

raxbat
Yak Posting Veteran

52 Posts

Posted - 2007-12-08 : 10:51:19
Hello all!
I have he table with the rates for currency pair for the last 12h. I wish to calculate with SQL query the Simpla=e mving average and Exponencial moving average for some period. Help me please wit this query.


My table:
Time Rate
----------
08:00:06 1.4378
08:00:12 1.4377
...
etc...

PS: moving average: http://en.wikipedia.org/wiki/Moving_average

Thank You

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-09 : 02:23:48
[code]
declare @t table (date datetime not null, rate float not null)

insert @t
select getdate()-10, 1 union all
select getdate()-9, 2 union all
select getdate()-8, 4 union all
select getdate()-7, 4 union all
select getdate()-6, 5 union all
select getdate()-5, 6 union all
select getdate()-4, 6 union all
select getdate()-3, 8 union all
select getdate()-2, 9 union all
select getdate()-1, 10 union all
select getdate()-0, 11 union all
select getdate()+1, 9

select * from @t

-- change the upper limit of the between clause to
-- specify how far back the moving average looks. also if your moving average looks
-- back minutes rather than days, you'll have to change the first arg to datediff.
select
t.date,
avg(tt.rate) as simpleMovingAvg
from @t AS t
join @t tt on DATEDIFF(day, tt.date, t.date) between 0 And 2
group by t.date
order by t.date

-- add more when clauses to extend how far back the weighted moving average looks.
-- adjust weights however you like, with the constraint that they all sum to 1.
-- note that you should exclude the first 2 values returned by this since they don't have
-- a full set of history - a minor boundary effect.
select
t.date,
sum(case when DATEDIFF(day, tt.date, t.date)=0 then 0.7*tt.rate
when DATEDIFF(day, tt.date, t.date)=1 then 0.2*tt.rate
when DATEDIFF(day, tt.date, t.date)=2 then 0.1*tt.rate end) as weightedMovingAvg
from @t t
join @t tt on DATEDIFF(day, tt.date, t.date) between 0 And 2
group by t.date
order by t.date
[/code]


elsasoft.org
Go to Top of Page

raxbat
Yak Posting Veteran

52 Posts

Posted - 2007-12-09 : 05:03:04
Thanx i works! But took a long time to process al my data!
I know that it is possible to compute current MA value, knowing previous value! How can I realize this?

Thanx a lot!
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-09 : 12:44:40
yea, moving averages (like running totals) are a case where the set-based approach is slower than a procedural approach.

that is, a cursor may be faster (gasp!) since you would only have to read each row once, and there would be no join. To do the moving average as I have done above, each row is read 3 times, plus you have the extra overhead of a join. I'm too lazy to give you the cursor-based solution though. that's left as an exercise for you




elsasoft.org
Go to Top of Page

raxbat
Yak Posting Veteran

52 Posts

Posted - 2007-12-09 : 12:52:38
How can I implement this?
Thank You
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-09 : 15:50:39
I think the wikipedia link you posted explains it well enough... did you try it?


elsasoft.org
Go to Top of Page

raxbat
Yak Posting Veteran

52 Posts

Posted - 2007-12-09 : 16:13:22
If I could ... :(
Im the simple user of sql, Im far away from sql programming and difficult functions

help me please
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-09 : 18:10:53
well, ok.

this code compares a cursor-based method with the one I gave above. cursor is much faster if you have thousands of rows.

On my system, with 5000 rows of sample data, the cursor took 796 ms and the other method took 12016 ms.

with 100k rows of sample data, cursor took 19 sec. other method I killed after letting it run 5 minutes.

have a look at this article, it explains the problem with queries like this: http://www.sqlservercentral.com/articles/T-SQL/61539


declare @t table (date datetime not null, rate float not null)
declare @movingAverages table (date datetime not null,
simpleMovingAvg float not null,
weightedMovingAvg float not null)

--
-- prepare some random sample data
--
declare @i int
set @i=0
while @i < 5000 --increase this number to make the perf difference even more stark
begin
insert @t select getdate()+@i, rand()
set @i = @i + 1
end
--
-- use an ugly cursor to calculate moving avgs
--
declare @date datetime
declare @rate float
declare @rateCurrent float
declare @rateMinusOne float
declare @rateMinusTwo float
declare @then datetime
set @then=getdate()

declare ugly cursor for select date, rate from @t
open ugly
while 1=1
begin

FETCH NEXT FROM ugly INTO @date, @rate
if @@FETCH_STATUS <> 0 break
set @rateMinusTwo=@rateMinusOne
set @rateMinusOne=@rateCurrent
set @rateCurrent=@rate
if @rateMinusTwo is not null and @rateMinusOne is not null and @rateCurrent is not null
begin
insert @movingAverages
select
@date
,(@rateCurrent + @rateMinusOne + @rateMinusTwo) / 3.0
,0.7 * @rateCurrent + 0.2 * @rateMinusOne + 0.1 * @rateMinusTwo
end
end
CLOSE ugly
DEALLOCATE ugly

-- see what we came up with
select t.date, t.rate, m.simpleMovingAvg, m.weightedMovingAvg
from @movingAverages m
join @t t on t.date=m.date

-- how long did the cursor take?
select datediff(ms, @then, getdate()) as [ugly cursor time]

--
-- now see how long the set-based way takes
--
set @then=getdate()
select
t.date
,avg(tt.rate) as simpleMovingAvg
,sum(case when DATEDIFF(day, tt.date, t.date)=0 then 0.7*tt.rate
when DATEDIFF(day, tt.date, t.date)=1 then 0.2*tt.rate
when DATEDIFF(day, tt.date, t.date)=2 then 0.1*tt.rate end) as weightedMovingAvg
from @t AS t
join @t tt on DATEDIFF(day, tt.date, t.date) between 0 And 2
group by t.date
order by t.date

-- how long did set-based take?
select datediff(ms, @then, getdate()) as [set based time]



elsasoft.org
Go to Top of Page

raxbat
Yak Posting Veteran

52 Posts

Posted - 2007-12-10 : 00:28:23
Thanx a lot! You are the real sql guru :)
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-10 : 08:47:15
hardly.




elsasoft.org
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-10 : 09:45:27
Jezemine, will you timetest this suggestion with CROSS JOIN?
select		t1.date,
t1.rate,
avg(t2.rate)
from @t AS t1
cross join @t as t2
where t2.date between t1.date - 2 and t1.date
group by t1.date,
t1.rate



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-10 : 10:28:44
quote:
Originally posted by Peso

Jezemine, will you timetest this suggestion with CROSS JOIN?
select		t1.date,
t1.rate,
avg(t2.rate)
from @t AS t1
cross join @t as t2
where t2.date between t1.date - 2 and t1.date
group by t1.date,
t1.rate



E 12°55'05.25"
N 56°04'39.16"




on my system this was about 10x slower than the cursor on 10k rows.

cursor took 3030ms, cross join took 36376ms.

EDIT: and that's with the cursor calculating BOTH the simple and weighted moving avg.



elsasoft.org
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-10 : 10:43:57
So 36 seconds is still faster than the original inner join (killed after 5 minutes)?



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-10 : 11:48:31
no, the one I killed had 100k sample rows. I tested yours with 10k sample rows above.

running the test again on 100k rows: cursor takes 15 sec, cross join was still churning after 10 mins. killed it.

I know it's hard to take - we all harp on doing things in a set based way. but sometimes a loop is more efficient!


elsasoft.org
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-10 : 13:46:19
I think set-based code has the fastest time
And no need for staging table either!
-- Prepare sample data
DECLARE @Sample TABLE (dt DATETIME, Rate FLOAT)

INSERT @Sample
SELECT CURRENT_TIMESTAMP - 10, 1 UNION ALL
SELECT CURRENT_TIMESTAMP - 9, 2 UNION ALL
SELECT CURRENT_TIMESTAMP - 8, 4 UNION ALL
SELECT CURRENT_TIMESTAMP - 7, 4 UNION ALL
SELECT CURRENT_TIMESTAMP - 6, 5 UNION ALL
SELECT CURRENT_TIMESTAMP - 5, 6 UNION ALL
SELECT CURRENT_TIMESTAMP - 4, 6 UNION ALL
SELECT CURRENT_TIMESTAMP - 3, 8 UNION ALL
SELECT CURRENT_TIMESTAMP - 2, 9 UNION ALL
SELECT CURRENT_TIMESTAMP - 1, 10 UNION ALL
SELECT CURRENT_TIMESTAMP - 0, 11 UNION ALL
SELECT CURRENT_TIMESTAMP + 1, 9

-- Peso 2
SELECT DATEADD(DAY, dt, '19000101') AS dt,
AVG(Rate) AS SimpleMovingAvg,
SUM(wr) AS WeightedMovingAvg
FROM (
SELECT DATEDIFF(DAY, '18991230', dt) AS dt,
0.1 * Rate AS wr,
Rate,
0 AS Actualdate
FROM @Sample

UNION ALL

SELECT DATEDIFF(DAY, '18991231', dt),
0.2 * Rate,
Rate,
0
FROM @Sample

UNION ALL

SELECT DATEDIFF(DAY, '19000101', dt),
0.7 * Rate,
Rate,
1
FROM @Sample
) AS k
GROUP BY dt
HAVING MAX(Actualdate) = 1
ORDER BY dt

E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-10 : 14:09:30
Jezemine, for your 5,000 record test above I get these timings

CURSOR 6,813 ms
Jezemine set-based 20,577 ms
Peso set-based 127 ms



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-12-10 : 14:11:57
very nice! on 100k rows, your new query finishes in 5 sec, whereas cursor takes 15 sec (EDIT: on my dev box).

@raxbat: is it clear now who the real sql guru is?


elsasoft.org
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-11 : 08:40:12
Further optimized for SQL Server 2005 (only a third of the reads for SQL Server 2000 approach).
Jezemine, if you have SQL Server 2005, can you timetest this approach for SQL Server 2005 too?
DECLARE	@Sample TABLE (dt DATETIME, Rate FLOAT)

INSERT @Sample
SELECT CURRENT_TIMESTAMP - 10, 1 UNION ALL
SELECT CURRENT_TIMESTAMP - 9, 2 UNION ALL
SELECT CURRENT_TIMESTAMP - 8, 4 UNION ALL
SELECT CURRENT_TIMESTAMP - 7, 4 UNION ALL
SELECT CURRENT_TIMESTAMP - 6, 5 UNION ALL
SELECT CURRENT_TIMESTAMP - 5, 6 UNION ALL
SELECT CURRENT_TIMESTAMP - 4, 6 UNION ALL
SELECT CURRENT_TIMESTAMP - 3, 8 UNION ALL
SELECT CURRENT_TIMESTAMP - 2, 9 UNION ALL
SELECT CURRENT_TIMESTAMP - 1, 10 UNION ALL
SELECT CURRENT_TIMESTAMP - 0, 11 UNION ALL
SELECT CURRENT_TIMESTAMP + 1, 9

-- SQL Server 2005
SELECT DATEADD(DAY, p.ActualDate, '19000101') AS dt,
AVG(p.Rate) AS SimpleMovingAvg,
SUM(
CASE p.dt
WHEN 'dt0' THEN p.r0
WHEN 'dt1' THEN p.r1
WHEN 'dt2' THEN p.r2
ELSE 0
END
) AS WeightedMovingAvg
FROM (
SELECT Rate,
DATEDIFF(DAY, '19000101', dt) AS dt0,
0.7 * Rate AS r0,
DATEDIFF(DAY, '18991231', dt) AS dt1,
0.2 * Rate AS r1,
DATEDIFF(DAY, '18991230', dt) AS dt2,
0.1 * Rate AS r2
FROM @Sample
) AS y
UNPIVOT (
ActualDate
FOR dt IN (y.dt0, y.dt1, y.dt2)
) AS p
GROUP BY p.ActualDate
HAVING MIN(p.dt) = 'dt0'
ORDER BY p.ActualDate

-- SQL Server 2000
SELECT DATEADD(DAY, k.dt, '19000101') AS dt,
AVG(k.Rate) AS SimpleMovingAvg,
SUM(k.wr) AS WeightedMovingAvg
FROM (
SELECT DATEDIFF(DAY, '18991230', dt) AS dt,
0.1 * Rate AS wr,
Rate,
0 AS Actualdate
FROM @Sample

UNION ALL

SELECT DATEDIFF(DAY, '18991231', dt),
0.2 * Rate,
Rate,
0
FROM @Sample

UNION ALL

SELECT DATEDIFF(DAY, '19000101', dt),
0.7 * Rate,
Rate,
1
FROM @Sample
) AS k
GROUP BY k.dt
HAVING MAX(k.Actualdate) = 1
ORDER BY k.dt



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2007-12-11 : 08:52:58
Peso, can you do testing for Running Total?
http://sqljunkies.com/WebLog/amachanic/archive/2006/02/28/18286.aspx

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-11 : 09:00:12
Hard to tell.
With the moving average, you have a finite number of elements (records) to compare with.
With a running total, there can be any number of records to compare with.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-11 : 09:07:10
This method is still useful in SQL Server 2005.
CREATE TABLE	#Sample
(
dt DATETIME,
Rate FLOAT,
running FLOAT
)

INSERT #Sample
(
dt,
Rate
)
SELECT CURRENT_TIMESTAMP - 10, 1 UNION ALL
SELECT CURRENT_TIMESTAMP - 4, 6 UNION ALL
SELECT CURRENT_TIMESTAMP - 3, 8 UNION ALL
SELECT CURRENT_TIMESTAMP - 7, 4 UNION ALL
SELECT CURRENT_TIMESTAMP - 6, 5 UNION ALL
SELECT CURRENT_TIMESTAMP - 5, 6 UNION ALL
SELECT CURRENT_TIMESTAMP - 2, 9 UNION ALL
SELECT CURRENT_TIMESTAMP - 1, 10 UNION ALL
SELECT CURRENT_TIMESTAMP - 0, 11 UNION ALL
SELECT CURRENT_TIMESTAMP - 9, 2 UNION ALL
SELECT CURRENT_TIMESTAMP - 8, 4 UNION ALL
SELECT CURRENT_TIMESTAMP + 1, 9

DECLARE @rt FLOAT
SET @rt = 0

CREATE CLUSTERED INDEX IX_Peso ON #Sample (dt)

UPDATE #Sample
SET @rt = running = @rt + Rate

DROP INDEX #Sample.IX_Peso

SELECT dt,
Rate,
running
FROM #Sample
ORDER BY dt

DROP TABLE #Sample



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page
    Next Page

- Advertisement -