Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 calculate average datetime

Author  Topic 

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-18 : 04:24:22
Hi,
My question is regarding calculation of average time...

Brief explanation is as follows:

To calculate the frequency of files coming in, I do an average by looking at the min and max of the arrival date of the files. I only look until 30 days ago to calculate the average so the min arrival date will be 30 days ago.
The above calculation seems to be working correctly except if a file has been coming is at say once a week say on fridays and when it does come in on that day, it arrives several times during that day.
It seems my present average calculation does not work correctly in this scenario.

The way I calculate the average arrivaldate of a file is something like:
(1.0 * datediff(second,minArrivalDateTime,maxArrivalDateTime)/(num))

If for example, a file comes in only today at say 9:15, 10:12, 12:22, 15:44, 17:01
and the last arrival time was a week ago with around the same time and the same for the previous week, then using my formula to calculate the average arrival datetime will show something like 15 hours which indicates that the files should be coming in every 15 hours which is not the case because they are coming in weekly. There could be another occurance which happens every three days for instance.
But most of the time the files are coming in at different times and my formula works only for those ones correctly.

How would I go about this please?

Thanks

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-01-18 : 05:27:30
Your requirements are a little confusing. Do you want the average days between each file coming in, and if one file comes in several times in one day it should only count as one? In that case you could probably do something like this (I think this will work...but it's untested):
WITH cte AS (
SELECT FileID, ArrivalDateTime, RowNum = ROW_NUMBER() OVER (PARTITION BY FileID ORDER BY ArrivalDateTime)
FROM (
SELECT FileID, ArrivalDateTime = DATEADD(DAY,DATEDIFF(day,0,ArrivalDateTime),0), COUNT(*)
FROM myTable
GROUP BY FileID, DATEADD(DAY,DATEDIFF(day,0,ArrivalDateTime),0)
) AS x
)
SELECT a.FileID, AVG(DATEDIFF(day, a.ArrivalDateTime, b.ArrivalDateTime))
FROM cte a
LEFT OUTER JOIN cte b
ON a.FileID = b.FileID
AND a.RowNum = b.RowNum - 1
GROUP BY a.FileID


- Lumbago

My blog (yes, I have a blog now! just not that much content yet)
-> www.thefirstsql.com
Go to Top of Page

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-18 : 05:40:04
Hello again,
I think you mis-understood. So I explain alittle further.
let's say for the past 30 days fileA has been coming in at different times during each day. Using the formula I sent earlier, let's say the average arrival time is about 2:05 hours each day. This is ok.
But say FileB comes in only twice a day i.e. at 9:32 and 14:22
But using my formula it does not calculate the correct arrival dates of fileB.
Any thoughts please?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-01-18 : 05:55:54
[code]DECLARE @Sample TABLE
(
DT DATETIME NOT NULL
)

INSERT @Sample
SELECT '20110118 09:15' UNION ALL
SELECT '20110118 10:12' UNION ALL
SELECT '20110118 12:22' UNION ALL
SELECT '20110118 15:44' UNION ALL
SELECT '20110118 17:01' UNION ALL
SELECT '20110111 08:15' UNION ALL
SELECT '20110111 09:12' UNION ALL
SELECT '20110111 14:44' UNION ALL
SELECT '20110111 16:01'

SELECT CONVERT(CHAR(8), DATEADD(SECOND, SUM(Value) / SUM(Items), 0), 8)
FROM (
SELECT SUM(DATEDIFF(SECOND, DATEDIFF(DAY, 0, DT), DT)) AS Value,
COUNT(*) AS Items
FROM @Sample
WHERE DT >= DATEDIFF(DAY, 30, GETDATE())
GROUP BY DATEDIFF(DAY, 0, DT)
) AS d[/code]


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-18 : 06:20:44
Hi Peso,
It returns 12:31:46 in your sample table.
If it is ok, please explain what you are doing/achieving as I do not understand it.

Thanks
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-01-18 : 08:23:30
It depends on the result is correct or not.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-18 : 09:38:20
I don't think the result is correct because the file has been arriving on 11th and 18th. i.e. several times one day a week. And I do not see what the result from @Sample (12:31:46 ) is referring to.
Thanks
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-01-18 : 09:52:24
Are you able to produce proper sample data and expected output so that we are better equipped to aid you is this matter?



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-18 : 11:23:49
quote:
Originally posted by Peso

Are you able to produce proper sample data and expected output so that we are better equipped to aid you is this matter?



N 56°04'39.26"
E 12°55'05.63"



Example 1

...
18 Jan 2011 19:22:13
18 Jan 2011 18:19:13
18 Jan 2011 17:25:13
18 Jan 2011 16:25:13
18 Jan 2011 16:25:13
...
...
...
18 Jan 2011 01:25:13

...
17 Jan 2011 19:22:13
17 Jan 2011 18:19:13
17 Jan 2011 17:25:13
17 Jan 2011 16:25:13
17 Jan 2011 16:25:13
...
...
...
18 Jan 2011 01:25:13

Example 2

18 Jan 2011 14:58:32
18 Jan 2011 12:58:56
18 Jan 2011 10:58:32
18 Jan 2011 08:58:32

17 Jan 2011 14:45:00
17 Jan 2011 12:55:12
17 Jan 2011 10:58:44
17 Jan 2011 08:50:22

16 Jan 2011 14:45:02
16 Jan 2011 12:55:55
16 Jan 2011 10:58:22
16 Jan 2011 08:50:08

15 Jan 2011 14:45:21
15 Jan 2011 12:55:33
15 Jan 2011 10:58:22
15 Jan 2011 08:50:11

...
...
...

Example 1 shows FileA appearing once every hour whereas in Example 2, FileA only appears four times a day at around those mentioned times.

I would like to indicate by true or false if a file is late or not.
For example, looking at the example above for FileA, the system should know that on each day, only from around 8:50:11 to 14:58:32 the file appears, and if at present the time is say 17:30:22, the system knows that the file is NOT late.
If the file has not appeared since say 12 O'clock, the system knows that it is late.
hope you see what I mean.

Thanks
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-01-19 : 03:45:00
I think you need to create an additional table that holds meta information about each file transfer. Trending this on the fly is very complex...something like this maybe:
DECLARE @FileTransferMeta table(
FileID int,
Filename varchar(50),
DaysOfWeek varchar(50),
FirstOccurrence datetime,
LastOccurrence datetime,
FilesPerDay int
)
INSERT INTO @FileTransferMeta
SELECT 1, 'FileA', '1,2,3,4,5', '19000101 01:20:00', '19000101 23:00:00', 21 UNION ALL
SELECT 2, 'FileB', '1,4,6,7', '19000101 10:00:00', '19000101 14:00:00', 4
Otherwise it will be very hard to make something sensible out of this.

- Lumbago

My blog (yes, I have a blog now! just not that much content yet)
-> www.thefirstsql.com
Go to Top of Page

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-20 : 08:43:59
I think you are right about having a separate table and place the filenames which have different arrival times...
FileC has the average arrival time of around 20 mins.
BUT, this is not true for the following times :
7:39 - 9:02
9:12 - 10:29
10:41 - 12:09

The above times show that FileC arrived at 7:39, 9:02, 9:12, 10:29, 10:41, 12:09
As you see, the average arrival time now is more than 20 mins.

May be I can create a new table to hold some information about odd files such as FileC which do not come in regularly.
Question:
How can I design this extra fields (Which fields are necessary to be placed in this new table) ?

Thanks
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-01-20 : 11:28:04
quote:
Originally posted by arkiboys

FileC has the average arrival time of around 20 mins.
BUT, this is not true for the following times :
7:39 - 9:02
9:12 - 10:29
10:41 - 12:09

The above times show that FileC arrived at 7:39, 9:02, 9:12, 10:29, 10:41, 12:09
As you see, the average arrival time now is more than 20 mins.

May be I can create a new table to hold some information about odd files such as FileC which do not come in regularly.
Question:
How can I design this extra fields (Which fields are necessary to be placed in this new table) ?

Thanks

Why are you talking about an AVERAGE when you really mean a TIMESPAN or INTERVAL?



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

arkiboys
Master Smack Fu Yak Hacker

1433 Posts

Posted - 2011-01-20 : 11:36:48
yes, you are right. I mean Interval.
Go to Top of Page
   

- Advertisement -