Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-18 : 04:24:22
|
Hi,My question is regarding calculation of average time...Brief explanation is as follows:To calculate the frequency of files coming in, I do an average by looking at the min and max of the arrival date of the files. I only look until 30 days ago to calculate the average so the min arrival date will be 30 days ago.The above calculation seems to be working correctly except if a file has been coming is at say once a week say on fridays and when it does come in on that day, it arrives several times during that day.It seems my present average calculation does not work correctly in this scenario.The way I calculate the average arrivaldate of a file is something like:(1.0 * datediff(second,minArrivalDateTime,maxArrivalDateTime)/(num))If for example, a file comes in only today at say 9:15, 10:12, 12:22, 15:44, 17:01and the last arrival time was a week ago with around the same time and the same for the previous week, then using my formula to calculate the average arrival datetime will show something like 15 hours which indicates that the files should be coming in every 15 hours which is not the case because they are coming in weekly. There could be another occurance which happens every three days for instance.But most of the time the files are coming in at different times and my formula works only for those ones correctly.How would I go about this please?Thanks |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-01-18 : 05:27:30
|
Your requirements are a little confusing. Do you want the average days between each file coming in, and if one file comes in several times in one day it should only count as one? In that case you could probably do something like this (I think this will work...but it's untested):WITH cte AS ( SELECT FileID, ArrivalDateTime, RowNum = ROW_NUMBER() OVER (PARTITION BY FileID ORDER BY ArrivalDateTime) FROM ( SELECT FileID, ArrivalDateTime = DATEADD(DAY,DATEDIFF(day,0,ArrivalDateTime),0), COUNT(*) FROM myTable GROUP BY FileID, DATEADD(DAY,DATEDIFF(day,0,ArrivalDateTime),0) ) AS x )SELECT a.FileID, AVG(DATEDIFF(day, a.ArrivalDateTime, b.ArrivalDateTime))FROM cte a LEFT OUTER JOIN cte b ON a.FileID = b.FileID AND a.RowNum = b.RowNum - 1GROUP BY a.FileID - LumbagoMy blog (yes, I have a blog now! just not that much content yet) -> www.thefirstsql.com |
 |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-18 : 05:40:04
|
Hello again,I think you mis-understood. So I explain alittle further.let's say for the past 30 days fileA has been coming in at different times during each day. Using the formula I sent earlier, let's say the average arrival time is about 2:05 hours each day. This is ok.But say FileB comes in only twice a day i.e. at 9:32 and 14:22But using my formula it does not calculate the correct arrival dates of fileB.Any thoughts please? |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-18 : 05:55:54
|
[code]DECLARE @Sample TABLE ( DT DATETIME NOT NULL )INSERT @SampleSELECT '20110118 09:15' UNION ALLSELECT '20110118 10:12' UNION ALLSELECT '20110118 12:22' UNION ALLSELECT '20110118 15:44' UNION ALLSELECT '20110118 17:01' UNION ALLSELECT '20110111 08:15' UNION ALLSELECT '20110111 09:12' UNION ALLSELECT '20110111 14:44' UNION ALLSELECT '20110111 16:01'SELECT CONVERT(CHAR(8), DATEADD(SECOND, SUM(Value) / SUM(Items), 0), 8)FROM ( SELECT SUM(DATEDIFF(SECOND, DATEDIFF(DAY, 0, DT), DT)) AS Value, COUNT(*) AS Items FROM @Sample WHERE DT >= DATEDIFF(DAY, 30, GETDATE()) GROUP BY DATEDIFF(DAY, 0, DT) ) AS d[/code] N 56°04'39.26"E 12°55'05.63" |
 |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-18 : 06:20:44
|
Hi Peso,It returns 12:31:46 in your sample table.If it is ok, please explain what you are doing/achieving as I do not understand it.Thanks |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-18 : 08:23:30
|
It depends on the result is correct or not. N 56°04'39.26"E 12°55'05.63" |
 |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-18 : 09:38:20
|
I don't think the result is correct because the file has been arriving on 11th and 18th. i.e. several times one day a week. And I do not see what the result from @Sample (12:31:46 ) is referring to.Thanks |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-18 : 09:52:24
|
Are you able to produce proper sample data and expected output so that we are better equipped to aid you is this matter? N 56°04'39.26"E 12°55'05.63" |
 |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-18 : 11:23:49
|
quote: Originally posted by Peso Are you able to produce proper sample data and expected output so that we are better equipped to aid you is this matter? N 56°04'39.26"E 12°55'05.63"
Example 1...18 Jan 2011 19:22:1318 Jan 2011 18:19:1318 Jan 2011 17:25:1318 Jan 2011 16:25:1318 Jan 2011 16:25:13.........18 Jan 2011 01:25:13...17 Jan 2011 19:22:1317 Jan 2011 18:19:1317 Jan 2011 17:25:1317 Jan 2011 16:25:1317 Jan 2011 16:25:13.........18 Jan 2011 01:25:13Example 218 Jan 2011 14:58:3218 Jan 2011 12:58:5618 Jan 2011 10:58:3218 Jan 2011 08:58:3217 Jan 2011 14:45:0017 Jan 2011 12:55:1217 Jan 2011 10:58:4417 Jan 2011 08:50:2216 Jan 2011 14:45:0216 Jan 2011 12:55:5516 Jan 2011 10:58:2216 Jan 2011 08:50:0815 Jan 2011 14:45:2115 Jan 2011 12:55:3315 Jan 2011 10:58:2215 Jan 2011 08:50:11.........Example 1 shows FileA appearing once every hour whereas in Example 2, FileA only appears four times a day at around those mentioned times.I would like to indicate by true or false if a file is late or not.For example, looking at the example above for FileA, the system should know that on each day, only from around 8:50:11 to 14:58:32 the file appears, and if at present the time is say 17:30:22, the system knows that the file is NOT late.If the file has not appeared since say 12 O'clock, the system knows that it is late.hope you see what I mean.Thanks |
 |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-01-19 : 03:45:00
|
I think you need to create an additional table that holds meta information about each file transfer. Trending this on the fly is very complex...something like this maybe:DECLARE @FileTransferMeta table( FileID int, Filename varchar(50), DaysOfWeek varchar(50), FirstOccurrence datetime, LastOccurrence datetime, FilesPerDay int)INSERT INTO @FileTransferMetaSELECT 1, 'FileA', '1,2,3,4,5', '19000101 01:20:00', '19000101 23:00:00', 21 UNION ALLSELECT 2, 'FileB', '1,4,6,7', '19000101 10:00:00', '19000101 14:00:00', 4 Otherwise it will be very hard to make something sensible out of this.- LumbagoMy blog (yes, I have a blog now! just not that much content yet) -> www.thefirstsql.com |
 |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-20 : 08:43:59
|
I think you are right about having a separate table and place the filenames which have different arrival times...FileC has the average arrival time of around 20 mins.BUT, this is not true for the following times :7:39 - 9:029:12 - 10:2910:41 - 12:09The above times show that FileC arrived at 7:39, 9:02, 9:12, 10:29, 10:41, 12:09As you see, the average arrival time now is more than 20 mins.May be I can create a new table to hold some information about odd files such as FileC which do not come in regularly.Question:How can I design this extra fields (Which fields are necessary to be placed in this new table) ?Thanks |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-20 : 11:28:04
|
quote: Originally posted by arkiboys FileC has the average arrival time of around 20 mins.BUT, this is not true for the following times :7:39 - 9:029:12 - 10:2910:41 - 12:09The above times show that FileC arrived at 7:39, 9:02, 9:12, 10:29, 10:41, 12:09As you see, the average arrival time now is more than 20 mins.May be I can create a new table to hold some information about odd files such as FileC which do not come in regularly.Question:How can I design this extra fields (Which fields are necessary to be placed in this new table) ?Thanks
Why are you talking about an AVERAGE when you really mean a TIMESPAN or INTERVAL? N 56°04'39.26"E 12°55'05.63" |
 |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2011-01-20 : 11:36:48
|
yes, you are right. I mean Interval. |
 |
|
|
|
|
|
|