| Author |
Topic |
|
Chris Orlando
Starting Member
9 Posts |
Posted - 2005-07-06 : 12:52:09
|
| Here is the challenge: I want to detrmine the duration in time between two times in different records. I don't have any problems with Date Diff and Convert syntax. I want to do this with one transaction or Select statement and avoid using a cursor. I have millions of rows to process nightly. It would probably be easier get this metric upon import, no?Any ideas on this? Read Below: Table: ID , Date_ColumnCurrent Data:1, 1/1/2005 10:59.0002, 1/1/2005 11:00.0003, 1/1/2005 11:02.0004, 1/1/2005 11:04.000What I want to Do:Table:ID, Date_Column, DurationProjected Data:1, 1/1/2005 10:59.000, 00:01.002, 1/1/2005 11:00.000, 00:02.003, 1/1/2005 11:02.000, 00:02.000Thanks |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2005-07-06 : 13:05:18
|
| based on your example you could do a self join on a.ID = b.ID-1That assumes your ID column is sequential. Since you have millions of rows you may want to do it in batchesBe One with the OptimizerTG |
 |
|
|
Chris Orlando
Starting Member
9 Posts |
Posted - 2005-07-06 : 13:14:05
|
| In a perfect world that would be great. These are for stock quotes and even though they have an identity, they are not exactly sequential. The order is based on time AND Exchange and Ticker symbol which I failed to mention. I usually run things in batches by symbol on a particular date using cursors and temp tables. Using your example, I guess I would have to create a temp table for each symbol, and date worth of quotes and then update. Maybe it is easier with a cursor?any thoughts? |
 |
|
|
KLang23
Posting Yak Master
115 Posts |
Posted - 2005-07-06 : 13:17:01
|
Hi,By its nature, this may not perform very well, but I think it will give you what you want. This assumes that the Ids have been assigned sequentially. You will of course have to customize the date calculation to give you what you want.Select T1.Id, T2.Id ,T1.DateColumn, T2.DateColumn ,DateDiff(SS,T1.DateColumn,T2.DateColumn)From Table1 T1, Table1 T2Where T2.Id = (Select Top 1 T1Next.Id From Table1 T1Next Where T1Next.Id > T1.Id Order by T1Next.Id)Order By T1.Id |
 |
|
|
KLang23
Posting Yak Master
115 Posts |
Posted - 2005-07-06 : 13:19:53
|
| By the time I posted there were other responses.You can add other predicates to join on symbol, etc.All that is required is that the id's be sequentially increasing - there can be gaps. |
 |
|
|
Chris Orlando
Starting Member
9 Posts |
Posted - 2005-07-06 : 13:28:29
|
| I did that... thanks to everyone for their help... I am testing the self join with subquery method and it is pretty slow but that may be an optimization issue on my end. I'll keep you posted. |
 |
|
|
Chris Orlando
Starting Member
9 Posts |
Posted - 2005-07-06 : 15:06:13
|
| hmmm.... that's a bit slow too. I am taking a step back here. Maybe someone can help me with this. There are three main columns to index for. exchange, quote time, and symbol. We use data for one exchange but we house all of the data... (no one seems to us the other data but, this is what they want... sigh) The process is the following:1) Bulk Loading a 5GB file to a holding table.2) Indexing the holding table (Presently with symbol only)3) Inserting the records to the warehouse.... about 1GB4) Insert the records by joining to another tableIf I can index the holding table in a manner where the sort will be Exchange, symbol, tradetime - then I could do the self join get my time duration on the exchange I need and then insert the rest in another transaction. Does this make sense to anyone? What kind of indexing would I need onthe holding table to get the sort I described above? |
 |
|
|
Chris Orlando
Starting Member
9 Posts |
Posted - 2005-07-06 : 15:07:07
|
| to clarify... steps 3 and 4 above are the same |
 |
|
|
KLang23
Posting Yak Master
115 Posts |
Posted - 2005-07-06 : 16:53:14
|
Hi,Don't forget to index the "Id".I'd recommend: Exchange, Symbol, Id, and DateColumn(just to cover the query).Select T1.Id, T2.Id ,T1.DateColumn, T2.DateColumn ,DateDiff(SS,T1.DateColumn,T2.DateColumn)From Table1 T1, Table1 T2Where T1.Exchange = @Exchange and T1.Symbol = @Symbol and T1.Exchange = T2.Exchange and T1.Symbol = T2.Symbol and T2.Id = (Select Top 1 T1Next.Id From Table1 T1Next Where T1Next.Exchange = T1.Exchange and T1Next.Symbol = T1.Symbol and T1Next.Id > T1.Id Order by T1Next.Id)Order By T1.Id If "Id" isn't indexed the query will scan and perform terribly.Good luck - this shouldn't be too bad. |
 |
|
|
|