Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 13:31:14
|
Looking to speed up this dog of a SQL...Q1 = Main QueryJ1 = Inner Join on Main QueryQ2 = Query for TimeIn column of previous record for previous module based on TimeStampQ3 = Query for TempIn column of previous record for previous module based on TimeStamp (Q3 and Q2 will return the same record)SELECT Q1.Module,J1.Description,Q1.TimeStamp AS OutTime,Q1.DataVal AS OutTemp, (SELECT TOP 1 TimeStamp FROM TraxData Q2 WHERE Q2.Customer_No = Q1.Customer_No AND Q2.TimeStamp <= Q1.TimeStamp AND Q2.Module = Q1.Module - 1 ORDER BY Q2.TimeStamp DESC )AS InTime, (SELECT TOP 1 DataVal FROM TraxData Q3 WHERE Q3.Customer_No = Q1.Customer_No AND Q3.TimeStamp <= Q1.TimeStamp AND Q3.Module = Q1.Module - 1 ORDER BY Q3.TimeStamp DESC )AS InTempFROM TraxData Q1INNER JOIN TraxMods J1 ON J1.Customer_No = Q1.Customer_No AND J1.Module = Q1.ModuleWHERE Q1.Customer_No = 12186 AND Q1.TimeStamp > '01/01/2006' AND Q1.Module = 5ORDER BY 2,3 Is there any way I can get the InTime and InTemp fields with 1 query? It would be coming from the same record.I already have indexes setup for TimeStamp and Module on TraxData and TraxMods table.TIA! |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2006-02-14 : 13:45:52
|
| Didn't u try inner joining that Q2 (or Q3) ? |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 14:03:52
|
| DO you mean changing the WHERE clause of Q2 and Q3 to a JOIN on Q1? I'll try it but why would that help? I don't think I understand. Can you show me what you mean? |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-02-14 : 14:21:05
|
| Ken,In order to help, we'll need the DDL for all tables involved, INSERT INTO statements for sample data, and the expected result set using that sample data. The sample data should contain enough rows to illustrate your problem.Tara Kizeraka tduggan |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 15:31:30
|
| OK, I can easily generate the script to create the tables, but how should I easily create the insert statements? Do I have to do that by hand? Can I send you an export? |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-02-14 : 15:36:39
|
| It typically is done by hand. When I post questions, I provide between 5-10 rows in each of the tables making sure that the rows can be joined together and also that the sample rows illustrate my problem. Also make sure to exclude any columns that are not relevant to your problem so that the DDL and DML are short and simple. We answer these questions for free. Sending us an export of your data, analyzing it, coming up with a solution, would not be free.Tara Kizeraka tduggan |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 16:27:30
|
The question is How do I get the previous timestamp and dataval from the previous module (21) based on the module and timestamp from module 22?OK, here's the code to create the tables with some very simple data in them... Note the indexes. They are clustered on Customer_No & ChemWatch_ID for lookup/insertion purposes.CREATE TABLE [dbo].[traxdata] ( [module] [int] NULL , [timestamp] [datetime] NULL , [dataval] [numeric](11, 2) NULL , [chemwatch_id] [int] NOT NULL , [customer_no] [numeric](8, 0) NOT NULL ) ON [PRIMARY]GOCREATE TABLE [dbo].[traxmods] ( [module] [int] NULL , [description] [char] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [chemwatch_id] [int] NOT NULL , [customer_no] [numeric](8, 0) NOT NULL ) ON [PRIMARY]GO CREATE INDEX [IX_traxdata_1] ON [dbo].[traxdata]([customer_no], [module], [timestamp]) ON [PRIMARY]GO CREATE INDEX [IX_traxmods_1] ON [dbo].[traxmods]([customer_no], [module]) ON [PRIMARY]GOINSERT INTO TraxMods (Module, Description,Customer_No,ChemWatch_ID) Select 22,'Out Mod 22',12186,1 UNION ALL Select 21,'In Mod 21',12186,2GOINSERT INTO TraxData (Module, TimeStamp, DataVal,Customer_No,ChemWatch_ID) Select 21,'01/01/2006 10:00',1000,12186,1 UNION ALL Select 21,'01/01/2006 11:00',1100,12186,2 UNION ALL Select 21,'01/01/2006 12:00',1200,12186,3 UNION ALL Select 21,'01/01/2006 13:00',1300,12186,4 UNION ALL Select 21,'01/01/2006 14:00',1400,12186,5GOINSERT INTO TraxData (Module, TimeStamp, DataVal,Customer_No,ChemWatch_ID) Select 22,'01/01/2006 10:01',1001,12186,6 UNION ALL Select 22,'01/01/2006 11:01',1101,12186,7 UNION ALL Select 22,'01/01/2006 12:01',1202,12186,8 UNION ALL Select 22,'01/01/2006 13:01',1303,12186,9 UNION ALL Select 22,'01/01/2006 14:01',1404,12186,10GOSelect * From TraxData WHERE Module > 20Select * From TraxMods WHERE Module > 20 Here's the results I need...22 Out Mod 22 2006-01-01 10:01:00.000 1001.00 2006-01-01 10:00:00.000 1000.0022 Out Mod 22 2006-01-01 11:01:00.000 1101.00 2006-01-01 11:00:00.000 1100.0022 Out Mod 22 2006-01-01 12:01:00.000 1202.00 2006-01-01 12:00:00.000 1200.0022 Out Mod 22 2006-01-01 13:01:00.000 1303.00 2006-01-01 13:00:00.000 1300.0022 Out Mod 22 2006-01-01 14:01:00.000 1404.00 2006-01-01 14:00:00.000 1400.00 From this SQL which while it works, runs way too slow on a table with 600,000+ records...DECLARE @OutMod intSET @OutMod = 22SELECT Q1.Module,J1.Description,Q1.TimeStamp AS OutTime,Q1.DataVal AS OutTemp, (SELECT TOP 1 TimeStamp FROM TraxData Q2 WHERE Q2.Customer_No = Q1.Customer_No AND Q2.TimeStamp <= Q1.TimeStamp AND Q2.Module = Q1.Module - 1 ORDER BY Q2.TimeStamp DESC )AS InTime, (SELECT TOP 1 DataVal FROM TraxData Q3 WHERE Q3.Customer_No = Q1.Customer_No AND Q3.TimeStamp <= Q1.TimeStamp AND Q3.Module = Q1.Module - 1 ORDER BY Q3.TimeStamp DESC )AS InTempFROM TraxData Q1INNER JOIN TraxMods J1 ON J1.Customer_No = Q1.Customer_No AND J1.Module = Q1.ModuleWHERE Q1.Customer_No = 12186 AND Q1.TimeStamp > '01/01/2006' AND Q1.Module = @OutModORDER BY 2,3 Thanks Tara! |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2006-02-14 : 16:50:08
|
As always, break it down into steps:Step 1: get the previous timeStamp and Module for each traxData transaction:select t.*, (select Max(TimeStamp) from TraxData t2 where t2.Module+1 = t.module and t2.Customer_No = t.Customer_No and t2.TimeStamp < t.TimeStamp ) as PrevTimeStamp, t.Module-1 as PrevModulefrom traxData t And now to get the DataValue, you should be able to say:select a.*, t.DataVal as PreviousValfrom (above SQL) aleft outer join TraxData ton t.Customer_no = a.Customer_No and t.Module = a.PrevModule and t.TimeStamp = a.PrevTimeStampwhere a.Module = @OutMod join to other tables and add additional filters in the last step. Should be more efficient that what you have. |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2006-02-14 : 16:50:30
|
| by the way -- thanks for the great DML and DDL ! Much appreciated. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-02-14 : 16:59:50
|
I don't understand the data. But to show you that I tried:SELECT m.module, m.[description], d1.[timestamp], d1.dataval, d2.[timestamp], d2.datavalFROM traxdata d1INNER JOIN( SELECT customer_no, MIN(chemwatch_id) AS chemwatch_id FROM traxdata m WHERE module = 22 GROUP BY customer_no) tON d1.customer_no = t.customer_no AND d1.chemwatch_id = t.chemwatch_idINNER JOIN traxmods mON d1.module = m.moduleINNER JOIN traxdata d2ON m.customer_no = d2.customer_no AND m.chemwatch_id = d2.chemwatch_id Mine only gets one row though, but perhaps it gives you some ideas.Tara Kizeraka tduggan |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 17:18:31
|
| Tara - I threw you off by including the ChemWatch_ID column, ignore it. I only included that because the tables are clustered on Customer_No,ChemWatch_ID.Module 21 is a temperature that measures water coming in. Module 22 is a temperature that measures water going out. I need to know what the last temperature coming in was for each temperature going out. They will never have the same timestamp, becuase data is only recorded when the temperature fluctuates 10 degrees on output, and 2 degrees an input. So there could be a lot more "in" temps than "out" Temps records.Thanks a bunch! |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-02-14 : 17:22:01
|
| Does Jeff's solution work for you?Tara Kizeraka tduggan |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 17:25:38
|
JSmith - I executed the first step like this...DECLARE @OutMod intSET @OutMod = 5select t.*, (select Max(TimeStamp) from TraxData t2 where t2.Module+1 = t.module and t2.Customer_No = t.Customer_No and t2.TimeStamp < t.TimeStamp ) as PrevTimeStamp, t.Module-1 as PrevModulefrom traxData tWhere t.Module = @OutMod 24 minutes later & it's still not done. Module 5 has 97,819 rows, Module 4 has 92,374 rows. The table has a total of 678,000 rows. Is it actually looking through all 92,374 records for the latest record on Module 4 for each Module 5 record 97,819 times? I would think there would be a more efficient way of doig this.Thanks for your help! |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-14 : 17:41:23
|
| JSmith/Tara - Another question. In Jeff's example, would the filters in step 2 be applied to step 1 when the queries are combined?Gotta go for now - It's V-Day & the fam is waiting. Will try again tomorrow morning.Thanks again guys. Tara - If the last name change is due to marriage then congradulations! Get off the computer & go spend time with your new family! |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-15 : 08:57:39
|
| I'm back! That step 1 SQL took 4:45 hours. Obviously I would never generate it for the 97,000+ rows, but even generating it for a month takes a long time. Looking at the Execution Plan 84% was dedicated to a process titled Index Pool/Eager Spool, which claims that it creates a temporary index, and had a row count of over 988 million. Ouch! As you can see by the DDL, I have an index on the TraxData table which combines the Customer_No, Module, and TimeStamp columns. Shouldn't that be sufficient? I am going to run the Index Tuning Wizard on it now. |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2006-02-15 : 09:13:42
|
First off -- aren't you running this for 1 module, for a certain period of time? How many rows are there for that? Never test a sql statement by just running it for all of your rows (especially if you have millions) -- run it for a realistic amount based on what the final query will need to do. For example, if you are writing a report that multiples units by rate to come up with an Amount, and you want to test it, you don't run it for every row of data in your table, right? And that final report probably won't run for every row in your table, right? they both will be filtered.Even though I said take it in steps and labelled the first as "step 1", again, it doesn't mean that you should execute that SQL for all of your rows -- how can you test it and verify the results? Run it for a small sample, check the results, make sure it works. Then plug it into the next SQL statement, again with proper filters, and run them together. use QA to help you tweak the query to see how it affects the execution plan.What is the primary key of these tables by the way? I see indexes -- are those unique indexes?Finally, one more thing you can do is to adjust the inner select (step 1) like this:select t.*, (select Max(TimeStamp) from TraxData t2 where t2.Module = t.module-1 and t2.Customer_No = t.Customer_No and t2.TimeStamp < t.TimeStamp ) as PrevTimeStamp, t.Module-1 as PrevModulefrom traxData t Note the subtle change -- this may allow for better index use on the table in the correlated subquery.Either way, with your sample data and DDL, my example showed a big performance increase over what you started with. |
 |
|
|
Ken Blum
Constraint Violating Yak Guru
383 Posts |
Posted - 2006-02-15 : 10:56:47
|
Thanks Jeff, & I totally agree with you. However I was curious to see the worst-case scenario and since I was leaving the office I thought I would just see what happens. I just ran the following SQL which would be a "real-world" situation, one module for the current month. Module 5 has 5,102 rows and module 4 has 3,589 rows for February. It took over 4 minutes until I got rid of that Eager Spool thing in the Execution plan by setting the Where Clause filter order to correlate with the order of the columns in the composite index. So now it runs in 3 minutes and the Execution Plan says 92% is an Index Seek on my composite Index. I did not know that the order of the filter was relevant when attempting to have the SQL use a composite index. Now I do!select t.*, (select Max(TimeStamp) from TraxData t2 where t2.Customer_No = t.Customer_No and t2.Module = t.module-1 and t2.TimeStamp < t.TimeStamp ) as PrevTimeStamp, t.Module-1 as PrevModulefrom traxData tWHERE t.Module = 5 and t.Customer_No = 12186 AND t.TimeStamp > '02/01/2006' So I ran the Index Tuning Wizard on this query and it did not give me any new recommendations.It appears to me that I have hit my limit in optimizing this query. I think the base problem is in the data, where module 4 is sending too much data. Sometimes every 10 seconds. I'm thinking of creating views which will create readings every 15 minutes, or maybe where the value of the reading changes by 5-10 degrees, and then running the SQL off of the views. Would you agree with this approach?I really appreciate your time and expertise. Basically I wanted to make sure that I wasn't missing something obvious to a SQL Expert. Thanks again to both you and Tara. |
 |
|
|
|
|
|
|
|