Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
oitsubob
Yak Posting Veteran
70 Posts |
Posted - 2005-10-10 : 21:02:33
|
| Hey All!Got a question for the experts. I have a table with about 10 million rows containing attendance related information. One of the fields we'll be querying often is the the reportdate datetime field, which by the way is indexed, that tells us the date of the attendance event. For example, on 1/1/2005, Bob took 8 hours of Vacation.One of my team members wants to write a query against the table for events occurring within a specific week or in a specific month and year. He's requesting we create two new fields: one containing the week number and another containg in year and month in the format 'yyymm'.What I'd like to know is, what is the most efficient way of storing and querying this data? Is it really better to create two new columns as he's suggested? Other options include creating indexed computed columns, using datepart() functions in the queries to retrieve week and yyyymm values, using the between operator to pull dates within ranges or even create a view containing these fields. Which method is best and why?Thanks,Bob |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2005-10-10 : 21:35:44
|
| Adding regular columns will add space, both for data and indexes. Creating a computed column saves some space, and it can be indexed as well, but it still uses more space. Properly using DateDiff and DateAdd can let you get the best possible performance WITHOUT adding any new columns:DECLARE @myDate datetime, @month1 datetime, @month2 datetime, @week1 datetime, @week2 datetimeSET @myDate='1/1/2005'SET @month1=DateAdd(month, DateDiff(month, 0, @myDate), 0) -- first day of @myDate's monthSET @month2=DateAdd(month, 1, @month1) -- first day of month afterSET @week1=DateAdd(week, DateDiff(week, 0, @myDate), 0) -- first day of week for @myDateSET @week2=DateAdd(week, 1, @week1) -- first day of week after @myDate's weekSELECT * FROM myTable WHERE dateCol BETWEEN @month1 AND @month2 -- all rows for month of @myDateSELECT * FROM myTable WHERE dateCol BETWEEN @week1 AND @week2 -- all rows for week of @myDateThese date boundaries are very easy and fast to calculate, and by using them in a BETWEEN clause, the query optimizer can more likely use an index seek for performance. |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2005-10-10 : 23:53:51
|
Normally, you should avoid using BETWEEN for date queries, because it will include both end points. Usually, you will want to select a date range like this:Where Date >= StartDateTime and Date < EndDateTimeThat will select everything starting with and including the StartDateTime and up to, but not including the EndDateTime. The following example will select the month of October, 2005.Where Date >= ‘2005-10-01’ and Date < ‘2005-11-01’If you used BETWEEN for this:Where Date between ‘2005-10-01’ and ‘2005-11-01’It would include rows where the datetime was exactly ‘2005-11-01 00:00:00.000’Here are some examples of how to query for last year, this year, last month, and this month:where -- Select range for last year DateCol >= dateadd(year,datediff(year,0,getdate())-1,0) and DateCol < dateadd(year,datediff(year,0,getdate()),0)where -- Select range for this year DateCol >= dateadd(year,datediff(year,0,getdate()),0) and DateCol < dateadd(year,datediff(year,0,getdate())+1,0)where -- Select range for last month DateCol >= dateadd(month,datediff(month,0,getdate())-1,0) and DateCol < dateadd(month,datediff(month,0,getdate()),0)where -- Select range for this month DateCol >= dateadd(month,datediff(month,0,getdate()),0) and DateCol < dateadd(month,datediff(month,0,getdate())+1,0) CODO ERGO SUM |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-10-11 : 05:20:43
|
| "Properly using DateDiff and DateAdd can let you get the best possible performance WITHOUT adding any new columns"I haven't done it, but I have toyed with computed columns for Week / Year / Year + Month and indexing them for performance.I don't know if that would be more efficient that using a WHERE clause against a date range [where Date Column also indexed], but I have certainly had Date Range columns escalating to Table Scan awfully easily!It probably becomes [more?!] important not to use SELECT * (to prevent YEAR / etc. columns being computed when not used), and ideally to only use YEAR / etc. computed columns in WHERE clause, and not in SELECTI'd appreciate anyone's thoughts on whether this has any merit over just a plain Date RangeKristen |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2005-10-11 : 11:02:45
|
| If the table is a transaction table, I would avoid adding all the extra date columns, because all the variations can really get out of hand. It the table is purely for reporting, you might consider it.I would go the way of creating a Date Dimension table with all of the variations of year, month, quarter, week, etc. and just join to the reporting table via a date id. That way, you can keep adding columns for different variations on date. I have a generic one that I developed that has 66 different columns, and that is with nothing specific to our business or industry.CODO ERGO SUM |
 |
|
|
oitsubob
Yak Posting Veteran
70 Posts |
Posted - 2005-10-11 : 11:40:00
|
| In our case, the table is going to specifically be used for reporting. I kind of like the idea of a date dimension table Michael suggested as you only have to store one record with all the different date variations.I really appreciate everyone taking the time to respond. You've given me some great ideas I'm going to take back and share with my team.If you have any other suggestions, let me know.Otherwise, thanks again! :)Bob |
 |
|
|
madhivanan
Premature Yak Congratulator
22864 Posts |
|
|
|
|
|
|
|