Author |
Topic |
ferrethouse
Constraint Violating Yak Guru
352 Posts |
Posted - 2011-12-20 : 18:26:53
|
Just looking for some feedback on what people do to deal with very large tables (I realize that "large" is subjective). We have several large tables but one in particular is 26 gigs (including indexes) and has 20 million rows. It is a heave read/write "Activity" table. We also have a large "Subscription" table. Some of our reports have become quite slow for clients with many students in our system. I've optimized the indexes as much as I can and I've worked with the developers to try to optimize the queries themselves. We aren't a huge company so the idea of a data warehouse isn't really an option (don't have the expertise or fiscal resources to do it properly). I've been toying with a couple of ideas...1. Precompile reports daily and render those reports instead of running them live. The problem with this option is that most of our reports allow for filtering based on many criteria including time. Precompiled reports don't always make sense when filtering by time is necessary.2. Moving data older than 3 months daily from the "Activity" table to an "ActivityArchive" table and join onto the "ActivityArchive" table only when the user specifically requests data older than 3 months. This would make the reports faster unless the user specifically is looking for older data.Just wondering what other strategies DBAs employ to deal with large and fast growing tables and report/application performance.Thanks,Craig |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2011-12-20 : 19:30:34
|
maybe describe the structure of the tables?quote: 1. Precompile reports daily and render those reports instead of running them live. The problem with this option is that most of our reports allow for filtering based on many criteria including time. Precompiled reports don't always make sense when filtering by time is necessary.
there are many ways to do this and many of those ways are..... bad.Could you maybe show us a sample query.Gail posted a nice article on catch all type queries. Probably one of the best use for dynamic sql I can think of.http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
Sachin.Nand
2937 Posts |
Posted - 2011-12-21 : 01:04:47
|
Following are the different options you could try1)Partition your table2)Move the table or your index to a new filegroup and then it to a faster disk3)Use plan guides if you think that queries are recompiling or using bad plans.4)I am sure your hash and sorts might be spilling to tempdb.So optimize your tempdb.5)Check existence of parallel plans.6)Move your read only tables to a much average performing disk and keep the live data on a much faster disk.PBUH |
 |
|
ferrethouse
Constraint Violating Yak Guru
352 Posts |
Posted - 2011-12-21 : 11:33:38
|
Thanks, I'll play around with the OPTION (RECOMPILE) hint from Charlie's link and investigate Sachin's suggestions. If that fails I'll post some queries and table schemas.It does sound like that unless you are dealing with absolutely massive tables that most people just optimize queries, indexes, and schemas in order to scale? Rather than implementing a data warehouse or a "workaround" solution like what I described above. Is that accurate? |
 |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2011-12-21 : 11:44:32
|
Meh -- "it depends" is the general answer.For your scenario I'd maybe look at partioning your data somehow (it's very hard to advise because we don't know the table structures). If most of your reports only run on 'newer' data then you could make a partitioning scheme based on data.An indexed view over the top means that you could still treat it as one big virtual table if you wanted to.Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
X002548
Not Just a Number
15586 Posts |
|
X002548
Not Just a Number
15586 Posts |
|
ferrethouse
Constraint Violating Yak Guru
352 Posts |
Posted - 2011-12-21 : 15:46:10
|
quote: Originally posted by X002548 how can a table with only 200 million rows be 26GB???how many columns in the bad boy, and are the all varchar(max)?Brett8-)Hint: Want your questions answered fast? Follow the direction in this linkhttp://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxWant to help yourself?http://msdn.microsoft.com/en-us/library/ms130214.aspxhttp://weblogs.sqlteam.com/brettk/http://brettkaiser.blogspot.com/
Only 20 million rows :)Not sure why it is so large. It is 15 gigs for the table itself and 10 gigs for the indexes (about 12 indexes). Here is the structure (there are a couple of text fields and one nvarchar max)...ActivityID int no 4StudentsID int no 4CourseID int no 4ActivityStarted datetime no 8ActivityFinished datetime no 8AttemptID varchar no 50IP varchar no 50Status int no 4LessonID int no 4Score numeric no 9Details text no 16SystemVars text no 16Bookmark varchar no 255ExitReason varchar no 50SuspendData nvarchar no -1LoginID int no 4ReferenceNum nvarchar no 120 |
 |
|
X002548
Not Just a Number
15586 Posts |
|
ferrethouse
Constraint Violating Yak Guru
352 Posts |
Posted - 2011-12-21 : 16:18:33
|
quote: Originally posted by X002548 what does sp_spaceused <table>give you?
name rows reserved data index_size unusedActivity 19365946 27105680 KB 16196456 KB 10893384 KB 15840 KB |
 |
|
|