Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SQL Server Administration (2008)
 Strategies for dealing with large tables

Author  Topic 

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-12-20 : 18:26:53
Just looking for some feedback on what people do to deal with very large tables (I realize that "large" is subjective). We have several large tables but one in particular is 26 gigs (including indexes) and has 20 million rows. It is a heave read/write "Activity" table. We also have a large "Subscription" table. Some of our reports have become quite slow for clients with many students in our system. I've optimized the indexes as much as I can and I've worked with the developers to try to optimize the queries themselves. We aren't a huge company so the idea of a data warehouse isn't really an option (don't have the expertise or fiscal resources to do it properly). I've been toying with a couple of ideas...

1. Precompile reports daily and render those reports instead of running them live. The problem with this option is that most of our reports allow for filtering based on many criteria including time. Precompiled reports don't always make sense when filtering by time is necessary.

2. Moving data older than 3 months daily from the "Activity" table to an "ActivityArchive" table and join onto the "ActivityArchive" table only when the user specifically requests data older than 3 months. This would make the reports faster unless the user specifically is looking for older data.

Just wondering what other strategies DBAs employ to deal with large and fast growing tables and report/application performance.

Thanks,
Craig

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2011-12-20 : 19:30:34
maybe describe the structure of the tables?

quote:

1. Precompile reports daily and render those reports instead of running them live. The problem with this option is that most of our reports allow for filtering based on many criteria including time. Precompiled reports don't always make sense when filtering by time is necessary.


there are many ways to do this and many of those ways are..... bad.

Could you maybe show us a sample query.

Gail posted a nice article on catch all type queries. Probably one of the best use for dynamic sql I can think of.

http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

Sachin.Nand

2937 Posts

Posted - 2011-12-21 : 01:04:47
Following are the different options you could try

1)Partition your table

2)Move the table or your index to a new filegroup and then it to a faster disk

3)Use plan guides if you think that queries are recompiling or using bad plans.

4)I am sure your hash and sorts might be spilling to tempdb.So optimize your tempdb.

5)Check existence of parallel plans.

6)Move your read only tables to a much average performing disk and keep the live data on a much faster disk.


PBUH

Go to Top of Page

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-12-21 : 11:33:38
Thanks, I'll play around with the OPTION (RECOMPILE) hint from Charlie's link and investigate Sachin's suggestions. If that fails I'll post some queries and table schemas.

It does sound like that unless you are dealing with absolutely massive tables that most people just optimize queries, indexes, and schemas in order to scale? Rather than implementing a data warehouse or a "workaround" solution like what I described above. Is that accurate?
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2011-12-21 : 11:44:32
Meh -- "it depends" is the general answer.

For your scenario I'd maybe look at partioning your data somehow (it's very hard to advise because we don't know the table structures). If most of your reports only run on 'newer' data then you could make a partitioning scheme based on data.

An indexed view over the top means that you could still treat it as one big virtual table if you wanted to.

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2011-12-21 : 13:38:42
how can a table with only 200 million rows be 26GB???

how many columns in the bad boy, and are the all varchar(max)?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2011-12-21 : 13:41:01
quote:
Originally posted by Transact Charlie
Meh



Is that a technical term?

Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-12-21 : 15:46:10
quote:
Originally posted by X002548

how can a table with only 200 million rows be 26GB???

how many columns in the bad boy, and are the all varchar(max)?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/






Only 20 million rows :)

Not sure why it is so large. It is 15 gigs for the table itself and 10 gigs for the indexes (about 12 indexes). Here is the structure (there are a couple of text fields and one nvarchar max)...

ActivityID int no 4
StudentsID int no 4
CourseID int no 4
ActivityStarted datetime no 8
ActivityFinished datetime no 8
AttemptID varchar no 50
IP varchar no 50
Status int no 4
LessonID int no 4
Score numeric no 9
Details text no 16
SystemVars text no 16
Bookmark varchar no 255
ExitReason varchar no 50
SuspendData nvarchar no -1
LoginID int no 4
ReferenceNum nvarchar no 120
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2011-12-21 : 16:16:52
what does sp_spaceused <table>

give you?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-12-21 : 16:18:33
quote:
Originally posted by X002548

what does sp_spaceused <table>

give you?



name rows reserved data index_size unused
Activity 19365946 27105680 KB 16196456 KB 10893384 KB 15840 KB
Go to Top of Page
   

- Advertisement -