Size of database exploded over the weekend

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

SQL Server Administration (2008)

Size of database exploded over the weekend

Author

Topic

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-09-06 : 12:31:31

It went from 90 gigs to 270 gigs. Our largest table went from 10 gigs to 70 gigs. I checked the IIS logs and there wasn't any unusual activity. I queried the row count of the large tables and they didn't increase an unusual amount. I'm having trouble figuring out where this extra size came from? Any ideas on where to begin?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-09-06 : 12:35:45

is it log file which went up?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Kristen
Test

22859 Posts

Posted - 2011-09-06 : 12:49:40

Housekeeping maintenance ran over the weekend - rebuilding Indexes and Statistics?

What method are you using to discover that a specific table increased from 10GB to 70GB?

Maybe you have an unlimited-text column in the table and some plonker is filling it full of crud ?

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2011-09-06 : 16:07:07

My bet is on a REBUILD or REORGANIZE task in a maintenance plan with a very large number specified for the Change Free Space per page percentage option. What I have seen happen is that this is mistaken for how much space to fill the pages with and someone puts in 90 - which then translates to a fill factor of 10.

Jeff

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-06 : 16:50:04

Someone used the SSMS GUI to alter the table and you lost your compression settings?

N 56°04'39.26"
E 12°55'05.63"

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-09-06 : 19:21:23

Figured it out. I changed the database setting a few days ago to leave 90% space on every index when I meant to put 10%. The indexes were rebuilt Sunday night and the database size exploded. I should pay more attention to what I'm reading before changing a database setting ;)

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-09-06 : 19:22:43

quote:
Originally posted by jeffw8713

My bet is on a REBUILD or REORGANIZE task in a maintenance plan with a very large number specified for the Change Free Space per page percentage option. What I have seen happen is that this is mistaken for how much space to fill the pages with and someone puts in 90 - which then translates to a fill factor of 10.

Jeff

Bingo!!! I guess I'm not the only one who has made this mistake (he says sheepishly).

Thanks everyone!

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2011-09-06 : 22:09:49

Nope - we all learn from experience...sorry to say.

It is an easy trap to fall into, you think you are setting the fill factor value (90) and it turns out it converts that to a fill factor of 10. When I do use that particular task to rebuild indexes - I generally don't use it to resize the indexes.

I would recommend that you take a look at SQL Fool's reindex procedure - which will only rebuild indexes (or reorganize) that really need to be touched. You can swap out that task with an Execute SQL Task to call out to her procedure. You could also look at Ola's backup utilities.

Sorry, I don't have links available but you should be able to find them through google easily enough.

Jeff

Kristen
Test

22859 Posts

Posted - 2011-09-07 : 03:07:21

I would also question if 10% slack is worth having at all?

I've read articles where 100% fill has no noticeably impact on additions, but of course keeps the indexes smaller, and that improves the lookup-performance.

If 10% slack gives yo, say, one spare key slot and you insert two new rows into that index page SQL will have to split it anyway ... and lots of data, such as Dates, tends to be created with values clustered close together.

Also, don't set 90% on indexes for identity column, and try to use sequence-GUIDs if you have them in an index (and keep that at 100% too) ... but I'm sure you know all that.

+1 to Jeff's comment about using a procedure that will only rebuild indexes that are fragmented. Much less impact on database logs etc.

Sachin.Nand

2937 Posts

Posted - 2011-09-07 : 06:15:52

quote:
Originally posted by jeffw8713

What I have seen happen is that this is mistaken for how much space to fill the pages with and someone puts in 90 - which then translates to a fill factor of 10.

Jeff

That's not true.

A value of Change free space of 90 will be equivalent FILLFACTOR value of 90.

PBUH

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-09-07 : 11:12:58

quote:
Originally posted by Sachin.Nand

quote:
Originally posted by jeffw8713

What I have seen happen is that this is mistaken for how much space to fill the pages with and someone puts in 90 - which then translates to a fill factor of 10.

Jeff

That's not true.

A value of Change free space of 90 will be equivalent FILLFACTOR value of 90.

PBUH

I just wish they would use "FILL FACTOR" in both locations. Choosing to use different words just causes people like me to mess up.

ferrethouse
Constraint Violating Yak Guru

352 Posts

Posted - 2011-09-07 : 12:05:11

Here is the link that Jeff mentioned above. Looks like a great script...

http://sqlfool.com/2011/06/index-defrag-script-v4-1/

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-09-07 : 21:35:48

In almost every case, I recommend not using fillfactors at all.

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2011-09-07 : 21:52:53

quote:
Originally posted by russell

In almost every case, I recommend not using fillfactors at all.

Why?

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-09-07 : 22:39:15

Because after a lot of testing, I can't find any evidence that it helps for typical OLTP systems. Tara once posted the same thing, but being me, I had to try for myself and see.

Ran a ton of updates/inserts guranteed to cause splits (w/o FF vs with FF) and write performance was not impacted to nearly the degree that read performance was by reading more pages.

I'll try to dig up my old test scripts and results. Was a while back.

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2011-09-07 : 23:52:59

Ah yes, this is something that I don't quite agree with. It is actually quite easy to prove that using an IDENTITY column as the clustered index key does not guarantee that the table will not become fragmented.

In fact, I performed a simple test a while back where I built a very simple table and inserted enough rows to get at least 50 pages of data. The clustered index was an IDENTITY - and right after inserting the data the table was already fragmented.

On top of that, if the table contains any VARCHAR data types that are mostly empty on first creation - then filled over time that will cause page splits to occur and hence, additional fragmentation.

And, of course - if any of the rows are ever deleted that will cause gaps on the pages which can only be filled by performing an index rebuild.

So, I guess if you are using IDENTITY or a Sequential GUID as the clustering key - then a 100% fill factor will work as long as you also know that the table will become fragmented over time and needs to be rebuilt.

If, however - you are not using an IDENTITY column as the clustering key then you really need to consider what fill factor to use. BTW - I am NOT a proponent of putting an IDENTITY on every table to define the clustering key. I do use IDENTITY where it makes sense, but I don't necessarily automatically make it the primary key - or necessarily specify it as the clustering key.

Jeff

Kristen
Test

22859 Posts

Posted - 2011-09-08 : 01:34:57

"I guess if you are using IDENTITY or a Sequential GUID as the clustering key - then a 100% fill factor will work as long as you also know that the table will become fragmented over time and needs to be rebuilt"

That's not a reason for having a <100% fill factor though. There will never be an insert into a "gap" (well, technically there can be by using SET IDENTITY_INSERT ON, but other than that new entries will cause an index on the Identity column to be added at the end)

Still need regular housekeeping to reindex or reorganise the indexes.

"If you are not using an IDENTITY column as the clustering key then you really need to consider what fill factor to use"

Maybe

but Russell, Tara's and my view is that 100% Fill Factor (i.e. no slack space) makes no appreciable difference to newly inserted index entries, but having spare space in the index from using Fill Factor < 100% increases the number of reads - sufficient to be appreciable.

I started off my SQL-life with Fill Factor = 90% on all non-sequential indexes. I've moved to having 100% on all of them. We have far more SELECTs than INSERTs (which I would expect is common for most OLTP)

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2011-09-08 : 22:11:27

Kristen,

We will have to agree to disagree then. I prefer to review the usage patterns and determine a suitable fill factor to allow for how the tables are organized. I don't have the luxury of changing the schema - since most of my systems are vendor supplied. I can change the fill factors on indexes and have found that having 100% will cause too many page splits and reduce system performance which directly affects the user experience.

I also have cases where using an IDENTITY for the clustered index actually increases the number of reads. In quite a few cases...

The most obvious case is the document table, where documents are all related to a patient. Using an IDENTITY causes those documents to be spread throughout the table - and when retrieving a patient chart we end up reading multiple extents into memory even though we may only have a single row for that patient in the extent.

The document table is pushing 300GB now - which really can cause performance issues because we end up using a lot of the buffer cache for this table and most of the data in the cache is not actually useful.

Jeff

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-09-08 : 22:25:51

A fill factor is really just a pre-determined page split.

A less than very well thought out FF is going to cause more reads right off the bat AND incur page splits.

If you do a lot of reads, a FF will immediately incur a small but noticeable performance penalty.

The practice of performance tuning is really the practice of reducing reads.

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-09-08 : 23:08:59

It's not about agreeing or disagreeing on this topic, it's about testing it out. The page splits weren't a big deal in terms of performance, but I was interested in making it faster. Boy was I in for a rude awakening when changing the fillfactor caused huge performance degradation on reads and very, very little performance improvement in writes. This was in a load test environment so wasn't a big deal, but it sure did open my eyes. I tried many different values too, and only 100% fill factor had acceptable performance for reads. Our IO subsystem can handle the write cost/page split, reading more pages was the problem.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2011-09-09 : 18:42:11

Tara - I understand that, but there are a lot of factors involved and I think it needs to be evaluated and tested for each case and not just arbitrarily stated that all tables should have a 100% fill factor.

I am not saying that using any other value is better or worse. Just that you need to evaluate the usage patterns and tune for that usage pattern. Using a 100% fill factor on an insert heavy table where the clustering key is not a monotonically increasing value may cause more problems than excepting the few extra reads required because the data crosses multiple pages (which may not even be a problem because SQL Server doesn't read single pages - it reads extents).

Russell - absolutely agree. However, I see the recommendation to put an IDENTITY on all tables and use that as the cluster key which can cause more reads and more data to be read into the buffer cache than is needed.

The example I gave previously is a very good example of this. To get a single patient's chart which will sometimes have more than 500 documents - can be quite a lot of reads because those documents are spread throughout the table and not grouped by the patient. If the non-clustered index is used - then we get a bookmark or key lookup because we cannot create a covering index (too many columns required) which increases the number of reads.

If that table were restructured to use the PatientID as the leading clustering key - than all patient rows would be located on contiguous pages and the read-ahead algorithm would actually work. As it is now - read ahead doesn't provide any benefit because the extents that are read in most likely will not have the data we are looking for.

As it stands now, the clustered index is on the document ID (IDENTITY) - which forces multiple extents to be read for a single patient's data. What we end up with is an extent in memory where we have at most a couple of rows that satisfy the 500 rows of data needed. And, in most cases - the other patient requests will not be satisfied by those pages/extents either.

Using a 100% fill factor on that table is not a problem right now - but it sure doesn't help us either. Changing to the PatientID could really help a lot, but the sacrifice is that we now will suffer more page splits and possibly faster fragmentation. We can reduce how fast the table will fragment - reducing how often the table has to be rebuilt (OFFLINE because of the XML and Text data types) by using a lower fill factor. Of course, we will suffer increased reads because of that - but, overall we reduce the number of reads and the impact on the buffer cache.

I am constrained by the fact that this is a vendor supplied system. I cannot change the indexing strategy - and cannot add indexes without going through the vendors development process.

Jeff

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources