Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SQL Server Administration (2008)
 Can't understand why ...

Author  Topic 

sql_er
Constraint Violating Yak Guru

267 Posts

Posted - 2013-07-24 : 13:32:06
Hi guys,

I've been given an update query recently and told that it takes hours/days to run. This update query joins 2 tables on a few columns and updates 2 columns. It also has a WHERE clause.

Table 1 has ~ 350 million records and is 280 GB in size. It is partitioned by date ranges

Table 2 has only 11,000 records (tiny)

The query updates Table 1.

Here is the puzzle: When I update TOP (200) rows [or less] in Table 1, it takes a split of a second, but when I update TOP (300) rows or more, it runs for 5 minutes.

I tried it many times and I get the same results.

Any idea as to why the timing jumps so much when I increase the required output from 200 to 300?

Please advise.

Thanks in advance!

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-07-24 : 13:45:54
Look at the query plan and statistics. Turn on query plan (control-m) and run SET STATISTICS IO ON. Then run the two queries and examine the query plans.
Go to Top of Page

sql_er
Constraint Violating Yak Guru

267 Posts

Posted - 2013-07-24 : 13:51:58
98% of the time/cost is spent on scanning this huge table, which is a heap.

It is scanning because the conditions in the where clause can't use indexes (due to isnull,ltrim,rtrim functions being applied to them on join)

Should i add a clustered index? It might be a big task considering the size of the table (280 GB)

Thank you!
Go to Top of Page

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-07-24 : 14:34:22
If it is scanning without using the (non-clustered) index that you already have because of non-sargability in the where clause. If that is the case, adding a clustered index is not going to help. If it is the three things that you indicated - isnull, ltrim, rtrim - it may be possible to rewrite the query so you don't use those functions.

Also, as you correctly guessed, you have to be careful about adding a clustered index on a large table because when you do that it has to reorganize all the data in the table, which can be time-consuming. And, it will take up disk space as well (logged operation)

Yep, I used a big word "sargability", only because I needed a lame reason to post a link to this blog
http://myshallowsqlblog.wordpress.com/saragable-the-word-only-a-geek-could-love/
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2013-07-24 : 15:01:09
There seem to be an internal threshold for about 220 rows.
Do the update and increase the 200 to 201, 202 etc until you find the threshold.



Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
Go to Top of Page

sql_er
Constraint Violating Yak Guru

267 Posts

Posted - 2013-07-24 : 16:54:03
James K. - thanks for the info. I'll try to think of a way to rewrite a query not to use isnull, ltrim, rtrim.

SwePeso - what do you mean by an internal threshold?

Btw, I started taking advantage of the partitions by adding date ranges to the update query to be in line with the partition function specifications and that made a world of difference. I'm now able to update 5 million records in under 5 minutes.

Thanks!
Go to Top of Page

wowguide
Starting Member

3 Posts

Posted - 2013-08-04 : 03:50:51
unspammed
Go to Top of Page
   

- Advertisement -