aggregate table or complex view? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

aggregate table or complex view?

Author

Topic

jannable
Starting Member

25 Posts

Posted - 2006-12-02 : 09:12:49

INTRO

I have two tables: Project and ProjectVersion. Whenever a user requests any information, they usually want to know it on a project. However, most of the data for a project is actually the sum of some data from its related versions. What's more, more than half of the data from the version is calculated based on other data from itself. Currently, I am using a view to calculate this data.

For example:
ProjectVersion.EligibleFunds (MONEY)
ProjectVersion.IsObligated (BIT)
ProjectVersion.FederalPercent (NUMERIC)
vwProjectVersion.FederalFunds = EligibleFunds * FederalPercent
vwProjectVersion.FederalObligated = (CASE WHEN IsObligated = 1 THEN FederalFunds ELSE 0 END)

So, when the user requests information on a project, it sums all these values up for the project using another view.

For example:
vwProject.FederalObligated = SUM(vwProjetVersion.FederalObligated)

There are many more complicated values that have to get calculated. About 80 in total for vwProject.

The reason I use a view is because the application uses these calculated values all over the place and it would have been awful to have had the application handling all of these calculations. Not to mention the code would've had to be duplicated somewhere within the database anyway. As it is, if there's a problem with a calculation, I can change it in one place and all is happy.

SLOW SYSTEM

Here's the problem though. The system is slow and I suspect that this view is the main culprit. It's being used thousands of times a day. There are only about 20,000 records (a small table). About 15 times per day, the user downloads all of the data (not my idea!) into an excel file. About 60 time per day, all of the projects' values are summed and grouped for different reports. And about 500 times per day, transactions are being done on the projects.

DEADLOCKS

One of the things that I thought was a good idea was setting up a trigger to update a value here and there. For example, whenever a payment is approved, a trigger fires to update Project.PaymentStepSequence. This is done so that it's easier to compute which approval step the project is in at any given time. Unfortunately, I suspect that it might be the cause of the deadlocking issues that we're having. We get about 50 deadlocks per day. Is that normal? Is it normal for me to want that to be down to 0?

IDEA #1

One of the things that I have been considering is creating a ProjectDetail table that stores the values like FederalObligated that are only calcuated based on the project's versions. Whenever a payment, expense, or version gets updated, inserted, or deleted, the application called a stored procedure that updates the 80 or so values for that project. Essentially, behaving like a trigger. The intention of this is to put more work on the insert/update than on the select. However, I have doubts about this because we had tried this approach before using a trigger and ran into all sorts of deadlocks. When I moved away from that, we never really got rid of all the triggers (because of some really nice and simple variables like StepSequence).

SUMMARY

I guess what I'm trying to find out is if I should create myself a table that stores all of the aggregate data and put more work into keeping it up-to-date? Or should I eliminate the few small triggers and continue with what I've got with the view? Or is there a better solution altogether that I'm not realizing?

I lean towards eliminating the triggers because the view should be able to handle this sort of job. I mean, that's what SQL excels at: pulling lots of data from multiple tables really fast. Correct me if I'm wrong, but updates and inserts are more expensive than selects and will likely cause more deadlocks, right? Well, shouldn't I avoid doing even MORE inserts and updates if my main problem already is deadlocks?

Anyway, thanks in advance to anybody who can help.

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2006-12-02 : 12:44:55

Do you know what queries are deadlocking? You can't eliminate the deadlocks unless you know with certainty what is causing them. There are traceflags you can turn on that will cause sql server to log detailed information about which spids were deadlocked, and what sql they were running at the time. see links below for more info:

http://www.sql-server-performance.com/deadlocks.asp
http://www.sql-server-performance.com/at_sql_locking.asp
http://msdn2.microsoft.com/en-us/library/ms178104.aspx

SqlSpec: a fast and comprehensive data dictionary generator for
SQL Server 2000/2005, Analysis Server 2005, Access 97/2000/XP/2003
http://www.elsasoft.org

jannable
Starting Member

25 Posts

Posted - 2006-12-02 : 23:27:04

Thanks Jezemine. I've tried that but am unable to duplicate the problem. It seems to only happen on the live system and randomly. I will try the trace flags again on Monday though. I may see something new.

As for the idea of an aggregate table that requires constant updating. Is this an idea that is just stupid or what? Know of any examples of it ever being done before?

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2006-12-03 : 10:28:36

I can't say I really like the idea of a table that stores calculated values that require constant updating - especially if you aren't fairly certain that implementing it will fix the problem.

Is it possible for you to turn the trace flags on in prod? Probably not. But certainly there you'd get some info since you are getting 50 deadlocks a day. From my understanding the overhead of the deadlock traceflags is pretty small. One drawback is they require you cycle the sql service though.

EDIT: My experience with perf problems is, you don't really know where they are unless you measure, and in this case measuring means turning on the trace flags. Maybe someone else has some better ideas?

SqlSpec: a fast and comprehensive data dictionary generator for
SQL Server 2000/2005, Analysis Server 2005, Access 97/2000/XP/2003
http://www.elsasoft.org

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-12-03 : 17:41:49

Also check the indexing on the tables, are the view's underlying tables properly indexed?
Are the datatypes of the parameters (in queries and procs) the same as the tables?

You could try to set WITH(NOLOCK) hint in the view, if you think it is the culprit, and business rules allow it.

Depending on how "up to date" the view has to be, a process could run (sql job), and materialize the view every 5 minutes or so.
How fast is it to do a select <everything> from the view, and how many records are we talking about ?

But first do as Jez said, run the traceflags, and see which objects and processes are involved in the deadlocks.

rockmoose

jannable
Starting Member

25 Posts

Posted - 2006-12-03 : 18:50:49

quote:
You could try to set WITH(NOLOCK) hint in the view, if you think it is the culprit, and business rules allow it.

I've read that using NOLOCK is actually not a recommended solution to a deadlocking problem. About two months ago, I had actually littered much of the SQL with NOLOCK and it didn't seem to make matters any better. I removed them all since then.

quote:
How fast is it to do a select <everything> from the view, and how many records are we talking about ?

It's certainly not quick. About 9 seconds to return all columns of all 20,000 records. Pathetic, I know. I looked at the execution plan of the view and noticed that significant time is spent joining on the document table. It checks to see if a contract has been uploaded for the specific project. I will add an index on the Document.Code column; currently it's only indexed on the primary key: Document.ID.

quote:
But first do as Jez said, run the traceflags, and see which objects and processes are involved in the deadlocks.

I'm going to try it on the prod tomorrow (after experimenting on the dev first). One of the links that jezemine gave said that the trace flag is minimal so it shouldn't be an issue. I just have to remember to turn if off when I'm done.

I'll keep you all posted. It'll probably take me some time trying to decifer what the results are telling me.

jannable
Starting Member

25 Posts

Posted - 2006-12-04 : 14:27:53

I finally managed to run a trace and obtain information on a deadlock that happened today. I couldn't figure out the DBCC TRACEON method so I used Profiler instead. The only problem is that I can't figure out what the actual object the two processes were fighting over. The numbers don't make any sense to me. Does anybody know how to look up the "associated objid"?

Here's the deadlock graph in case it helps.

[url]http://jannable.com/deadlock.gif[/url]

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2006-12-04 : 14:51:43

Look at object_Name() in BOL...

I am amazed that the "D" word has not mentioned...

IME, deadlocks are usually the result of dodgy application code, TSQL code or design..

What does that trigger look like?

DavidM

Production is just another testing cycle

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-12-04 : 15:03:24

Also, if you suspect that the View is slow and causing problems, then you need to show us the definition of the view and all tables it uses and what indexes they have. And also a typically call to that View -- i.e., is it being filtered by a calculated value which therefore cannot use indexes?

When optimizing any system, always start with the basics -- are proper indexes in place, is your design sound, is your SQL well-written and efficient? Those 3 things must be carefully checked and verified before you can begin to trouble-shoot other potential slow-downs in your system.

- Jeff

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-12-04 : 15:07:55

> "I am amazed that the "D" word has not mentioned..."

Now you'r making me feel bad

, thanks actually.

The deadlock graph might suggest better indexes, since we have page locks which are pretty coarse compared to key locks.

rockmoose

jannable
Starting Member

25 Posts

Posted - 2006-12-04 : 15:36:51

quote:
Originally posted by byrmol

Look at object_Name() in BOL...

I ran it for the obvious number: 271049915. But it only returned NULL. Of course, I went crazy and ran it for all other numbers and also only got NULL.

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-12-04 : 15:55:41

You have to run it in the database where the object exists.

rockmoose

jannable
Starting Member

25 Posts

Posted - 2006-12-04 : 16:14:21

Indeed. I still get NULL when running it on the correct db. Here's the SQL that I have tried. All returning NULL:

SELECT object_name(25459859)
SELECT object_name(72057594879410176)
SELECT object_name(17249)
SELECT object_name(170993081)

Here's the new deadlock graph: [url]http://jannable.com/deadlock2.gif[/url]

Oh, and I'll post some more detailed info about the tables/views when I get a chance.

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-12-04 : 16:44:32

That's a bummer. I would expect a name from associated objectid 72057594879410176.
The page, owner etc.. I would not expect a result from object_name().

Actually I can rant a bit about the new deadlock graph, it looks really good,
but why could they not just have written out the names of objects, or at least make the text "copy-paste" - able !!!
In sql2005 it is one step backwards when they removed the names of deadlock objects, (they were there in the sql2000 errorlog with traceflag 3605)!

You can look in the xml file that is the source of the picture, it contains some more info, if you care to read it.

rockmoose

jannable
Starting Member

25 Posts

Posted - 2006-12-04 : 18:06:43

I know what you mean about the non-copy-paste-able-ness. It really sucks!

Anyway, I remember not being able to open the xml file earlier but your message provoked me to try again. It worked this time! And much to my surprise, an entirely different view was the culprit. Yay! I'm going to be digging deeper into this other view tomorrow. I'll keep you guys posted.

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2006-12-04 : 18:48:33

quote:
Originally posted by jannable

And much to my surprise, an entirely different view was the culprit. Yay! I'm going to be digging deeper into this other view tomorrow.

an example of the necessity of measuring before implementing perf fixes.

ask any dev where the most time is spent in their code, 90% give you the wrong answer with great confidence. the other 10% are right only because they measured.

EDIT: btw, when investigating deadlocks I usually look at the output to the errorlog on 2005 - it has more information than profiler, like object names, etc.

SqlSpec: a fast and comprehensive data dictionary generator for
SQL Server 2000/2005, Analysis Server 2005, Access 97/2000/XP/2003
http://www.elsasoft.org

jannable
Starting Member

25 Posts

Posted - 2006-12-05 : 16:26:50

So, in regards to the deadlocking issue, I believe I fixed it by eliminated two small triggers in the payment section. Long forgotten code that was never suspected of being a problem. Thanks to all of you for helping me figure out how to hunt down deadlocks. I'm deploying tonight so by the end of the week I'll know for sure whether I fixed the problem.

Ok. So now the slow view problem. The application is suffering from timeouts (which I originally thought were tied to the deadlock issue). The view (vwProject) used to calculate all the values in the TotalVersions subquery by using another view on the ProjectVersion table. I just finished changing it so that it's using some new calculated columns instead. Doing this isn't much of a speed improvement but it helps a little bit.

For anybody who is interested in spotting some obvious problems, you can see the view and the execution plan here: [url]http://jannable.com/vwProject.zip[/url]. I hope that's enough to spot some obvious stupidity. I'd love to receive some feedback. Perhaps I'm just trying to do too much with one view?

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2006-12-06 : 14:01:52

That's a very fugly view...

I don't know how SQL2005 handles UDFs, but you have lots of them and they would absolutely kill performance in SQL2K...

DavidM

Production is just another testing cycle

jannable
Starting Member

25 Posts

Posted - 2006-12-06 : 14:07:50

quote:
Originally posted by byrmol

That's a very fugly view...

I 100% agree. It's disgusting. Only thing is, I don't know how to reasonably avoid the functions. I've made sure that they're all just doing basic scalar operations (like CASE statements). They don't hit any other tables at least. I only need them so that I'm not rewriting the same large CASE statement 50 times to help compute different values. I'm all for a better way.

jannable
Starting Member

25 Posts

Posted - 2006-12-06 : 14:44:52

Another option I thought about was to make the root variables calculated via a subquery. And then have a larger outer query that would calculate new variables from the subquery's derived values. But that doesn't seem like an improvement. Maybe it is. It feels like a bandaid solution. I must be missing the real problem.

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2006-12-06 : 14:47:31

Obviously if this view is run without restriction (No WHERE clause), then it will have to scan most of the data.

Do you call this view with restrictions?

DavidM

Production is just another testing cycle

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources