Select Suicidally Slow - Aliasing Answer? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

Select Suicidally Slow - Aliasing Answer?

Author

Topic

yonabout
Posting Yak Master

112 Posts

Posted - 2005-12-13 : 07:18:19

I'm developing a report application with a VB front end. part of it allows the users to create their own reports (kind of like ms query but specific to my company) without having to worry about joins etc. and part of it allows them to run their 'Standard' reports, most of which are stupidly complicated.

There are 20 standards, and all of them are fine, except 3 which are based on the same selection.

They all join on to a particular table which is currently around a million rows big, but which will grow to many times that.
A 'Select * from' statement on this table takes around 20 seconds to return the data.
Each of the reports join onto a bunch of other tables as well.
The reports are written as stored procedures so they can accept variables passed in from the VB front end.
As I've had to use ##tables to store the data so it can be accessed by VB recordsets, to get around the problem of users sharing the tables, I've made them unique by appending usernames and timestamps using dynamic SQL within the procedures.
The reports each generate 8 slightly different selections based on the variables input into the procedure.

When these reports are run, they take about 2 - 3 minutes on a good day, and up to 10 if the servers running slow.

So far I've run the profiler and analysed the traces when running these reports, but the indexing suggestions only make a 1 percent increase in performance.
I've tried separating the 8 sub queries out in case the server is trying to parse everything instead of just the one it needs to. I've done this by writing an individual procedure for each query, and calling sub procedures for each query, but the performance only increases by about 3 - 4 seconds.

I've also thought about aliasing the tables, so instead of joining onto this massive table, they just join on to a selection of it based on the users inputs. I think this would work, but I'm still going to have the same problem if anyone uses parameters which are going to select the entire table anyway.

Any Ideas?

Cheers

Yonabout

Kristen
Test

22859 Posts

Posted - 2005-12-13 : 08:38:28

"A 'Select * from' statement on this table takes around 20 seconds to return the data."

That's constrained by the time to physically shift the data from the server to your machine, and probably display it too, so don't worry about the "20 seconds" for that one.

Please post the DDL of the tables in question, and if possible a few INSERT statements, a sample query [as constructed by your application] and then we can easily chuck that into SQL Server and "have a go" - and then make some fantastic suggestions which will cause your query to run in merely a few milliseconds

Kristen

yonabout
Posting Yak Master

112 Posts

Posted - 2005-12-13 : 10:48:28

OK, Thanks. This is a bit of a monster, so I'll post this - if you need more, please let me know.

I'm guessing you can't attach text files, so here's the ddl of the largest table (from the execution plan, this is where 99 % of the resources are getting used)

if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[InvestmentHistory]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[InvestmentHistory]
GO

CREATE TABLE [dbo].[InvestmentHistory] (
	[ParentInvestmentUID] [int] NOT NULL ,
	[DateReceived] [datetime] NOT NULL ,
	[ContributionTypeCatID] [int] NOT NULL ,
	[ContributionReasonCatID] [int] NOT NULL ,
	[TransactionTypeCatID] [int] NOT NULL ,
	[TransactionEventCatID] [int] NOT NULL ,
	[Amount] [money] NULL ,
	[AmountCurrency] [char] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
	[DateInvested] [datetime] NULL ,
	[Units] [decimal](18, 6) NULL ,
	[IsTxUID] [int] NULL ,
	[RowIdentity] [int] IDENTITY (1, 1) NOT NULL ,
	[WFCreatorID] [int] NOT NULL ,
	[CreatorStepID] [int] NOT NULL ,
	[WFOwnerID] [int] NULL ,
	[OwnerStepID] [int] NULL ,
	[CreationStamp] [datetime] NULL ,
	[RecType] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
	[AccountTransactionUID] [int] NULL ,
	[TaxDate] [datetime] NULL 
) ON [PRIMARY]
GO

This is what I've done in the procedures..

The user is passing in variables from the front end - a from and to date, and contributionreason, transactiontype and transaction events. (The user selects these as text strings, but they get joined to a code table within the query as they are integers in the database).

The dates are used to get the selection based on either the datereceived or dateinvested in the table. The other 3 parameters determine the rest of the selection. Because there are 3 parameters, there are 8 permutations of query i.e.
1 0 0
1 1 0
1 1 1
1 0 0
0 1 0
0 1 1
0 1 0
0 0 1.

There are 2 other parameters that get passed in, but these are to deal with the dynamic sql and creating user-unique ##tables. I'm going to give you an undynamic version, as the speed is pretty much the same either way.

Here we go..

Pass the parameters in and create the #table.

@TempPersonTable char (50),
@UserID char (20),
@FromDate char (50),
@ToDate char (50),
@TransType char (50),

AS

Begin

begin

/*Create a temporary table to hold the raw data extracted according to the criteria .*/
Create table [dbo]. [#TransactionSelection]
	(Personuid int,
	NINO char(9),
	EERef char(20),
	surname char(40),
	inits char (5),
	Fund char(40),
	RecDate datetime,
	InvDate datetime,
	TransType char(50),
	TransEvent char(50),
	ContSource char(50),
	ContType char(50),
	Conts decimal(10,2),
	PayCentre char(100),
	Location char(100),
	Scheme char(100),
	Benefit char(100),
	FundManager char(100)
	)

end

Next there are a raft of if statements for each of the 8 permutations I mentioned above. I've included the first couple so you can see my logic.


IF @Transtype <> 'All' and @TransEvent <> 'All' and @ContSrc <> 'All'
begin

Statement1


end
IF @Transtype = 'All' and @TransEvent <> 'All' and @ContSrc <> 'All'
begin

Statement2

end

etc.etc.

Finally, here is a typical statement inserting data into the #table (there are 8 of these varying slightly dependant on the if statements. I'm not convinced that this is the best way to do it, but in my defence, the user can select 'All' as one of the inputs. As this doesn't actually exist in the database, I had to make sure they could select specific values or everything - I couldn't just say where Transevent = @this and ContSrc = @that because if @this and @that were 'All' they don't exist.

Insert into #TransactionSelection 
(Personuid, 
NINO, 
EERef, 
surname, 
inits, 
Fund, 
RecDate,  
InvDate, 
TransType, 
TransEvent, 
ContSource, 
ContType, 
Conts, 
PayCentre, 
Location, 
Scheme, 
Benefit, 
FundManager)
	
select p.Personuid, 
p.NationalIDNumber, 
e.employeereference, 
p.Surname, 
p.Initials, 
i.name, 
ih.datereceived, 
ih.dateinvested, 
d.longdesc as TransactionType, 
d1.longdesc as TransactionEvent, 
d2.longdesc as ContributionSource, 
d3.longdesc as ContributionType, 
isnull(sum(ih.amount), 0.00) as Conts, 
pc.name, 
l.name, 
s.name, 
b.name, 
fm.name

from person p 
inner join employee e 
on p.personuid = e.personuid 
inner join schememember sm 
on e.employeeuid = sm.employeeuid 
left join location l 
on e.employeeuid = l.parentlocationuid 
left join paycentre pc 
on e.employeeuid = pc.parentpaycentreuid 
inner join scheme s 
on sm.schemeuid = s.schemeuid 
inner join memberbenefit mb 
on sm.memberuid = mb.memberuid 
inner join memberinvestment mi 
on mb.memberbenefituid = mi.memberbenefituid 
inner join benefit b 
on mb.benefituid = b.benefituid 
inner join investment i 
on mi.investmentuid = i.investmentuid 
inner join fundmanager fm 
on i.fundmanageruid = fm.fundmanageruid 
inner join investmenthistory ih 
on mi.memberinvestmentuid = ih.parentinvestmentuid 
inner join description d 
on ih.transactiontypecatid = d.parentuid 
inner join description d1 
on ih.transactioneventcatid = d1.parentuid 
inner join description d2 
on ih.contributionreasoncatid = d2.parentuid 
inner join description d3 
on ih.contributiontypecatid = d3.parentuid

There is another Join which improved the performance of the query. When the user selected the people they wanted to run the report across, the application generates a temporary table of UIDs associated with those people. Prior to this, I was selecting data for everyone, then cutting it down later. I've included it for completeness, but you can ignore it.

inner join ' + @TempPersonTable + ' on p.personuid = ' + @TempPersonTable + '.personuid

This query is for where 'All' has NOT been selected as an option, therefore the values in the variables are valid within the database.


where 
(dbo.FuncRealDate(ih.datereceived) >= convert(datetime,  @fromdate, 103) 
and dbo.FuncRealDate(ih.datereceived) <= convert(datetime, @todate, 103)) 
and d.longdesc in (@transtype) 
and ltrim(rtrim(d1.longdesc)) in ( @transevent) 
and ltrim(rtrim(d2.longdesc)) in ( @contSrc)

The FuncRealDate is a function to strip timestamps from dates in the database because if a user wanted data <= 13/12/2005 a timestamped date like 31/12/2005 00:00:01 is classed as being after the 13th and isn't picked up.


group by 
P.personuid,
p.NationalIDNumber, 
e.employeereference, 
p.Surname, 
p.Initials, 
pc.name, 
l.name, 
s.name, 
b.name, 
i.name, 
fm.name, 
ih.datereceived, 
ih.dateinvested, 
d.longdesc, 
d1.longdesc, 
d2.longdesc, 
d3.longdesc

There is other stuff in the procedure, theres a bit that pivots the data into something less normalised so the users can read it, but the section I've just laid out is the bit that's killing it.

Thanks for your time.

Yonabout

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2005-12-13 : 11:16:47

Why the "group by"...you're not aggregating data are you?
The use of the "funcrealdate" would be also a killer re using any suitable index.
There are alternatives....search here (or mindsdoor.net) for examples (and discussions as to when/why not to use functions in SQL for performance)

yonabout
Posting Yak Master

112 Posts

Posted - 2005-12-13 : 11:30:58

The group by is for the

isnull(sum(ih.amount), 0.00) as Conts

bit.
I know that user functions can cripple things (because they aren't transactional?), but it was running like a dog before I added that bit - it was a recent addition in response to user comments (other than the ones threatening to kill me unless I speed it up!)

Kristen
Test

22859 Posts

Posted - 2005-12-13 : 14:20:31

"Create table [dbo]. [#TransactionSelection] ... "

Put a PRIMARY KEY on it (often makes a surprising difference to performance on #Temp Tables). Set any columns to NOT NULL that you can (helps the optimiser). For "best practice" put a COLLATE on all the CHAR/VARCHAR columns [just in case you run it on a server/TEMPDB with weird collation in the future]

"IF @Transtype <> 'All' and @TransEvent <> 'All' and @ContSrc <> 'All'
begin

Statement1

end"

I would propose that each of these "Statement1" chappies is an EXEC to a separate Sproc. That way each child Sproc can get its own Query Plan that is not dependant on the execution path - I have no idea if this is important, but I've had good results with it in the past.

"from dbo.person p
inner join dbo.employee e
on p.personuid = e.personuid ...

Put the OWNER on each table - supposedly increases the chance of the query plan being cached.

"where
(dbo.FuncRealDate(ih.datereceived) >= convert(datetime, @fromdate, 103)
and dbo.FuncRealDate(ih.datereceived) <= convert(datetime, @todate, 103))"

Make @fromdate and @todate into datatype "datetime" so that there is no CONVERT in the WHERE. Shouldn't make a difference as it should be optimised out - but if the optimiser misses it it WILL make a difference. (If you need a CHAR parameter to the Sproc use a "working variable" within the Sproc to hold the converted values as a datetime, and use that in the WHERE clause)

There is no such thing as a third parameter in CONVERTing to a datetime, so if you need to enforce the style of the date, in the conversion, use

SET DATEFORMAT xxx
...
... CONVERT(datetime, @fromdate) ...

"and d.longdesc in (@transtype)
and ltrim(rtrim(d1.longdesc)) in ( @transevent)
and ltrim(rtrim(d2.longdesc)) in ( @contSrc)"

You can't use a variable in this construction to represent a comma delimited list of values - SQL is just going to do an EQUALS, so you might as well use "=" to give the optimiser a potentially better "route"

(If you NEED to have the query work with a comma delimited set of values for any of these three parameters ask again and we can point you at how to "split" CSV variables).

You only provided the DDL for the [InvestmentHistory] table, so I don't know what the characteristics of

d.longdesc,
d1.longdesc,
d2.longdesc,
d3.longdesc

are, but they look fairly big, and thus probably "CPU intensive", in a Group By. You might want to consider just storing enough PK information [and the SUM] in your Temporary Table so that you could then JOIN it back to the "details" tables to get the final output data (which might also reduce the number of tables you need in your initial query, as the WHERE only covers ih, d, d1 and d2 - and that appears to make about half the JOIN'ed tables redundant (unless they are there to enforce some referential integrity)

Finally, make sure that BOTH sides of each JOIN have indexes (ideally clustered indexes) ... so:


from person p 
inner join employee e 
on p.personuid = e.personuid

needs indexes on
p.personuid
and
e.personuid
to help the optimiser.

Kristen

yonabout
Posting Yak Master

112 Posts

Posted - 2005-12-14 : 04:58:36

Cheers,

I have tried splitting the statements out into individual sprocs before, but I've never set the temp tables up with keys or constraints, and I never knew that putting the owner in front of the table name could make a difference, so I'll see what happens.

Thanks for your time

Yonabout

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources