Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
ddamico
Yak Posting Veteran
76 Posts |
Posted - 2011-01-27 : 09:53:50
|
Good day,I am working with a very larged data set and am running into performance slowness introducing aggregation of count distinct and wondering if there is a work aroundLarge table has .5 billion rowsAccount Table Account TransactionType TranCount TranValue TransactionDateHousehold Table Household AccountTimePeriod Period StartDate EndDateA household can have multiple accounts with transactions for each day. I am trying to get a count of the distinct transaction daysfor a household across all accounts.SELECT H.Household, A.TransactionType, P.Period, TranValue = COUNT(DISTINCT(TransactionDate))FROM Household HJOIN TimePeriod P ON P.Period IN ('M0','M1','M2','M3')JOIN Account A ON A.TransactionType = 3 AND H.Account = A.Account AND A.TransactionDate BETWEEN P.StartDate AND P.EndDateGROUP BY H.Household, A.TransactionType, P.PeriodBecause multiple accounts for a household could have transaction on the same day i don't want them counted more than once hence the distinct. Is there another/better way to accomplish this with new functionalityThanks. |
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-01-27 : 10:11:26
|
FROM Household HJOIN TimePeriod PON P.Period IN ('M0','M1','M2','M3')Appears to be missing the hoin criteria here - is itFROM Household HJOIN Account AON A.TransactionType = 3AND H.Account = A.AccountJOIN TimePeriod PON A.TransactionDate BETWEEN P.StartDate AND P.EndDateAND P.Period IN ('M0','M1','M2','M3')How many rows do you get back?How long does it take to insert them into a temp table? i.e.SELECT H.Household, A.TransactionType, P.Period, TransactionDateFROM Household HJOIN Account AON A.TransactionType = 3AND H.Account = A.AccountJOIN TimePeriod PON A.TransactionDate BETWEEN P.StartDate AND P.EndDateAND P.Period IN ('M0','M1','M2','M3')GROUP BY H.Household, A.TransactionType, P.Period, TransactionDatecould it be because there is an index that doesn't include the transactiondate?==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-27 : 11:31:10
|
COUNT(DISTINCT TransactionDate) is a slow aggregate operation compared to this.;WITH cteSource(Household, TransactionType, Period, TransactionDate)AS ( SELECT h.Household, a.TransactionType, tp.Period, a.TransactionDate FROM dbo.TimePeriod AS tp INNER JOIN dbo.Account AS a ON a.TransactionDate BETWEEN tp.StartDate AND tp.EndDate AND a.TransactionType = 3 INNER JOIN dbo.Household AS h ON h.Account = a.Account WHERE tp.Period IN ('M0', 'M1', 'M2',' M3') GROUP BY h.Household, a.TransactionType, tp.Period, a.TransactionDate)SELECT Household, TransactionType, Period, COUNT(*) AS TranValueFROM cteSourceGROUP BY Household, TransactionType, Period N 56°04'39.26"E 12°55'05.63" |
|
|
|
|
|
|
|