Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2001-04-30 : 10:51:49
|
Roger writes "Is there a way to retrive a field value from the previously read row in order to use it to calculate a field in the current row . . ." Article Link. |
|
julianhaynes
Starting Member
3 Posts |
Posted - 2002-02-21 : 12:40:23
|
I prefer to avoid CROSS joins...SELECT a.DayCount , a.Sales , SUM(b.Sales) AS RunningTotalFROM Sales a JOIN Sales b ON b.DayCount <= a.DayCountGROUP BY a.DayCount , a.SalesORDER BY a.DayCount , a.Sales... but of course this is the same...Scan count 5000, logical reads 37544, physical reads 0, read-ahead reads 0. |
|
|
mschweisguth
Starting Member
1 Post |
Posted - 2002-05-12 : 18:57:04
|
what I am thinking is: you can use "order by" on a read-only cursor and then insert the results into a new temp table.quote: The reason for the performance difference is that the Celko solution and it's variants (all the v1 < v2 ones) are quadratic. To compute the total for the Nth rcord, it computes the total of the N-1 records below. When it moves to computing the next total, it redoes all this work and add the Nth value. On average, each total requires N/2 records to be accessed. Do it for a dataset twice the size and you should see ti take about 4 times as long. Not acceptable where thevolume of data can increase significantly over time.The cursor appraoch visits each record just once.The update approach is interesting. The only problem is ensuring that the records come in the right order. I am considering copying the data in a temp table with a clustered index on the ordered field to force the records into the required order but the docs do not guarantee the update will take the records in clustered index order. And you can't have an ORDER BY in an update.quote: Interesting <P>HmmmmVery interesting about the cursor being quicker. Just proves that there never is one hard and fast rule for anything.
|
|
|
richardd
Starting Member
3 Posts |
Posted - 2002-05-13 : 05:28:23
|
quote: You received the following message from mschweisguth:I was looking at your SQL. I once did something like this, but, I was afraid that it was undocumented behavior.Primarily since I remembered reading something, somewhere, that said that even if you add a clustered index to a table, a select * statement isn't gaurenteed to return ordered output unless you use an order by clause.Do you know if this is specified someplace in SQL92?-thanks.
Michael,I have no idea! I just know it works in MS SQL, and I don't use anything else. Maybe someone else can help.Cheers,Richard |
|
|
merry
Starting Member
1 Post |
Posted - 2003-05-15 : 18:50:20
|
Hi , i am a self taught sql server developer.I am so amazed at the cool solutions that you guysthrow out for reasonably can-be-complex problems.Can you please share what makes one comfortably writedown the queries ? is it just common sense or is it intelligence or is it remembering some basic sql tricks and building on that or ..? I bet none of these cool solutions that i see on this site,arrived in the author's mind as they are seen here. what is the method or mind-set one shouldbe in, to turn ideas to sql queries ?show me light. sorry if i my attempt to not sound desperate failed in this email.but I want to be good at the art/send the email straight to raju_sql@yahoo.com. |
|
|
AjarnMark
SQL Slashing Gunting Master
3246 Posts |
Posted - 2003-05-19 : 14:24:19
|
Merry, I'd say the bulk of it comes from "years of experience". I know that's not the answer most people want to hear, but the fact is that the more you work with SQL, the more you start thinking in terms of set-based operations. And, the more you experience you have, the more often you can say to yourself, "Now, I remember something kind of like this a while back, and we did something like ____." Then you test it, massage it, test it again, etc.Of course, reading forums like this one, reading the articles posted here, and getting your hands on a couple of good books help lay a solid foundation. But nothing is quite as valuable as experience.------------------------------------------------------The more you know, the more you know you don't know. |
|
|
GreySky
Starting Member
20 Posts |
Posted - 2003-05-19 : 15:48:14
|
I was wondering if SQL Server 2000 *functions* can be used that emulate the in-line functions I describe below.---------------------In the MS Access Newsgroup (the one sponsored by Microsoft), there are often questions on how to perform "cumulative" whatevers.As *all* are aware, MS Access doesn't have the Stored Procedure capabilities SQL Server has. However, Jet is extended to allow the use of *functions* that greatly enhance query capabilities.The "Celko" method is commonly used in Access. However, it has major performance issues when large datasets are involved, and of course isn't updatable.The most efficient method for calculating in-line values for *very large* recordsets in MS Access is to have code/SQL that:1) resets hold values used by inline functions;2) manages how the Cartesian Product Table is built within MS Access (either by TOP 100 PERCENT clauses within subqueries or managed CPT builds by using base tables whose joins dictate the CPT build order); and3) executes a query that calls an in-line function (described below).These steps are a modified approach to the explanation below, as the query is initiated from code instead of code initiated by the query.Below is how to open queries that then perform cumulative functions in MS Access. While written for MS Access people, it still contains a few ideas not discussed here.---------------CUMULATIVES FOR MS ACCESS -------------------There are essentially two competing methods to accomplish accumulations by performing line-by-line analysis, each with their advantages and drawbacks. The two methods are (a) queries, and (b) functions.Questions that determine the appropriate answer:-- How many records will be displayed as a result? 100 or fewer? Thousands?-- Will the query need to be updatable?-- Will the user scroll through these records?-- Will the user need to make changes and have the calculations immediately update?-- Will the user reorder the records, thus requiring the recalculation of the cumulative totals?-- Are there groupings? In other words, will you be looking for the last cumulative value after Group A, then after Group B, etc.?------------An overview of the competing methodologies:tblOrderItems--OrderItemID - AutoNumber--OrderItem - Text 255--OrderValue - CurrencyVersion 1: Queries OnlyThere are a few ways of implementing this as a subquery. Here, I provide an example of using a subquery within the select statement to return the cumulative values up to that point.SELECT MainOrderItems.OrderItemID, MainOrderItems.OrderValue, (Select Sum([OrderValue]) From tblOrderItems As SubQuery Where SubQuery.[OrderItemID] <= MainOrderItems.OrderItemID) AS CumulativeFROM tblOrderItems AS MainOrderItemsORDER BY MainOrderItems.OrderItemID;Advantages:--This version is fully encapsulated within SQL--SQL geeks love it because it makes them feel uber and reinforces their opinion that code is for the uninitiatedDisadvantages:--Doesn't allow for updates*--The displayed sort order must be managed--Don't try this with thousands of records if you want an answer anytime soon*OK, OK, so there IS a way to make the results updatable. Use DSum in the select statement instead of a subquery. For example:SELECT tblOrderItems.*, nz(DSum("[OrderValue]","tblOrderItems"," [OrderItemID]<" & [tblOrderItems].[OrderItemID]))+[OrderValue] AS CumulativeFROM tblOrderItems;This combination allows for the cumulative action to work while maintaining its updatability capabilities. AND, the numbers will automatically recalculate if you make a change. If you don't want the automatic recalculation, you can use a DLookup against the first query that uses the subquery.Version 2: FunctionsThere are two distinct ways to utilize functions. I break them down into (a) inline functions and (b) non-inline functions, or in other words: (a) hard but blazingly fast and (b) somewhat easy and flexible but slow.2(a) Inline functions take the current value of the record, perform a mathematic (or other) function, "stores" the result of calculation, and return the answer for display.Inline functions use the "stored" value for next call. Inline functions can be used in Queries, Forms, and Reports.Advantages:--Absolute performance screamers. Inline functions are hands-down the fastest executors.--Able to easily handle the largest record counts; linear time requirement growthDisadvantages:--Hardest to manage*--Is best used in conjunction with Aggregate functions, such as Last, with or without Group By.**--Not designed for scrolling, but you can do it if you really have to****Inline functions require the additional requirement of resetting the "stored" value from execution to exection. Within queries, this can be achieved by having one query return the first record, and if the record being executed is the first record, then it resets the stored value. For reports, it can be reset with the Report_Open. For forms, it usually isn't best to use use inline functions -- use non-inline functions or queries.**Inline functions are superior at returning an aggregate result based upon line-by-line analysis in an incredibly fast manner (quite possibly unbeatable speed-wise, even when compared against code using cursorvs). Essentially, you create a query that uses the inline function. Then you create another query that performs the aggregation against the first query.If function execution order is important, there's an important caveat. The query that runs the inline function must be based upon a query that includes the SELECT TOP 100 PERCENT ... keywords. "TOP 100 PERCENT" ensures that your stated Order By is performed against the subquery. This ensures the inline function is called in the exact order you demand (such as forcing groups to execute in order). Then aggregate query can simply take the "last" of the inline query to return the blitzkrieg mathematic results.Note: Cumulative multiplication in conjunction with aggregate functions does not require a function. You can use an expession similar to this one: IIf(Sum(IIf([Value]<0,1,0)) Mod 2=1,-1,1)*Sgn(Min(Abs([Value])))*Exp(Sum(Log(Abs(IIf([Value]=0,1,[Value])))))***While technically the results can be set to be updatable, it just doesn't work. This is because the function is called as the user scrolls through the recordset for the row the user is on. This wreaks absolute havok with "stored" variables, and will return unexpected values. If you are going to scroll the results, you *should* set the result type to Snapshot.If in-line functions must be used with a scrollable and updatable recordset, then you must implement methods that dramatically hamper performance. This is because you have to forget about using stored values, and dynamically calculate the proper result based upon the users position within the recordset.Here's an example of an in-line function. This function performs a cumulative multiply, and resets the value when a new CoGroupID is passed. (In the real world, this should be performed using the expression I listed above; this is for illustration only.)Function InlineMultiply(lngInlineValue As Long, bInitialize As Boolean, lngCoGroupID As Long) As LongOn Error GoTo Err_Overflow Static lngHoldValue As Long Static lngHoldCoGroup As Long Static lngHoldCompanyID As Long ' determine if to initialize If bInitialize Then lngHoldValue = 1 lngHoldCoGroup = lngCoGroupID lngHoldCompanyID = 0 End If ' determine if a different co group passed (requires resetting) If lngHoldCoGroup <> lngCoGroupID Then lngHoldValue = 1 lngHoldCoGroup = lngCoGroupID End If ' return the hold value * inline value lngHoldValue = lngHoldValue * lngInlineValue InlineMultiply = lngHoldValueExit_InlineMultiply: Exit FunctionErr_Overflow: InlineMultiply = 0 lngHoldValue = 1 lngHoldCoGroup = 0 Resume Exit_InlineMultiplyEnd FunctionThe first query involved utilizes the TOP 100 PERCENT keywords to force the appropriate execution order of the function. Otherwise, there would be no guarantee that all CoGroupID 1's would execute before any CoGroupID 2's were passed (thus resetting the value prematurely).SELECT TOP 100 PERCENT tblCompanies.*FROM tblCompaniesWHERE tblCompanies.Value Is Not NullORDER BY tblCompanies.CoGroupID, tblCompanies.CompanyName; || note this Order By CoGroupID || in conjunction with TOP 100 PERCENT--This query forces the Order By to be performed, thanks to the TOP 100 PERCENT keywords. This is then used as a subquery in the next query that actually calls the inline function:SELECT qrySortedCompanyData.CompanyID, qrySortedCompanyData.CoGroupID, qrySortedCompanyData.CompanyName, qrySortedCompanyData.Value, InlineMultiply([Value],[qrySortedCompanyData].[CompanyID]=[qryFirstCompany].[CompanyID],[qrySortedCompanyData].[CoGroupID]) AS TheInlineFROM qrySortedCompanyData, qryFirstCompanyORDER BY qrySortedCompanyData.CoGroupID, qrySortedCompanyData.CompanyName;--This query can then be used to make summaries:SELECT qryRunningMultiply.CoGroupID, Last(qryRunningMultiply.TheInline) AS LastValueFROM qryRunningMultiplyGROUP BY qryRunningMultiply.CoGroupID;-- This summary shows the cumulative value for each group.2(b) Non-Inline Functions.Non-inline functions are chiefly designed for use on forms. They take the recordsetclone and a bookmark, and perform their calculations against that recordset.Advantages:--Fully updatable recordsets--Immediate reculations--Easy to write and maintain--The function uses the current sort, even if the user changes itDisadvantages:--Not exactly speedy--Not for thousands of recordsThe following in-line function example returns the row number:-TextBox controlsource:=IIf(IsNull([TheValue]),Null,LineNumber([RecordsetClone],[Bookmark]))-Non-inline Function:Function LineNumber(rsc As Recordset, strBookmark As String) As Long ' align to the current bookmark and return its absolute position rsc.Bookmark = strBookmark LineNumber = rsc.AbsolutePosition + 1 End FunctionThis next example returns the cumulative sum of values showing on the datasheet.-Textbox controlsource:=IIf(IsNull([TheValue]),Null,CumulativeValues([RecordsetClone],[Bookmark],"TheValue"))-Non-inline function:Function CumulativeValues(rsc As Recordset, strBookmark As String, strFieldName As String) As Long Dim lngTotal As Long ' align to the current bookmark rsc.Bookmark = strBookmark ' move previous until bof encountered While Not rsc.BOF lngTotal = lngTotal + rsc.Fields(strFieldName) rsc.MovePrevious Wend ' return the value CumulativeValues = lngTotal End Function----------David Atkins, MCP |
|
|
vhyn99
Starting Member
3 Posts |
Posted - 2004-04-24 : 11:07:51
|
nice code...it helps//vhyn99 |
|
|
objectuser
Starting Member
14 Posts |
Posted - 2004-07-22 : 08:46:11
|
I've recently needed to have running totals in an OLTP application. I was concerned with the performance, but it was a tradeoff in performance and redundant data. I was able to get the #2 query above to perform satisfactorily by limiting the range of rows.I was not able to verify the results posted in the original reply. I found that the first solution (the temporary table) performed the worst of the bunch. On my laptop, it took 36 seconds. This compares to about 7 seconds with the correlated subquery.As an aside, does anyone know if SQL Server will be adding analytic functions like Oracle has? This is a much better solution to this problem (performance wise). I do work in both SQL Server and Oracle, so I ran this test on Oracle (again running on my laptop). For the correlated subquery solution, the results were similar--about 7s. However, when I used Oracle's analytic functions, they results went down two orders of magnitude to about 70ms. Here's how it looks:select day_count, sales, sum(sales) over (order by day_count range unbounded preceding) running_totalfrom salesorder by day_countI use things kinds of things all of the time when I'm working with Oracle--they are very useful. I'm hoping that SQL Server will add something analogous, soon.--Kevin |
|
|
parevr23
Starting Member
4 Posts |
Posted - 2005-04-17 : 01:42:07
|
quote: Originally posted by JustinBigelow Why the need for coalesce? <P>When I started to read the article I decided to try my own solution first and compare it to what the author came up with. My solution is as follows: select s1.daycount, s1.sales, RunningTotal = (select sum(s2.sales) from sales s2 where s2.daycount <= s1.daycount)from sales s1orderby s1.daycount, s1.salesI came up with this after refreshing myself on correlated sub queries in "Professional SQL Server 7.0 Programming" so I guess you could call this the "Wrox Solution" ;)This is pretty close to the "Celko" solution so I timed both on QA and the Wrox beat the Celko by a whopping ONE SECOND (heavy sarcasm)! I don't understand the need for the coalesce function, they both seem to produce the same results. Would somebody mind enlightening me? Thanks.Justin
Hey, I found this page on google and I find it very intresting, you guys are great! Alone the same lines, I need some help with a running total issue. I have a table that has a list of transaccions, not a huge table, there will probably be a couple 100 records a month or so, and I am not sure whether or not to calculate a running total and save it in the table for each transaction or run a query to get the date every time. What are the advantages?My table looks something like this: Transaction(ID, Date, Description, Amunt, Balance)The issue I run into is that transactions will not come in the order of inserted but by the field date. I need to calculate the total based on the day in field Date not by the day the record was entered. |
|
|
bojan.jovicic
Starting Member
1 Post |
Posted - 2006-03-28 : 06:13:52
|
quote: Originally posted by monoThe cursor appraoch visits each record just once. The update approach is interesting. The only problem is ensuring that the records come in the right order. I am considering copying the data in a temp table with a clustered index on the ordered field to force the records into the required order but the docs do not guarantee the update will take the records in clustered index order. And you can't have an ORDER BY in an update.
I have found another article discussing exactly that solution: http://www.sql-server-performance.com/mm_cursor_friendly_problem.aspIt looks like that update order is guarenteed with techniques used there.-- Bojan JovicicB. Sc. IT EngineerMicrosoft Certified Solution DeveloperMicrosoft Business Solutions Certified Professional |
|
|
stevekgoodwin
Starting Member
3 Posts |
Posted - 2007-02-26 : 01:25:42
|
Nice article. There's a bug in the CROSS JOIN example. The phrase "AS Running Total" needs to be cut'n'pasted 3 lines higher so it reads:SUM(Sales) As RunningTotal |
|
|
ismailc
Constraint Violating Yak Guru
290 Posts |
Posted - 2009-01-05 : 08:22:26
|
Great Article Is there a way of reseting the running total to 0 based on a certain criteria and continue calculating.When the date is ist of the month (Split_Date like '01%') then set Accum_Tot to 0.Regards |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
HershelCohen
Starting Member
1 Post |
Posted - 2009-03-12 : 12:34:57
|
[quote]Originally posted by julianhaynes I prefer to avoid CROSS joins...SELECT a.DayCount , a.Sales , SUM(b.Sales) AS RunningTotalFROM Sales a JOIN Sales b ON b.DayCount <= a.DayCountGROUP BY a.DayCount , a.SalesORDER BY a.DayCount , a.Sales... but of course this is the same...Scan count 5000, logical reads 37544, physical reads 0, read-ahead reads 0.This solution worked for me. Thank you, Julian.-Hershel |
|
|
Jeff Moden
Aged Yak Warrior
652 Posts |
Posted - 2010-05-14 : 19:40:37
|
quote: Originally posted by HershelCohen [quote]Originally posted by julianhaynes I prefer to avoid CROSS joins...SELECT a.DayCount , a.Sales , SUM(b.Sales) AS RunningTotalFROM Sales a JOIN Sales b ON b.DayCount <= a.DayCountGROUP BY a.DayCount , a.SalesORDER BY a.DayCount , a.Sales... but of course this is the same...Scan count 5000, logical reads 37544, physical reads 0, read-ahead reads 0.This solution worked for me. Thank you, Julian.-Hershel
I know this is an old post but I just had to warn you... you have no idea what a mistake you made, Hershel... Please see the following article for why I say so...[url]http://www.sqlservercentral.com/articles/T-SQL/61539/[/url] |
|
|
sageora
Starting Member
4 Posts |
Posted - 2010-05-17 : 16:12:24
|
How about using cross apply instead of cross join?This gives us ability to apply our custom conditions to correlated subquery:CREATE TABLE [dbo].[Debts]( [id] [int] NULL, [Value] [float] NULL, [ag_id] [int] NULL)goinsert into dbo.debts(id, ag_id, Value)values(1, 25, 101.0)insert into debts(id, ag_id, Value)values(2, 25, 200.0)insert into dbo.debts(id, ag_id, Value)values(3, 35, 1650.0)insert into dbo.debts(id, ag_id, Value)values(4, 35, 804.0)insert into dbo.debts(id, ag_id, Value)values(5, 35, 525.0)insert into dbo.debts(id, ag_id, Value)values(6, 35, 765.0)goselect o.*, running.*from dbo.debts ocross apply(select SUM(value) smfrom dbo.debts iwhere i.id<=o.id --put date field here and i.ag_id = o.ag_id) runningRegards |
|
|
Jeff Moden
Aged Yak Warrior
652 Posts |
Posted - 2010-05-23 : 13:46:37
|
quote: Originally posted by sageora How about using cross apply instead of cross join?This gives us ability to apply our custom conditions to correlated subquery:CREATE TABLE [dbo].[Debts]( [id] [int] NULL, [Value] [float] NULL, [ag_id] [int] NULL)goinsert into dbo.debts(id, ag_id, Value)values(1, 25, 101.0)insert into debts(id, ag_id, Value)values(2, 25, 200.0)insert into dbo.debts(id, ag_id, Value)values(3, 35, 1650.0)insert into dbo.debts(id, ag_id, Value)values(4, 35, 804.0)insert into dbo.debts(id, ag_id, Value)values(5, 35, 525.0)insert into dbo.debts(id, ag_id, Value)values(6, 35, 765.0)goselect o.*, running.*from dbo.debts ocross apply(select SUM(value) smfrom dbo.debts iwhere i.id<=o.id --put date field here and i.ag_id = o.ag_id) runningRegards
I haven't tested the particular code above but it looks to me like it has a triangular join in it. See the following article for why that's a bad thing and why it's slower than a cursor and more resource intensive than a cursor.[url]http://www.sqlservercentral.com/articles/T-SQL/61539/[/url]--Jeff Moden "Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! ""RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row" |
|
|
beyt34
Starting Member
1 Post |
Posted - 2011-01-06 : 03:22:54
|
another way using with,ex:;WITH TmpTable(DayCount, Sales, RunningTotal)AS ( SELECT 0 AS DayCount, CAST(0 AS money) AS Sales, CAST(0 AS money) AS RunningTotal UNION ALL SELECT tmp.DayCount, tmp.Sales, tmp.Sales + t.RunningTotal AS RunningTotal FROM #tmpTable AS tmp, TmpTable AS t WHERE (tmp.DayCount = t.DayCount+1) )SELECT * FROM TmpTable WHERE DayCount>0OPTION (MAXRECURSION 32767); |
|
|
|
|
|
|
|