Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-01-09 : 08:01:16
|
For a set of data points (x, y), this algorithm can be used to fit the data to any of the following curves:1. Straight line (linear regresion); y = A + b*x2. Exponential curve; y = A*EXP(b*x); nb a > 03. Logarithmic curve; y = A + b*LN(x)4. Power curve; y = A*x^b; nb a > 0The coefficient of determination is R2 (how well does the curve fit)-- Prepare test dataCREATE TABLE cf ( x decimal(38, 10), y decimal(38, 10) ) -- Calculate Linear regressionINSERT cfSELECT 40.5, 104.5 UNION ALLSELECT 38.6, 102 UNION ALLSELECT 37.9, 100 UNION ALLSELECT 36.2, 97.5 UNION ALLSELECT 35.1, 95.5 UNION ALLSELECT 34.6, 94SELECT 'Linear regression' AS Type, A, b, R2FROM dbo.fnCurveFitting(1)UNION ALLSELECT 'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2FROM dbo.fnBestFit()-- Calculate Exponential regressionDELETEFROM cfINSERT cfSELECT .72, 2.16 UNION ALLSELECT 1.31, 1.61 UNION ALLSELECT 1.95, 1.16 UNION ALLSELECT 2.58, .85 UNION ALLSELECT 3.14, .5SELECT 'Exponential regression' AS Type, A, b, R2FROM dbo.fnCurveFitting(1)UNION ALLSELECT 'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2FROM dbo.fnBestFit()-- Calculate Logarithmic regressionDELETEFROM cfINSERT cfSELECT 3, 1.5 UNION ALLSELECT 4, 9.3 UNION ALLSELECT 6, 23.4 UNION ALLSELECT 10, 45.8 UNION ALLSELECT 12, 60.1SELECT 'Logarithmic regression' AS Type, A, b, R2FROM dbo.fnCurveFitting(1)UNION ALLSELECT 'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2FROM dbo.fnBestFit()-- Calculate Power regressionDELETEFROM cfINSERT cfSELECT 10, .95 UNION ALLSELECT 12, 1.05 UNION ALLSELECT 15, 1.25 UNION ALLSELECT 17, 1.41 UNION ALLSELECT 20, 1.73 UNION ALLSELECT 22, 2 UNION ALLSELECT 25, 2.53 UNION ALLSELECT 27, 2.98 UNION ALLSELECT 30, 3.85 UNION ALLSELECT 32, 4.59 UNION ALLSELECT 35, 6.02SELECT 'Power regression' AS Type, A, b, R2FROM dbo.fnCurveFitting(1)UNION ALLSELECT 'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2FROM dbo.fnBestFit()DROP TABLE cf Peter LarssonHelsingborg, Sweden |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-01-09 : 08:02:03
|
Here are the functions.When using a type that is not valid, the function defaults to linear regression.CREATE FUNCTION dbo.fnCurveFitting( @Type TINYINT)RETURNS @p TABLE (A DECIMAL(38, 10), b DECIMAL(38, 10), R2 DECIMAL(38, 10))AS/*Type = 1 Linear y = a + b*xType = 2 Exponential y = a*e^(b*x) nb a > 0Type = 3 Logarithmic y = a + b*ln(x)Type = 4 Power y = a*x^b nb a > 0*/BEGIN DECLARE @n DECIMAL(38, 10), @x DECIMAL(38, 10), @x2 DECIMAL(38, 10), @y DECIMAL(38, 10), @xy DECIMAL(38, 10), @y2 DECIMAL(38, 10), @d DECIMAL(38, 10), @a DECIMAL(38, 10), @b DECIMAL(38, 10), @r2 DECIMAL(38, 10) SELECT @n = COUNT(*), @x = CASE WHEN @Type = 2 THEN SUM(x) WHEN @Type = 3 THEN SUM(LOG(x)) WHEN @Type = 4 THEN SUM(LOG(x)) ELSE SUM(x) END, @x2 = CASE WHEN @Type = 2 THEN SUM(x * x) WHEN @Type = 3 THEN SUM(LOG(x) * LOG(x)) WHEN @Type = 4 THEN SUM(LOG(x) * LOG(x)) ELSE SUM(x * x) END, @y = CASE WHEN @Type = 2 THEN SUM(LOG(y)) WHEN @Type = 3 THEN SUM(y) WHEN @Type = 4 THEN SUM(LOG(y)) ELSE SUM(y) END, @xy = CASE WHEN @Type = 2 THEN SUM(x * LOG(y)) WHEN @Type = 3 THEN SUM(LOG(x) * y) WHEN @Type = 4 THEN SUM(LOG(x) * LOG(y)) ELSE SUM(x * y) END, @y2 = CASE WHEN @Type = 2 THEN SUM(LOG(y) * LOG(y)) WHEN @Type = 3 THEN SUM(y * y) WHEN @Type = 4 THEN SUM(LOG(y) * LOG(y)) ELSE SUM(y * y) END, @d = @n * @x2 - @x * @x FROM cf IF @d = 0 RETURN SELECT @a = (@x2 * @y - @x * @xy) / @d, @b = (@n * @xy - @x * @y) / @d, @r2 = (@a * @y + @b * @xy - @y * @y / @n) / (@y2 - @y * @y / @n) INSERT @p SELECT CASE WHEN @Type = 2 THEN EXP(@a) WHEN @Type = 3 THEN @a WHEN @Type = 4 THEN EXP(@a) ELSE @a END, @b, @r2 RETURNENDCREATE FUNCTION dbo.fnBestFit()RETURNS @p TABLE (Type TINYINT, A DECIMAL(38, 10), b DECIMAL(38, 10), R2 DECIMAL(38, 10))ASBEGIN INSERT @p SELECT 1, A, b, R2 FROM dbo.fnCurveFitting(1) INSERT @p SELECT 2, A, b, R2 FROM dbo.fnCurveFitting(2) INSERT @p SELECT 3, A, b, R2 FROM dbo.fnCurveFitting(3) INSERT @p SELECT 4, A, b, R2 FROM dbo.fnCurveFitting(4) DELETE FROM @p WHERE R2 <> (SELECT MAX(R2) FROM @p) RETURNEND Peter LarssonHelsingborg, Sweden |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-01-09 : 08:02:51
|
Harsh, it seems that doing this SQL wise would be faster than doing it in front-end, since SQL Server does this set-based.Peter LarssonHelsingborg, Sweden |
|
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2007-01-09 : 08:08:07
|
Is it? Oh wow! That's a wonderful observation then.I think you are right, if you know the correct way to implement the solution, the language hardly matters.Harsh AthalyeIndia."The IMPOSSIBLE is often UNTRIED" |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-01-09 : 08:33:13
|
running those log running queries again, are we peter?Boredom on the horizon? Too much free time? Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-01-09 : 08:42:45
|
Free time? When writing algorithms here? No no... Peter LarssonHelsingborg, Sweden |
|
|
daydreamer82
Starting Member
3 Posts |
Posted - 2007-04-11 : 16:46:38
|
hello,This algorithm is providential for me. Im a student experimenting path discovering on Internet. I succeeded to put some traceroute data into a mysql database. And now, i tried to find a way to find shortests paths...Thanks a lot ;) PS. If you have the mysql version ... ;) |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
daydreamer82
Starting Member
3 Posts |
Posted - 2007-04-11 : 17:00:36
|
Ho yes, you are right. I lost myself during the registration ;) Soo sorry. Admin? hem... |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2007-04-11 : 17:05:39
|
Peter,Why stop half way?Shouldn't your script insert the data into an Excel spreadsheet, and create a graph with the proper trend line too?CODO ERGO SUM |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-04-11 : 17:07:53
|
I love you too, Michael Peter LarssonHelsingborg, Sweden |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2007-04-11 : 17:30:10
|
Sometimes, I just have to let out my inner twit.CODO ERGO SUM |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2007-04-11 : 19:05:52
|
To keep the outer one company? e4 d5 xd5 Nf6 |
|
|
HumanJHawkins
Starting Member
4 Posts |
Posted - 2010-11-02 : 17:41:49
|
This is wonderful. It looks like it will be greatly helpful toward some trend calculation that I hope to do. I think I can use this as is, but I would like to learn from and understand it better. So, apologies in advance for the noob questions that follow...Can anyone point to places in the code if/where intermediate data (such as slope, intercept, or "c") are calculated? I think it's clear that r2 is R squared. But I've never seen "a" as a variable name in this sort of equation. So that makes me wonder if "b" is just a variable that was needed, or if it is the "b" that is commonly used in calculating regressions.Thanks for the code, and thanks in advance for any further explanation.HumanJHawkins |
|
|
HumanJHawkins
Starting Member
4 Posts |
Posted - 2010-11-02 : 19:03:07
|
One more follow-up... I'm using MS SQL Server 2005. In my dataset, my y-column includes negative numbers. I am getting the error: "A domain error occurred."I believe this is due to the use of "LOG(y)", where LOG() is expecting a positive number. Is there a common strategy for dealing with this issue?Again, thanks in advance,HumanJHawkins |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-11-03 : 18:40:40
|
Are you using the BestFit function? Well... That hasn't been optimized for this situations.However I have a SQLCLR that has! See http://www.developerworkshop.net/software.htmlThere are scripts and samples too. N 56°04'39.26"E 12°55'05.63" |
|
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-11-09 : 14:35:00
|
EXEC sp_configure 'clr enabled', 1; RECONFIGURE WITH OVERRIDE; With override option detailed here.http://msdn.microsoft.com/en-us/library/ms176069.aspx |
|
|
HumanJHawkins
Starting Member
4 Posts |
Posted - 2010-11-10 : 12:53:57
|
I guess I should get more sleep... I appear to add confusion with every visit. Anyway, here is a thread that describes how to get around issues one may encounter with enabling CLR:[url]http://social.msdn.microsoft.com/Forums/en/sqlnetfx/thread/50c11a95-046e-472e-b788-d12c091da1f5[/url]Thanks for all of your help. |
|
|
gingerninja
Starting Member
2 Posts |
Posted - 2010-12-08 : 06:10:24
|
Hi,Bit of a long shot this one, but I have stumbled on this forum post. I have a requirement to replicate some of the Excel Solver functionality within SQL. Ideally I wanted a purely TSQL function, but this looks unlikely now, so I'm also looking at CLR options.What I need to do is analyse some financial fund data (36 price values) against somewhere between 2-5 benchmarks (also 36 price values each) and determine the sensitivities (weights) of the benchmarks that best follow the fund. In other words find a set of benchmark weights which minimizes the tracking error between the resulting benchmark and the fund. It's explained perfectly here: - http://www.andreassteiner.net/performanceanalysis/?External_Performance_Analysis:Style_AnalysisThis is done with a few clicks in Excel, which is why it's so frustrating that I can't find a SQL Server contained solution. However, I do appreciate that it's a quadratic problem, so may not be so easily portable to SQL Server. I've have looked at the Frontline Solver (http://www.solver.com) and building a C# Dll, but I'd rather avoid that if possible.Anyone got any thoughts or experience that might assist?Many thanks,Stephen |
|
|
AvinashPatwari
Starting Member
1 Post |
Posted - 2011-10-12 : 01:51:31
|
Swepeso, I am so grateful for your post. IT saved my life. In my college project , I have to use similar kind of functionality for forecasting of events. But again, I will be more helpful if I get any more insight on Polynomial regression .. I am not able to find fool-proof and suitable method like yours. Your expertise is needed. |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2013-12-08 : 09:25:55
|
No, it's the final EXP.e^820.97 can not be represented in SQL Server. Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
Next Page
|
|
|
|
|