Author |
Topic |
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-28 : 16:45:17
|
I've been tasked with determining the 95th Percentile price per part sold by my company in the past 6 months. Can anyone give me some direction of how one might calculate that number using SQL? |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2009-01-28 : 16:57:54
|
I think it would be helpful of you could provide an example of exactly what that means, with tables, sample data, and sample results.CODO ERGO SUM |
 |
|
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-28 : 17:08:59
|
Well, let's say you had a sales table with sales order line items. SO# Item # PartNo NetPrice 100000 1 Widget 21.50 100010 1 Widget 18.50 100100 4 Widget 23.50 100099 6 Widget 24.00 123000 1 Widget 14.50 How would you write a very basic query so one could determine the correct 95th percentile price. |
 |
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2009-01-28 : 17:24:49
|
come on David, you've been around these forums long enough to know what "tables, sample data, and results" mean Be One with the OptimizerTG |
 |
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2009-01-28 : 17:26:18
|
without trying to guess the specifics, have a look at the ranking function NTILE() in Books Online.Be One with the OptimizerTG |
 |
|
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-28 : 17:37:17
|
quote: Originally posted by TG come on David, you've been around these forums long enough to know what "tables, sample data, and results" mean Be One with the OptimizerTG
I'm not trying to be difficult. I've given a table, sample data, and what I need is the 95th percentile. Unfortunately, I'm using SQL 2000. I guess I don't know enough about what I'm asking about to phrase a better question. |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2009-01-28 : 17:47:00
|
So, based on the sample data you provided, what is the "95th Percentile price per part sold".CODO ERGO SUM |
 |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2009-01-28 : 17:47:34
|
95th percentile is approximately two standard deviations above the mean.So calculate the mean of your data and the standard deviation, and there you are.________________________________________________If it is not practically useful, then it is practically useless.________________________________________________ |
 |
|
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-29 : 12:37:20
|
In an effort to be helpful here is a very abbreviated sample table: SalesNO Price Part Number100000 100 Widget100001 99 Widget100002 98 Widget100003 97 Widget100004 96 Widget100005 95 Widget100006 94 Widget100007 93 Widget100008 92 Widget100009 91 Widget100010 90 Widget100011 89 Widget100012 88 Widget100013 100 Widget100014 99 Widget100015 98 Widget100016 97 Widget100017 96 Widget100018 95 Widget100019 94 Widget100020 93 Widget100021 92 Widget100022 91 Widget100023 90 Widget100024 89 Widget100025 88 Widget100026 87 Widget100027 86 Widget100028 85 Widget100029 84 Widget100030 83 Widget100031 82 Widget100032 81 Widget100033 80 Widget100034 100 Widget100035 99 Widget100036 98 Widget100037 97 Widget100038 96 Widget100039 95 Widget100040 94 Widget100041 93 Widget100042 92 Widget100043 91 Widget100044 90 Widget100045 89 Widget100046 88 Widget100047 87 Widget100048 86 Widget100049 85 Widget100050 84 Widget100051 100 Widget100052 99 Widget100053 98 Widget100054 97 Widget100055 96 Widget100056 95 Widget100057 94 Widget100058 93 Widget100059 92 Widget100060 91 Widget100061 90 Widget100062 89 Widget100063 88 Widget100064 87 Widget100065 86 Widget100066 85 Widget100067 84 Widget100068 100 Widget100069 99 Widget100070 98 Widget100071 97 Widget100072 96 Widget100073 95 Widget100074 94 Widget100075 93 Widget100076 92 Widget100077 91 Widget100078 90 Widget100079 89 Widget100080 88 Widget100081 87 Widget100082 86 Widget100083 85 Widget100084 84 Widget100085 83 Widget100086 82 Widget100087 81 Widget100088 80 Widget100089 79 Widget100090 78 Widget100091 77 Widget100092 76 Widget100093 75 Widget100094 74 Widget100095 73 Widget100096 72 Widget100097 71 Widget100098 70 Widget100099 69 Widget According to Excel the 95th percentile is 99.05. Any ideas? |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-01-29 : 12:47:49
|
http://www.sqlteam.com/article/computing-percentiles-in-sql-server |
 |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2009-01-29 : 17:09:04
|
[code]select *into #TestDatafrom(select 100000 as SalesNO, 100 as Price, 'Widget' as PartNumberunion select 100001, 99, 'Widget'union select 100002, 98, 'Widget'union select 100003, 97, 'Widget'union select 100004, 96, 'Widget'union select 100005, 95, 'Widget'union select 100006, 94, 'Widget'union select 100007, 93, 'Widget'union select 100008, 92, 'Widget'union select 100009, 91, 'Widget'union select 100010, 90, 'Widget'union select 100011, 89, 'Widget'union select 100012, 88, 'Widget'union select 100013, 100, 'Widget'union select 100014, 99, 'Widget'union select 100015, 98, 'Widget'union select 100016, 97, 'Widget'union select 100017, 96, 'Widget'union select 100018, 95, 'Widget'union select 100019, 94, 'Widget'union select 100020, 93, 'Widget'union select 100021, 92, 'Widget'union select 100022, 91, 'Widget'union select 100023, 90, 'Widget'union select 100024, 89, 'Widget'union select 100025, 88, 'Widget'union select 100026, 87, 'Widget'union select 100027, 86, 'Widget'union select 100028, 85, 'Widget'union select 100029, 84, 'Widget'union select 100030, 83, 'Widget'union select 100031, 82, 'Widget'union select 100032, 81, 'Widget'union select 100033, 80, 'Widget'union select 100034, 100, 'Widget'union select 100035, 99, 'Widget'union select 100036, 98, 'Widget'union select 100037, 97, 'Widget'union select 100038, 96, 'Widget'union select 100039, 95, 'Widget'union select 100040, 94, 'Widget'union select 100041, 93, 'Widget'union select 100042, 92, 'Widget'union select 100043, 91, 'Widget'union select 100044, 90, 'Widget'union select 100045, 89, 'Widget'union select 100046, 88, 'Widget'union select 100047, 87, 'Widget'union select 100048, 86, 'Widget'union select 100049, 85, 'Widget'union select 100050, 84, 'Widget'union select 100051, 100, 'Widget'union select 100052, 99, 'Widget'union select 100053, 98, 'Widget'union select 100054, 97, 'Widget'union select 100055, 96, 'Widget'union select 100056, 95, 'Widget'union select 100057, 94, 'Widget'union select 100058, 93, 'Widget'union select 100059, 92, 'Widget'union select 100060, 91, 'Widget'union select 100061, 90, 'Widget'union select 100062, 89, 'Widget'union select 100063, 88, 'Widget'union select 100064, 87, 'Widget'union select 100065, 86, 'Widget'union select 100066, 85, 'Widget'union select 100067, 84, 'Widget'union select 100068, 100, 'Widget'union select 100069, 99, 'Widget'union select 100070, 98, 'Widget'union select 100071, 97, 'Widget'union select 100072, 96, 'Widget'union select 100073, 95, 'Widget'union select 100074, 94, 'Widget'union select 100075, 93, 'Widget'union select 100076, 92, 'Widget'union select 100077, 91, 'Widget'union select 100078, 90, 'Widget'union select 100079, 89, 'Widget'union select 100080, 88, 'Widget'union select 100081, 87, 'Widget'union select 100082, 86, 'Widget'union select 100083, 85, 'Widget'union select 100084, 84, 'Widget'union select 100085, 83, 'Widget'union select 100086, 82, 'Widget'union select 100087, 81, 'Widget'union select 100088, 80, 'Widget'union select 100089, 79, 'Widget'union select 100090, 78, 'Widget'union select 100091, 77, 'Widget'union select 100092, 76, 'Widget'union select 100093, 75, 'Widget'union select 100094, 74, 'Widget'union select 100095, 73, 'Widget'union select 100096, 72, 'Widget'union select 100097, 71, 'Widget'union select 100098, 70, 'Widget'union select 100099, 69, 'Widget') TestDataselect PartNumber, avg(Price) as Mean, stdev(Price) as StandardDeviation, avg(Price) + (stdev(Price) * 1.65) as '95thPercentile'from #TestDatagroup by PartNumberdrop table #TestData[/code]________________________________________________If it is not practically useful, then it is practically useless.________________________________________________ |
 |
|
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-30 : 09:06:01
|
So, my actual query is this: SELECT 'XXX' AS [BU Name], somast.fsono AS [Sales Order #], ' ' AS [Sales Order Type], somast.forderdate AS [Sales Order Date], aritem.fpartno AS [Part Number], aritem.FMDESCRIPT AS [Part Description], dbo.aritem.FPRODCL AS [Product Class], '' AS [Group Code], armast.fcinvoice AS [Invoice Number], armast.finvdate AS [Invoice Date], aritem.ftotprice AS [Invoice Amount], aritem.fordqty AS [Quantity Ordered], aritem.fshipqty AS [Quantity Invoiced], 0 AS [List Price Per Unit], dbo.aritem.FTOTTXNPRICE AS [Net/Invoice Price Per Unit], COALESCE(somefieldhere,0) AS [Discount Per Unit], COALESCE(tmp.[COGS Material],0) AS [COGS Material], COALESCE(tmp.[COGS Labor],0) AS [COGS Labor], COALESCE(tmp.[COGS Overhead],0) AS [COGS Overhead], slcdpm.FCUSRCHR1 AS [Customer Type], somast.fcustno AS [Customer Number], somast.fcompany AS [Customer Name], Syaddr.fccountry AS [Region], somast.fsoldby AS [Sales Rep] FROM armast INNER JOIN aritem ON armast.fcinvoice = aritem.fcinvoice INNER JOIN slcdpm ON slcdpm.fcustno = armast.fcustno INNER JOIN somast ON left (aritem.fsokey, 6) = somast.fsono INNER JOIN sorels ON aritem.fsokey = sorels.fsono + sorels.finumber + sorels.frelease INNER JOIN SYADDR ON sorels.fshptoaddr = syaddr.fcaddrkey AND somast.fcustno = syaddr.fcaliaskey INNER JOIN inmast on inmast.fpartno = aritem.fpartno and inmast.frev = aritem.FREVLEFT JOIN (SELECT jomast.fsono,(jopact.flabact + jopact.flabinv) as [COGS Labor],(jopact.fmatlact + jopact.fmatlinv) as [COGS Material],(jopact.fovhdact + jopact.fovhdinv) as [COGS Overhead] FROM sorels Inner JOIN jomast ON sorels.fsono = jomast.fsono AND sorels.finumber + sorels.frelease = jomast.fkey inner join jopact on jomast.fjobno = jopact.fjobno)tmpON tmp.fsono=somast.fsono The List Price needs to be calculated from 95th percentile of the item below it: dbo.aritem.FTOTTXNPRICE AS [Net/Invoice Price Per Unit].I read the article, but I'm still lost as to how to pull that off in a query such as this. I appreciate all the help so far, but I need a little more. |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-01-30 : 09:32:09
|
just make a function as shown in link. then call it from your query passing reqd field to get percentile value. |
 |
|
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-30 : 09:49:28
|
How about if I change it into a query which populates a temporary table? If I use a UDF, won't it execute at each record of this query and would therefore drag? |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-01-30 : 10:00:02
|
yup./..it will. or else try solution given by blindman. |
 |
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2009-01-30 : 10:17:46
|
Blindman, how did you come up with the constant: 1.65? If I "reverse engineer" from OPs desired answer of 99.05 then I get instead: (approx) 1.315667078050060605However, I also see some articles about how excel's percentile calc doesn't agree with statistical calculations.ie: Statistical flaws in Excelhttp://www.coventry.ac.uk/ec/~nhunt/pottel.pdfBe One with the OptimizerTG |
 |
|
DavidChel
Constraint Violating Yak Guru
474 Posts |
Posted - 2009-01-30 : 14:28:16
|
I have it working Blindman's way but I have no idea if it's accurate. |
 |
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2009-01-30 : 14:43:31
|
quote: Originally posted by DavidChel I have it working Blindman's way but I have no idea if it's accurate.
If Excel is your baseline then can't you spot check a few datasets there?Be One with the OptimizerTG |
 |
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2009-02-02 : 11:46:10
|
Remember there is a difference between the number of standard deviations from the mean required to encompass 95% of the data, and the number of standard deviations above zero that represents the CUMULATIVE percentage of data.I didn't use excel to verify or research it to come up with that constant. I just googled "cumulative percentile standard deviations", or some such thing. I don't guarantee the constant is the correct one he wants, but the technique and code should be the same.________________________________________________If it is not practically useful, then it is practically useless.________________________________________________ |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
|
|
|
|