Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Median values for grouped data

Author  Topic 

austinv
Starting Member

3 Posts

Posted - 2013-06-13 : 12:41:48
I've searched web and forums to some extent and have seen several different ways of calculating a median average, but only for simple columns that aren't grouped. I have data that is similar to this and need to extract a median value for each "bank." Any help would be greatly appreciated!
    Declare @Temp Table(bank varchar(3), Data int)

Insert into @Temp Values('A01',1)
Insert into @Temp Values('A01',2)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A02',7)
Insert into @Temp Values('A02',9)
Insert into @Temp Values('A02',10)


So for A01 I need a median as well as A02. As there are hundreds of banks in the data, I'd rather not write a query for each.

MIK_2008
Master Smack Fu Yak Hacker

1054 Posts

Posted - 2013-06-13 : 12:57:22
Long time, I've studied about median .. it would be great if you could provide the formula for calculating median.

Cheers
MIK
Go to Top of Page

austinv
Starting Member

3 Posts

Posted - 2013-06-13 : 13:01:25
One of the more elegant ones I've seen is this:
    Declare @Temp Table(bank varchar(3), Data int)

Insert into @Temp Values('A01',1)
Insert into @Temp Values('A01',2)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A02',7)
Insert into @Temp Values('A02',9)
Insert into @Temp Values('A02',10)

SELECT x.bank
,x.data median
FROM @Temp x, @temp y
GROUP BY x.bank, x.data
HAVING
SUM(CASE WHEN y.data <= x.data
THEN 1 ELSE 0 END)>=(COUNT(*)+1)/2 AND
SUM(CASE WHEN y.data >= x.data
THEN 1 ELSE 0 END)>=(COUNT(*)/2)+1

but it ignores the difference in banks and only returns the median for all of the data values
Go to Top of Page

MIK_2008
Master Smack Fu Yak Hacker

1054 Posts

Posted - 2013-06-13 : 13:10:12
Is this what you're looking for?


Declare @Temp Table(bank varchar(3), Data int)

Insert into @Temp Values('A01',1)
Insert into @Temp Values('A01',2)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',5)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A01',6)
Insert into @Temp Values('A02',7)
Insert into @Temp Values('A02',9)
Insert into @Temp Values('A02',10)

select bank,
AVG(data) as Median
from
(
select bank,data,
ROW_NUMBER() over (partition by bank order by data ASC) as DataRank,
COUNT(*) over (partition by bank) as BankCount
from
@Temp
) x
where
x.DataRank in (x.BankCount/2+1, (x.BankCount+1)/2)
group by
x.bank


Cheers
MIK
Go to Top of Page

austinv
Starting Member

3 Posts

Posted - 2013-06-13 : 13:18:06
Absolutely! Thanks so much!
Go to Top of Page

MIK_2008
Master Smack Fu Yak Hacker

1054 Posts

Posted - 2013-06-13 : 13:21:48
you're welcome ..

Cheers
MIK
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2013-06-13 : 21:21:21
If you have sql 2012 check out: PERCENTILE_DISC and PERCENTILE_CONT

select distinct percentile_disc(.5) within group (order by data)
over (partition by bank)
from @temp




Be One with the Optimizer
TG
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2013-06-15 : 02:03:01
See http://www.sqltopia.com/?page_id=62



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page
   

- Advertisement -