SQL Server 分析函数

分析函数基于分组，计算分组内数据的聚合值，经常会和窗口函数OVER()一起使用，使用分析函数可以很方便地计算同比和环比，获得中位数。

使用以下脚本插入示例数据：

;with cte_data as 
(
select 'Document Control' as Department,'Arifin' as LastName,17.78 as Rate 
union all 
select 'Document Control','Norred',16.82 
union all 
select 'Document Control','Kharatishvili',16.82
union all 
select 'Document Control','Chai',10.25 
union all 
select 'Document Control','Berge',10.25 
union all 
select 'Information Services','Trenary',50.48
union all 
select 'Information Services','Conroy',39.66 
union all 
select 'Information Services','Ajenstat',38.46
union all 
select 'Information Services','Wilson',38.46
union all 
select 'Information Services','Sharma',32.45
union all 
select 'Information Services','Connelly',32.45
union all 
select 'Information Services','Berg',27.40
union all 
select 'Information Services','Meyyappan',27.40
union all 
select 'Information Services','Bacon',27.40
union all 
select 'Information Services','Bueno ',27.40
)
select Department
    ,LastName
    ,Rate
into #data
from cte_data
go

View Code

一，CUME_DIST 和PERCENT_RANK

CUME_DIST 计算的逻辑是：小于等于当前值的行数/分组内总行数

PERCENT_RANK 计算的逻辑是：（分组内当前行的RANK值-1）/ （分组内总行数-1）

以下代码，用于计算累积分布和排名百分比：

select Department
    ,LastName
    ,Rate
    ,cume_dist() over(partition by Department order by Rate) as CumeDist
    ,percent_rank() over(partition by Department order by Rate) as PtcRank
    ,rank() over(partition by Department order by Rate asc) as rank_number
    ,count(0) over(partition by Department) as count_in_group
from #data
order by DepartMent
    ,Rate desc

二，PERCENTILE_CONT和PERCENTILE_DISC

PERCENTILE_CONT和PERCENTILE_DISC都是为了计算百分位的数值，比如计算在某个百分位时某个栏位的数值是多少。他们的区别就是前者是连续型，后者是离散型。CONT代表continuous，DISC代表discrete。PERCENTILE_CONT是连续型意味它考虑的是区间，所以值是绝对的中间值。而PERCENTILE_DISC是离散型，所以它更多考虑向上或者向下取舍，而不会考虑区间。

以下脚本用于获得分位数:

select Department
    ,LastName
    ,Rate
    ,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Rate) 
                            OVER (PARTITION BY Department) AS MedianCont
    ,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Rate) 
                            OVER (PARTITION BY Department) AS MedianDisc
    ,row_number() over(partition by Department order by Rate) as rn
from #data
order by DepartMent
    ,Rate asc

三，LAG和LEAD

Lag和Lead函数可以在一次查询中取出同一字段的前N行的数据和后N行的值，特别适合用于计算同比和环比。

select DepartMent
    ,LastName
    ,Rate
    ,lag(Rate,1,0) over(partition by Department order by LastName) as LastRate
    ,lead(Rate,1,0) over(partition by Department order by LastName) as NextRate
from #data
order by Department
    ,LastName

参考文档：

Analytic Functions (Transact-SQL)