几个小时后,我被困在类似的东西上,并在一个Teradata查询中输出了一个不那么混乱的代码,用于输出25%,50%,75%的百分位数.可以进一步扩展以产生" 5点总结 ".根据您的人口估计值,最小和最大变化静态值.
有人要求优雅的方法.分享我的.
这是代码:
SELECT MAX(PER_MIN) AS PER_MIN, MAX(PER_25) AS PER_25, MAX(PER_50) AS PER_50, MAX(PER_75) AS PER_75, MAX(PER_MAX) AS PER_MAX FROM (SELECT CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.01 AS INT) THEN DURATION_MACRO_CURR END AS PER_MIN, CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.25 AS INT) THEN DURATION_MACRO_CURR END AS PER_25, CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.50 AS INT) THEN DURATION_MACRO_CURR END AS PER_50 CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.75 AS INT) THEN DURATION_MACRO_CURR END AS PER_75 CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.99 AS INT) THEN DURATION_MACRO_CURR END AS PER_MAX FROM PROD_EXP_DL_CVM.PROD_CVM WHERE PW_END_DATE = '2016-10-18' ) BASE
这是所需的输出:
我会使用条件聚合来做到这一点:
select min(DURATION_MACRO_CURR) as min_val, min(case when seqnum / 0.25 >= cnt then DURATION_MACRO_CURR end) as 25_percentile, min(case when seqnum / 0.50 >= cnt then DURATION_MACRO_CURR end) as 50_percentile, min(case when seqnum / 0.75 >= cnt then DURATION_MACRO_CURR end) as 75_percentile, max(DURATION_MACRO_CURR) as max_val from (select pc.*, row_number() over (order by DURATION_MACRO_CURR) as seqnum, count(*) over () as cnt from PROD_EXP_DL_CVM.PROD_CVM pc where pc.PW_END_DATE = '2016-10-18' ) pc;