i have table customers, users , revenue similar below (in reality thousands of records):
customer user revenue 001 james 500 002 james 750 003 james 450 004 sarah 100 005 sarah 500 006 sarah 150 007 sarah 600 008 james 150 009 james 100
what want return highest spending customers make 80% of total revenue user.
to manually order james' customers revenue, work out percentage of total , running total percentage, return records point running total hits 80%:
customer user revenue % of total running total % 002 james 750 0.38 0.38 001 james 500 0.26 0.64 003 james 450 0.23 0.87 <- greater 80%, last record 008 james 150 0.08 0.95 009 james 100 0.05 1.00
i've tried using cte far have come blank. there way via single query rather manually in excel sheet?
sql server 2012+
only
you use windowed sum
:
with cte ( select *, 1.0 * revenue/sum(revenue) over(partition [user]) percentile, 1.0 * sum(revenue) over(partition [user] order [revenue] desc) /sum(revenue) over(partition [user]) running_percentile tab ) select * cte running_percentile <= 0.8;
sql server 2008:
with cte ( select *, row_number() over(partition [user] order revenue desc) rn t ), cte2 ( select c.customer, c.[user], c.[revenue] ,percentile = 1.0 * revenue / nullif(c3.s,0) ,running_percentile = 1.0 * c2.s / nullif(c3.s,0) cte c cross apply (select sum(revenue) s cte c2 c.[user] = c2.[user] , c2.rn <= c.rn) c2 cross apply (select sum(revenue) s cte c2 c.[user] = c2.[user]) c3 ) select * cte2 running_percentile <= 0.8;
output:
╔══════════╦═══════╦═════════╦════════════════╦════════════════════╗ ║ customer ║ user ║ revenue ║ percentile ║ running_percentile ║ ╠══════════╬═══════╬═════════╬════════════════╬════════════════════╣ ║ 2 ║ james ║ 750 ║ 0,384615384615 ║ 0,384615384615 ║ ║ 1 ║ james ║ 500 ║ 0,256410256410 ║ 0,641025641025 ║ ║ 7 ║ sarah ║ 600 ║ 0,444444444444 ║ 0,444444444444 ║ ╚══════════╩═══════╩═════════╩════════════════╩════════════════════╝
edit 2:
that looks there, niggle it's missing last row, third row james takes him on 0.80 needs included.
with cte ( select *, row_number() over(partition [user] order revenue desc) rn t ), cte2 ( select c.customer, c.[user], c.[revenue] ,percentile = 1.0 * revenue / nullif(c3.s,0) ,running_percentile = 1.0 * c2.s / nullif(c3.s,0) cte c cross apply (select sum(revenue) s cte c2 c.[user] = c2.[user] , c2.rn <= c.rn) c2 cross apply (select sum(revenue) s cte c2 c.[user] = c2.[user]) c3 ) select a.* cte2 cross apply (select min(running_percentile) rp cte2 running_percentile >= 0.8 , cte2.[user] = a.[user]) s a.running_percentile <= s.rp;
output:
╔══════════╦═══════╦═════════╦════════════════╦════════════════════╗ ║ customer ║ user ║ revenue ║ percentile ║ running_percentile ║ ╠══════════╬═══════╬═════════╬════════════════╬════════════════════╣ ║ 2 ║ james ║ 750 ║ 0,384615384615 ║ 0,384615384615 ║ ║ 1 ║ james ║ 500 ║ 0,256410256410 ║ 0,641025641025 ║ ║ 3 ║ james ║ 450 ║ 0,230769230769 ║ 0,871794871794 ║ ║ 7 ║ sarah ║ 600 ║ 0,444444444444 ║ 0,444444444444 ║ ║ 5 ║ sarah ║ 500 ║ 0,370370370370 ║ 0,814814814814 ║ ╚══════════╩═══════╩═════════╩════════════════╩════════════════════╝
looks perfect, translated big table , returns need, spent 5 minutes working through , still can't follow you've done!
sql server 2008
not support in over()
clause, row_number
does.
first cte calculate position within group:
╔═══════════╦════════╦══════════╦════╗ ║ customer ║ user ║ revenue ║ rn ║ ╠═══════════╬════════╬══════════╬════╣ ║ 2 ║ james ║ 750 ║ 1 ║ ║ 1 ║ james ║ 500 ║ 2 ║ ║ 3 ║ james ║ 450 ║ 3 ║ ║ 8 ║ james ║ 150 ║ 4 ║ ║ 9 ║ james ║ 100 ║ 5 ║ ║ 7 ║ sarah ║ 600 ║ 1 ║ ║ 5 ║ sarah ║ 500 ║ 2 ║ ║ 6 ║ sarah ║ 150 ║ 3 ║ ║ 4 ║ sarah ║ 100 ║ 4 ║ ╚═══════════╩════════╩══════════╩════╝
second cte:
c2
subquery calculate running total based on rankrow_number
c3
calculate full sum per user
in final query s
subquery finds lowest running
total exceeds 80%.
edit 3:
using row_number
redundant.
with cte ( select c.customer, c.[user], c.[revenue] ,percentile = 1.0 * revenue / nullif(c3.s,0) ,running_percentile = 1.0 * c2.s / nullif(c3.s,0) t c cross apply (select sum(revenue) s t c2 c.[user] = c2.[user] , c2.revenue >= c.revenue) c2 cross apply (select sum(revenue) s t c2 c.[user] = c2.[user]) c3 ) select a.* cte cross apply (select min(running_percentile) rp cte c2 running_percentile >= 0.8 , c2.[user] = a.[user]) s a.running_percentile <= s.rp order [user], revenue desc;
Comments
Post a Comment