2017-04-19 146 views
1

背景:我有範圍經常更新以設置不同數量的材料的價格。一旦滿足某些配額,價格就會下降。問題是在範圍更新或添加後確定當前價格。過濾到非重疊範圍 - Amazon RedShift

我期待從數據集中濾除非連續範圍。下面是一些測試代碼:

drop table if exists public.test_ranges; 
create table public.test_ranges (
    category  integer 
    ,lower_bound integer 
    ,upper_bound integer 
    ,cost   numeric(10,2) 
    ,modifieddate timestamp 
); 

insert into public.test_ranges values (1,0,70456,0,'2015-09-29'); 
insert into public.test_ranges values (1,53956,60000,1.28,'2015-02-11'); 
insert into public.test_ranges values (1,70456,90000,1.02,'2015-09-29'); 
insert into public.test_ranges values (1,90000,120000,0.88,'2015-02-11'); 
insert into public.test_ranges values (1,120000,999999999,0.79,'2015-02-11'); 

insert into public.test_ranges values (2,0,48786,0,'2015-11-02'); 
insert into public.test_ranges values (2,22500,25000,0.43,'2015-02-17'); 
insert into public.test_ranges values (2,48786,50000,0.37,'2015-11-02'); 
insert into public.test_ranges values (2,50000,100000,0.21,'2015-02-17'); 
insert into public.test_ranges values (2,100000,175000,0.19,'2015-02-17'); 
insert into public.test_ranges values (2,175000,999999999,0.17,'2015-02-17'); 

insert into public.test_ranges values (3,0,585969,0,'2015-11-02'); 
insert into public.test_ranges values (3,346667,375000,0.15,'2014-09-12'); 
insert into public.test_ranges values (3,375000,500000,0.14,'2014-09-12'); 
insert into public.test_ranges values (3,500000,600000,0.13,'2014-09-12'); 
insert into public.test_ranges values (3,585969,999999999,0.02,'2015-11-02'); 
insert into public.test_ranges values (3,600000,670000,0.12,'2014-09-12'); 

select * from public.test_ranges order by 1,2; 

該代碼將返回:

category lower_bound upper_bound cost modifieddate 
-------------------------------------------------- 
1   0   70456  0  2015-09-29 
1   53956  60000  1.28 2015-02-11 
1   70456  90000  1.02 2015-09-29 
1   90000  120000  0.88 2015-02-11 
1   120000  999999999 0.79 2015-02-11 
2   0   48786  0  2015-11-02 
2   22500  25000  0.43 2015-02-17 
2   48786  50000  0.37 2015-11-02 
2   50000  100000  0.21 2015-02-17 
2   100000  175000  0.19 2015-02-17 
2   175000  999999999 0.17 2015-02-17 
3   0   585969  0.00 2015-11-02 
3   346667  375000  0.15 2014-09-12 
3   375000  500000  0.14 2014-09-12 
3   500000  600000  0.13 2014-09-12 
3   585969  999999999 0.02 2015-11-02 
3   600000  670000  0.12 2014-09-12 

期望的結果:

category lower_bound upper_bound cost modifieddate 
-------------------------------------------------- 
1   0   70456  0  2015-09-29 
1   70456  90000  1.02 2015-09-29 
1   90000  120000  0.88 2015-02-11 
1   120000  999999999 0.79 2015-02-11 
2   0   48786  0  2015-11-02 
2   48786  50000  0.37 2015-11-02 
2   50000  100000  0.21 2015-02-17 
2   100000  175000  0.19 2015-02-17 
2   175000  999999999 0.17 2015-02-17 
3   0   585969  0.00 2015-11-02 
3   585969  999999999 0.02 2015-11-02 

預先感謝任何幫助。

+0

紅移或Postgres的? –

+0

紅移........ – Josh

+0

您能否澄清您的要求?你是否說完全包含在其他行中的行不應該顯示?如果行之間有部分重疊(例如1-10和5-15)會發生什麼?另外,我假設upper_bound值不在範圍內(「小於」而不是「小於或等於」)? –

回答

0

如果沒有遞歸公用表表達式,就無法完美實現。他們目前在Redshift中不受支持。

部分解決方案(不會給你3類正確的結果):

select tr1.* 
from public.test_ranges tr1 
    left join public.test_ranges tr_left on tr1.category = tr_left.category and tr1.lower_bound = tr_left.upper_bound 
    left join public.test_ranges tr_right on tr1.category = tr_right.category and tr_right.lower_bound = tr1.upper_bound 
where tr1.lower_bound = 0 or tr1.upper_bound = 999999999 or (tr_left.upper_bound is not null and tr_right.lower_bound is not null) 
order by tr1.category, tr1.lower_bound;