2016-11-22 93 views
2

如何使用分析在樣本大小更改的情況下爲我提供滾動平均值?平均使用Oracle分析

create table MyVals (Item_no char(10), epoch number, Yield number, Skip_Period char(1), Reset_Period char(1)); 

insert into MyVals values ('A00001',1705, 12, 'N','N');  /* 17.18181818 average of epochs 1705..1610 & 1607..1606 */  
insert into MyVals values ('A00001',1704, 13, 'N','N');  /* 19.45454545 average of epochs 1704..1610 & 1607..1605 */  
insert into MyVals values ('A00001',1703, 9, 'N','N');  /* 20.36363636 average of epochs 1703..1610 & 1607..1604 */  
insert into MyVals values ('A00001',1702, 11, 'N','N');  /* 21.5  average of epochs 1702..1610 & 1607..1604 */  
insert into MyVals values ('A00001',1701, 4, 'N','N');  /* 22.66666667 average of epochs 1701..1610 & 1607..1604 */  
insert into MyVals values ('A00001',1613, 16, 'N','N');  /* 25  average of epochs 1613..1610 & 1607..1604 */  
insert into MyVals values ('A00001',1612, 33, 'N','N');  /* 26.28571429 average of epochs 1612..1610 & 1607..1604 */  
insert into MyVals values ('A00001',1611, 2, 'N','N');  /* 25.16666667 average of epochs 1611..1610 & 1607..1604 */  
insert into MyVals values ('A00001',1610, 1, 'N','N');  /* 29.8  average of epochs 1610 & 1607..1604  */  
insert into MyVals values ('A00001',1609, 66, 'Y','N');  /* 37  average of epochs 1607..1604    */  
insert into MyVals values ('A00001',1608, 23, 'Y','N');  /* 37  average of epochs 1607..1604    */  
insert into MyVals values ('A00001',1607, 22, 'N','N');  /* 37  average of epochs 1607..1604    */  
insert into MyVals values ('A00001',1606, 66, 'N','N');  /* 42  average of epochs 1606..1604    */  
insert into MyVals values ('A00001',1605, 37, 'N','N');  /* 30  average of epochs 1605..1604    */  
insert into MyVals values ('A00001',1604, 23, 'N','Y');  /* 23  average of epochs 1604     */  
insert into MyVals values ('A00001',1603, 77, 'N','N');  /* 44.83333333 average of epochs 1603..1511    */  
insert into MyVals values ('A00001',1602, 15, 'N','N');  /* 38.4  average of epochs 1602..1511    */  
insert into MyVals values ('A00001',1601, 82, 'N','N');  /* 44.25 average of epochs 1601..1511    */  
insert into MyVals values ('A00001',1513, 4, 'N','N');  /* 31.66666667 average of epochs 1513..1511    */  
insert into MyVals values ('A00001',1512, 7, 'N','N');  /* 45.5  average of epochs 1512..1511    */  
insert into MyVals values ('A00001',1511, 84, 'N','N');  /* 84  average of epochs 1511     */  

如何得到的前最多13所記錄的平均收益率,其中Skip_Period =「N」而Reset_Period =「N」

所以窗口取決於skip_Period值的平均變化和Reset_Period:

如果一行有Reset_Period ='Y',那麼不要再返回比該記錄更遠的地方。 如果一行有Skip_period ='Y',那麼從平均樣本中排除該期間

我無法計算出如何在表達式之間創建一個範圍,該範圍將爲我提供需要使用分析的滾動平均值。

任何建議表示歡迎:)

+0

評論中的位是基於最大13行平均值的滾動平均值,不包括skip_period ='Y'和不超過rest_period ='Y',如果該行下降在13. – lidbanger

回答

1

我覺得這是你追求的:

WITH res AS (SELECT item_no, 
        epoch_number, 
        yield, 
        skip_period, 
        reset_period, 
        SUM(CASE WHEN reset_period = 'Y' THEN 1 ELSE 0 END) OVER (PARTITION BY item_no ORDER BY epoch_number) grp 
      FROM myvals) 
SELECT item_no, 
     epoch_number, 
     yield, 
     skip_period, 
     reset_period, 
     grp, 
     AVG(CASE WHEN skip_period = 'N' THEN yield END) OVER (PARTITION BY item_no, grp 
                  ORDER BY epoch_number 
                  rows 12 preceding) rolling_avg_yield 
FROM res 
ORDER BY epoch_number DESC; 

ITEM_NO EPOCH_NUMBER  YIELD SKIP_PERIOD RESET_PERIOD  GRP ROLLING_AVG_YIELD 
------- ------------ ---------- ----------- ------------ ---------- ----------------- 
A00001   1705   12 N   N      1 17.1818181818182 
A00001   1704   13 N   N      1 19.4545454545455 
A00001   1703   9 N   N      1 20.3636363636364 
A00001   1702   11 N   N      1    21.5 
A00001   1701   4 N   N      1 22.6666666666667 
A00001   1613   16 N   N      1    25 
A00001   1612   33 N   N      1 26.2857142857143 
A00001   1611   2 N   N      1 25.1666666666667 
A00001   1610   1 N   N      1    29.8 
A00001   1609   66 Y   N      1    37 
A00001   1608   23 Y   N      1    37 
A00001   1607   22 N   N      1    37 
A00001   1606   66 N   N      1    42 
A00001   1605   37 N   N      1    30 
A00001   1604   23 N   Y      1    23 
A00001   1603   77 N   N      0 44.8333333333333 
A00001   1602   15 N   N      0    38.4 
A00001   1601   82 N   N      0    44.25 
A00001   1513   4 N   N      0 31.6666666666667 
A00001   1512   7 N   N      0    45.5 
A00001   1511   84 N   N      0    84 

首先,你需要找出您的平均過的組。我們可以通過根據報告組是否更改來生成值1或0,然後對這些值執行運行求和來完成此操作。

一旦我們有了這些,只需要在分區中包含該列,然後在跳過週期爲N的情況下對當前行和前面12行進行條件平均。

+0

這幾乎是正確的 - 有關跳過週期的規則很奇怪,這就是爲什麼我們不同意前兩個值。當跳躍週期='Y'時,該值將從平均值中排除,但樣本量也會減少。因此,1705年的平均值爲1705..1610和1607..1606。共有11個樣本。 – lidbanger

+0

11?是不是13 - 1705..1610和1607..1604?此外,它只是前兩個不同的平均值;其他人是一致的。我想你可能會在某個地方出現問題? – Boneist

+0

不,這是正確的。這只是奇怪的商業規則:/跳過週期= Y被排除在平均值之外,但是從開始時期到結束時期的跨度不能超過13. – lidbanger