2016-08-14 118 views
0

我有SELECT:追加缺失月份和年份選擇值

SELECT month, year, ROUND(AVG(q_overall) OVER (rows BETWEEN 10000 preceding and current row),2) as avg 
FROM (
    SELECT EXTRACT(Month FROM date) as month, EXTRACT(Year FROM date) as year, ROUND(AVG(q_overall),1) as q_overall 
    FROM fb_parsed 
    WHERE business_id = 1 
    GROUP BY year, month 
    ORDER BY year, month) a 

輸出:

month year avg  
----------------- 
12  2012 5 
1  2013 4.5 
2  2013 4.1 
4  2013 4.8 
5  2013 4.7 

我必須追加該表缺少值(在本例中有3次月在2013年)。 AVG的必須是一樣的前一行,這意味着我需要這個表追加用:

3  2013 4.1 

我能做到這一點具有自聯接和generate_series,或與一些UNION選擇?

回答

1

您可以簡化您的選擇。它不需要子查詢:

SELECT EXTRACT(Month FROM date) as month, 
     EXTRACT(Year FROM date) as year, 
     ROUND(AVG(q_overall), 1) as q_overall, 
     ROUND(AVG(AVG(q_overall)) OVER (rows BETWEEN 10000 preceding and current row), 2) 
FROM fb_parsed 
WHERE business_id = 1 
GROUP BY year, month; 

窗口功能需要order by。我假設你真的打算:

SELECT EXTRACT(Month FROM date) as month, 
     EXTRACT(Year FROM date) as year, 
     ROUND(AVG(q_overall), 1) as q_overall, 
     ROUND(AVG(AVG(q_overall)) OVER (ORDER BY year, month)), 2) 
FROM fb_parsed 
WHERE business_id = 1 
GROUP BY year, month; 

然後,填補了值,您可以使用generate_series()

SELECT EXTRACT(Month FROM ym.date) as month, 
     EXTRACT(Year FROM ym.date) as year, 
     ROUND(AVG(AVG(q_overall)) OVER (ORDER BY year, month)), 2) 
FROM (SELECT generate_series(date_trunc('month', min(date)), 
          date_trunc('month', max(date)), 
          interval '1 month') as date 
     FROM fb_parsed 
    ) ym LEFT JOIN 
    fb_parsed p 
    ON EXTRACT(year FROM ym.date) = EXTRACT(year FROM p.date) AND 
     EXTRACT(month FROM ym.date) = EXTRACT(month FROM p.date) AND 
     p.business_id = 1 
GROUP BY year, month; 

我認爲這會做你想要什麼。

+0

非常感謝你,你真的幫了我哥登。 Postgresql在你的最後一個查詢中拋出了一些錯誤,因爲pg不知道第3行ORDER BY中的年和月,而fb_parsed表有很多其他數據,所以我必須追加一些WHERE clausules,但現在它工作得很完美。謝謝。 – Michal

0

我可以用SELF JOINS和generate_series做這個嗎?

是的,你很接近,但你目前的查詢做一個累積平均。最棘手的部分是填補國內空白,與前值(如PostgreSQL的支持LAST_VALUE這將是更容易的選擇IGNORE NULLS ...)

SELECT month, 
     year, 
     MAX(q_overall) -- assign the value to all rows within the same group 
     OVER (PARTITION BY grp) 
FROM 
(
    SELECT all_months.month, all_months.year, p.q_overall, 
     -- assign a new group number whenever there's a value in q_overall 
     SUM(CASE WHEN q_overall IS NULL THEN 0 ELSE 1 END) 
     OVER (ORDER BY all_months.month, all_months.year 
      ROWS UNBOUNDED PRECEDING) AS grp 
    FROM 
    (-- create all months with min and max date 
     SELECT generate_series(date_trunc('month', min(date)), 
           date_trunc('month', max(date)), 
           interval '1 month') as date 
     FROM fb_parsed 
    ) AS all_months 
    LEFT JOIN 
    (-- do the average per month calculation 
     SELECT EXTRACT(Month FROM date) as month, 
       EXTRACT(Year FROM date) as year, 
       ROUND(AVG(q_overall),1) as q_overall 
     FROM fb_parsed 
     WHERE business_id = 1 
     GROUP BY year, month 
    ) AS p 
    ON EXTRACT(year FROM ym.date) = all_months.month 
    AND EXTRACT(month FROM ym.date) = all_months.year 
) AS dt 

編輯:

哎呀,這是過於複雜,問了累加平均,然後空值的問題不會改變結果,而且也沒有必要填補國內空白

0

最終查詢:

SELECT EXTRACT(Month FROM ym.date) as month, 
     EXTRACT(Year FROM ym.date) as year, 
     ROUND(AVG(AVG(q_overall)) OVER (ORDER BY EXTRACT(Year FROM ym.date), EXTRACT(Month FROM ym.date)), 2) 
FROM 
(SELECT generate_series(date_trunc('month', min(date)), 
         date_trunc('month', max(date)), 
         interval '1 month') as date 
FROM fb_parsed WHERE business_id = 1 AND site = 'facebook') 
ym LEFT JOIN 
    fb_parsed p 
    ON EXTRACT(year FROM ym.date) = EXTRACT(year FROM p.date) AND 
     EXTRACT(month FROM ym.date) = EXTRACT(month FROM p.date) AND 
     p.business_id = 1 AND site = 'facebook' 
GROUP BY year, month;