2017-08-16 86 views
1

的最新可用日期,看起來像這樣:PostgreSQL的查詢:列總和爲給定一個PSQL表每個月

date  | data 
2015-01-23 | 15 
2015-01-23 | 11 
2015-02-25 | 15 
2015-02-25 | 11 
2015-01-25 | 24 
2015-01-25 | 2 
2015-01-25 | 13 
2015-01-29 | 5 
2015-02-28 | 12 
2015-02-28 | 1 
2015-05-15 | 12 
2015-05-16 | 1 

我怎樣才能獲得的數據每個月的最後一個可用日期的總和? 結果舉例:

date  | data 
2015-01-29 | 5 
2015-02-28 | 13 
2015-05-16 | 1 

這是我到目前爲止已經試過:

SELECT year,month,max(day),sum(data) FROM 
    (
    SELECT 
     date, 
     date_part('year', date) AS year, 
     date_part('month', date) AS month, 
     date_part('day', date) AS day, 
     sum(data)    AS tdata 
    FROM table a 
    GROUP BY date, date_part('year', date), date_part('month', date), date_part('day', date) 
    ORDER BY year ASC, month ASC, day ASC 
) dataq 
GROUP BY year,month 

我從這個得到的總和似乎是錯誤的。

+1

請不要標記無關的產品。 – Strawberry

回答

1

您應該計算的款項在內部查詢,通過一天的分組。在外部查詢中選擇月份中的最新日期:

select distinct on (year, month) 
    make_date(year::int, month::int, day::int) as date, 
    data 
from (
    select 
     date_part('year', date) as year, 
     date_part('month', date) as month, 
     date_part('day', date) as day, 
     sum(data) as data 
    from my_table 
    group by date 
    ) s 
order by year, month, day desc 

    date | data 
------------+------ 
2015-01-29 | 5 
2015-02-28 | 13 
2015-05-16 | 1 
(3 rows)  
+0

謝謝,這個作品很棒。 – r1pster

1

我想你只需要刪除你不想總結的日子。例如使用NOT EXISTS如下:

SELECT year,month,max(day),sum(tdata) tdata FROM 
    (
    SELECT 
     d, 
     date_part('year', d) AS year, 
     date_part('month', d) AS month, 
     date_part('day', d) AS day, 
     sum(data)    AS tdata 
    FROM tab a 
    WHERE NOT EXISTS 
    (
     SELECT * 
     FROM tab a2 
     WHERE date_part('year', a.d) = date_part('year', a2.d) AND 
      date_part('month', a.d) = date_part('month', a2.d) AND 
      date_part('day', a.d) < date_part('day', a2.d) 
    ) 
    GROUP BY d, date_part('year', d), date_part('month', d), date_part('day', d) 
    ORDER BY year ASC, month ASC, day ASC 
) dataq 
GROUP BY year,month 

SQLFiddle

+0

謝謝。這有效,但由於某種原因,它非常緩慢。在500k行的表上執行需要大約15秒的時間。 – r1pster

相關問題