2016-07-06 87 views
2

目前我有這個,通過PostgreSQL的獲得每日,每週,和在一個查詢事件的發生的月平均

  1. 功能會採取彙總每日,每週,每月數到中間表相當大的查詢按事件名稱和日期分組的事件的count()
  2. 通過按事件做avg()組,選擇每個中間表的平均計數,對結果進行聯合,並且因爲我想每天,每週,每月都有一個單獨的列,將填充值0填入空列。
  3. 然後我總結所有的列,0基本上作爲一個無操作,這給了我每個事件只有一個值。

查詢是相當大的,雖然,我覺得我做了很多重複性的工作。有什麼方法可以更好地執行此查詢或使其更小?我之前沒有真正做過這樣的查詢,所以我不太確定。

WITH monthly_counts as (
    SELECT 
    event, 
    count(*) as count 
    FROM tracking_stuff 
    WHERE 
    event = 'thing' 
    OR event = 'thing2' 
    OR event = 'thing3' 
    GROUP BY event, date_trunc('month', created_at) 
), 
weekly_counts as (
    SELECT 
    event, 
    count(*) as count 
    FROM tracking_stuff 
    WHERE 
    event = 'thing' 
    OR event = 'thing2' 
    OR event = 'thing3' 
    GROUP BY event, date_trunc('week', created_at) 
), 
daily_counts as (
    SELECT 
    event, 
    count(*) as count 
    FROM tracking_stuff 
    WHERE 
    event = 'thing' 
    OR event = 'thing2' 
    OR event = 'thing3' 
    GROUP BY event, date_trunc('day', created_at) 
), 
query as (
    SELECT 
    event, 
    0 as daily_avg, 
    0 as weekly_avg, 
    avg(count) as monthly_avg 
    FROM monthly_counts 
    GROUP BY event 
    UNION 
    SELECT 
    event, 
    0 as daily_avg, 
    avg(count) as weekly_avg, 
    0 as monthly_avg 
    FROM weekly_counts 
    GROUP BY event 
    UNION 
    SELECT 
    event, 
    avg(count) as daily_avg, 
    0 as weekly_avg, 
    0 as monthly_avg 
    FROM daily_counts 
    GROUP BY event 
) 
SELECT 
    event, 
    sum(daily_avg) as daily_avg, 
    sum(weekly_avg) as weekly_avg, 
    sum(monthly_avg) as monthly_avg 
FROM query 
GROUP BY event; 

回答

1

我會寫查詢在這樣的方式:

select event, daily_avg, weekly_avg, monthly_avg 
from (
    select event, avg(count) monthly_avg 
    from (
     select event, count(*) 
     from tracking_stuff 
     where event in ('thing1', 'thing2', 'thing3') 
     group by event, date_trunc('month', created_at) 
    ) s 
    group by 1 
) monthly 
join (
    select event, avg(count) weekly_avg 
    from (
     select event, count(*) 
     from tracking_stuff 
     where event in ('thing1', 'thing2', 'thing3') 
     group by event, date_trunc('week', created_at) 
    ) s 
    group by 1 
) weekly using(event) 
join (
    select event, avg(count) daily_avg 
    from (
     select event, count(*) 
     from tracking_stuff 
     where event in ('thing1', 'thing2', 'thing3') 
     group by event, date_trunc('day', created_at) 
    ) s 
    group by 1 
) daily using(event) 
order by 1; 

如果where條件消除了數據的顯著部分(比如一半以上)使用cte可能略有加快查詢執行:

with the_data as (
    select event, created_at 
    from tracking_stuff 
    where event in ('thing1', 'thing2', 'thing3') 
    ) 

select event, daily_avg, weekly_avg, monthly_avg 
from (
    select event, avg(count) monthly_avg 
    from (
     select event, count(*) 
     from the_data 
     group by event, date_trunc('month', created_at) 
    ) s 
    group by 1 
) monthly 
-- etc ... 

只是爲了好奇,我已經做了數據測試:

create table tracking_stuff (event text, created_at timestamp); 
insert into tracking_stuff 
    select 'thing' || random_int(9), '2016-01-01'::date+ random_int(365) 
    from generate_series(1, 1000000); 

在每一個我把它換成thingthing1查詢,所以查詢排除行的2/3。的10個測試

平均執行時間:

Original query   1106 ms 
My query without cte 1077 ms 
My query with cte  902 ms 
Clodoaldo's query  5187 ms 
+0

只是一個真正的快速問題,沒有檢查任何事實...不是比工會更昂貴的加入?除了偏好之外,還有什麼理由不使用'with'? – m0meni

+1

在這種情況下,'union'和'join'之間的區別應該是不可察覺的。類似的評論可能涉及使用'cte'。當我需要遞歸時,通常使用'with'。 – klin

+0

'CTE'是規劃師的優化圍欄。可能或不會有所作爲。 –

3

在9.5+使用grouping sets

由FROM和WHERE子句是由每個指定分組集合單獨分組,骨料計算了所選擇的數據每個組與簡單的GROUP BY子句一樣,然後返回結果

select event, 
    avg(total) filter (where day is not null) as avg_day, 
    avg(total) filter (where week is not null) as avg_week, 
    avg(total) filter (where month is not null) as avg_month  
from (
    select 
     event, 
     date_trunc('day', created_at) as day, 
     date_trunc('week', created_at) as week, 
     date_trunc('month', created_at) as month, 
     count(*) as total 
    from tracking_stuff 
    where event in ('thing','thing2','thing3') 
    group by grouping sets ((event, 2), (event, 3), (event, 4)) 
) s 
group by event 
+0

這是非常有趣的提示!儘管我的直覺告訴我這個查詢應該相當昂貴。 – klin