2013-04-04 164 views
0

我有一些稱爲分類的表,其中包含classification_indicator_id
我需要總結這個ID並放入1天系列。
我需要添加大約20列(與另一個classification_indicator_id)。
我修改了一下回答previous questiongenerate_series()在PostgreSQL中按預期不能按預期方式工作

select 
data.d::date as "data", 
sum(c.classification_indicator_id)::integer as "Segment1", 
sum(c4.classification_indicator_id)::integer as "Segment2", 
sum(c5.classification_indicator_id)::integer as "Segment3" 
from 
    generate_series(
    '2013-03-25'::timestamp without time zone, 
    '2013-04-01'::timestamp without time zone, 
    '1 day'::interval 
) data(d) 
left join classifications c on (data.d::date = c.created::date and c.classification_indicator_id = 3) 
left join classifications c4 on (data.d::date = c4.created::date and c4.classification_indicator_id = 4) 
left join classifications c5 on (data.d::date = c5.created::date and c5.classification_indicator_id = 5) 
group by "data" 
ORDER BY "data" 

但仍然無法正常工作。 sum對於每一行都是很大的,而且我增加了額外的列。在與4列segment1爲2013年3月26日第二個表應該像第一張表等

With 3 column      With 4 columns 
data  | Segment1 | Segment2 data  | Segment1 | Segment2 | Segment3 
-------------------------------- ------------------------------------------- 
2013-03-25 | 12  | 16   2013-03-25 | 12  | 16  | 20 
-------------------------------- ------------------------------------------- 
2013-03-26 | 18  | 24   2013-03-26 | 108  | 144  | 180  

回答

2

由於commented under your previous answer,你正在運行到「代理交叉連接」相同的金額。
我在這個相關的答案解釋的比較詳細:
Two SQL LEFT JOINS produce incorrect result

您的查詢應該像這樣工作:

SELECT d.created AS data 
     ,c3.segment1 
     ,c4.segment2 
     ,c5.segment3 
FROM (
    SELECT generate_series('2013-03-25'::date 
         ,'2013-04-01'::date 
         ,interval '1 day')::date AS created 
    ) d 
LEFT JOIN (
    SELECT created 
      ,sum(classification_indicator_id)::integer AS segment1 
    FROM classifications 
    WHERE classification_indicator_id = 3 
    GROUP BY 1 
    ) c3 USING (created) 
LEFT JOIN (
    SELECT created 
      ,sum(classification_indicator_id)::integer AS segment2 
    FROM classifications 
    WHERE classification_indicator_id = 4 
    GROUP BY 1 
    ) c4 USING (created) 
LEFT JOIN (
    SELECT created 
      ,sum(classification_indicator_id)::integer AS segment3 
    FROM classifications 
    WHERE classification_indicator_id = 5 
    GROUP BY 1 
    ) c5 USING (created) 
ORDER BY 1; 

假設createddate,而不是一個timestamp

或者,甚至更快的查詢,因爲這已經成爲一個話題:

SELECT d.created AS data 
     ,count(classification_indicator_id = 3 OR NULL)::int * 3 AS segment1 
     ,count(classification_indicator_id = 4 OR NULL)::int * 4 AS segment2 
     ,count(classification_indicator_id = 5 OR NULL)::int * 5 AS segment3 
FROM (
    SELECT generate_series('2013-03-25'::date 
         ,'2013-04-01'::date 
         ,interval '1 day')::date AS created 
    ) d 
LEFT JOIN classifications c USING (created) 
GROUP BY 1 
ORDER BY 1; 
+0

謝謝你,現在我們正在測試這個解決方案,我將讓你知道它是否對我們有幫助。但是我們認爲這是:) – ssuperczynski 2013-04-04 12:57:28

+0

它的工作!我們的速度更快:) – ssuperczynski 2013-04-04 13:13:43

+0

我們將使用這兩種解決方案,他們都很棒。從10000ms到100ms這是它! – ssuperczynski 2013-04-04 13:22:00

2

無需連接:

select 
    data.d::date as "data", 
    sum((classification_indicator_id = 3)::integer * classification_indicator_id)::integer as "Segment1", 
    sum((classification_indicator_id = 4)::integer * classification_indicator_id)::integer as "Segment2", 
    sum((classification_indicator_id = 5)::integer * classification_indicator_id)::integer as "Segment3", 
from 
    generate_series(
     '2013-03-25'::timestamp without time zone, 
     '2013-04-01'::timestamp without time zone, 
     '1 day'::interval 
    ) data(d) 
    left join 
    classifications c on data.d::date = c.created::date 
group by "data" 
ORDER BY "data" 
+0

這可能比多個連接更快。 'CASE'會更快。 – 2013-04-04 13:11:53

+0

@Erwin _MIGHT_ ???你一定在開玩笑。或者我的英語不穩定,不明白_might_的意思:)) – 2013-04-04 13:14:18

+0

我也會測試它,我會讓你知道 – ssuperczynski 2013-04-04 13:14:18