2015-05-14 149 views
4

我有用於傳感器記錄數據的應用程序,我希望能夠以從多個傳感器的平均值,可以是一個,兩個,三個或大量...SQL平均值和零點

編輯:這些是溫度傳感器,因此0是傳感器可能存儲爲數據庫中值的值。

我最初的出發點是這樣的SQL查詢:

SELECT grid.t5||'.000000' as ts, 
avg(t.sensorvalue) sensorvalue1 
, avg(w.sensorvalue)AS sensorvalue2 
FROM 
(SELECT generate_series(min(date_trunc('hour', ts))       
,max(ts), interval '5 min') AS t5 FROM device_history_20865735 where  
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid 

LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min' 
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min' 
--WHERE t.sensorvalue notnull 
GROUP BY grid.t5 ORDER BY grid.t5 

我得到5點分鐘的平均值,因爲它是我的應用程序更好。

結果如預期具有用於任一sensorvalue1或2 NULL值:

ts;sensorvalue1;sensorvalue2 
"2015-05-13 09:00:00.000000";19.9300003051758; 
"2015-05-13 09:05:00.000000";20; 
"2015-05-13 09:10:00.000000";; 
"2015-05-13 09:15:00.000000";20.0599994659424; 
"2015-05-13 09:20:00.000000";; 
"2015-05-13 09:25:00.000000";20.1200008392334; 

我的目標是從所有可用的傳感器計算,每次5分鐘間隔的平均,從而空值是我想的問題使用CASE語句,這樣如果有一個NULL,以獲得其他傳感器的值...

SELECT grid.t5||'.000000' as ts, 
CASE 
     WHEN avg(t.sensorvalue) ISNULL THEN avg(w.sensorvalue) 
     ELSE avg(t.sensorvalue) 
END AS sensorvalue 
, 
CASE 
     WHEN avg(w.sensorvalue) ISNULL THEN avg(t.sensorvalue) 
     ELSE avg(w.sensorvalue) 
END AS sensorvalue2 
FROM 
(SELECT generate_series(min(date_trunc('hour', ts)),max(ts), interval '5 min') AS t5 
FROM device_history_20865735 where  
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid 

LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min' 
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min' 
GROUP BY grid.t5 ORDER BY grid.t5 

但隨後計算平均值我要做的另一選擇在此之上每列數devide (又名傳感器),如果他們只是兩個,那就OK,但如果是因爲可能有多個傳感器每行有NULL值...

SQL是使用Postgres 9.4從應用程序(使用Python)語法派生的,所以有一個簡單的方法爲了實現我所需要的,我覺得我正在走上一條相當複雜的路線......?

編輯#2:有了您的輸入我已經產生這個SQL代碼,又似乎相當複雜,但開到你的想法和審查,如果它是可靠的,可維護:

SELECT ts, sensortotal, sensorcount, 
CASE 
    WHEN sensorcount = 0 THEN -1000 
    ELSE sensortotal/sensorcount 
END AS sensorAvg 

FROM (
    WITH grid as (
      SELECT t5 
      FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5 
       FROM device_history_20865735 
       ) d 
      WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00' 
     ) 
    SELECT d1.t5 || '.000000' as ts 
      , Coalesce(avg(d1.sensorvalue), 0) + Coalesce(avg(d2.sensorvalue),0) as sensorTotal 
      , (CASE 
        WHEN avg(d1.sensorvalue) ISNULL THEN 0 
        ELSE 1 
      END + CASE 
      WHEN avg(d2.sensorvalue) ISNULL THEN 0 
      ELSE 1 
      END) as sensorCount 

    FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
      FROM grid LEFT JOIN 
       device_history_20865735 t 
       ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
      GROUP BY grid.t5 
     ) d1 LEFT JOIN 
     (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
      FROM grid LEFT JOIN 
       device_history_493417852 t 
       ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
     GROUP BY grid.t5 
     ) d2 on d1.t5 = d2.t5 
    GROUP BY d1.t5 
    ORDER BY d1.t5 
) tmp; 

謝謝!

+0

我不知道如何計算平均值,但你可以做'(coalesce(sum(t.sensorvalue),0)+ coalesce(sum(w.sensorvalue),0))/ count(t.sensorvalue)+ count((w.sensorvalue))''。這可以很容易地擴展到任何數量的傳感器。 – dnoeth

+0

謝謝@dnoeth!我需要在網格的每一行計算它,例如每5分鐘,而不是整個列... – Kostas

回答

0

爲了得到精確的平均值,則需要分別計算每一個之前聯接:

WITH grid as (
     SELECT t5 
     FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5 
      FROM device_history_20865735 
      ) d 
     WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00' 
    ) 
SELECT d1.t5 || '.000000' as ts, 
     avg(d1.sensorvalue) as sensorvalue1 
     , avg(d2.sensorvalue) as sensorvalue2 
FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
     FROM grid LEFT JOIN 
      device_history_20865735 t 
      ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
     GROUP BY grid.t5 
    ) d1 LEFT JOIN 
    (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
     FROM grid LEFT JOIN 
      device_history_493417852 t 
      ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
    GROUP BY grid.t5 
    ) d2 on d1.t5 = d2.t5 
GROUP BY d1.t5 
ORDER BY d1.t5; 
+0

謝謝@戈登! - 雖然我得到語法錯誤...錯誤:在「GROUP」處或附近的語法錯誤行21:GROUP BY d1.t5 ^ – Kostas

+0

在LEFT JOIN中d1和d2之間沒有關係,ON條件丟失 –

+0

我已經設法讓它運行,結果與上面的SQL完全相同......在平均值問題上,雖然有更「優雅」的解決方案嗎? :) – Kostas

0

這聽起來像你想是這樣的:

(coalesce(value1,0) + coalesce(value2,0) + coalesce(value3,0))/
(value1 IS NOT NULL::int + value2 IS NOT NULL::int + value3 IS NOT NULL::int) 
AS average 

基本上,只是做你想爲每一行做數學。唯一「棘手」的部分是如何「計數」非空值 - 我使用了一個強制轉換,但還有其他選項,如:

CASE WHEN value1 IS NULL THEN 0 ELSE 1 END