2017-06-30 24 views
1

使用Postgres 9.6。如何有效地計算嵌套在Postgres中的JSONB數組的統計數據?

我有這個工作,但懷疑有一個更有效的方法。在MyEventLength陣列上計算AVG,SUM等的最佳方法是什麼?

DROP TABLE IF EXISTS activity; 
DROP SEQUENCE IF EXISTS activity_id_seq; 
CREATE SEQUENCE activity_id_seq; 

CREATE TABLE activity (
    id INT CHECK (id > 0) NOT NULL DEFAULT NEXTVAL ('activity_id_seq'), 
    user_id INT, 
    events JSONB 
); 

INSERT INTO activity (user_id,events) VALUES 
(1, '{"MyEvent":{"MyEventLength":[450,790,1300,5400],"MyEventValue":[334,120,120,940]}}'), 
(1, '{"MyEvent":{"MyEventLength":[12],"MyEventValue":[4]}}'), 
(2, '{"MyEvent":{"MyEventLength":[450,790,1300,5400],"MyEventValue":[334,120,120,940]}}'), 
(1, '{"MyEvent":{"MyEventLength":[1000,2000],"MyEventValue":[450,550]}}'); 

到目前爲止,這是我可以找出計算的平均水平MyEventLength陣列user_id 1的最佳方式:

SELECT avg(recs::text::numeric) FROM (
    SELECT jsonb_array_elements(a.event_length) as recs FROM (
     SELECT events->'MyEvent'->'MyEventLength' as event_length from activity 
     WHERE user_id = 1 
    )a 
) b; 

或者這種變化:

SELECT avg(recs) FROM (
    SELECT jsonb_array_elements_text(a.event_length)::numeric as recs FROM (
     SELECT events->'MyEvent'->'MyEventLength' as event_length from activity 
     WHERE user_id = 1 
    )a 
) b; 

是有沒有更好的方法來做到這一點,不需要那麼多的子選擇?

回答

1

您需要標量值傳遞行avg(),否則(如果你嘗試通過像jsonb_array_elements_text(..)一些設置返回函數的輸出),你會得到一個錯誤,如本:

ERROR: set-valued function called in context that cannot accept a set 

所以你絕對需要至少1個子查詢或CTE。

選項1,W/O CTE:

select avg(v::numeric) 
from (
    select 
    jsonb_array_elements_text(events->'MyEvent'->'MyEventLength') 
    from activity 
    where user_id = 1 
) as a(v); 

選項2,CTE(可讀性更好):

with vals as (
    select 
    jsonb_array_elements_text(events->'MyEvent'->'MyEventLength')::numeric as val 
    from activity 
    where user_id = 1 
) 
select avg(val) 
from vals 
; 

UPDATE,選擇3:原來,你可以做到這一點沒有任何嵌套查詢,使用隱式JOIN LATERAL

select avg(val::text::numeric) 
from activity a, jsonb_array_elements(a.events->'MyEvent'->'MyEventLength') vals(val) 
where user_id = 1; 
+1

太棒了,謝謝! – Clay