2017-04-24 97 views
3

爲組聚合函數I具有下表:陣列相交通過

CREATE TABLE person 
AS 
    SELECT name, preferences 
    FROM (VALUES 
    ('John', ARRAY['pizza', 'meat']), 
    ('John', ARRAY['pizza', 'spaghetti']), 
    ('Bill', ARRAY['lettuce', 'pizza']), 
    ('Bill', ARRAY['tomatoes']) 
) AS t(name, preferences); 

group by personintersect(preferences)作爲聚合函數。所以我想要以下輸出:

person | preferences 
------------------------------- 
John | ['pizza'] 
Bill | [] 

這應該如何在SQL中完成?我想我需要做類似以下的事情,但X函數是什麼樣的?

SELECT person.name, array_agg(X) 
FROM  person 
LEFT JOIN unnest(preferences) preferences 
ON  true 
GROUP BY name 
+0

可能會加入unnest(首選項)? –

+0

@VaoTsun我認爲這是一個好主意,但我該如何與該連接相交(並在之後應用'array_agg')? –

+0

數組有重複值的機會嗎? –

回答

2

使用FILTERARRAY_AGG

SELECT name, array_agg(pref) FILTER (WHERE namepref = total) 
FROM (
    SELECT name, pref, t1.count AS total, count(*) AS namepref 
    FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name) 
    FROM person 
) AS t1 
    CROSS JOIN LATERAL unnest(preferences) AS pref 
    GROUP BY name, total, pref 
) AS t2 
GROUP BY name; 

下面是使用ARRAY構造和DISTINCT做到這一點的方法之一。

WITH t AS (
    SELECT name, pref, t1.count AS total, count(*) AS namepref 
    FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name) 
    FROM person 
) AS t1 
    CROSS JOIN LATERAL unnest(preferences) AS pref 
    GROUP BY name, total, pref 
) 
SELECT DISTINCT 
    name, 
    ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name) 
FROM t; 
+1

這將*不*計算數組的交集,但會生成一個數組,其中包含不止一次出現的所有首選項。嘗試以下三條記錄:'('Paul',ARRAY ['pizza','meat'])','('Paul',ARRAY ['pizza','salad'])''和'('Paul', ARRAY ['沙拉','啤酒'])'。結果應該是空的,但是你的查詢會產生'{pizza,salad}'。 –

+0

@LaurenzAlbe修復。 –

+0

這將工作,除非有數組包含多次相同的值。 –

2

你可以創建自己的聚合函數:

CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray 
    LANGUAGE sql IMMUTABLE AS 
    'SELECT CASE 
       WHEN $1 IS NULL 
       THEN $2 
       WHEN $2 IS NULL 
       THEN $1 
       ELSE array_agg(x) 
      END 
    FROM (SELECT x FROM unnest($1) a(x) 
      INTERSECT 
      SELECT x FROM unnest($2) a(x) 
     ) q'; 

CREATE AGGREGATE arr_sec_agg(anyarray) (
    SFUNC = arr_sec_agg_f(anyarray, anyarray), 
    STYPE = anyarray 
); 

SELECT name, arr_sec_agg(preferences) 
FROM person 
GROUP BY name; 

┌──────┬─────────────┐ 
│ name │ arr_sec_agg │ 
├──────┼─────────────┤ 
│ John │ {pizza}  │ 
│ Bill │    │ 
└──────┴─────────────┘ 
(2 rows) 
+0

這很整潔,我不知道這是可能的。我將繼續搜索查詢,因爲我目前無法更改模式。在這種情況下我該怎麼辦?接受這個問題並再次提出問題,注意我無法創建自己的函數,因此正在尋找一個簡單的查詢? –

+0

不,只是不接受我的回答。 –

1

如果編寫自定義聚合(如@LaurenzAlbe提供)是不是你的選擇,你可以通常註冊相同的邏輯在recursive CTE

with recursive cte(name, pref_intersect, pref_prev, iteration) as (
    select name, 
      min(preferences), 
      min(preferences), 
      0 
    from  your_table 
    group by name 
    union all 
    select name, 
      array(select e from unnest(pref_intersect) e 
        intersect 
        select e from unnest(pref_next) e), 
      pref_next, 
      iteration + 1 
    from  cte, 
    lateral (select your_table.preferences pref_next 
       from  your_table 
       where your_table.name  = cte.name 
       and  your_table.preferences > cte.pref_prev 
       order by your_table.preferences 
       limit 1) n 
) 
select distinct on (name) name, pref_intersect 
from  cte 
order by name, iteration desc 

http://rextester.com/ZQMGW66052

這裏的主要想法是找到一個你可以在你的行中「行走」的順序。我使用了preferences數組的自然順序(因爲沒有顯示多少列)。理想情況下,這種排序應該發生在(a)唯一的字段上(最好在主鍵上),但在這裏,因爲preferences列中的重複不會影響交集的結果,所以這已經足夠了。