2017-08-01 93 views
1

早上好,使用標準sql將行轉換爲BigQuery中的列

我試圖在大查詢中轉置一些數據。我已經看過其他一些在stackoverflow上提出過這個問題的人,但是這樣做的方式似乎是使用legacy sql(使用group_concat_unquoted)而不是標準的sql。我會使用遺留的,但我曾經有過嵌套數據的問題,所以從那時起只使用標準。

這裏是我的榜樣,給一些背景,我試圖映射出一些客戶的旅程,我有如下:

uniqueid | page_flag | order_of_pages 
A  | Collection| 1 
A  | Product | 2 
A  | Product | 3 
A  | Login  | 4 
A  | Delivery | 5 
B  | Clearance | 1 
B  | Search | 2 
B  | Product | 3 
C  | Search | 1 
C  | Collection| 2 
C  | Product | 3 

不過,我想轉的數據,所以它看起來是這樣的:

uniqueid | 1   | 2   | 3  | 4  | 5 
A  | Collection | Product | Product | Login | Delivery 
B  | Clearance | Search  | Product | NULL | NULL 
C  | Search  | Collection | Product | NULL | NULL 

我一直使用多個左聯接,但出現以下錯誤嘗試:

select a.uniqueid, 
b.page_flag as page1, 
c.page_flag as page2, 
d.page_flag as page3, 
e.page_flag as page4, 
f.page_flag as page5 

from 

(select distinct uniqueid, 
(case when uniqueid is not null then 1 end) as page_hit1, 
(case when uniqueid is not null then 2 end) as page_hit2, 
(case when uniqueid is not null then 3 end) as page_hit3, 
(case when uniqueid is not null then 4 end) as page_hit4, 
(case when uniqueid is not null then 5 end) as page_hit5 
from `mytable`) a 

LEFT JOIN (
SELECT * 
from `mytable`) b on a.uniqueid = b.uniqueid 
and a.page_hit1 = b.order_of_pages 


LEFT JOIN (
SELECT * 
from `mytable`) c on a.uniqueid = c.uniqueid 
and a.page_hit2 = c.order_of_pages 


LEFT JOIN (
SELECT * 
from `mytable`) d on a.uniqueid = d.uniqueid 
and a.page_hit3 = d.order_of_pages 


LEFT JOIN (
SELECT * 
from `mytable`) e on a.uniqueid = e.uniqueid 
and a.page_hit4 = e.order_of_pages 


LEFT JOIN (
SELECT * 
from `mytable`) f on a.uniqueid = f.uniqueid 
and a.page_hit5 = f.order_of_pages 



Error: Query exceeded resource limits for tier 1. Tier 13 or higher required. 

我看過使用數組函數,但我從來沒有使用過,我不知道這是否只是換位的方式。任何建議將是盛大的。

謝謝

回答

2

爲BigQuery的標準SQL

#standardSQL 
SELECT 
    uniqueid, 
    MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1, 
    MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2, 
    MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3, 
    MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4, 
    MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5 
FROM `mytable` 
GROUP BY uniqueid 

你可以從你的問題與播放/測試下方的虛擬數據

#standardSQL 
WITH `mytable` AS (
    SELECT 'A' AS uniqueid, 'Collection' AS page_flag, 1 AS order_of_pages UNION ALL 
    SELECT 'A', 'Product', 2 UNION ALL 
    SELECT 'A', 'Product', 3 UNION ALL 
    SELECT 'A', 'Login', 4 UNION ALL 
    SELECT 'A', 'Delivery', 5 UNION ALL 
    SELECT 'B', 'Clearance', 1 UNION ALL 
    SELECT 'B', 'Search', 2 UNION ALL 
    SELECT 'B', 'Product', 3 UNION ALL 
    SELECT 'C', 'Search', 1 UNION ALL 
    SELECT 'C', 'Collection', 2 UNION ALL 
    SELECT 'C', 'Product', 3 
) 
SELECT 
    uniqueid, 
    MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1, 
    MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2, 
    MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3, 
    MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4, 
    MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5 
FROM `mytable` 
GROUP BY uniqueid 
ORDER BY uniqueid 

結果是

uniqueid p1   p2   p3  p4  p5 
A   Collection Product  Product Login Delivery  
B   Clearance Search  Product null null  
C   Search  Collection Product null null 

取決於你的需要,你也可以考慮以下方法(雖然不轉動)

#standardSQL 
SELECT uniqueid, 
    STRING_AGG(page_flag, '>' ORDER BY order_of_pages) AS journey 
FROM `mytable` 
GROUP BY uniqueid 
ORDER BY uniqueid 

如果用相同的虛擬數據如上運行 - 結果是

uniqueid journey 
A   Collection>Product>Product>Login>Delivery  
B   Clearance>Search>Product  
C   Search>Collection>Product  
+0

優秀,再次感謝你米哈伊爾。兩種方法都很完美。 –