2017-06-05 77 views
1

Quering我有這個表:的BigQuery:與標準的SQL

client_id session_id time action transaction_id 
1 1 15:01 view NULL  
1 1 15:02 basket NULL  
1 1 15:03 basket NULL  
1 1 15:04 purchase 1 
1 2 15:05 basket NULL  
1 2 15:06 purchase 2 
1 2 15:07 view NULL  

而且我希望會話內部,所有以前的行動來註冊,在15:03 TRANSACTION_ID首次(因此發生TRANSACTION_ID = NULL)

session_id time transaction_id 
1 15:01 1 
1 15:02 1 
1 15:03 NULL  
1 15:04 1 
2 15:05 2 
2 15:06 2 
2 15:07 NULL  

回答

1

下面是BigQuery的標準SQL

#standardSQL 
SELECT 
    client_id, session_id, time, action, 
    (CASE 
    WHEN ROW_NUMBER() 
     OVER (PARTITION BY client_id, session_id, grp, action ORDER BY time) = 1 
    THEN MAX(transaction_id) OVER (PARTITION BY client_id, session_id, grp) END 
) AS transaction_id 
FROM (
    SELECT *, 
    COUNTIF(transaction_id IS NOT NULL) 
     OVER(PARTITION BY client_id, session_id 
     ORDER BY time ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grp 
    FROM YourTable 
) 
-- ORDER BY client_id, session_id, time 

你可以用虛擬數據如下

試玩
#standardSQL 
WITH YourTable AS (
    SELECT 1 AS client_id, 1 AS session_id, '15:01' AS time, 'view' AS action, NULL AS transaction_id UNION ALL 
    SELECT 1, 1, '15:02', 'basket', NULL UNION ALL 
    SELECT 1, 1, '15:03', 'basket', NULL UNION ALL 
    SELECT 1, 1, '15:04', 'purchase', 1 UNION ALL 
    SELECT 1, 1, '15:05', 'basket', NULL UNION ALL 
    SELECT 1, 1, '15:06', 'basket', NULL UNION ALL 
    SELECT 1, 1, '15:07', 'purchase', 3 UNION ALL 
    SELECT 1, 2, '15:08', 'basket', NULL UNION ALL 
    SELECT 1, 2, '15:09', 'purchase', 2 UNION ALL 
    SELECT 1, 2, '15:10', 'view', NULL 
) 
SELECT 
    client_id, session_id, time, action, 
    (CASE 
    WHEN ROW_NUMBER() 
     OVER (PARTITION BY client_id, session_id, grp, action ORDER BY time) = 1 
    THEN MAX(transaction_id) OVER (PARTITION BY client_id, session_id, grp) END 
) AS transaction_id 
FROM (
    SELECT *, 
    COUNTIF(transaction_id IS NOT NULL) 
     OVER(PARTITION BY client_id, session_id 
     ORDER BY time ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grp 
    FROM YourTable 
) 
-- ORDER BY client_id, session_id, time 

輸出爲預期

client_id session_id time action  transaction_id 
1   1   15:01 view  1  
1   1   15:02 basket  1  
1   1   15:03 basket  null  
1   1   15:04 purchase 1  
1   1   15:05 basket  3  
1   1   15:06 basket  null  
1   1   15:07 purchase 3  
1   2   15:08 basket  2  
1   2   15:09 purchase 2  
1   2   15:10 view  null  
+0

非常感謝您的回答!如果session_id = 1中沒有事務,但代碼將如何更改,但第一個「視圖」(或另一個操作)在第一個session_id中。與他相反,顯示transaction_id = 2 – Zzema

+0

@Zzema - 我沒有看到代碼需要改變 - 它仍然產生你期望的結果(根據你的問題) - 你真的嘗試過嗎? –

+0

是的,我試了一下,謝謝)我的評論與改變的條件沒有寫在問題中有關......但是,在閱讀了關於窗口函數之後,我想出瞭如何重新編寫你的答案,再次感謝 – Zzema

3

嗯。 。 。假設有每個會話只能有一個事務ID,那麼你可以使用窗口功能:

select t.*, 
     (case when row_number() over (partition by client_id, session_id, action 
            order by time) = 1 
      then max(transactc 
ion_id) over (partition by client_id, session_id) 
     end) as new_transaction_id 
from t 
+0

非常感謝您的回答!如果session_id = 1中沒有事務,但代碼將如何更改,但第一個「視圖」(或另一個操作)在第一個session_id中。與他相反顯示transaction_id = 2 – Zzema

+0

@Zzema。 。 。如果在一個會話中沒有事務,那麼值就是'NULL',正如你的問題所指定的那樣:「而且我希望在會話中,所有先前的操作都註冊第一次發生的transaction_id」。 –

相關問題