2017-04-14 57 views
2

我想在用戶級別的BigQuery中創建通道路徑。我希望事務發生時結束的路徑。接下來的訪問將開始一條新路。目前,每位用戶有一條路徑將所有交易總和。請參閱下面提供的代碼。我還包含了當前的OUTPUT TABLE和所需的OUTPUT TABLE。如何在基於事件的BigQuery中創建通道路徑?

我的想法是創建一個計算事務的新列。該值將從0開始,並且在發生事務後需要遞增1。然後我會將此值與user_id值合併,並將聚合字符串分組到該變量。但我不知道如何做到這一點。

在此先感謝!

圭多

#standardSQL 
WITH yourTable AS (
    SELECT 1 AS user_id,'1a' as visit_id, '2017-01-01 14:10:12' AS DATETIME, 
'google cpc' AS channelgrouping, 0 AS transaction , 1 as visit UNION ALL 
    SELECT 1, '1b', '2017-01-01 20:10:12', 'email', 1, 1 UNION ALL 
    SELECT 1, '1c','2017-01-03 08:10:12', 'direct', 0, 1 UNION ALL 
    SELECT 1, '1d','2017-01-04 13:10:14', 'organic', 1, 1 
) 
SELECT 
    user_id, 
    STRING_AGG(channelgrouping, ' > ' ORDER BY DATETIME) AS channelgrouping_path, 
    SUM(transaction) AS transaction, 
    SUM(visit) AS visits 
FROM yourTable 
GROUP BY user_id 

輸出表

user_id|channgelgrouping_path    |Transactions|Visits 
1  |google cpc > email > direct > organic| 2   | 4 

所需的輸出表

user_id|channgelgrouping_path    |Transactions|Visits 
1  |google cpc > email     | 1   | 2 
1  |direct > organic      | 1   | 2 

回答

2

嘗試以下

#standardSQL 
WITH yourTable AS (
    SELECT 1 AS user_id,'1a' AS visit_id, '2017-01-01 14:10:12' AS DATETIME, 
'google cpc' AS channelgrouping, 0 AS transaction , 1 AS visit UNION ALL 
    SELECT 1, '1b', '2017-01-01 20:10:12', 'email', 1, 1 UNION ALL 
    SELECT 1, '1c','2017-01-03 08:10:12', 'direct', 0, 1 UNION ALL 
    SELECT 1, '1d','2017-01-04 13:10:14', 'organic', 1, 1 
) 
SELECT 
    user_id, 
    STRING_AGG(channelgrouping, ' > ' ORDER BY DATETIME) AS channelgrouping_path, 
    SUM(transaction) AS transaction, 
    SUM(visit) AS visits 
FROM (
    SELECT 
    *, 
    SUM(transaction) OVER(PARTITION BY user_id ORDER BY datetime 
       ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING 
    ) AS grp 
    FROM yourTable 
) 
GROUP BY user_id, IFNULL(grp, 0) 
+0

再次感謝米哈伊爾這正是我正在尋找的。 你能解釋一下無界前置和1前置功能之間的行嗎?爲什麼你將IFNULL(grp,0)添加到GROUP BY? – gvkleef

+0

檢查分析函數https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#analytic-functions,特別是Window Frame Clause。 '無界前置和1前置之間的行「 - 對當前行之前的所有行執行SUM。 'IFNULL(grp,0)'需要,因爲對於第一行 - 沒有任何前面的行,因此總和將爲空,我們需要將其「翻譯」爲0 –

+0

您可能知道我如何計算訪問時間戳在路徑中的第一個和最後一個通道?我想我在創建一個新問題之前首先問你。 – gvkleef