2017-07-26 66 views
0

我正在嘗試使用BigQuery重新創建GA漏斗(Google360上的自定義報告)。 GA上的漏斗使用每頁上發生的事件的唯一計數。我發現,工作在大多數情況下此查詢在線:在BigQuery上重新創建GA漏斗

SELECT 
    COUNT(s0.firstHit) AS Landing_Page, 
    COUNT(s1.firstHit) AS Model_Selection 
from(
SELECT 
     s0.fullvisitorID, 
     s0.firstHit, 
     s1.firstHit, 
    FROM (
      # Begin Subquery #1 aka s0 
      SELECT 
        fullvisitorID, 
        MIN(hits.hitNumber) AS firstHit 
      FROm [64269470.ga_sessions_20170720] 
      WHERE 
        hits.eventInfo.eventAction in ('landing_page') 
        AND totals.visits = 1 
      GROUP BY 
        fullvisitorID 
       ) s0 
    # End Subquery #1 aka s0 

    left join (

    # Begin Subquery #2 aka s1 
      SELECT 
       fullvisitorID, 
       MIN(hits.hitNumber) AS firstHit 
      FROM [64269470.ga_sessions_20170720] 
      WHERE 
      hits.eventInfo.eventAction in ('model_selection_page') 
      AND totals.visits = 1 
      GROUP BY 
       fullvisitorID, 
       ) s1 

     ON 
    s0.fullvisitorID = s1.fullvisitorID 

    ) 

查詢工作正常,併爲着陸頁的值,因爲我可以得到GA相同,但Model_Selection是高出10%左右。這個差異也隨着漏斗的增加而增加(爲了清楚起見,我只發佈了兩個步驟)。 任何想法我在這裏想念什麼?

回答

1

此查詢確實需要什麼,但在Standard SQL版本:

#standardSQL 
SELECT 
    SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page, 
    SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection 
FROM `64269470.ga_sessions_20170720` 

這一點。 4線,方式更快,更便宜。

您也可以使用模擬數據,像玩:

#standardSQL 
WITH data AS(
    SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL 
    SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL 
    SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits UNION ALL 
    SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits 
) 

SELECT 
    SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page, 
    SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection 
FROM data 

注意,當您需要選擇誰曾至少一次燒製的遊客在喬治亞州建立這種類型的報表可能會有點難度事件'landing_page',然後發起事件'model_selection_page'。確保你在GA中正確建立了這個報告(一種方法可能是首先構建一個自定義報告,只有'landing_page'被觸發的客戶,然後應用第二個過濾器尋找'model_selection_page')。

[編輯]:

你在你的關於把這個計數的會話和用戶級別評論問。對於每個會話計數,可以將結果限制爲1對每個子查詢評估,像這樣:

SELECT 
    SUM((SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page, 
    SUM((SELECT 1 FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection 
FROM data 

用於計數不同用戶的想法是一樣的,但是你必須應用COUNT(DISTINCT)操作,像這樣:

SELECT 
    COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page, 
    COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection 
FROM data 
+0

嗨威利安,謝謝你的回答。這是您一直在使用的有趣方法。快速的問題,但。我會用這種結構來區分用戶和會話。它看起來像是在計算總數。 謝謝! – Jacob

+0

@Jacob多虧了另一個引用這個問題的問題,我發現你的評論,抱歉花了這麼長時間來回復。我編輯了我的答案,希望這是你正在尋找的。讓我知道它是否工作:) –