2013-05-02 39 views
1

我與BigQuery玩,碰到一個問題,我的查詢語句:響應太大而無法返回限制1;

SELECT * FROM (
SELECT a.title, a.counter , MAX(b.num_characters) as max 
FROM (
    SELECT title, count(*) as counter FROM publicdata:samples.wikipedia 
    GROUP EACH BY title 
    ORDER BY counter DESC 
    LIMIT 10 
) a JOIN 
(SELECT title,num_characters FROM publicdata:samples.wikipedia 
) b ON a.title = b.title 
GROUP BY a.title, a.counter) 
LIMIT 1; 

雖然這是有效的,我得到的反應過大,無法返回。第一個子查詢運行良好,我想要做的是獲得更多的列。但我失敗了。

回答

2

不要擔心「限制1」,在到達該階段之前,響應會變得太大。

嘗試跳過第二個子查詢,因爲它僅從大數據集中選擇2列,而沒有對其進行過濾。一個可行的替代方案是:

SELECT 
    a.title, a.counter, MAX(b.num_characters) AS max 
FROM 
    publicdata:samples.wikipedia b JOIN(
    SELECT 
    title, COUNT(*) AS counter 
    FROM 
    publicdata:samples.wikipedia 
    GROUP EACH BY title 
    ORDER BY 
    counter DESC 
    LIMIT 10) a 
    ON a.title = b.title 
GROUP BY 
    a.title, 
    a.counter 

這運行15.4秒。

我們可以做得更快,使用TOP():

SELECT 
    a.title title, counter, MAX(num_characters) max 
FROM 
    publicdata:samples.wikipedia b 
JOIN 
    (
    SELECT 
    TOP(title, 10) AS title, COUNT(*) AS counter 
    FROM 
    publicdata:samples.wikipedia 
    ) a 
    ON a.title=b.title 
GROUP BY 
    title, counter 

TOP()作爲一個簡單和快速(SELECT COUNT(*)/組/ LIMIT)。

https://developers.google.com/bigquery/docs/query-reference#top-function

現在它運行在僅6.5s,處理15.9 GB。